Statistics miscellaneous
Temporary uncategorized statistic jargons, tricks, and memes.
p(-value) hacking
Also called data dredging, data fishing. Manipulate the data to convince audience some statistically.
P-value indicates the probability of the current value occurs in the given distribution. Usually p<0.05 implies the sampling/hypothesis is biased and unacceptable.
Semi-Variogram: Nugget, Range and Sill
[Ref] Semi-variogram is 1/2 squared difference between all data pairs over spatial distance. It originates in Geographic Information System illustrating relationship between data correlation versus spatial distance.
Likelihood
A reversed version of probability to evaluate the chance an hypothesis is acceptable.
This example from U of McGill is illustrative.
Example: American or Canadian M&M’s? (Discrete parameter): M&M’s sold in the United States have 50% red candies compared to 30% in those sold in Canada. In an experimental study, a sample of 5 candies were drawn from an unlabelled bag and 2 red candies were observed. Is it more plausible that this bag was from the United States or from Canada? The likelihood function is: L(p|x) ∝p 2 (1 − p) 3 , p=0.3 or 0.5. L(0.3|x) = 0.03087 < 0.03125 = L(0.5|x), suggesting that it is more plausible that the bag used in the experiment was from the United States.http://www.medicine.mcgill.ca/epidemiology/hanley/bios601/Likelihood/Likelihood.pdf
Variability estimation
PEM (probability estimation method) rosenblueth method, seems like a type of bootstrapping method. Rocscience Phase2.0 implementation: PDF