
Statistics miscellaneous

Temporary uncategorized statistic jargons, tricks, and memes.

p(-value) hacking

Also called data dredging, data fishing. Manipulate the data to convince audience some statistically.
P-value indicates the probability of the current value occurs in the given distribution. Usually p<0.05 implies the sampling/hypothesis is biased and unacceptable.

p-value hacking, Data dredging

Semi-Variogram: Nugget, Range and Sill

[Ref] Semi-variogram is 1/2 squared difference between all data pairs over spatial distance. It originates in Geographic Information System illustrating relationship between data correlation versus spatial distance.

Semi-variogram elements illustration
Semi-variogram elements illustration


A reversed version of probability to evaluate the chance an hypothesis is acceptable.
This example from U of McGill is illustrative.

Example: American or Canadian M&M’s? (Discrete parameter): M&M’s sold in the United States have 50% red candies compared to 30% in those sold in Canada. In an experimental study, a sample of 5 candies were drawn from an unlabelled bag and 2 red candies were observed. Is it more plausible that this bag was from the United States or from Canada? The likelihood function is: L(p|x) ∝p 2 (1 − p) 3 , p=0.3 or 0.5. L(0.3|x) = 0.03087 < 0.03125 = L(0.5|x), suggesting that it is more plausible that the bag used in the experiment was from the United States.http://www.medicine.mcgill.ca/epidemiology/hanley/bios601/Likelihood/Likelihood.pdf

Variability estimation

PEM (probability estimation method) rosenblueth method, seems like a type of bootstrapping method. Rocscience Phase2.0 implementation: PDF

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.