
You might be able to specify some particular forms of deviation and look at something like equivalence testing, but it's kind of tricky with goodness of fit because there are so many ways for a distribution to be close to but different from a hypothesized one, and different forms of difference can have different impacts on the analysis. One might as well ask the same kind of question - what is the point of performing such testing if we can't conclude whether or not the mean takes a particular value?

Similar considerations extend not only to other distributions, but largely, to a large amount of hypothesis testing more generally (even a two-tailed test of $\mu=\mu_0$ for example). but at a much smaller sample size, regressions and two sample t-tests (and many other tests besides) will behave so nicely as to make it pointless to even worry about that non-normality even a little. The red distribution is non-normal, and with a really large sample we could reject a test of normality based on a sample from it. It's little use having a procedure that is most likely to tell you that your data are non-normal just when the question doesn't matter.Ĭonsider the image above again. In the case of normality, often the inferential procedures they wish to apply (t-tests, regression etc) tend to work quite well in large samples - often even when the original distribution is fairly clearly non-normal - just when a goodness of fit test will be very likely to reject normality. This is part of the reason I often suggest to people that the question they're actually interested in (which is often something nearer to 'are my data close enough to distribution $F$ that I can make suitable inferences on that basis?') is usually not well-answered by goodness-of-fit testing. (See, for example, the post Is normality testing essentially useless?, but there are a number of other posts here that make related points) Instead, the best you can hope for with that form of testing is the situation you describe. It may be that the data actually come from something close to a normal, but will they ever be exactly normal? They probably never are.

Smaller fractions of standardized betas with equal but larger parameters would be much harder to see as different from a normal.īut given that real data are almost never from some simple distribution, if we had a perfect oracle (or effectively infinite sample sizes), we would essentially always reject the hypothesis that the data were from some simple distributional form.Īs George Box famously put it, " All models are wrong, but some are useful."Ĭonsider, for example, testing normality. At say $n=100$, we have little chance of spotting the difference, so we can't assert that data are drawn from a normal distribution - what if it were from a non-normal distribution like the red one instead? Here's two distributions, a standard normal (green solid line), and a similar-looking one (90% standard normal, and 10% standardized beta(2,2), marked with a red dashed line): Broadly speaking (not just in goodness of fit testing, but in many other situations), you simply can't conclude that the null is true, because there are alternatives that are effectively indistinguishable from the null at any given sample size.
