The mathematics of the last century worth of experiment design.

Compare with multiple testing

Probably the least sexy thing in statistics and as such, usually taught by the least interesting professor in the department, or at least one who couldn’t find an interesting enough excuse to get out of it, which is a fair indication. Said professor will then teach it to you as if you were in turn the least interesting student in the school, and so it goes on.

This is unfair, because it turns out to be elegant and powerful tool if you can move past block- and combinatorial design stamp collecting, which few classes do, because it is the easiest way to fill in those long lecture hours.

- Bob Sturm does a nice take, taken from Bailey, R. A. (2008). Design of Comparative Experiments. Cambridge; New York: Cambridge University Press.

## Questions

What do you do when your variance is not uniform? (*heteroskedastic*) The entire edifice seems to tumble at that point, and all your pretty latin squares come to naught. There are some data transforms, of course, for lognormal distributions or whatever – but what if your experimental interventions affect the variance in some non-regular way?

## Snippets

- Lucile L, Robert Chang and Dmitriy Ryaboy of Twitter have a practical guide to risky testing at scale: Power, minimal detectable effect, and bucket size estimation in A/B tests