Chapter 2 Basics of Experimental Design and Analysis
Here we very briefly define and describe some of the general characteristics of statistical procedures that guide our decision making in the rest of this guide. Briefly, we want to create research designs that have enough statistical power to tell us something meaningful about the new policy interventions that we are piloting, and we want to use statistical tests that will rarely mislead us — will rarely give a false positive result, and we want to use estimators without systematic error. These operating characteristics of our procedures depend on both the design of the study and the choices of computational procedures that we use. So, we descibe them more in-depth in the Power Analysis section that comes after both our sections on randomization and the design of experiments and the section on analysis choices.
2.1 Statistical Power: Designing Studies that effectively distinguish signal from noise
The research designs we use in the OES aim to enhance our ability to distinguish signal from noise: studies with very few observations cannot tell us much about the treatment effect, while studies with very many observations provide a lot of information about the treatment effect. A study which effectively distinguishes signal from noise has excellent “statistical power” and a study which cannot do this has low statistical power. The Evidence in Governance and Politics (EGAP) Methods Guide 10 Things You Need to Know about Statistical Power describes more about what statistical power is and how to assess it.
Before we field a research design, we assess its statistical power. If we anticipate that the intervention will only make a small change in peoples’ behavior, then we will need a relatively large number of people in the study: too few people will result in a report saying something like, “The new policy might have improved the lives of the people in the study, but we can’t argue strongly that this is so because the study was too small.”
2.2 Error Rates of Tests
A good statistical test rarely rejects a true hypothesis and often rejects false hypotheses. The EGAP Methods Guide 10 Things to Know about Hypothesis Testing describes the basics of hypothesis tests and explains more about how one might know that a given \(p\)-value arises from a test with good properties in a given research design. Our team tries to follow these practices of choosing testing procedures that are not likely to mislead analysts, when we make our analysis plans and complete our analyses and re-analyses.
2.3 Bias in Estimators
A good estimator is not systematically different from the truth, and an even better estimator tends to produce estimates that are close to the truth across different experiments. Because the difference of means between treatment and control groups is well known as an unbiased estimator of the average treatment effect within a given experimental pool, this is a primary quantity of interest to report by our team. Similarly, since we know that the coefficient in a logistic regression of a binary outcome on a treatment indicator and a covariate is a biased estimator of the underlying causal difference in log-odds we use other approaches when we want to talk about the causal effect of a treatment on log-odds (Freedman 2008b).