Overview

This document explains how our team, the Office of Evaluation Sciences in the General Services Administration (the OES), tends to do statistical analysis and it also explains why we do what we do. ¹ The research integrity process of the OES is already documented the OES Methods Web Page. For example, on that page we provide templates for our research design and analysis pre-registration process. Here, we get into the nitty gritty of our statistical work.

Purposes of this document

First, this document educates new team members about the decisions past team members have made regarding the design and analysis of the studies fielded so far. This educative role also offers us a place to record decisions for our own future selves but also to help us harness the power that arises from our disciplinary diversity. That is, we have decided, as a team, how to do certain statistical analysis and these decisions may differ from those that are common in any given academic discipline. This document, thus, helps explain why we have landed on those decisions (for now), and how to implement those practices.

Second, this document records decisions that we have made in the absence of pre-analysis plans, or in the context of circumstances unforeseen by our pre-analysis planning. And it should guide our future experimental design and analysis.

Third, this document should help us write better analysis plans and speed our practice of re-analysis. (Our team insists on a blind re-analysis of every study as a quality control for our results before they are reported to our agency partners.)

Fourth, this document should help other teams working to learn about the causal impacts of policy interventions. We hope this contributes to the federal government’s own work pursuing evidence-based public policy, but also helps other teams in other places doing work that is similar to our own.

Nature and limitations of this document

We (mostly) focus on randomized field experiments.

This document focuses on design and analysis of randomized field experiments. Although we include some discussion about non-randomized studies, often known as observational studies, until now, our team has focused primarily on randomized field experiments.

We present examples using R

As public servants and social and behavioral scientists, we use the R statistical analysis language because it is (a) one of the two industry standards in the field of data science (along with Python), (b) free, open source, and multiplatform, and (c) the standard for advanced methodological work in the statistical sciences as applied to the social and behavioral sciences (the latest new statistical techniques for social and behavioral scientists tend to be developed in R).

Many members of our team use Stata or SAS or SPSS or Python. We welcome additions to this document using those languages as well.

Structure of the document

Each section of this document will include, if applicable:

A description of our approach
A description of how we implement our approach, including functions in R, including key arguments that must be entered into the function and key values that are outputs from the function.²
A general example using simulated data (perhaps including some evaluation of the tool as compared to other possible choices).
A discussion of a specific example from OES (if applicable) in which we implemented the given procedure.

Throughout the document, we include links to the Glossary and Appendix, to clarify terms or explain tools and procedures in more depth.

Help us improve our work!

Since we hope to improve our analytic workflow with every project, this document should be seen as provisional — as a record and a guide for our continuous learning and improvement. We invite comments in the Issues and pull requests for direct contributions.

About this document

This book was written in bookdown. The complete source is available from GitHub. This version of the book was built with R version 4.1.2 (2021-11-01) and the following packages.

package	version	source
bfe	2.0	Github (gibbonscharlie/bfe@4eaebc00d12bc427a9c75aec3280c43a0034b416)
blockTools	0.6-3	CRAN (R 4.1.0)
bookdown	0.26	CRAN (R 4.1.1)
coin	1.4-2	CRAN (R 4.1.1)
DeclareDesign	0.30.0	CRAN (R 4.1.1)
devtools	2.4.3	CRAN (R 4.1.1)
estimatr	0.30.6	CRAN (R 4.1.1)
future	1.25.0	CRAN (R 4.1.2)
future.apply	1.9.0	CRAN (R 4.1.2)
here	1.0.1	CRAN (R 4.1.0)
ICC	2.3.0	CRAN (R 4.1.0)
kableExtra	1.3.4	CRAN (R 4.1.1)
lmtest	0.9-40	CRAN (R 4.1.2)
multcomp	1.4-18	CRAN (R 4.1.1)
nbpMatching	1.5.1	CRAN (R 4.1.0)
quickblock	0.2.0	CRAN (R 4.1.0)
randomizr	0.22.0	CRAN (R 4.1.1)
remotes	2.4.2	CRAN (R 4.1.1)
ri2	0.2.0	CRAN (R 4.1.1)
RItools	0.2.0.9004	Github (markmfredrickson/RItools@c64c8a111e31c678405cdc246dcd2f117bd506d9)
sandwich	3.0-1	CRAN (R 4.1.0)
tidyverse	1.3.1	CRAN (R 4.1.0)
withr	2.5.0	CRAN (R 4.1.1)

We call this document a standard operating procedure (SOP) because we are inspired by the Green, Lin and Coppock SOP.↩︎
We use R and R+markdown for our work and in general prefer open source tools in order to best serve the public.↩︎

OES Standard Operating Procedures for The Design and Statistical Analysis of Experiments.