A. INTRO

This file contains instructions on how to interpret and use the data
utilized in "Broadband in School: Impact on Student Performance" by Rodrigo Belo, Pedro Ferreira, and Rahul Telang.The data were collected from multiple sources and aggregated into the current dataset. These sources include mostly publicly available data,
but some raw data cannot be disclosed. One such piece of data is the
location of the ISP's central offices (CO). We provide, however, the distance between the school and the closest CO (but not the actual CO-location). Additionally some data are readily available online, and can be gathered with little effort. One such example is the students'9th grade exam scores. The authors have decided to publish only a
clean panel with all variables aggregated at the school level, and
enumerate raw data sources.

Please contact the authors for additional questions.

B. DATA SOURCES

The data were gathered from five different sources: 

1. School Internet use data came from monthly reports collected by FCCN over the period 2005-2009. The data made available corresponds
 to school-level monthly usage average over an academic year.
2. School location and size were provided by the Ministry of Education upon request.
3. 9th grade national exam scores are available from the Portuguese Ministry of Education site: http://www.dgidc.min-edu.pt/jurinacionalexames/index.php?s=directorio&pid=33&ppid=4
4. Central offices locations and traffic were provided by Portugal Telecom under an NDA agreement.
5. Regional-level data such as average monthly earnings and population density were collected from the Portuguese National Statistics
 Insitute (www.ine.pt).

C. DATA DESCRIPTION

We provide the data in STATA format (file broadband-schools-ms.dta),with all variables adequately labeled. Detailed description of each variable can be while their description is available from the paper.File do-floats-broadband-schools.do contains the necessary STATA commands to build all the floats presented in the paper. The exceptions are regressions that use the schools' corresponding total CO traffic as the authors are not authorized to disclose these data. The do-file can be run by setting this variable to zero. All results are
robust to including or not this variable.