This directory contains all the files needed to reproduce the tables in "Opening up IP Strategy: Implications for Open Source Software Entry by Start-Up Firms" by Wen Wen, Marco Ceccagnoli, and Chris Forman. 

More specifically, RegressionPanel.dta can be used to create tables 1, 4, 5, and 6. Table2a.dta/ Table2b.dta file can be used to create table 2a/2b. Table3.dta can be used to create table 3. RegressionPanel_Table7.dta can be used to create table 7. The Tables.do do file contains Stata code to perform these analyses.

********************************
The definition of the major variables included in these data files are as follows.

segment_id: ID assigned to software market j
ossentrants: The number of start-up entry events through releasing new OSS products related to market j in year t
Entry_code: the number of entry events related to market j in year t
log_commons_cumclaims: Log of The Commons claims-weighted patent count related to market j cumulated up to year t
cumulativeness: Log of cumulativeness of innovation in market j up to year t
concentr: Four-assignee citation concentration ratio in market j up to year t
sales_growth: The growth of market js sales from year t-1 to year t
log_mkt_claims: Log of total claims-weighted patent count related to market j cumulated up to year t
log_quality: Log of cumulative stock of citations received by the patents in market j (adjusted for truncations) divided by total number patents in j up to year t.
pat_age: The average age of patents in market j granted by year t
log_oin_cumclaims: Log of Open Invention Networks claims-weighted patent count in market j cumulated up to year t
log_sso_cumclaims: Log of standard-setting organizations claims-weighted patent count in market j cumulated up to year t
log_OSSsales: Log of the share of OSS downloads related to market j up to year t (as a measure of OSS opportunities in market j by year t) weighted by the total software sales in year t
IBMpatm: Log of IBMs claims-weighted patent count that belong to the same patent class/subclass as the Commons patents and applied for in the same year as the patents in the Commons.
ossmkt: a dummy that is equal to 1 if market j is an OSS market and is equal to 0 if market j is a proprietary market


********************************
Data sources of the major variables included in these data files are as follows.

Data source of the variables ossentrants and Entry_code: a proprietary database named PROMT and authors' calculations. See section 5.1, section 6.5, and Appendix A for more detailed steps on how to construct these variables.
Data source of the variable log_commons_cumclaims: http://www.patent-commons.org and authors' calculations
Data source of the variables cumulativeness, concentr, log_mkt_claims, log_quality, pat_age, IBMpatm: NBER patent data project (https://sites.google.com/site/patentdataproject/Home), combined with the data from USPTO website, and authors' calculations.
Data source of the variable sales_growth: a proprietary database named NETS and authors' calculations.
Data source of the variable log_oin_cumclaims: www.openinventionnetwork.com and authors' calculations. 
Data source of the variable log_sso_cumclaims: www.ssopatents.org and authors' calculations.
Data source of the variable log_OSSsales: www.sourceforge.net, combined with the data from the proprietary database NETS, and authors' calculations.
