Organizing Knowledge Production Teams within Firms for Innovation

How should firms organize their pool of inventive human capital for firm-level innovation? While access to diverse knowledge may aid knowledge recombination, which can facilitate innovation, prior literature has focused primarily on one way of achieving that: diversity of inventor-held knowledge within a given knowledge production team (“within-team knowledge diversity”). We introduce the concept of “across-team knowledge diversity,” which captures the distribution of inventor knowledge diversity across production teams, an overlooked dimension of a firm’s internal organization design. We study two contrasting forms of organizing the firm-level knowledge diversity environment in which a firm’s inventors are situated: “diffuse” (high within-team diversity and low across-team diversity) versus “concentrated” (low within-team diversity and high across-team diversity). Using panel data on new biotechnology ventures founded over a 21-year period and followed annually from inception, we find that concentrated structures are associated with higher firm-level innovation quality, and with more equal contributions from their teams (and the opposite for diffuse structures). Our empirical tests of the operative mechanisms point to the importance of within-team coordination costs in diffuse structures and across-team knowledge flows in concentrated knowledge structures. We end with a discussion of implications for future research on organizing for innovation.


Introduction
Collaborative (rather than solo) knowledge production is increasingly the norm in many creative domains (Wuchty et al. 2007). Patenting follows this same trend, driven in part by the increasingly high cumulative inventor knowledge necessary to master fields (Jones 2009). With inventors becoming narrower but deeper in their knowledge domains, a team-based patent production structure may compensate: empirical evidence suggests that teams are more likely than solo inventors to produce "breakthrough" inventions (Singh and Fleming 2010).
Invention studies in this "science of science" tradition (e.g., Fortunato et al. 2018) often use the population of U.S. inventors, and so the findings are quite general. However, these studies typically do not account for the organizational context in which inventive teams reside and are therefore silent on the managerially relevant question of how production team organization within a firm relates to firm-level innovation outcomes. The literature on product development teams in technology-based environments (the "teams" literature) is typically much richer in the depth of data collection, often via interviews and surveys (e.g., Ancona andCaldwell 1992, Hoegl et al. 2004), but it faces drawbacks in generalizing the findings, especially across organizations. Moreover, in this branch of the literature, the usual unit of analysis is the team rather than the firm, limiting inferences regarding the innovation role of the organizational knowledge environments in which teams are embedded.
We aim to bridge this gap by layering an organizational and team design lens onto a relatively large empirical sample of invention teams and firms. While the literature recognizes the central role of knowledge recombination in generating impactful innovation, the typical instrument of accessing novel knowledge to recombine (Fleming 2001) is within-team knowledge diversity. By broadening the lens to the firm, we introduce a new concept to the firm-level organizing for innovation literature of sourcing knowledge for potential recombination: "across-team" knowledge diversity, which captures the extent to which invention teams differ from one another with respect to their prior technical domain experience. At the extreme, firms achieve knowledge diversity in two contrasting ways: (1) complete within-team diversity paired with no across-team diversity (what we term a "diffuse" organizational knowledge 4 2002), and top management team composition (e.g., Bantel and Jackson 1989). Yet data availability has likely constrained theorizing with regard to how production team organization within firms relates to firm-level innovation outcomes. We discuss two topics in this section: how knowledge diversity and access jointly create distinct organizational knowledge environments in which teams pursue innovation activities; and the link between these environments and firm-level innovation outcomes.

Organizational Knowledge Environments
We contend that the diversity of and access to knowledge within the firm must be considered jointly in order to understand the link between inventive human capital organization and innovation. First, the organization of human capital within the firm determines where the inventive process occurs. Second, given the locus of the inventive process, organization also determines the available knowledge upon which the team's inventive process operates. Team boundaries affect the motivation for and difficulty of accessing knowledge from others within the firm but outside the focal invention team.
Inventors embody knowledge accumulated through their past experiences (Gruber et al. 2013), with the aggregate set of inventive human capital inside a firm reflecting the available knowledge inventors can draw upon as part of the production process (Grant 1996, Simon 1991. 1 Under this view, the set of input knowledge to an invention is distributed throughout the firm. In this sense, a production team can build on knowledge that is "local" (contained within the team itself) or "boundary-spanning" (existing inside the firm but on a different production team). From a team design perspective, knowledge diversity both within a focal team as well as outside the focal team but within the firm, together with knowledge access in those same locales, will likely shape innovation output. 2 The access issues we 1 We focus our theorizing on input knowledge that is available within the firm (abstracting from input knowledge available from outside the firm). However, in our empirical analyses we control for the firm's use of this external knowledge by including covariates related to firm-level knowledge capabilities, which broadly capture the firm's absorptive capacity (Cohen and Levinthal 1990). 2 A mix of factors comes into play when determining the set of individuals who directly contribute to a production team. These factors include both individual agency and managerial fiat, with the structures and processes leading to patterns of team formation arising as an outcome of the firm's organizational architecture (Joseph and Ocasio 2012). While the determinants of this architecture may ultimately be subject to managerial control, numerous other factors (only a subset of which are observable) are likely to come into play. We thus abstract away from the determinants of the organization of inventive human capital, focusing instead on the implications of alternative such structures for firm-level knowledge production. We discuss this point further in the concluding section of the article. discuss below relate to team boundaries, together with the relative ease and incentives to consult colleagues across versus within those team boundaries.
Prior literature suggests that the diversity of knowledge among a firm's inventors benefits knowledge production (Hoisl et al. 2016). Access to diverse sets of knowledge facilitates innovation because it expands the scope of available material for recombinant search (Carnabuci and Operti 2013, Hargadon and Sutton 1997, Schumpeter 1934. Accordingly, the technical experience diversity of a firm's inventors critically determines a firm's innovative capacity (Dixon 1994). Yet this line of research is agnostic to whether intra-firm boundaries enable or impede the flow of knowledge available for recombination.
By contrast, another research stream suggests that the location of knowledge relative to an intrafirm boundary influences a firm's innovation output by influencing the ease of search for this knowledge.
For example, hierarchical structures can inhibit knowledge sharing (Tsai 2002), while centralized R&D functions allow inventors to draw on a wider range of technologies (Argyres and Silverman 2004).
However, this literature on intra-organizational boundaries does not explicitly consider the role of knowledge diversity among the firm's inventors. Locus of knowledge among teams within a firm immediately surfaces issues of access, both with regard to inventors' motivation to access, as well as with regard to inventors' ability to access and use knowledge outside the focal team. Knowledge that crosses team boundaries faces a higher hurdle compared to knowledge present within a team (Tushman 1977).
To understand how the organization of inventors inside a firm influences firm-level innovative output, we characterize the diversity of knowledge within a firm along two distinct dimensions. First, within-team knowledge diversity is the degree to which an average production team inside the firm has individuals who differ among themselves with respect to the knowledge embodied in their prior technical experience. Second, across-team knowledge diversity is the degree to which an average production team inside the firm differs from other teams with respect to the knowledge embodied in the aggregate technical experience of the team's inventors. These forms of knowledge diversity can be conceptualized as characteristics of the firm-level environment in which production teams are embedded. In the former Electronic copy available at: https://ssrn.com/abstract=3502246 Electronic copy available at: https://ssrn.com/abstract=3502246 case, individuals encounter knowledge diversity among their own team members. In the latter case, knowledge diversity resides across team boundaries. To ensure these concepts are meaningful when considered together, we confine our conceptual development and empirical analyses to firms organized into multiple production teams (Gerwin andMoffat 1997, Sabbagh 1996).
These dimensions of knowledge diversity allow us to characterize firms with regard to their distinctive "organizational knowledge environments," which we propose shape firm-level innovation outcomes. Figure 1 depicts two contrasting organizational knowledge environments. The top panel illustrates a firm with high levels of within-team knowledge diversity and low levels of across-team knowledge diversity. This is a knowledge environment in which knowledge in a given technological area can be seen as "diffuse" throughout the firm. By contrast, the bottom panel illustrates a firm with low levels of within-team knowledge diversity and high levels of across-team knowledge diversity. This is a knowledge environment where knowledge can be characterized as "concentrated," in the sense that teams hold unique knowledge relative to other teams in the firm. We turn next to the implications of these two distinct organizational knowledge environments.

Diffuse Organizational Knowledge Environments and Firm-Level Innovation
In diffuse organizational knowledge environments, the diversity of prior technical experience among inventors on a given production team means that these inventors may lack shared points of understanding and common building blocks that can help guide the inventive process. Such a structure may impede innovation quality due to within-team coordination challenges resulting from heterogeneous cognitive frameworks and the lack of shared common building blocks among team members (Reagans andZuckerman 2001, Srikanth andPuranam 2011). At the same time, the existence of knowledge diversity within any given team likely reduces the motivation (and perceived need) for seeking knowledge outside the focal team, even though doing so may be comparatively low cost from an accessibility standpoint (shared knowledge backgrounds across teams may facilitate knowledge exchange). The lack of shared 7 common building blocks, together with the more siloed approach to knowledge recombination, likely lead to lower innovation quality under a diffuse organizational knowledge environment: Hypothesis 1a: Firms characterized by a "diffuse" organizational knowledge environment (high within-team diversity, low across-team diversity) are associated with lower innovation quality.
At the same time, this siloed approach to innovation may have a secondary effect that shapes the distribution of innovation outcomes across teams. Similar teams-each pursuing their own innovations while remaining relatively inert with respect to their surroundings (and other teams)-likely arrive at solutions with differential levels of effectiveness. Reframed from an evolutionary perspective, teams composed of similar "genotypes" are likely to exhibit different "phenotypes" when innovating in greater isolation, resulting in more variation in quality across teams. The diversity of combinations produced within the context of a firm-level selection environment can be conceptualized as being akin to an evolutionary mechanism of variation and mutation in speciation. This process likely results in more diversity in the "fitness" of teams and their resulting combination of ideas. However, only a smaller subset of these diverse, inert teams is likely to be "fit" given the (revealed) selection environment. As a consequence, under a diffuse structure, we expect greater inequality across teams with respect to innovation quality: Hypothesis 1b: Firms characterized by a "diffuse" organizational knowledge environment (high within-team diversity, low across-team diversity) are associated with more unequal team contributions to their innovation output.
The core operative mechanism underlying our prediction of lower innovation quality in Hypothesis 1a above is the lack of a shared cognitive framework and work process within production teams as a result of greater within-team diversity. The ensuing coordination frictions stymie the efficacy of within-team diversity when inventors with different knowledge backgrounds engage with one another as part of the innovation process. These challenges to within-team coordination can be mitigated through a variety of organizational means (Aggarwal and Wu 2015). Shared team processes are one approach to the challenge 8 of within-team coordination in the presence of diverse cognitive frameworks. These processes arise when members of the same team informally learn by collaborating on prior inventions. A longer history of shared production experience by members of a production team allows for the emergence of routines and processes for coordination, thereby muting the negative effect of dissonant cognitive frameworks and work processes (Kotha et al. 2012). Consequently, this form of coordination through jointly shared prior experience may limit the detrimental innovation effects of inert teams under a diffuse organizational knowledge structure. 3 We predict: Hypothesis 2: Factors the facilitate within-team coordination will positively moderate (i.e., reduce the negative effect of) Hypothesis 1a.

Concentrated Organizational Knowledge Environments and Firm-Level Innovation
Firms operating in a more concentrated organizational knowledge environment likely exhibit the opposite pattern to those in diffuse environments with respect to their innovation output. In a concentrated knowledge environment, the shared backgrounds of inventors provide for common individual starting points to the innovation process. This enables more immediate team productivity. At the same time, while the lack of common conceptual frameworks across production teams may limit access to more distant knowledge, the lack of knowledge diversity within the focal team implies greater benefits to spanning team boundaries (Rosenkopf and Nerkar 2001). Thus, there are likely to be high returns to sourcing knowledge from other teams within the firm. A concentrated knowledge environment therefore allows for a greater likelihood of across-team knowledge flows that in turn benefits innovation quality: Hypothesis 3a: Firms characterized by a "concentrated" organizational knowledge environment (low within-team diversity, high across-team diversity) are associated with higher innovation quality.
The higher level of motivation (and need) to engage externally to the team (but within the firm) likely increases flows of across-team knowledge and decreases the perceived threshold against which teams 9 consult with others in the firm outside their own team. In other words, individuals may more readily consult with others outside their team when knowledge diversity is limited within the team, similar to the logic for when managers choose to consult with those outside their firm (Nickerson and Zenger 2004).
Thus, despite knowledge being more concentrated within teams, team boundaries may effectively be more porous, enabling individuals to access areas of knowledge in which they themselves are not experts.
While this more porous structure allows for higher mean levels of innovation quality as discussed above (due to the reduced need for coordination, paired with the benefits of accessing diverse knowledge as needed), it may also push firms toward greater overall homogeneity in innovation quality across teams.
Given the motivation to engage externally, external consultation homogenizes differences and reduces expected quality variation across teams. Therefore, a more concentrated knowledge environment might engender more equal innovation outcomes among teams: Hypothesis 3b: Firms characterized by a "concentrated" organizational knowledge environment (low within-team diversity, high across-team diversity) are associated with more equal team contributions to their innovation output.
The core operative mechanism underlying our prediction of lower innovation quality in Hypothesis 2a above is cross-team knowledge flows. These underlie the positive innovation quality outcomes expected from a concentrated organizational knowledge environment. In such a setting, there exists a shared (within-team) mutual understanding that allows for coordination, together with a rich organizational knowledge environment that requires boundary-spanning collaboration to reap knowledge-related benefits. When teams operate within a rich soup of (across-team) diversity, cross-fertilization of ideas allows a given team's inventors to benefit from the prior knowledge and experience of inventors on other teams. One implication of this is that firm characteristics which hinder the ease of across-team knowledge flows would reduce the positive effect of a concentrated team structure. For example, geographic dispersion-such as when the firm's inventors are spread across multiple locations-could increase the cost of across-team knowledge exchange (Funk 2013, Haas and Hansen 2007, Lahiri 2010) and thus reduce the positive effect of a concentrated knowledge environment. We predict: Electronic copy available at: https://ssrn.com/abstract=3502246 Electronic copy available at: https://ssrn.com/abstract=3502246 Hypothesis 4: Factors that increase the cost of knowledge accessibility across teams within a firm will negatively moderate (i.e., reduce the positive effect of) Hypothesis 3a.
Taken together, the arguments regarding differences in knowledge flows between diffuse and concentrated organizational environments suggest differences in intra-firm patterns of knowledge flows.
Specifically, the relative proportion of knowledge on which a team builds that resides outside the team but within the firm would likely be higher in a concentrated organizational knowledge environment as compared to a diffuse organizational knowledge environment. Because the benefits of operating in a concentrated setting arise from diverse knowledge that resides outside the team, teams in a concentrated environment more likely seek out and utilize the benefits of being situated in a soup of diverse knowledge. Doing so would imply building on a higher proportion of outside-the-team knowledge in their inventions as compared to inventors in a diffuse structure (which tend to be more inert and insular). This leads to our final prediction: Hypothesis 5: Concentrated organizational knowledge environments will draw on a greater proportion of within-firm external-to-the-team knowledge as compared to diffuse organizational knowledge environments.

Industry Setting
To empirically test our hypotheses regarding the firm-level innovation implications of heterogeneous intra-firm knowledge environments, we seek an industry in which innovation output is a key performance metric, where multiple invention teams within a firm is the norm, and where any given inventive team likely holds knowledge that can be of use to other teams, since teams operate within a shared overall knowledge space. The biotechnology industry satisfies these criteria and also has other beneficial characteristics: the industry is one in which knowledge-based resources are an important driver of innovation output, consistent with our objective of studying the link between the organization of Electronic copy available at: https://ssrn.com/abstract=3502246 Electronic copy available at: https://ssrn.com/abstract=3502246 knowledge-oriented human capital and firm-level innovation; and studying new venture evolution in this industry avoids issues of left-censoring, allowing us to track the same firms as they develop over time.
Patents are widely regarded as a key means of value appropriation in the biotechnology industry (Levin et al. 1987), and as such we can be more confident in relying on patent data to capture the relevant individual-and team-level characteristics that serve as precursors to the firm's innovative output. We rely on patent data to infer the structure of the firm's production teams for empirical observation.
Biotechnology firms generally patent their inventions, and the resulting patent records-which contain names of individual inventors-reveal team composition to observers. For empirical purposes, production teams are defined as the set of inventors meriting attribution on a patent team. Patent data therefore has the primary benefit of enabling us to observe staffing in a large sample panel, which would otherwise be difficult to obtain systematically, especially across firms. In addition, the historical patent record allows us to construct detailed inventor career histories, allowing for comprehensive tracking of an inventor's technical experience both before and during their tenure at an in-sample firm. We use this inventive experience as a measure of the inventor's knowledge.
Although we leverage this data in order to observe a large sample of realized team structures, identifying teams in this way does impose some inference limitations. Our data reveal the ex post realized team structures associated with inventions, but not necessarily the ex ante team structures, as teams which did not successfully patent are censored (see Discussion section for more on this issue).

Data and Sample
To construct our sample, we seek a set of firms that is as homogeneous as possible, apart from the dimension of team organization, to make comparable and meaningful comparisons. Confining the sample to a single industry provides uniformity in interpreting firm-level objectives. Additionally, restricting the sample to venture capital-backed firms increases the commonality of the objectives and time horizons facing firms in the sample. Together, these factors reduce unobserved differences across firms, aside from the desired dimension of heterogeneity in team organization.
Electronic copy available at: https://ssrn.com/abstract=3502246 Electronic copy available at: https://ssrn.com/abstract=3502246 Our empirical sample is the universe of 476 venture capital-backed human biotechnology firms (SIC codes 2833(SIC codes -2836 founded between 1980 and 2000, as identified using the VentureXpert database. Our primary dataset is an unbalanced firm-year panel in which firms are observed from their year of founding through either 2009 or, if sooner, their year of dissolution (a longer time window facilitates within-firm inferences). In addition to including all years in which the firm is privately held, we also include in our observation window firm-years following an IPO or acquisition by another entity, together with controls for these ownership regimes. 4 We utilize several sources to construct our variables. The IQSS Patent Network database includes all U.S. Patent and Trademark Office data on patents applied for since 1975 (Li et al. 2014), allowing us to uniquely identify inventors associated with patents and to construct various measures of the production teams engaged in the creation of patents. Firm-year level attributes come from Deloitte Recap RDNA, Pharmaprojects, Inteleos, ThomsonOne, Zephyr, and SEC filings.

Dependent Variables
To measure the innovation performance implications of intra-firm organization (H1a, H2, H3a, H4), our main dependent variable measures the number of forward citations received within a four-year postapplication window to the firm's patents in the focal firm-year. Forward citations are an accepted measure of innovation output (Jaffe andTrajtenberg 2002, Trajtenberg 1990), acting as a proxy for the economic value created by the patented innovations. 5 A fixed citation window facilitates meaningful comparisons across observation years: without such a window, older patents would be upward biased in citation counts. 6 To provide insight on the induced behavior underlying the aggregate firm-level innovation outcomes (H1b, H3b), we also construct the dependent variable, forward citations Herfindahl, which measures the concentration of a firm's forward citations among its patents in a focal year. The measure is calculated using the standard Herfindahl index formula. Each patent generates a share of the forward citations (four-year post-application window) to the firm's patents in a focal year. The share for a given patent is calculated as the forward citations to the given patent divided by the total forward citations to all of the firm's patents in the focal year. We then take the sum across the square of each of those shares to generate the firm-year level measure of forward citations Herfindahl. 7 This measure ranges from 0 to 1, with a lower value indicating that forward citations come from a broad base of the firm's patents, and a higher value indicating that forward citations emanate from a more limited (concentrated) set of the firm's patents.
To test H5, we build a measure of the utilization of knowledge within-the-firm but outside-of-theteam generating a given patent. Firm non-team self-citation ratio is the count of backward citations by the firm's patents in the focal firm-year that do not cite patents by one of the citing patent team's inventors (but is a firm-level self-cite), divided by the total number of backward citations in the focal firm-year. This ratio normalizes the count relative to the overall volume of backward citations being generated by the firm, though the results we report below are also qualitatively the same for non-team self-citation counts (levels). In other words, the numerator of this measure captures the volume of knowledge flowing inside the firm from one team to another, distinct team, and the denominator adjusts this measure for the total volume of knowledge the firm builds on in the firm-year. 8 7 The formula for forward citations Herfindahl is ∑ [ ∑ ] 2 for all patents by a firm in a year.
8 Sorensen and Stuart (2000) conceptualize organizational self-citation as a measure of building on own-knowledge, which naturally relates to the knowledge exploitation and exploration constructs. Rosenkopf and Nerkar (2001) consider self-citations in their study of organizational-and technology domain-spanning firm behavior. Both papers distinguish between self-citing and non-self-citing patents as a way of measuring the extent to which firms build on their own prior knowledge as compared to exploring new domains. Our conceptualization starts from this lineage, but extends it for our purposes. We use self-citations to measure internal inventive activity, similar to these two papers. Building on Rosenkopf and Nerkar's (2001) notion of "internal boundary-spanning," we exploit the fact that self-citations list associated co-inventors. We use this as a window to observe the distinction between self-citation of inventions with and without at least one co-inventor of a production in common, a proxy for communication patterns and consultation within the firm. To our knowledge, this mechanism and measure is novel in the strategy literature.

Main Independent Variables
Our three primary independent variables address the composition of technical knowledge diversity across and within teams. Before we describe these three independent variables, we first describe two intermediate indices, within-team knowledge diversity and across-team knowledge diversity, which are used to construct the three independent variables used in the regression analyses.
Our primary independent variables of theoretical interest are a combination of within-and acrossteam knowledge diversity, which we measure using individual inventors' career patenting experience.
The intermediate within-team knowledge diversity measure captures the diversity of knowledge among different inventors on a particular production team. 9 To create this measure, we first measure the angular distance between the knowledge experience of each pair of inventors on a team (as described in further detail below, as well as in Appendix Table A1), and then take the mean of this value across all pairs of inventors on a team. The within-team knowledge diversity measure is then the average of this team-level value for all teams in a firm-year (our level of analysis). The across-team knowledge diversity measure captures the degree to which production teams within the firm differ from one another with respect to each team's aggregate knowledge. This variable is the average of the knowledge experience angular distance between all pairs of teams in the firm-year (as described in further detail below, as well as in the Appendix).
The angular distance between pairs of inventors (used to calculate within-team knowledge diversity) and pairs of production teams (used to calculate across-team knowledge diversity) are both based on Jaffe's (1986) cosine similarity measure. For each inventor at each year of her career, we capture her knowledge experience with a "class experience vector," consisting of the total experience the inventor has had patenting in each technology class. An inventor's total experience includes her entire history, and as such captures experience not only in the context of her current firm, but also in all prior firms. For a particular inventor, a given entry in the class experience vector thus represents the stock 9 We use a dyadic measure of diversity, which, as opposed to a measure based upon concentration (i.e., Herfindahl index) or variance, allows us to take the full range of an inventor's experience into account and to provide a complete measure of diversity among all the technological experience dimensions within a production team (i.e., a particular patent). count of that inventor's patents in that particular technological class through the focal year, with the dimension of the vector being the total number of primary USPTO patent classes. We follow Jaffe's (1986) cosine similarity measure by calculating the "angular separation" between the corresponding class experience vectors. Cosine diversity is defined as 1 minus the cosine similarity measure; the two diversity measures range from 0 to 1, where 1 is completely diverse (no overlap among class experience vectors) and 0 is completely homogeneous (full overlap among class experience vectors).
The within-team knowledge diversity measure is created by forming all possible dyads between inventors on a particular production team, calculating the cosine diversity for each dyad, averaging this for inventor dyads on a production team, and then aggregating to the firm-year level by taking the average for all patents in the firm-year. The across-team knowledge diversity measure is created by first summing the class experience vectors of the inventors on a given production team to create a single aggregate class experience vector for each production team. We then take the set of production teams of a firm in a year, form all possible dyads among these patents, calculate the cosine diversity measure among patents based on their respective aggregate class experience vectors, 10 and then calculate the average over all dyads of patents in the firm-year. We provide a detailed example of the calculation of the two knowledge diversity measures in the Appendix (Table A1).
Taking these intermediate measures of across-team diversity and within-team diversity, we construct three main independent variables to represent the possible permutations of diversity using these two dimensions. We empirically categorize "organizational knowledge environments" into mutually exclusive categories to characterize and emphasize their distinction, and to map them to managerially relevant organization design decisions. We dichotomize these intermediate measures at the 75 th percentile of their underlying distribution to capture the meaningful variation in the data, as shown in the scatterplot 10 This measure contrasts with that of within-team knowledge diversity, where the cosine diversity measure is calculated among dyads of inventors within a team. In the case of across-team knowledge diversity, it is calculated among dyads of patents. We calculate across-team knowledge diversity across teams rather than across individuals (i.e., across the dyads of individuals in the firm) because we want a measure that captures the diversity of the firm in the context of the organization of teams within the firm. The inclusion of a firm-level diversity measure as calculated across all dyads of individual inventors in the firm does not change the sign or significance of the main independent variables, however.
in Appendix Figure A1 (for within-team diversity, the threshold is a value of 0.3, while for across-team diversity, the relevant value is 0.15) in order to empirically test the regimes of diversity discussed in our hypotheses. 11 Concentrated (low within high across diversity) is an indicator (0/1) for firm-years where withinteam knowledge diversity is below the 75 th percentile of the sample and across-team diversity is above the 75 th percentile. Diffuse (high within low across diversity) (above 75 th percentile and below 75 th percentile respectively) and high within high across diversity (above 75 th percentile and above 75 th percentile respectively) are similarly defined. All estimated coefficients on these three variables are relative to the excluded baseline case of low within low across diversity. We make comparisons relative to this baseline to understand how the locus of knowledge diversity, implied by the alternative organizational knowledge environments, shapes firm-level innovation outcomes. In addition, organizing production teams in a diffuse versus concentrated manner represents alternative ways of achieving organizational knowledge diversity (and as compared to the high within high across diversity environment, such potential managerial tradeoffs are not as salient, and so we do not devote much effort to interpreting that estimate).

Moderating Independent Variables
We construct a measure of lowered within-team coordination costs to test Hypothesis 2: prior within-team collaborative experience between different inventors on a team. For any two inventors and − , joint experience at time is defined as the stock count of patents for which both inventors were on the same patent team at any point in their career through time . The measure is constructed by averaging the joint experience among all dyads of inventors on the same patent team, which is then averaged over all patent teams in a firm-year.
To test Hypothesis 4, we build a measure for the accessibility of knowledge across teams within a firm. Team geographic dispersion is the largest number of unique three-digit ZIP code prefixes among inventors on a production team, across the production teams patenting in the focal firm-year (as of this 11 The results are also robust to a threshold corresponding to the 90 th percentile, though high thresholds yield sparse data in the theoretically-relevant independent variables (resulting in the majority of observations falling into the omitted category (low across low within diversity).

writing, there are 923 [over 40,000] three-digit [five-digit] zip codes in the US). The basic US Postal
Service ZIP code format consists of five digits, where the first three digits represents a broader geographic region than the full five digits. Higher team geographic dispersion would make across-team consultation more difficult.

Control Variables
We employ a set of time-varying control variables (measured at the firm-year level) to account for residual differences beyond the time-invariant firm-level characteristics absorbed by including firm fixed effects across all models.
We include two control variables based on the patenting history of the firm and its inventors.
Patent count is the total number of patents applied for by the firm in the firm-year. Total collaborative experience is the average level of joint experience among all dyads formed from the full set of inventors patenting in a firm-year, constructed in a similar manner to within-team collaborative experience, described above. This control serves primarily as a counterpart to within-team collaborative experience, so that this independent variable of theoretical interest measures collaboration only within teams and does not errantly measure all collaborations within the firm.
We further account for time-varying characteristics of the firm that could correlate with both knowledge diversity characteristics and innovation. Collectively, these variables capture characteristics of firm quality and development stage that are particularly relevant in our industry setting of early-stage venture capital-backed biotechnology firms. The variables we create are: age of the firm in years since founding (from VentureXpert and public sources); VC inflows stock, which measures the cumulative venture capital investment into the firm (from VentureXpert); strategic alliance stock, which measures the cumulative stock of alliances in which the firm has been involved to date (from Deloitte Recap RDNA); and active product, which is an indicator for whether the firm has at least one active product in the U.S. Food and Drug Administration (FDA) pipeline (from PharmaProjects and Inteleos). We log VC inflows stock and strategic alliance stock in our regression models due to their skewed distributions.
While privately-held is the baseline ownership regime, we also control for the firm's ownership using the Electronic copy available at: https://ssrn.com/abstract=3502246 post-IPO and post-M&A variables, hand-collected using archival news sources, as the ownership regime of the firm may influence both knowledge diversity characteristics and innovation (Aggarwal and Hsu 2014). Post-IPO indicates that the firm has undergone an initial public offering (IPO) in or after the focal year, and post-M&A is an indicator that the firm has been acquired in or after the focal year.
Finally, we control for top management team (TMT) characteristics that might influence both the firm's organization of knowledge diversity and its innovation output. Diverse top management teams are associated with greater innovation (Bantel and Jackson 1989), and TMT diversity may also correlate with the diversity of inventors that we study. For each firm, we manually collect data from public sources such as publicly viewable LinkedIn profiles and BoardEx to construct a full history of the top management team of each firm, focusing specifically on those holding C-suite titles (CEO, CTO, etc.

Model Specification
We employ conditional fixed effects Poisson models with robust standard errors in our main analyses at the firm-year level for our analysis of forward patent citations, and Tobit models for our analysis of forward citation Herfindahl and firm non-team self-citation ratio. The Poisson model accounts for the non-negative count nature of the forward citations Ziedonis 2001, Hausman et al. 1984). 13 The Tobit model accounts for the bounded nature of the associated dependent variables, which is asymptotically bounded at zero at the lower-end and can take a maximum of one. To control for time invariant firm characteristics, we include conditional firm fixed effects in the conditional Poisson model 12 Our results are also robust to including any one of these measures individually. 13 A well-known issue with the Poisson model is the over-dispersion one (the conditional mean in empirical data typically is smaller than the variance). To address concerns about over-dispersion, we cluster (at the firm level) and implement robust standard errors in the panel conditional Poisson model. We also re-estimate our models using a conditional fixed-effects negative binomial model instead, and we find consistent and statistically significant results to those reported.

and traditional firm fixed effects in the Tobit model (indicator variable for each firm). For both the panel
Poisson and the Tobit models, we include year fixed effects to control for year-over-year changes across firms. Together with the set of controls described above, these models facilitate the interpretation of our results as estimating within-firm, across-time effects. We report robust standard errors clustered at the firm level to account for possible within-firm error correlation across model years. 14

Results
In Table 3, we present regression estimates of the models testing our main hypotheses. All models use a conditional fixed effects Poisson estimation except for models (3-2) and (3-6), which use a Tobit estimation with fixed effects. For the conditional Poisson models, the reported coefficients are incidencerate ratios, which represent the exponentiated form of the regression coefficients. These coefficients can be interpreted as follows: for a unit increase in an independent variable, the incidence rate of the dependent variable would be expected to be scaled (multiplied) by the value of the estimated coefficient.
Thus, a coefficient value less than one should be interpreted as a negative effect, while a coefficient value greater than one should be interpreted as a positive effect. The Tobit coefficients are reported in the standard manner (not as incidence-rate ratios).
All models include the three main independent variables: high within low across diversity ("diffuse" organizational knowledge environment), low within high across ("concentrated" organizational knowledge environment), and high within high across diversity. The base category from which the other coefficients should be compared is low within low across diversity. All specifications include control variables, firm fixed effects, and year fixed effects.
14 We use the statistical software Stata/MP 15.1 to conduct all our empirical analysis. To estimate conditional fixed-effects Poisson regressions, we use the command xtpoisson, with the options of fe and vce(robust). For xtpoisson, the robust standard error option by default clusters at the level of the panel variable, i.e., firm, and is equivalent to explicitly specifying the variable to identify the panel (e.g., Wooldridge 2013). For Tobit regressions, we use the tobit command with clustered standard errors by firm (vce(cluster firm)) with indicator variables for each firm.

1) considers the dependent variable of forward citations, and model (3-2) considers forward citation
Herfindahl, where higher values indicate inequality in invention quality. A diffuse team knowledge structure (high within low across diversity) correlates with lower firm forward citations and higher inequality in invention quality, supporting H1a and H1b respectively. The diffuse team structure is associated with 37.6% fewer forward citations to a firm's patents generated in a year. On the other hand, the concentrated team knowledge diversity structure (low within high across diversity) correlates with higher firm forward citations accompanied by more equal invention quality, supporting H3a and H3b respectively. The concentrated team structure is associated with 43.9% more forward citations to a firm's patents generated in a year. All of these effects are highly statistically significant (p < 0.01).

Mechanisms
In model (3-3), we add a moderating interaction term to test H2, a factor that facilitates within-team coordination. We find that the base term of within-team collaborative experience is positively associated with forward citations. More importantly, we find that within-team collaborative experience positively moderates (makes less negative) the negative effect of a diffuse team structure (p < 0.01), supporting H3.
One additional unit of past patenting experience between two team members is associated with 15.6% more forward citations for diffuse structures; expressed differently, a standard deviation increase in within-team collaborative experience is associated with 44.0% more forward citations for diffuse structures. 15 In model (3-4), we find support for H4, the moderating effect of across-team knowledge accessibility on concentrated team structures. The base coefficient on team geographic dispersion is negative. When interacted with the concentrated team structure variable, we find a negative moderating effect (i.e., an effect that makes the main effect less positive) of team geographic dispersion (p < 0.01).
An additional three-digit ZIP code of an inventor residence for the most dispersed patent team in a firm- year is associated with 13.7% fewer forward citations for concentrated structures; alternatively, a one standard deviation increase in team geographic dispersion is associated with 19.3% fewer forward citations. Model (3-5) includes both moderators in the same specification, with results consistent to (3-3) and (3-4), while preserving the main direct effect results of diffuse and concentrated organizational knowledge structures.
Finally, model (3-6) demonstrates support for H5 by showing that concentrated organizational knowledge environments draw on a greater proportion of within-firm external-to-the-team knowledge as compared to diffuse organizational knowledge environments (with the opposite pattern for diffuse structures). We find that the concentrated structure is positively associated with firm non-team selfcitation ratio (p < 0.01), while the diffuse structure is negatively related (p < 0.10). 16 [Insert Table 3 here]

Endogeneity Considerations
An endogeneity concern for our study is that there is an omitted variable that relates to firm quality which could correlate both with how knowledge diversity is organized within the firm (the diffuse and/or concentrated structure) as well as with innovation quality (the outcome variable). For example, a superior management team, in ways not captured by our TMT diversity control variable, may be able to impose a particular structure of knowledge diversity within the firm, while at the same time also influencing the firm's innovation output quality in ways independent from our measures of knowledge diversity structure.
More generally, the firm may select a structure due to an unobserved or unmeasurable factor that also drives innovation outcomes. We cannot fully exclude these possibilities, as we do not have an exogenous driver of organizational knowledge environments. However, the inclusion of firm fixed effects mitigates 16 To rule out the concern that firm non-team self-citation ratio has a mechanical relationship with concentrated or diffuse structures, i.e., inventors in either environment have a different risk set of within-firm non-team citations to make, we conduct an additional statistical test. The test evaluates whether these alternative organizational knowledge structures differentially relate to inventors' past co-patenting relationships in the form of exposure to peer patent class expertise through those relationships. We do not find a statistically significant difference in the co-patenting class exposure across concentrated versus diffuse structures, suggesting that our estimated relationships are not simply mechanical.
Electronic copy available at: https://ssrn.com/abstract=3502246 Electronic copy available at: https://ssrn.com/abstract=3502246 the role of unobserved, time-invariant factors, and the multiple sets of controls further address factors that vary over time.
The second endogeneity concern is that of reverse causality between the dependent variable of innovation output and the main independent variables-i.e., the possibility that firm innovation output drives the organizational knowledge environment structure. This reverse effect could be the case if firms with more promising innovation possibilities choose to select more or less diverse team structures to implement the innovation. Alternatively, firms with more innovative output may be able to become more diverse because they have greater resources from their innovation and can choose to be diverse, while other firms may be constrained from doing so. The antecedents of production team formation are an important research agenda in their own right, an issue we touch upon in our concluding discussion.

Discussion
In contrast with much of the extant literature on diversity and innovation, which focuses on the team or sub-unit-level implications of knowledge diversity, we show that knowledge diversity arising from the firm-level organization of inventive human capital can play a key role in influencing the efficacy of a firm's knowledge-generation processes. We do so by introducing the concept of across-team knowledge diversity, which stands in contrast to the comparatively well-developed construct of within-team knowledge diversity. The across-team knowledge diversity concept opens new terrain in the ways in which organizations can design for knowledge diversity.

Theoretical Implications
The insights we develop have implications for several streams of the strategy literature, including work on intra-firm knowledge networks and the knowledge-based view of the firm. With regard to the former, a central theme of work on intra-firm networks is that the structure of social ties has implications for the degree to which individuals have access to and can use knowledge from distant parts of the firm (Grigoriou and Rothaermel 2013, Nerkar and Paruchuri 2005, Tsai 2002. Generally missing in this more structural account of intra-firm networks, however, are considerations regarding the characteristics of the nodes (e.g., the knowledge held by a particular inventor), as well as the ways in which these characteristics are distributed across the firm. Our results on across-and within-team knowledge diversity suggest that the distribution of inventive experience across invention teams is a key lever influencing firm-level innovation output due to the mechanisms of within-team coordination costs and across-team knowledge flows. Knowledge access and use stemming from the distribution of human capital experience vis-à-vis production teams thus offers an explanation for heterogeneity in firm-level innovation that is distinct from network-related considerations. Future work might examine the degree to which these two sets of factors interact with one another to inform our understanding of firm-level innovation output.
Our work also has implications for the broader "knowledge-based view" of the firm (Grant 1996, Kogut andZander 1992). A central theme of this literature is that competitive advantage, particularly in knowledge-based industries, is driven by differential firm-level abilities to organize internally-held heterogeneous knowledge. Less attention has been paid, however to the firm-level design principles necessary to achieve desired outcomes in this regard. As such, while the knowledge-based view has been deeply influential, it has made less headway with regard to the internal organizational correlates of innovation. One reason for this is that while the micro-foundational processes underlying the knowledgebased view have been examined at the level of the individual team (substantial prior work has examined the knowledge-based correlates of team-level innovation (Ancona andCaldwell 1992, Hoegl et al. 2004)), limited progress has been made in developing a firm-level view that takes into account both within-team factors (e.g., coordination costs) together with across-team factors (e.g., cross-team knowledge flows).
The insights from our study underscore the importance of considering firm-level team design in evaluating the efficacy of a firm's knowledge pool. Stated differently, when viewed at the level of the firm (versus the team), the organization design of a firm's inventive human capital can shape the efficacy of firm-level knowledge generation.
Finally, our paper has implications for recent work on the types of knowledge generated by organizations. Recent work by Pontikes and Barnett (2017) introduces the idea of "knowledge consistency," with the thesis that accretive knowledge (at the organizational level of analysis) is favored Electronic copy available at: https://ssrn.com/abstract=3502246 Electronic copy available at: https://ssrn.com/abstract=3502246 in the relevant product space. Two interlinked elements seem important to this concept: first, knowledge builds cumulatively and comprehensively; and second, the speed at which knowledge can be built matters. One extension of these ideas would be to examine the extent to which knowledge consistency can be achieved at the production team level, consistent with our focus on the locus of knowledge within firms. In particular, the organizational production team lens might be a way to study whether knowledge consistency operates mainly at the level of the firm or at the level of the production team, with implications for the knowledge accumulation process, particularly when the organizational environment shifts.

Limitations
In our conceptual development and empirical analysis, we abstract away from the antecedents of production team formation, instead focusing primarily on the innovation consequences of alternate patterns of knowledge diversity organization. Conceptually, production teams may arise from some combination of managerial fiat and self-organizing activities by the inventors themselves. In hierarchical organizations, managers may explicitly assign formal teams. Alternatively, production teams can selforganize: team members may organically search for team members, and then decide to work together.
There is a limited literature on the process and outcome of self-organization, but such situations can be conceptualized either as informal working relationships that form "at the water cooler," or in a more intentional way, as in agile or scrum development teams, which are common in software industry settings.
In a recent paper, Mortensen and Haas (2016)  where production teams are randomly assigned with respect to each inventor's patenting experience across technology classes. In addition, studies employing in-depth qualitative methods might examine the micro-processes through which individuals access and use knowledge, and in so doing contribute further to our understanding of how the organization of diversity shapes the efficacy and concentration of innovation output within a firm.

Conclusion
Prior literature focuses primarily on access to diverse knowledge as an input to the knowledge recombination process. We suggest that when conceptualized at the firm-level, knowledge diversity can be characterized along two distinct dimensions: within-team and across-team. The latter captures the extent to which knowledge diversity of a firm's inventors is distributed across, rather than within, production teams, thereby taking into consideration the firm's internal organization. Our theory development and results suggest that the alternate forms of inventive human capital organization that arise by considering across-team diversity can be differentially effective with respect to firm-level innovation, as well as with respect to the concentration of such innovation across production teams in the firm. 17 A different issue related to observed invention production teams is that we use patent data to study such teams. A common concern in studies using patent data to study organizations is censoring-i.e., that some assembled teams did not successfully produce a patent, and therefore do not appear in our data. Under the assumption that the inventor teams which did not successfully receive granted patents would have received no forward citations as well if we were able to observe them, we explore the distribution of forward citations, segmented by diffuse and concentrated organizational knowledge structures, in the histogram presented in Appendix Figure A2 (before any regression analysis). An empirical concern would be that the postregression positive relation of a concentrated organizational knowledge structure with innovation quality is upward biased due to the censored left density of forward citations by teams with a concentrated structure. While we cannot know if censored knowledge structures are more or less likely to appear in the left tail of the forward citation distribution, from the data in our sample, we see that the overall concentrated structure effect is likely driven by the non-zero part of the distribution (while the opposite may be true for the diffuse structure, which has more density in the left tail of the distribution). Nevertheless, a caveat to our analysis is that censored team compositions may bias our estimates.
To conclude, by introducing to the literature the novel dimension of across-team knowledge diversity, we advance our understanding of the link between the intra-firm organization of inventive human capital and firm-level innovation.

Figure 1. Organizational Knowledge Environments: The Locus of Knowledge Diversity
The dashed lines represent production team boundaries and the solid lines represent firm boundaries. Distinct shapes represent inventors with different technological specializations. Two alternate firm-level approaches to organizing inventors on a team are depicted. Firm A represents a firm with a diffuse organizational knoweldge environment consisting of high within-team knowledge diversity and low acrossteam knoweldge diversity. Firm B represents a firm with a concentrated organizational knowledge environment with low within-team knowledge diversity and high across-team knowledge diversity.

Forward citations
Total forward citations within a four-year window to patents filed in the focal firm-year 5.14 21.46

Forward citation Herfindahl
Herfindahl concentration of forward citations among patent teams in the focal firm-year 0.76 0.30

Firm non-team self-citation ratio
Proportion of firm non-team self-citations out of all backward citations in a focal firm-year 0.02 0.07

Main Independent Variables
(  13 Note: For conditional Poisson models, the reported exponentiated coefficients are incidence rate ratios: a unit increase in an independent variable scales (multiplies) the dependent variable by the estimated coefficient. A coefficient value less (greater) than one represents a negative (positive) effect. For Tobit models, reported coefficients are standard. Robust standard errors clustered at the firm level are shown in parentheses and p-values are shown in brackets. * p<0.10 ** p<0.05 *** p<0.01. Electronic copy available at: https://ssrn.com/abstract=3502246 Electronic copy available at: https://ssrn.com/abstract=3502246