Exchange-Traded Funds and the Wash Sale Loophole

Published Online:https://doi.org/10.1287/mnsc.2024.06944

Abstract

Tax wash sale rules prohibit the recognition of capital losses when substantially identical securities are sold and immediately repurchased within short windows. This study examines whether institutional investors use exchange-traded funds (ETFs) to circumvent wash sale rules. Consistent with tax-motivated demand for ETFs, incumbent ETFs both create more shares and experience more trading volume upon the introduction of nearly identical ETFs, particularly when recent returns are negative. We show that tax-sensitive institutions’ investment in highly correlated ETFs has proliferated in recent years, exceeding a quarter of their assets under management. Furthermore, tax-sensitive institutions holding more ETFs are significantly more likely to engage in swapping nearly identical ETFs. This swapping behavior has become widespread, with tax-sensitive institutional investors swapping $417 billion of nearly identical ETFs since 2001. We estimate that tax-sensitive institutions realized more than $84 billion dollars in losses in highly correlated ETFs associated with the swapping activity since 2001.

This paper was accepted by Shiva Rajgopal, accounting.

Funding: The authors thank their institutions for financial support. M. Dambra thanks the Kenneth W. Colwell endowment for research support, C. M. C. Lee thanks the Kermit O. Hanson endowment for research support, and P. J. Quinn thanks PricewaterhouseCoopers for research support.

Supplemental Material: The online appendix and data files are available at https://doi.org/10.1287/mnsc.2024.06944.

Tax loss harvesting is the crown jewel of tax management strategies, and it represents what we consider to be the only reliable way for investors to outperform the market … (Malkiel 2023)

1. Introduction

For over 100 years, U.S. law has prohibited claiming a tax loss on a wash sale—where a security is sold for a realized loss (RL) within 30 days of purchasing a “substantially identical” security (26 CFR section 1091-1). By disallowing tax losses from the sale and repurchase of substantially identical securities within a short window, the wash sale rule prevents harvesting capital losses with trades that lack economic substance. Although theory argues that investors should immediately harvest capital losses for tax benefits (Constantinides 1983), the wash sale rule significantly limits investors’ ability to do so (Grinblatt and Keloharju 2004, Jensen and Marekwica 2011).

Exchange-traded funds (ETFs) provide an efficient way for investors to circumvent the trading frictions associated with the wash sale rule. Specifically, investors can sell a depreciated ETF security and realize a capital loss while simultaneously purchasing another nearly identical ETF security. This form of swap trading allows investors to maintain a substantially identical economic position while harvesting a capital loss that can be used to offset realized gains (RG) and other taxable income.1 With an explosion in available ETFs over the past two decades, these securities have become ideal vehicles for circumventing the wash sale rule. More than 2,000 ETFs trade today on U.S. exchanges (Lundholm and Zheng 2025), and many track economically equivalent baskets of securities. These broad-based ETFs typically offer greater liquidity and lower fees than traditional diversified investment vehicles, such as mutual funds (Huang and Guedj 2009, Ben-David et al. 2023). More importantly for our purposes, investors in ETFs can buy and sell their shares directly throughout the day; they do not need to purchase (or redeem) their shares from the mutual fund at end-of-day prices.

Our study examines the extent to which investors swap shares of highly correlated (HC) ETFs to circumvent the wash sale rule. Although the economic intent of the wash sale rule is straightforward, significant uncertainty remains as to the permissibility of tax deductions achieved through ETF swaps. The Internal Revenue Service (IRS) has not ruled on what constitutes a “substantially identical” security, leaving financial advisors to navigate a foggy legal landscape. Some advisors seem to take the regulatory silence as tacit permission to swap ETFs that hold identical securities or that are benchmarked to the same index (e.g., Lasser 2011). Others argue that if an investor’s economic position has not changed after swapping ETFs, the spirit of the wash sale rule has likely been violated (e.g., Fischer 2010). Against this backdrop of legal uncertainty, the extent to which investors engage in tax avoidance through ETF wash sales remains largely unknown.

Our analyses proceed in three stages, each employing a different database. In the first stage, we use ETF-level data to examine primary and secondary market demand for incumbent ETFs when a near-identical ETF alternative is introduced. Absent tax effects, economic theory predicts that the introduction of a near-identical alternative security will reduce investor demand for the incumbent ETF as the two funds are economic substitutes. However, if the second ETF provides a wash sale benefit to investors, the two ETFs may serve as complementary goods, leading to increased demand for both assets. We explore this possibility by examining the change in demand for incumbent ETF shares around the introduction of a new near-identical ETF.

Using index ETF data from 1993 through 2022, we identify ETFs with highly correlated returns (i.e., ≥99%) as potential loss-harvesting peers.2 Our focus is on the change in demand for an incumbent ETF at the introduction of a near-identical peer ETF. We approximate primary market demand with ETF shares outstanding (e.g., Brown et al. 2021) and secondary market demand with ETF trading volume (e.g., Ben-David et al. 2018, Li 2024). We then conduct a difference-in-differences (DiD) analysis to determine whether the introduction of a highly correlated ETF increases or decreases demand for the incumbent ETF. To avoid potential heterogeneity in treatment effects that could confound a staggered DiD design, we align the introductions of loss harvest pairs in event time and run a stacked regression that compares the treated ETF with a clean control group of unpaired ETFs (Baker et al. 2022).

Our results show that demand for incumbent ETFs’ shares increase with the introduction of a new near-identical ETF. First, consistent with an increase in primary market demand, we find a significant increase in the incumbent ETF’s total shares outstanding (indicating net new issuance of ETF shares). Second, consistent with an increase in secondary market demand, we find a significant increase in incumbent ETF trading volume. Our trading volume results confirm those in Li (2024), which provides the first evidence of increases in incumbent ETF trading volume following new ETF introductions using a staggered DiD and a Callaway and Sant’Anna (2021)-style DiD. The findings in our study and Li (2024) broadly support a complementarity effect between pairs of nearly identical ETFs.

One plausible alternative explanation for the increase in shares outstanding and trading volume after ETF introductions is that these introductions coincide with a surge in investor demand for low-cost, diversified portfolios (i.e., Agapova 2011). To the extent that ETF competitors introduce near-identical products when a comparable incumbent product has been especially successful, the increased trading volume may not be because of tax loss harvesting or wash sale avoidance. Instead, the increased activity may simply reflect exceptionally strong growth in the underlying demand for an incumbent ETF product. Furthermore, contemporaneous research has identified other tax advantages to holding ETFs, including their ability to avoid distributing capital gains (Moussawi et al. 2025). To discriminate between these alternative explanations, we analyze institutional investor activities at the micro level.3

In the second stage of our analysis, we use granular trading data from Abel Noser (formerly Ancerno) to document the existence of institutional investor same-day, same-fund swapping of highly correlated ETFs. This database contains all trades made by Abel Noser’s sizeable client base from January 2001 to December 2010. Prior research (e.g., Puckett and Yan 2011, Hu et al. 2018) shows that the characteristics of stocks owned and traded by Abel Noser’s clients do not significantly differ from the characteristics of stocks owned and traded by the average 13F-filing institution.

Our evidence shows that even during the early days of ETF availability (i.e., 2001–2010), institutional investors engaged in direct, same-day swap trades using highly correlated ETFs. Focusing on highly correlated ETFs traded by the same fund in opposite directions on the same trading day, we find that the Abel Noser database contains many such trades. Over our sample, the frequency of near-identical, same-day ETF swaps increases from a single trade in 2001 to more than 1,400 trades in 2010. The existence of these highly correlated ETF swap trades supports the wash sale hypothesis and corroborates the macro-level empirical evidence. Institutional investors incurred trading costs to swap between ETFs with return correlations of 99% or greater, with funds often tracking identical indices. We are unaware of any economic rationale other than tax loss harvesting that could explain this phenomenon.

Although the highly granular Abel Noser data are useful in establishing the existence of wash sale transactions among institutional trades, the limited coverage of these data (in terms of sample, institutional coverage, the lack of holdings data, and our inability to track investors over a time series) prevents us from linking ETF swaps directly to funds’ loss recognition activities. To establish this link more directly and to provide an estimate of the scope of tax avoidance through such activities, we turn to institutional investors’ quarterly filings.

In the third stage of our analysis, we use Form ADV (Uniform Application for Investment Adviser Registration and Report by Exempt Reporting Advisers) and WhaleWisdom 13F data to examine the extent to which institutional investors use ETF wash sales for tax avoidance. We classify institutional investors as tax sensitive if they report non-zero high-net-worth (HNW) clients on Form ADV. Tax-insensitive institutions are identified as either those reporting zero high-net-worth clients on Form ADV or those identified as a pension plan or an endowment via Brian Bushee’s Institutional Investor Classification Database.4 The institutional investors that we cannot classify using these techniques are excluded from the sample.5 WhaleWisdom contains every position that each 13F filer reports to the Securities and Exchange Commission (SEC). We interact 13F filers’ tax sensitivity with measures of highly correlated ETF availability in ordinary least squares (OLS) regressions with filer fixed effects. Our results show that tax-sensitive institutions hold a greater number of highly correlated ETFs and have a significantly larger portion of their assets under management (AUM) in these instruments. Furthermore, tax-sensitive institutions’ investment in highly correlated ETFs has proliferated in recent years, exceeding 25% of AUM. These findings are broadly consistent with highly correlated ETFs offering significant tax benefits to institutional investors.

To further understand whether the increased ETF holdings are driven by incentives to avoid the wash sale rule, we examine the extent to which investors swap highly correlated ETFs. We proxy for institutional trades using quarterly changes in 13F holdings. ETF swaps are identified as the offsetting quarterly buys and sells in separate ETFs with prices correlated at ≥99%. We find that tax-sensitive institutions engage in substantial swapping. These buys and sells of highly correlated ETFs seem to lack economic substance beyond harvesting capital losses. Since 2001, tax-sensitive institutional investors in our sample have swapped approximately $417 billion between highly correlated ETFs and $106 billion in 2022 alone.

Finally, to determine whether ETF swapping is indeed associated with the realization of capital losses, we follow Sikes (2014) and Blouin et al. (2017) to estimate institutional investors’ basis in each holding and calculate their realized losses, unrealized losses (UL), and the proportion of losses realized (PLR) in a given quarter (Odean 1998). Our results show a strong positive association between highly correlated ETF swaps and the PLR, with the tax-sensitive institutions’ trading activity being the main driver. On average, a tax-sensitive institution that incurs a standard deviation increase in swapping activity recognizes 1.2% more of their available losses each quarter, a significant increase considering that the median quarterly PLR in our tax-sensitive institutional sample is 17.3%.

Our estimate of the extent of incremental capital loss recognition attributable to ETF swaps is affected by at least four sources of potential noise. First, to the extent that different funds within the same filing entity buy and sell offsetting amounts of highly correlated ETFs by happenstance, our estimate of swapping activity will be noisy and may be biased upward or downward. Second, our measure of quarterly changes in holdings will not capture intraquarter swapping of highly correlated ETFs. This source of noise will lead to a downward bias in our estimate of swapping. Third, to identify near-identical ETFs, we impose a high threshold for their correlation in returns (≥99%). If institutional investors use a less stringent threshold, our estimate of tax-related swaps will again be biased downward.6 Finally, given the increase in ETF swapping observed throughout our sample period, the averages that we compute are likely to understate future swapping activities significantly. Considering all four sources of noise, we believe that our estimate of the capital loss recognition attributable to ETF swaps is likely a lower bound on the institutional investors’ tax avoidance.

Our study contributes to several lines of existing research. First, we add to the literature on how institutional investors incorporate tax strategies into their active portfolio management (Sialm and Starks 2012, Blouin et al. 2017). We are unaware of other studies that examine how institutional investors exploit ETFs to harvest capital losses while maintaining a nearly identical portfolio position. Empirical evidence also documents that investors realize losses too late and inefficiently (Odean 1998, Barber and Odean 2004). Our evidence suggests that institutional investors utilize ETFs to circumvent trading frictions imposed by the wash sale rule to capture available capital losses.

Second, we add to the literature on the increasing importance of ETFs in U.S. capital markets. In the past 20 years, the global ETF market has exploded from around 300 ETFs representing $200 billion in 2003 to more than 8,700 ETFs representing $9.5 trillion in 2022 (Statista 2023). Prior research emphasizes the role that ETFs play in passive investment (see Ben-David et al. 2017 and Liebi 2020 for recent reviews) and financial analysis (Lundholm and Zheng 2025). Concurrent work by Moussawi et al. (2025) illustrates the favorable tax treatment of holding ETFs rather than mutual funds. Whereas Moussawi et al. (2025) sheds light on how ETF managers avoid ETF-level taxes in the primary market, our work shows how institutional investors avoid investor-level taxes by trading near-identical ETFs in the secondary market. In particular, we present novel evidence that the accumulation and aggressive swapping of ETFs are concentrated among tax-sensitive investors.

Our work complements a contemporaneous working paper by Li (2024). Like Li (2024), we use ETF-level data to document an increase in trading volume of incumbent ETFs after the introduction of a highly correlated ETF. Both studies also find that the relationship is negatively associated with past returns. However, beyond these common findings, the two papers take different approaches to develop their inferences. Li (2024) develops an analytical model and uses this model to estimate the tax revenue loss and tax alpha from tax-loss harvesting. By “tuning model parameters to match the empirical estimation of the tax-loss trading volume,” Li (2024, p. 6) infers an annual tax revenue loss of “about 0.52% of AUM of highly correlated ETFs, equivalent to 25 billion USD.”

In contrast, we draw our inferences from multiple data sources. We first use ETF-level data to show that new near-identical ETF entries affect incumbent ETF demand in both the primary market and the secondary market. We then use two additional institutional trading databases, one at the trade level and one at the 13F filer level, to estimate the degree to which institutions engage in swapping highly correlated ETFs. In conducting these tests, our research design focuses on partitioning institutions into tax-sensitive and tax-insensitive types. This design allows us to (a) discriminate between higher trading volume because of increased demand for ETF products and higher volume because of tax-loss harvesting and (b) provide a direct estimate of the magnitude of the tax-related swapping activity. Overall, we estimate that 13F filers realized $84 billion of losses from swapping highly correlated ETFs during our sample. The findings in Li (2024) and our study are complementary, and together, they provide robust evidence that highly correlated ETFs are increasingly used to harvest significant losses that would otherwise be disallowed by the wash sale rule.

Finally, we inform the policy debate around loss harvesting. Resource-constrained regulators consider success likelihood when deciding to challenge tax positions (Nessa et al. 2020), and the IRS weighs enforcement costs when deciding whether to clarify tax law (e.g., how to audit whether two ETFs are “too correlated” or “too similar in economic substance”). The IRS has not defined the term “substantially identical” in its wash sale provision, which fosters uncertainty about whether swapping highly correlated ETFs violates the wash sale rule among legal scholars and practitioners (Fischer 2010, Ganel 2024). Our analysis and estimate of the magnitude of the trading activity seemingly driven by institutional investors’ loss harvesting give regulators a lower bound on ETF swapping activity as they weigh continued silence regarding asset similarity.

2. Wash Sale Background

An optimal investment strategy is to realize capital losses immediately and defer capital gains until the event of a forced liquidation (i.e., Constantinides 1983), all while maintaining the same risk/return exposures in the portfolio. However, the disallowance of losses generated via a wash sale creates a meaningful friction to this “realize and rebalance” strategy. The U.S. Federal Government has taken the position that although the goal of tax minimization is not inherently problematic, avoiding taxes with activities that otherwise lack economic purpose will not be permitted.

The U.S. Congress passed the Revenue Act of 1921 (the Act) in November 1921 (Blakey 1922). The Act was projected to “considerably increase the revenue by preventing taxpayers from taking [recognizable] losses in wash sales and other fictitious exchanges” and specifies that no deduction is permitted for losses on a sale “where it appears that within 30 days after such a sale the taxpayer purchases identical securities” (U.S. Senate 1921, p. 14). The congressional prohibition on wash sales coincided with the growing notoriety of a tax-loss-harvesting-via-wash-sale strategy suggested by the popular press (Wall Street Journal 1920). Enforcement of the law fell to the Treasury Department and the Internal Revenue Service. The IRS interprets wash sales via its Publication 550:

A wash sale occurs when you sell or trade stock or securities at a loss and within 30 days before or after the sale you: (1) buy substantially identical stock or securities, (2) acquire substantially identical stock or security in a fully taxable trade, (3) acquire a contract or option to buy substantially identical stock or securities, or (4) acquire substantially identical stock for your individual retirement arrangement or Roth IRA. (Internal Revenue Service 2021)

Thus, investors wishing to recognize a capital loss must sell the security in question and not purchase a “substantially identical” security within 30 days—before or after—the sale. This rule forces investors to choose between recognizing losses and maintaining an optimal portfolio (Jensen and Marekwica 2011). The IRS clarified in Revenue Ruling 2008-5 that investors cannot claim a tax loss on a substantially identical security that is purchased through an individual retirement account (IRA) within 30 days (Silow 2008) and has sanctioned various individuals and entities for violations of the wash sale rule (e.g., Internal Revenue Service 1983 (Estate of Estroff v. Commissioner), 1985 (Rev. Ruling 85-87), 2017 (Technical Advice Memorandum 201737011)).

To effectively circumvent the wash sale rule, an asset must be readily exchangeable for a highly correlated but not identical asset. However, considerable ambiguity remains on what constitutes a “substantially identical” security (i.e., Fischer 2010). This ambiguity has led investors to consider alternative strategies to circumvent the trading frictions imposed by the wash sale rule. With the proliferation of near-identical ETFs, it has become extremely easy for investors to circumvent the wash sale rule by selling and immediately buying back highly correlated securities. But, what exactly does “substantially identical” mean in the context of ETFs?

To better understand the current guidance and rules of thumb available to investors, we searched both practitioner and academic sources. Our results, summarized in Figure 1 in the Online Appendix, reveal a wide range of interpretations. Some practitioners employ strict interpretations of what constitutes substantially identical and caution against using highly correlated ETFs to harvest tax losses (Goodman 2015, Matthews 2016, Holt 2023). These authors typically believe that two ETFs are substantially identical if their return correlation and/or holdings overlap exceed a certain threshold. For example, Matthews (2016, p. 12) proposes that “ETFs are substantially identical if they are reasonably expected to: (1) generate returns that mirror each other within a fraction of a percent; and (2) hold portfolios that overlap by 99% or more.”

Other sources adopt more lenient interpretations. One common view is that two ETFs are likely substantially identical if they track the same underlying index. For example, BlackRock (2024, p. 1) states that swapping “ETF or Index Mutual Fund with same index” has a higher “risk of wash sale classification.” Dong (2022) argues that substantially identical requires two ETFs to track the same index: “VOO tracks the Standard and Poor’s (S&P) 500 Index, while VTI tracks the Center for Research in Security Prices (CRSP) U.S. Total Market Index. This makes them sufficiently differentiated in the eyes of the IRS, even though VTI has a 0.99 correlation to VOO.”7 Finally, some authors, such Meziani and Yang (2012), argue that even tracking the same index does not necessarily make two ETFs substantially identical if they are issued by different trustees and/or are structured differently.

3. Sample and Descriptive Statistics

3.1. Sample

We conduct analyses at both the ETF level and the institutional investor level. We identify ETFs in CRSP, with index ETFs identified as those with share code = 73 in the CRSP monthly stock database and index fund flag = “B” or “D” in the CRSP mutual fund database. We utilize monthly returns, shares outstanding, and trading volume from 1993 to 2022. Removing monthly ETF observations with missing transaction volume, observations where we cannot compute spreads, and observations where the cumulative factor to adjust share prices/shares equals zero leaves us with a sample of approximately 234,000 ETF-month observations (panel A of Table 1 in the Online Appendix).

Our fund-level trading data span from 2001 to 2010 and come from Abel Noser. The data include granular trading activity for Abel Noser’s anonymous institutional clients (Hu et al. 2018 provides a comprehensive discussion of the Abel Noser database). Different vintages of the Abel Noser data have slightly different client and trade information (e.g., the reported numbers of yearly institutions and trades differ in the descriptive statistics of Puckett and Yan 2011 and Hu et al. 2018).

Our institutional-investor-level holdings data come from WhaleWisdom, which provides 13F filing data since 2001 for institutions with AUM of over $100 million.8 Institutional investors must disclose long positions valued at more than $200,000 via Form 13F, which allows us to infer trading activity and wash sale rule avoidance. Observations in WhaleWisdom are at the investor-quarter-ticker level. After requiring identifying information, a match in CRSP, and sufficient information to determine a basis, we have a sample of 143 million investor-quarter-ticker observations. We measure trading activity using differences in holdings from one quarter to the next. For example, if we observe that Filer A holds 5,000 shares of Security B in a quarter and in the next quarter, Filer A reports holding only 1,000 shares of Security B, we infer that Filer A sold 4,000 shares of Security B.9 We merge security prices from CRSP with the WhaleWisdom data set, and then, we estimate the cost basis of securities traded following Sikes (2014) and Blouin et al. (2017).

We identify the tax sensitivity of institutional investors via reported clientele on Form ADV. Form ADV is a required filing for professional investment advisors that provides information on the institution’s investment approach, employees, and clientele. We match Form ADV filings to 13F filers and define a tax-sensitive filer as one who serves high net worth individuals as clients. Tax-insensitive institutional advisors are those who either report zero HNW individual clients on Form ADV or those identified as pensions or endowments via Brian Bushee’s Institutional Investor Classification Database. Investors who file a Form 13F but not a Form ADV and are not identified as pensions or endowments remain unclassified and do not appear in our sample.

Panel B of Table 1 in the Online Appendix details our sample selection at the 13F filer level, beginning with investor positions available in 13F filings from 2001 to 2022. To conduct our tests, we require non-missing information from WhaleWisdom, CRSP, and Form ADV as well as investor-level AUM and trading activities above zero over our sample. We progressively collapse the sample toward measures aggregated at the filer-quarter level, which is the unit of observation for our second set of empirical tests. We are left with 208,252 filer-quarter observations.

3.2. Descriptive Statistics

Panel A of Table 1 presents our descriptive statistics at the ETF-month level. The median (mean) monthly trading volume is 709,000 (12.9 million) shares. The median (mean) ETF market value is $140.3 million ($1.67 billion). Roughly 30% of our ETF-month observations have a highly correlated peer ETF (i.e., ≥99% return correlation).

Table

Table 1. Descriptive Statistics

Table 1. Descriptive Statistics

Panel A: CRSP ETFs—Market-level statistics
NMeanStandard deviationP10P25MedianP75P90
ln(Volume)234,0738.902.555.667.208.8710.5712.23
Volume234,07312.9351.730.030.130.713.9020.51
ln(SharesOut)234,0738.312.135.306.808.319.8311.19
SharesOut234,07328.0870.700.200.904.0518.6572.10
HC ETF Peer234,0730.300.460.000.000.001.001.00
Spread234,0730.000.010.000.000.000.000.01
ln(Market Value)234,07318.812.3215.7017.0818.7620.4321.98
Market Value234,0731,6655,4116.6026.22140.30743.703,502
Volatility234,0730.050.030.010.030.040.060.09
B&H Trailing 12234,0730.060.22−0.18−0.050.050.170.30
Panel B: Overall 13F filer sample
NMeanStandard deviationP10P25MedianP75P90
AUM208,2523,14610,98889.60153.76349.221,1985,161
HC ETF AUM208,252130.83536.190.000.003.4963.26232.84
Volume208,2521,1063,95211.8830.70101.79436.191,869
Turnover208,2520.430.500.060.130.260.550.99
Realized Loss208,25230.47142.120.010.211.478.9446.31
Realized Gain208,25266.56245.070.120.834.2423.27115.65
Unrealized Loss208,25280.84374.750.271.415.7925.24124.42
Unrealized Gain208,252683.472,8513.5614.0150.41195.53897.04
Realized Loss/Vol208,2520.040.070.000.000.020.040.09
Unrealized Loss/Vol208,2520.250.750.000.010.060.190.55
Unique HC ETFs208,25213.4327.090.000.002.0015.0039.00
HC ETF Intensity208,25213.4622.980.000.000.6117.2450.81
Tax Sensitivity208,2520.700.460.000.001.001.001.00
HC ETF Swaps208,2520.200.600.000.000.000.000.57
Available HC ETFs208,252362.02206.7068.00175.00330.00554.00658.00
PLR204,11028.6927.331.056.4219.9344.0173.15
PGR206,44215.4819.730.682.397.4920.3443.31
HC ETF PLR88,59533.1037.240.000.9414.9962.66100.00
HC ETF PGR115,57114.5126.980.000.101.8712.6554.91
Panel C: Differences between tax-sensitive and tax-insensitive 13F filers
VariableTax sensitiveTax insensitiveStandardized difference
MeanMean[(TSTINS)/σ]
AUM $2,5204,635−0.19***
HC ETF AUM $126.90140.19−0.02
Volume $804.601,822−0.26***
Turnover0.340.64−0.60***
Realized Loss $22.2150.13−0.20***
Realized Gain $48.99108.37−0.24***
Unrealized Loss $60.87128.40−0.18***
Unrealized Gain $563.08970.13−0.14***
Realized Loss/Vol0.040.04−0.10***
Unrealized Loss/Vol0.250.26−0.01
Unique HC ETFs17.533.660.51***
HC ETF Intensity16.057.310.38***
HC ETF Swaps0.250.070.29***
PLR25.7735.89−0.37***
PGR11.2825.66−0.73***
HC ETF PLR30.9545.51−0.39***
HC ETF PGR11.2529.50−0.68***


Notes. Panel A presents summary statistics for the CRSP ETF data. The unit of analysis is ETF-month. Pre-transformed Volume, SharesOut, and Market Value are provided for convenience (in millions). Panel B presents the summary statistics for the WhaleWisdom 13F data. The unit of observation is filer-quarter. Variables are defined in Appendix B. Dollar values are in millions. PLR, PGR, and HC ETF Intensity are percentages. Panel C presents variable means for tax-sensitive and tax-insensitive filers separately. The rightmost column presents the difference in means scaled by the population standard deviation. The t-tests in panel C are based on filer and year-quarter clustered standard errors.

 ***Significance at 1%.

Moving to our institutional investor data in panel B of Table 1, we classify 70% of filer-quarters as tax sensitive. These institutional investors have median (mean) assets under management of $349 million ($3.1 billion) and hold highly correlated ETFs equal to 1% (13%) of total AUM. They also exhibit a greater average propensity to realize losses than gains (28.7% versus 15.5%). Panel C of Table 1 presents the differences in means between tax-sensitive and tax-insensitive institutions. Tax-sensitive filers tend to have less AUM ($2,520 million versus $4,635 million) but hold more highly correlated ETF tickers (18 versus 4) and devote more of their portfolios to highly correlated ETFs (16% versus 7%). These differences are consistent with ETFs providing tax advantages for tax-sensitive filers.10

4. Empirical Design and Results

We predict that the availability of ETFs mitigates a trading friction facing investors: harvesting losses disrupts portfolios. We start our analyses at the macro level by exploring whether the introduction of a near-identical ETF increases or decreases demand for the incumbent ETF.

4.1. Do Emerging ETFs Act as Complements or Substitutes for Incumbent ETFs?

An ETF is only useful for avoiding the wash sale rule if at least one other highly correlated ETF is available to swap. Specifically, in this setting, a “treated” ETF would be a security with a near-identical peer available such that investors can exit their current ETF position and enter a new one without meaningfully changing their existing aggregate portfolio.

Li (2024), in an independently developed working paper, provides evidence that the trading volume of incumbent ETFs increases upon the introduction of a highly similar peer ETF using a staggered DiD as the main specification. There is a range of practitioner and legal opinions regarding the legality of tax loss harvesting with highly correlated ETFs, with several opinions concentrating at a 99% correlation threshold (e.g., Jennings et al. 2020, Wealthfront 2021, Dong 2022, Holt 2023). Other academic work considers ETF similarity based on return correlation thresholds ranging from 90% (Khomyn et al. 2024) to 99% (Brown et al. 2022, Li 2024). We use a 99% threshold as a starting point to examine how incumbent ETFs respond to the introduction of a highly correlated ETF.

In contrast to Li (2024), we study the complementary versus substitutive effects of the introduction of highly correlated peers in the primary market (i.e., share creation) as well as the secondary market (i.e., trading volume). We implement a “stacked regression” approach (Gormley and Matsa 2011, Baker et al. 2022, Dambra et al. 2024) that estimates average treatment effects within a cohort. The stacked regression allows us to obtain clean counterfactuals by “stacking” cohort-specific data sets to avoid contamination from comparing newly treated ETFs with those already treated. Specifically, we create a cohort for each year-month in which incumbent ETFs have near-identical peers introduced.11 Over our sample period, 110 cohorts are established, and we analyze ETF volume over the 36 months before and after the introduction of a paired ETF. We retain “never treated” ETFs that existed during the same window as control observations.12 This creates a 2 × 2 data set for each cohort (Baker et al. 2022).

Using this stacked data set, we conduct an analysis similar to that in Li (2024). Our focus is on the introduction of the first highly correlated ETF, meaning that an ETF is treated no more than once. For each 2 × 2 data set, we create a data set-specific identifier (e.g., cohort j). We then estimate

[Outcome]itj=β1Treatij×Posttj+ βControls+αij+λtj+εit,(1)
where the dependent variable in Equation (1), [Outcome], denotes measures of incumbent ETF demand. We estimate primary market demand via the number of ETF shares outstanding (e.g., Brown et al. 2021), where ln(SharesOut) is the natural log-transformed monthly shares outstanding of each ETF i in our sample for month t in cohort j. Secondary market activity is captured by ln(Volume), the natural log-transformed monthly trading volume of each ETF i in our sample for month t in cohort j. Treat equals one for ETF i if a new ETF is introduced in cohort j with returns correlated at ≥99% with ETF i and is zero otherwise. As in Li (2024) and Khomyn et al. (2024), we measure the monthly return correlations between ETFs. We require at least 12 months of overlap to generate the return correlations, but we utilize the entire overlapping life of the ETFs.

This construction uses future realizations to determine highly correlated ETFs. To the extent that investors are uncertain of how closely two index-based ETFs will comove immediately upon the introduction of a new ETF, this construction should bias against detecting any share or volume changes in the original ETF. We set Post equal to one for observations that occur after the introduction of a highly correlated ETF peer in cohort j in the t months after the paired ETF introduction and is zero otherwise.

The coefficient of interest in Equation (1) is β1. To the extent that the introduction of a near-identical ETF acts as a substitute for an incumbent ETF (as economic theory would predict upon the introduction of a nearly identical asset), we expect β1 < 0. Alternatively, to the extent that introducing a near-identical ETF provides a convenient investment tool to harvest losses and maintain near-identical portfolio allocations, we would expect β1 > 0. Following prior ETF literature, our control matrix includes the size of the ETF, the bid-ask spread, volatility, and the prior year’s buy-and-hold performance (Li 2024, Moussawi et al. 2025). We include ETF-cohort (αij) and month-cohort (λtj) fixed effects, which control for ETF- and time-invariant factors that may affect ETF trading activity. Given the inclusion of fund-cohort and month-cohort fixed effects, the individual Treati j and Postt j terms are omitted from Equation (1). Non-discrete variables in our regression analyses are winsorized by year at the 1st and 99th percentiles, and we cluster our standard errors by ETF.

We start our regression analyses excluding control variables as time-varying covariates in a stacked regression partially capture treatment effects and can confound inferences (Gao and Huang 2020). In column (1) in Table 2, we observe a 20% increase in shares outstanding for the incumbent ETF in the 36 months after the introduction of a highly correlated paired ETF (coefficient = 0.186, p < 0.01).13

Table

Table 2. Effects of Introducing a Highly Correlated Partner ETF

Table 2. Effects of Introducing a Highly Correlated Partner ETF

Dependent variablePrimary Market ActivitySecondary Market Activity
(1)(2)(3)(4)
ln(SharesOut)ln(SharesOut)ln(Volume)ln(Volume)
Treat × Post0.186***0.213***0.269***0.103**
(3.27)(3.57)(3.90)(2.50)
Spread−14.150***−7.249***
(−12.98)(−8.76)
Volatility1.774***4.989***
(4.29)(14.08)
ln(Market Value)0.842***
(62.34)
B&H Trailing 120.233***0.411***
(5.03)(9.11)
B&H Trailing 12 × Treat0.059−0.292***
(0.68)(−3.69)
B&H Trailing 12 × Post0.146***−0.267***
(3.94)(−7.05)
B&H Trailing 12 × Treat × Post−0.291***−0.234**
(−2.59)(−2.34)
Observations4,690,1324,690,1324,690,1324,690,132
Fixed effectsCohort × date, cohort × ETFCohort × date, cohort × ETFCohort × date, cohort × ETFCohort × date, cohort × ETF
Adjusted R20.9120.9140.8520.903


Notes. This table presents results of stacked DiD regressions that examine monthly ETF primary and secondary market activity surrounding the introduction of a highly correlated peer. ln(SharesOut) is the natural log of month-end shares outstanding. ln(Volume) is the natural log of monthly trading volume. Treat is equal to one for an ETF incumbent if another ETF within a cohort has returns that are correlated at 99% or more and is equal to zero otherwise. Post is equal to one for observations that occur in the 36 months following the highly correlated peer ETF introduction and is equal to zero otherwise. In column (2), we omit ln(Market Value) to avoid multicollinearity between market value and shares outstanding. Cohorts are created with never treated control observations. Standard errors are clustered by ETF, and t-statistics are reported in parentheses. Appendix B contains detailed variable definitions.

 **Significance at 5%; ***significance at 1%.

We expect ETF swapping demand to be more prominent when investors generate capital losses on their investment holdings. In column (2) in Table 2, we interact Treat × Post with the trailing 12-month buy-and-hold return (B&H Trailing 12). A 12-month return window coincides with tax rules for long-term capital gains and losses, and it is consistent with the window considered in Brown et al. (2022) and Li (2024).14 We find that the main effect of high prior returns is increased shares outstanding. However, the interaction term indicates that shares outstanding following the introduction of a paired ETF increase (decrease) as a function of the incumbent ETF’s negative (positive) stock returns is symptomatic of traders being more likely to exit (hold) their existing positions to generate capital losses (delay recognition of capital gains). We further explore the relation between ETF holdings and 13F filer loss recognition in Section 4.5.

We find similar evidence to Li (2024) in the secondary market with ln(Volume) as the dependent variable. The coefficient on Treat × Post in column (3) in Table 2 represents a 31% increase in ETF trading volume in the postperiod. Collectively, our results suggest that the introductions of near-identical ETFs act as complements to incumbent ETFs. To further analyze time-series changes in shares outstanding and trading volume for newly paired ETFs, we analyze the differential trends in shares outstanding and trading volume between treatment ETFs (i.e., incumbents) and control ETFs (i.e., funds where a highly correlated fund is not introduced) in the 36 months surrounding the introduction of a near-identical ETF. In the pre-ETF introduction period, we observe relatively stable shares outstanding and volume differentials between treatment and control ETFs. Post-ETF introduction, we observe a marked increase in the difference in both shares outstanding and trading volume between treatment and control ETFs (see Figure 1). This illustrative evidence supports our conjecture that the demand for a newly paired ETF is permanent and that the increase in ETF differential shares outstanding and trading activity only occurs after the introduction of a new ETF.

Figure 1. (Color online) Event Time Differences in ETFs with Highly Correlated Partners
Notes. This figure presents a graph of the ETF market in event time, where period 0 is the month in which an ETF has a highly correlated peer ETF introduced. Treated ETFs are those that have a highly correlated partner, and control ETFs are those that do not. The differences in both primary market activity and secondary market activity are plotted (i.e., standardized shares outstanding or trading volume for treated firms minus standardized shares outstanding or trading volume for control firms) for each month t individually for months between t = −24 and t = 24 and jointly for months −36 ≤ t ≤ 25 and 25 ≤ t ≤ 36.

The results thus far are highly suggestive of ETF wash sale activities. However, to the extent that ETF competitors introduce near-identical products when a comparable incumbent product has been especially successful, our results may not be because of tax loss harvesting. If these ETF introductions coincide with strong surges in underlying demand for indexed ETF products, the increased trading volume and share creation may reflect a strong growth trend in the sector rather than tax loss harvesting. Furthermore, increases in ETF trading volume and share creation alone would not link institutional investors to wash sale avoidance. For these reasons, we turn to institutional trading data to provide additional evidence.

4.2. Granular Evidence of Swap Trades

The complementary relation between incumbent and new-entry ETFs suggests that investors use near-identical assets to harvest tax losses. If that is what drives the increased demand in redundant ETFs, we should observe institutional investors swapping between highly correlated ETFs within the wash sale window. To test this conjecture, we exploit proprietary daily institutional investor trading data from Abel Noser. We start by counting the number of transactions where institutions engage in same-day, same-fund transactions where buys and sells of HC ETFs occur. In Figure 2, we observe that the frequency of HC ETF swaps over our sample has increased exponentially in our Abel Noser sample from a handful in the early 2000s to more than 1,400 in 2010. Notably, our tranche of Abel Noser data predates the recent ETF boom and only includes 8%–15% of the total dollar amount of U.S. institutional investor trades (Puckett and Yan 2011, Hu et al. 2018). Nevertheless, we observe a marked increase in institutional investor transactions consistent with harvesting tax losses as more HC ETFs come online through the early 2000s.

Figure 2. Institutional Investors and Highly Correlated ETFs
Notes. This figure plots the frequency of same day, same fund trades between highly correlated ETFs in the Abel Noser data. The ClientMgrCode data field from Abel Noser is used to identify intrafund activity. Only investment managers (ClientTypeCode = 2) are included in the chart.

For illustrative purposes, in Appendix A, we provide examples of several ETF swaps across different Abel Noser clients. These trades are executed on the same day by the same fund. For example, Abel Noser client #732 sold $3,615,739 of IVV and purchased $3,563,270 of SPY on December 21, 2004. The correlation between IVV and SPY monthly price movements in the year leading up to the swap trade was 1.000, consistent with the client expecting tax advantages from the trade rather than differences in pretax returns. Thus, client #732 engaged in trades totaling $7.2 million in dollar volume while shifting its economic position by only $52,000 (less than 1% of the dollar volume). We are unaware of any economic rationale other than tax loss harvesting that explains these trades.15

4.3. Estimating the Magnitude of ETF-Based Wash Sale Avoidance

Abel Noser data are not well suited to estimating the magnitude of loss harvesting. First, Abel Noser trading data extend from 2001 to 2010 when ETF trading volume averaged only 25% of 2022 volume. Second, Abel Noser’s fund manager identifier (ClientMgrCode) is not stable across time, limiting our ability to identify time-series variation in institutional investor trading (see Hu et al. 2018 for a review of Abel Noser data features).16 Finally, the data include trades by institutions but not their historical holdings, making it difficult to estimate investment bases and calculate gains and losses. To address these limitations and to more broadly estimate the magnitude of tax-motivated ETF holdings, trading volume, and capital loss recognition across a comprehensive panel of tax-sensitive institutional traders, we turn to 13F filings. We start these analyses by first examining whether tax-sensitive investors relative to tax-insensitive investors are more likely to invest in highly correlated ETFs. We estimate the following model:

[HC ETF Preference]it=β1Tax Sensitivityi+β2Available HC ETFst+β3Tax Sensitivityi×Available HC ETFst+θi+εit.(2)

We measure HC ETF Preference as either Unique HC ETFs (the distinct number of highly correlated ETFs held by investor i in quarter t) or HC ETF Intensity (highly correlated ETF AUM holdings over total AUM holdings for investor i in quarter t). Both are measures of a preference for and adoption of highly correlated ETFs. With quarterly 13F data, our unit of observation is at the filer-quarter level. We define Tax Sensitivity as an indicator equal to one when a 13F filer reports non-zero HNW clients on Form ADV and zero if there are no HNW clients or if the institution is classified as a pension or endowment. Investment advisors file Form ADV with the SEC and must disclose information about their organization and clientele, including how many of their clients fall into various categorizations (e.g., individuals, high-net-worth individuals, pensions, and governments). We collected Form ADVs from 2009 to 2022. We merge the time series of ADV classifications with the WhaleWisdom data using a fuzzy match of multiple identifiers (year, quarter, Central Index Key (CIK), name, and phone number).17 We then identify the modal value of Tax Sensitivity for each institution and apply that classification for all periods.18 We predict that, ceteris paribus, tax-sensitive institutional investors will hold more near-identical ETFs (β1 > 0).

We further predict that tax-sensitive institutional investors will become more active in accumulating near-identical ETFs as the number of available highly correlated ETFs increases. We measure Available HC ETFs as the standardized count of unique highly correlated ETFs available in CRSP each quarter t of our sample. The interaction between Tax Sensitivity of institution i and ETF HC Availability in quarter t captures our empirical prediction. A β3 > 0 suggests that tax-sensitive institutions intensify their ETF holdings more rapidly than their peers as ETF availability increases. We also include filer-level fixed effects (θi) in some specifications.19

We start our empirical analysis by illustrating time-series changes in institutional investor holdings between tax-sensitive and tax-insensitive investors. Panel (a) of Figure 3 shows that Unique HC ETFs have increased significantly more rapidly for tax-sensitive investors than tax-insensitive investors. In panel (b) of Figure 3, we show that that HC ETF Intensity increased rapidly for tax-sensitive investors over the past 20 years, whereas HC ETF Intensity has remained largely flat for tax-insensitive investors since 2010. Although this provides descriptive evidence of tax-sensitive investors’ preference for highly correlated ETFs, we explore this prediction further in our empirical analyses.

Figure 3. (Color online) Institutional Investors and Highly Correlated ETFs
Notes. This figure presents descriptive graphs of the ETF market for each year of our sample. Panel (a) shows the change in average quarterly count of HC ETFs held for both investor types. Panel (b) shows the change in HC ETF Intensity (defined as HC ETF AUM/AUM) for both investor types over time. Data are from CRSP and WhaleWisdom. (a) Average quarterly count of unique HC ETFs held. (b) Average quarterly HC ETF intensity.

Table 3 presents the results of estimating Equation (2) with Unique HC ETFs as the dependent variable. In column (1) in Table 3, we find that tax-sensitive investors hold 13 more unique HC ETFs than tax-insensitive institutional investors (coefficient = 13.488, p < 0.01). In column (2) in Table 3, we interact Tax Sensitivity with Available HC ETFs. We find that as more HC ETFs become available, tax-sensitive institutions are incrementally more likely to own a greater number of distinct highly correlated ETF tickers. The second and fourth specifications include a filer fixed effect and thus, omit the static Tax Sensitivity measure. Column (3) in Table 3 presents results with HC ETF Intensity as the dependent variable. Our main effect is consistent with tax-sensitive institutions holding 8.4 percentage points more of their AUM in highly correlated ETFs (p < 0.01). The interaction of Tax Sensitivity and Available HC ETFs is positive but statistically insignificant for the full sample (t-statistic: 1.50). Panel (b) of Figure 3 shows that the intensity of the tax-sensitive institutional holdings accelerates relative to the tax-insensitive institutions only in the latter half of the sample. We confirm this by reperforming the test from column (4) in Table 3 after excluding the first five years of the sample and find positive and significant results (untabulated) consistent with tax-sensitive institutions having an incremental preference for highly correlated ETFs.

Table

Table 3. Unique HC ETFs Held and HC ETF Intensity

Table 3. Unique HC ETFs Held and HC ETF Intensity

Dependent variable(1)(2)(3)(4)
Unique HC ETFsUnique HC ETFsHC ETF IntensityHC ETF Intensity
Tax Sensitivity13.488***8.441***
(10.32)(8.51)
Available HC ETFs9.343***2.847***7.375***2.059***
(43.28)(6.59)(34.24)(6.38)
Tax Sensitivity × Available HC ETFs5.777***0.479
(9.75)(1.50)
Constant3.931***7.520***
(4.44)(10.08)
Observations208,252208,140208,252208,140
Fixed effectsNoneFilerNoneFiler
Adjusted R20.1740.8030.1330.845


Notes. This table presents results of regressing Unique HC ETFs and HC ETF Intensity on Tax Sensitivity and Available HC ETFs. The dependent variable Unique HC ETFs is the count of distinct ETFs held at the end of the quarter. The dependent variable HC ETF Intensity is equal to (HC ETF AUM/Total AUM) ×100 at the end of each quarter. We measure Tax Sensitivity as an indicator equal to one if an institution serves high-net-worth individuals. We measure Available HC ETFs as the standardized count of unique HC ETFs available in CRSP each quarter from 2001 to 2022. Standard errors are clustered by filer and by year-quarter, and t-statistics are reported in parentheses.

 ***Significance at 1%.

4.4. Establishing ETF Swapping as an Institutional Investor Tax Strategy

Although simple, the results in Table 3 and illustrations in Figure 3 establish an important empirical fact. There is a starkly different appetite for highly correlated ETFs based on the tax sensitivity of an institutional investor. It can be inferred that ETFs offer something to tax-sensitive institutions that is not appealing to tax-insensitive investors. However, the relation between ETF holdings and investor type may be driven by investor clientele effects unrelated to tax sensitivity. For instance, Bhattacharya et al. (2017) find that retail investors tend to return chase using ETFs, and the strong market performance of the S&P 500 in recent years could drive the uptick in ETF AUM among tax-sensitive investors.

Alternatively, the increase in ETF holdings could be driven by tax minimization preferences for ETFs unrelated to skirting wash sale rules. Moussawi et al. (2025) argue that the increase in tax-sensitive institutional investor holdings is influenced by the ability of ETFs to avoid within-ETF capital gains taxes from portfolio position changes. ETFs can use “heartbeat trades” (i.e., creation and in-kind redemption in the primary market) to avoid distributing taxable capital gains from trading appreciated securities (e.g., around index reconstitutions).

In this section, we examine whether 13F filers swap ETFs that are highly correlated with one another. In other words, we identify instances where an investor sells one ETF (e.g., VOO) while simultaneously buying an offsetting amount of another ETF (e.g., IVV). We focus on secondary market trading activity to better understand whether the increase in ETF holdings documented above can be attributable to avoiding the wash sale rule. Secondary market swapping of highly correlated ETFs is unrelated to the other tax advantages documented by Moussawi et al. (2025). Using the following empirical design, we examine ETF trading activity in the secondary market:

HC ETF Swapsit=β1Tax Sensitivityi+ β2HC ETF Intensityit+β3Tax Sensitivityi×HC ETF Intensityit+βControls +ψt+θi+εit.(3)

HC ETF Swaps is the natural log of one plus the dollar volume of offsetting purchases and sales of highly correlated ETFs for investor i in quarter t.20 We identify ETF swap activity by observing quarters in which an investor sold an ETF while simultaneously buying one or more ETFs that are correlated with the sold ETF at 99% or greater. In other words, when we observe offsetting purchases and sales of two highly correlated ETFs in a given fiscal quarter (call them ETF b and ETF c) such that ΔETFbt < 0 and ΔETFct, c≠b > 0, then we define the HC ETF Swaps as the dollar volume offset between the purchases and sales transactions: min[(|ΔETFbt| × Pricebt), (|ΔETFct| × Pricect)].

We define HC ETF Intensity as noted above. Our coefficients of interest in Equation (3) are β1 and β3, the interaction of Tax Sensitivity and HC ETF Intensity. We expect that tax-sensitive institutional investors will engage in more ETF swaps than tax-insensitive investors (i.e., β1 > 0). In addition, we expect that tax-sensitive institutional investors will engage in more ETF swaps as their ETF holdings increase (β3 > 0). Our control matrix follows prior literature as we control for turnover, fourth-quarter selling, and the level of realized gains, realized losses, unrealized gains (UGs), and unrealized losses (Sikes 2014).

In column (1) in Table 4, we regress HC ETF Swaps on Tax Sensitivity and year and quarter fixed effects. Consistent with our prediction that swapping between HC ETFs serves a tax purpose, we find that institutions classified as tax-sensitive have significantly higher HC ETF Swaps (coefficient = 0.165; p < 0.01).

Table

Table 4. Swapping Activity

Table 4. Swapping Activity

Dependent variable = HC ETF Swaps(1)(2)(3)
Tax Sensitivity0.165***0.040***
(8.10)(3.76)
HC ETF Intensity0.003***0.001***
(5.98)(3.87)
Tax Sensitivity × HC ETF Intensity0.006***0.002***
(8.52)(3.91)
Unique HC ETFs0.014***
(27.35)
Fourth Quarter0.011
(1.07)
Realized Loss/Vol0.036**
(2.00)
Unrealized Loss/Vol−0.004
(−1.60)
Realized Gain/Vol−0.022
(−1.07)
Unrealized Gain/Vol−0.007***
(−9.13)
Turnover0.010***
(2.69)
Observations208,252208,252208,140
Fixed effectsYear, quarterYear, quarterYear, filer
Adjusted R20.0740.1680.628


Notes. This table presents results of regressing dollar volume of highly correlated ETF swaps on Tax Sensitivity and HC ETF Intensity. The dependent variable, HC ETF Swaps, equals the natural log of one plus the dollar summation of offsetting buys and sells in HC ETFs for each filer in each quarter. We measure Tax Sensitivity as an indicator equal to one if an institution serves high-net-worth individuals. We measure HC ETF Intensity as equal to (HC ETF AUM/Total AUM) × 100 at the end of each quarter. We measure Unique HC ETFs as the count of distinct HC ETFs held at the end of the quarter. We measure Fourth Quarter as an indicator variable equal to one for the fourth calendar quarter. We measure Realized Loss (Gain)/Vol as equal to realized losses (gains)/volume. We measure Unrealized Loss (Gain)/Vol as equal to unrealized losses (gains)/volume. We measure Turnover as equal to volume/total AUM. Standard errors are clustered by filer and by year-quarter, and t-statistics are reported in parentheses.

 **Significance at 5%; ***significance at 1%.

Next, we include HC ETF Intensity and the interaction of Tax Sensitivity and HC ETF Intensity in column (2) in Table 4. Consistent with Tax Sensitivity being especially important to harvesting ETF losses in the presence of increasing HC ETF Intensity, we find that the coefficient on the interaction is significant at the 1% level (coefficient = 0.006). In column (3) in Table 4, we estimate our full Equation (3) model, including additional controls and investor fixed effects, and we again find a positive coefficient on Tax Sensitivity × HC ETF Intensity (coefficient = 0.002; p < 0.01). Notably, we observe that the ratio of realized losses over volume (Realized Loss/Vol) is increasing in HC ETF Swaps. In other words, more swapping is significantly associated with increased realization of losses, a finding that we examine more directly in Section 4.5. Thus, consistent with our prediction, we find that tax-sensitive institutions engage in more swapping of highly correlated ETFs, that this propensity is an increasing function of AUM devoted to highly correlated ETFs, and that the swapping activity is associated with more realized losses.

Figure 4 plots the data by tax sensitivity when scaling the amount of HC ETF Swaps by total volume traded. Among tax-sensitive institutional investors, ETF swaps increased from $280 million in 2001 to $106 billion in 2022. Tax-sensitive institutional investors have amassed $417 billion of ETF swaps since 2001.21

Figure 4. (Color online) Average Quarterly Trading Volume Dedicated to HC ETF Swaps
Notes. This figure shows the dollar volume of HC ETF Swaps scaled by the dollar volume of total trades during quarter t. We separately plot tax-sensitive and tax-insensitive filers. Data are from CRSP and WhaleWisdom.

4.5. Realizing Losses and Gains

Although simultaneously entering and exiting economically identical ETFs seem to have no purpose beyond tax avoidance, institutions may act for other reasons. For example, they could be swapping into ETFs with lower management fees (Brown et al. 2022) or ETFs managed by personal acquaintances. To further validate that tax-sensitive institutions swapping ETFs are harvesting losses, we employ the proportion of losses realized measure from prior literature (Odean 1998, Sikes 2014). We estimate Equation (4) with PLR as the dependent variable:

PLRit=β1HC ETF Swapsit+ β2Tax Sensitivityit×HC ETF Swapsit+βControls + ψt+θi+εit.(4)

Calculating realized gains and losses requires a cost basis at the time of sale. We follow a similar approach to prior literature to proxy for cost basis, valuation, and trading behavior (Jin 2006, Sikes 2014, Blouin et al. 2017). Securities held during the first quarter of 2001 (the first year with quarterly 13Fs available from WhaleWisdom) are treated as purchased during that quarter. From then on, quarter-to-quarter increases in each security are treated as “buys,” and quarter-to-quarter decreases are treated as “sells.” If an institution owns multiple lots, we assume a highest in, first out (HIFO) approach to calculate realized gains and losses.22 After establishing each position’s basis, sales price, and implied gain or loss, we aggregate gains and losses to the institution-quarter level. We measure PLR as the proportion of realized losses for institution i in time t scaled by their total realized losses and ULs in that quarter RLitRLit+ULit.

As a baseline test, we first examine the filer’s overall PLR as our dependent variable in Equation (4). We do so to address our fundamental research question: whether ETFs mitigate frictions imposed by the wash sale rule in harvesting capital losses. Our second set of tests restricts the PLR to be related only to highly correlated ETFs (HC ETF PLR) to more directly tie the ETF swapping activity to losses harvested in the filer’s ETF portfolio. We anticipate that HC ETF Swaps will lead to higher PLRs for tax-sensitive investors (β2 > 0). We additionally control for the proportion of gains realized (PGR) to account for the tax incentive to realize losses as offsets for concurrently realized gains. We construct PGR as the proportion of realized capital gains (relative to total RGs and UGs in a quarter RGitRGit+UGit. In addition to PGR, we include an indicator to capture fourth-quarter trading and 13F filer turnover (i.e., trading volume scaled by AUM) as well as filer fixed effects to identify within-investor effects and limit correlated omitted variable concerns.23 Appendix B provides detailed variable definitions.

Table 5 presents the results of regressing PLR and ETF PLR on measures of Tax Sensitivity and HC ETF Swaps. In column (1) in Table 5, we see that PLR is an increasing function of HC ETF Swaps (coefficient = 2.584; p < 0.01). In other words, institutions that engage in more swapping between highly correlated ETFs capture a larger proportion of their losses. Thus, our evidence suggests that ETF swapping appears to ameliorate the trading frictions imposed by the wash sale rule (e.g., Jensen and Marekwica 2011) in that investors maintain a similar portfolio allocation and harvest tax losses. Column (2) in Table 5 makes clear that tax-sensitive institutions drive the HC ETF Swaps as the size and significance of the effect are concentrated in the interaction of Tax Sensitivity and HC ETF Swaps. In column (3) in Table 5, we include control variables for turnover, the fourth quarter, and the count of unique ETFs held. Consistent with prior literature, we find that significantly greater proportions of losses are realized in the fourth calendar quarter (Sikes 2014). A one-standard-deviation increase in HC ETF Swaps for tax-sensitive institutional investors (0.65) is thus associated with a 1.2% increase in PLR (1.806 × 0.65). Given that the sum of the mean realized losses and unrealized losses for our tax-sensitive subsample is $83.08 million (22.21 + 60.87), a one-standard-deviation increase in swapping activity equates to $972,000 of additional losses realized per tax-sensitive filer-quarter. Aggregated over our sample, 70% of which are tax sensitive, a one-standard-deviation increase equates to $138 billion ($972,000 × 70% × 202,168 observations).24

Table

Table 5. Proportion of Losses Realized and ETF Proportion of Losses Realized

Table 5. Proportion of Losses Realized and ETF Proportion of Losses Realized

Dependent variable(1)(2)(3)(4)(5)(6)
PLRPLRPLRHC ETF PLRHC ETF PLRHC ETF PLR
HC ETF Swaps2.584***0.5050.772**4.226***2.039***2.215***
(9.53)(1.47)(2.48)(12.39)(3.90)(4.16)
Tax Sensitivity × HC ETF Swaps2.578***1.806***2.668***2.364***
(5.13)(4.01)(4.45)(4.23)
Unique HC ETFs−0.007−0.070***
(−1.16)(−5.82)
Fourth Quarter3.759***0.934
(8.01)(0.93)
PGR−0.029**
(−2.24)
HC ETF PGR0.132***
(10.48)
Turnover26.969***22.406***
(26.61)(23.01)
Observations203,986203,986202,16888,28188,28177,864
Fixed effectsYear, quarter, filerYear, quarter, filerYear, filerYear, quarter, filerYear, quarter, filerYear, filer
Adjusted R20.2810.2810.4210.1430.1440.208


Notes. This table presents results of regressing PLR (HC ETF PLR) on HC ETF Swaps and Tax Sensitivity. The dependent variable PLR is defined as (Realized Losses/All Losses) × 100. We define HC ETF PLR similarly but only for HC ETF losses. We measure HC ETF Swaps as the log of one plus the dollar summation of offsetting sells and buys in highly correlated ETFs. We measure Tax Sensitivity as an indicator equal to one if an institution serves high-net-worth individuals. We measure Unique HC ETFs as the count of distinct HC ETFs held at the end of the quarter. We measure Fourth Quarter as an indicator variable equal to one for the fourth calendar quarter. We measure PGR (HC ETF PGR) in the same manner as PLR (HC ETF PLR), but for gains rather than losses. We define Turnover as equal to volume/total AUM. Standard errors are clustered by filer and by year-quarter, and t-statistics are reported in parentheses.

 **Significance at 5%; ***significance at 1%.

We also find that, unsurprisingly, Turnover is the primary driver of PLR.25 The relationship is mechanical as more trading implies more realization of capital gains and losses. The coefficient on the interaction of Tax Sensitivity and HC ETF Swaps remains positive and significant, indicating that tax-sensitive investors’ ETF swaps are a driver of harvesting capital losses.

Columns (4)–(6) in Table 5 present a similar empirical design but with ETF PLR as the dependent variable. The sample size for this test is smaller than that of the first three specifications because HC ETF PLR is missing for fund-quarters with no highly correlated ETF holdings.26 Again, we find results consistent with our predictions. Notably, in column (6) in Table 5, we observe that the coefficient on the Fourth Quarter indicator is not significantly different from zero. This suggests that harvesting losses among ETFs is less seasonally driven than among traditional equities (as documented in Sikes 2014), consistent with ETF swapping being a viable strategy throughout the calendar year.27 Matthews (2016) corroborates this empirical finding, noting that institutions face no tracking error and minimum transaction costs when swapping ETFs intrayear, which makes year-end-only ETF loss harvesting inefficient.

Another approach to estimating losses harvested via wash sale avoidance is to sum realized losses associated with HC ETF swapping activity. Table 6 does so for different definitions of “highly correlated” ranging from 95% to identical index ETFs. Using our ≥99% correlation threshold, we estimate that 13F filers realized $84 billion of losses over our sample period from HC ETFs that we identify as being involved in swapping activity. Loss realizations are higher in bear market years (e.g., 2002, 2008–2009, 2018, and 2022) and in later years (untabulated). If the losses offset short-term capital gains or income, this represents $31 billion in tax revenue. For scale, the IRS 2023 enacted budget was $12.3 billion. Because of the assumptions required to generate these summations (i.e., assumptions regarding acquisition and disposal dates and prices, initial basis assignments, intraquarter activity, and intrafund activity) as well as the likelihood that investor behavior would change with rule clarification (i.e., if regulators specified that swapping HC ETFs violated the wash sale rule, the extent of swapping behavior would likely decline), we caution against interpreting these estimates as the likely increase in treasury revenue if the wash sale rule was clarified to disallow losses from swapping HC ETFs. Rather, we view these figures as indicative of the significant and growing size of this activity absent intervention.

Table

Table 6. Estimating Tax Revenue Losses

Table 6. Estimating Tax Revenue Losses

Same index99% correlation95% correlation
Tax-sensitive dollars swapped, $41,749M417,192M1,276,258M
Tax-sensitive losses associated with swapping-involved HC ETFs, $6,756M84,479M163,618M
Potential tax revenue at a 37% short-term capital gains rate, $2,500M31,257M61,649M
Potential tax revenue at a 20% long-term capital gains rate, $1,351M16,896M33,342M


Notes. This table provides estimates of swapping activity and associated losses at alternative threshold definitions of “highly correlated.” The estimates are derived by (1) identifying same quarter swaps between HC ETFs, (2) estimating the cost basis of each swapped security, (3) flagging the HC ETFs involved in the swap that have realized losses, and (4) summing the realized losses for only institutions classified as tax sensitive. “Same index” ETFs are those that explicitly state that they track the same defined index (e.g., we would include swapping between two ETFs that both track the S&P 500 in this measure but not between two ETFs that both focus on “midcap growth equities” without specifying a specific index). M, millions.

4.6. A Separate Effect from Primary Market Tax Advantages

Research documents two other tax benefits of ETFs. First, many ETFs have low turnover in their holdings, which limits the frequency with which they realize taxable gains (e.g., Colon 2023). Second, investors in ETFs benefit from section 852(b)(6), which exempts in-kind distributions from taxation through a process referred to as “heartbeat trades” (Choi et al. 2025, Moussawi et al. 2025). Large inflows from authorized participants create new ETF shares followed within a few days by large outflows in the form of in-kind redemptions of appreciated shares on the primary market. In this way, appreciated securities are disbursed without any realized capital gains that would have been generated if the securities had been sold and cash had been disbursed. This second tax benefit accrues to investors for holding ETFs.

Similar to our findings, Moussawi et al. (2025) document that tax-sensitive institutions allocate larger portions of their portfolios to ETFs. The authors attribute this preference to the primary market tax advantages. In Table 7, we illustrate that tax-sensitive filers are significantly more likely to hold ETFs that engaged in large dollar volumes of heartbeat trades (column (1) in Table 7) and large numbers of heartbeat trades (column (2) in Table 7), consistent with Moussawi et al. (2025). An alternative explanation to our main findings is that they are simply a side effect of investors crowding into ETFs for the primary market tax advantages. To demonstrate that the effects that we document are both distinct from and incremental to those identified in Moussawi et al. (2025), we estimate the following model:

HC ETF Swapsit=β1Tax Sensitivityit+β2HC ETF Intensityit+ β3Tax Sensitivityit×HC ETF Intensityit+β4Heartbeat Intensity Quintile+ψt+εit.(5)

Table

Table 7. Heartbeat Trades and Swapping Activity

Table 7. Heartbeat Trades and Swapping Activity

Dependent variable(1)(2)(3)(4)
Heartbeat Size IntensityHeartbeat Count IntensityHC ETF SwapsHC ETF Swaps
Tax Sensitivity68.707***0.046***−0.0110.001
(5.26)(10.79)(−1.22)(0.55)
HC ETF Intensity0.003***0.002***
(5.15)(5.51)
Tax Sensitivity × HC ETF Intensity0.005***0.004***
(7.04)(6.37)
Heartbeat Count Intensity Quintile0.064***
(11.98)
Observations208,252208,252208,252136,289
Fixed effectsYear, quarterYear, quarterYear, quarterYear, quarter
Adjusted R20.0290.0930.1910.110


Notes. Columns (1) and (2) present results of regressing proxies for heartbeat trade activity on Tax Sensitivity. The dependent variable, Heartbeat Size Intensity, is the dollar volume of heartbeat trade activity scaled by the dollar value of ETFs held by the filer in each quarter. We measure Heartbeat Count Intensity as the number of heartbeat trades executed among the ETFs held by the filer in each quarter scaled by the number of ETFs held by the filer. In column (3) and (4), the dependent variable is HC ETF Swaps as previously defined. We define Heartbeat Count Intensity Quintile as a count variable sorting each filer-quarter into a quintile of Heartbeat Count Intensity. Column (4) limits the sample to only filer-quarters that are below the median of Heartbeat Count Intensity. Heartbeat trades are identified using ETF Global Fund Flows data according to the procedure defined in Moussawi et al. (2025). Standard errors are clustered by filer and by year-quarter, and t-statistics are reported in parentheses.

 ***Significance at 1%.

Variables are previously defined, and Heartbeat Intensity Quintile is a count variable that sorts each filer-quarter into its quintile of ETF heartbeat trade activity. ETF heartbeat trade activity is identified by following the procedure described in Moussawi et al. (2025) and using the ETF Global database for ETF fund flows. If our main findings are subsumed by a preference for ETFs that offer heartbeat trades, we anticipate an insignificant β3 coefficient with a positive and significant β4 coefficient. On the other hand, if institutional investors swap highly correlated ETFs for reasons beyond the primary market tax advantages, we would predict β3 > 0. Importantly, we do not believe that the primary market tax advantages and secondary market tax advantages are mutually exclusive; rather, we would anticipate that both play a meaningful role for tax-sensitive institutions.

Results from estimating Equation (5) are in column (3) in Table 7. We see a positive and significant coefficient on both β3 and β4, suggesting that the secondary market wash sale avoidance that we document is incremental to tax-sensitive preferences for ETFs that engage in heartbeat trades. We also re-estimate Equation (5) on a subsample of filer-quarters that hold ETFs below the median level of heartbeat trade activity. Consistent results in this subsample would suggest that the effects are not primarily driven by swapping ETFs that engage in heartbeat trades. We find that the estimates are very similar to those in Table 4 in this subsample, and we report the results in column (4) in Table 7. Again, this finding indicates that the secondary market trading advantages are distinct from and incremental to the primary market advantages of holding ETFs.

4.7. Alternative Measures of Tax Sensitivity

Our base measure of tax sensitivity is an objective cutoff between having and not having HNW individuals as clients. Prior literature exploits differential cutoffs of HNW individuals on Form ADV. Table 8 displays the robustness of our main results to alternative measures of Tax Sensitivity. Column (1) in Table 8 provides our base results from column (3) in Table 5.28 In columns (2)–(4) in Table 8, we utilize different HNW individual cutoffs and re-estimate Equation (4). In column (2) in Table 8 (column (3) in Table 8), we define a tax-sensitive institution as one in which at least 25% (50%) of its clients are HNW individuals and a tax-insensitive institution as one that has zero HNW clients, a pension fund, or an endowment. In column (4) in Table 8, we classify a tax-sensitive institution as one where at least 50% of its clients are HNW individuals and a tax-insensitive institution as one where either (1) an investment advisor’s combined charity, government, and pension clients exceed 50% of their total clients or (2) the institution is a corporate pension fund, a public pension fund, or an endowment. In column (5) in Table 8, we utilize the tax sensitivity measure in Blouin et al. (2017). This constrains our sample size given that (1) WhaleWisdom does not share a primary identifier with the Thomson Reuters data used in Blouin et al. (2017), necessitating a matching procedure from which we keep only high-quality matches, and that (2) the measure is only available through 2018.29 In column (6) in Table 8, we apply each filer’s modal value of tax sensitivity from Blouin et al. (2017) to all of its observations, carrying its classification forward through 2022. Finally, we create continuous and time-varying measures of tax sensitivity using the proportion of clients who are individuals (individuals + HNW individuals) and the proportion of AUM that is attributable to individual clients in columns (7) and (8) in Table 8.

Table

Table 8. Tax Sensitivity Robustness

Table 8. Tax Sensitivity Robustness

Dependent variable = PLR(1)(2)(3)(4)(5)(6)(7)(8)
Tax Sensitivity definitionSimple cut25% HNW50% HNW50/50Blouin et al. (2017)Blouin et al. (2017) (modal)Continuous clientContinuous dollars
Tax Sensitivity−0.553−1.702*−0.340
(−1.23)(−1.87)(−0.33)
HC ETF Swaps0.772**0.981***1.394***1.027**0.766***0.809***0.896***1.144***
(2.48)(3.84)(5.38)(2.17)(3.17)(3.83)(3.47)(4.81)
Tax Sensitivity × HC ETF Swaps1.806***1.801***1.527***1.342**1.406***1.616***2.060***2.074***
(4.01)(4.54)(4.03)(2.14)(2.68)(4.51)(4.70)(4.57)
Unique HC ETFs−0.007−0.005−0.003−0.0070.031***0.000−0.007−0.008
(−1.16)(−0.69)(−0.40)(−0.72)(2.70)(0.05)(−1.18)(−1.23)
Fourth Quarter3.759***3.884***3.840***4.726***3.764***3.745***3.760***3.770***
(8.01)(8.25)(8.29)(7.47)(7.74)(8.13)(8.00)(8.02)
PGR−0.029**−0.033**−0.033**−0.035**−0.059***−0.057***−0.029**−0.029**
(−2.24)(−2.39)(−2.42)(−2.54)(−4.02)(−4.37)(−2.23)(−2.22)
Turnover26.969***27.027***26.571***39.465***32.769***30.439***26.992***26.987***
(26.61)(24.66)(23.82)(30.77)(32.09)(32.83)(26.50)(26.52)
Observations202,168189,970176,404100,654127,056177,851197,932197,932
Fixed effectsYear, filerYear, filerYear, filerYear, filerYear, filerYear, filerYear, filerYear, filer
Adjusted R20.4210.4180.4160.4170.4460.4220.4230.423


Notes. Each column presents the test from column (3) in Table 5 with a different definition of Tax Sensitivity. The dependent variable in all specifications is PLR as previously defined. Column (1) uses the same definition as in all other tables. Columns (2) and (3) define tax sensitives as filers that report that HNW individuals comprise >25% or >50% of their clientele, respectively. Column (4) defines tax sensitives as filers with >50% HNW clients and tax insensitives as those with >50% charity, government, or pension clients. Column (5) uses the tax sensitivity measure from Blouin et al. (2017). Column (6) uses the modal value of the measure from Blouin et al. (2017) for each filer applied to all sample years. Column (7) is a continuous measure created by summing the percentage of clients who are individuals or HNW individuals. Column (8) is similar to column (7) but uses the percentage of AUM rather than percentage of clients. Fixed effects are noted near the bottom of each specification. Standard errors are clustered by filer and by year-quarter, and t-statistics are reported in parentheses.

 *Significance at 10%; **significance at 5%; ***significance at 1%.

In each column in Table 8, we observe statistically significant evidence that the relation between ETF swapping and loss recognition is incrementally stronger for tax-sensitive filers. These tests indicate that our results are not dictated by how we measure tax sensitivity.

5. Conclusion

The United States has disallowed tax losses from wash sales for over 100 years. Prior literature has posited that wash sales have prevented investors from maximizing tax-efficient returns in their investment portfolios (e.g., Jensen and Marekwica 2011). In recent years, the expansion of ETFs has provided investors with a new, low-cost tool, whereby capital losses can be realized without disturbing an optimal portfolio. We contribute to the growing ETF literature and document trading behavior consistent with ETF wash sale avoidance and tax loss harvesting. First, we find that the introduction of a near-identical ETF leads to more demand (as measured by share creation) for the incumbent ETF. Similar to Li (2024), we also find that incumbent ETFs have increased secondary market trading volume subsequent to the introduction of a near-identical ETF. Next, we document direct evidence of institutional investors swapping highly correlated ETFs in same-day, same-fund trades. Finally, we find that tax-sensitive institutions hold a more diverse set of highly correlated ETFs, invest a larger portion of their AUM in these ETFs, engage in more swapping between near-identical ETFs, and capture more capital losses with this swapping. We estimate conservatively that capital loss recognition attributable to annual swapping among tax-sensitive institutional investors is in the tens of billions of dollars.

It seems clear to us that the direct swapping of highly correlated ETFs for the sole purpose of harvesting tax losses violates the spirit of the original wash sale rule. But, even as this behavior has become increasingly widespread, regulators have remained silent on how the term “substantially identical securities” should be applied to ETFs. Absent such guidance, it is not surprising that a wide range of opinions on the matter has emerged among taxpayers and their advisors. Considering the rapid proliferation of nearly identical ETFs in recent years, we believe that more detailed guidance on the applicability of the wash sale rule to ETFs would help align potential enforcement actions with the original intent of the rule.

One possible explanation for the regulatory inaction is that policy setters have not yet fully grasped the scale and prevalence of the tax-loss-harvesting activities associated with highly correlated ETFs. If this is the case, we hope that by providing a conservative estimate of the magnitude of these activities, our analyses will help elevate this issue and bring it forward for regulatory consideration.

It may be that the reason for the regulatory inaction is more complicated than that. Other factors may be at play. Perhaps legal ambiguity and the practical challenges of implementing and enforcing new ETF rules are prohibitively high. Or, perhaps resistance from institutional investors and high-net-worth individuals makes such actions politically infeasible. Perhaps even regulators’ personal trading incentives have a role to play. To refrain from speculation, we leave these questions to future research.

Acknowledgments

The authors acknowledge Mary Cowx (discussant), Robert Feldgarden, Wentao Li, Shaphan Ng (discussant), Ben Yost (discussant), and seminar and workshop participants at the 2025 Financial Accounting and Reporting Section Midyear Meeting, the 2025 Hawaii Accounting Research Conference, the 2024 Conference on Financial Economics and Accounting, the 2024 University of Georgia Fall Accounting Symposium, Baruch College, Georgetown University, the University at Buffalo, the University of Illinois Chicago, the University of Washington, the Brigham Young University Accounting Research Symposium, The Stanford Initiative for Business, Taxation, and Society Lab, the PhD Accounting Workshop Series, two anonymous reviewers, an anonymous associate editor, and Shiva Rajgopal (editor) for helpful suggestions. They also thank Douglas Laporte and Christina Zhu for assistance with the ETF Global and Abel Noser data. Suhani Aggarwal, Luke Alexander, Isha Bhansali, Stefanie Luo, and Justin Sialm provided excellent research assistance.

Appendix A. Abel Noser Direct Washing Evidence

Table

Table A.1. Abel Noser Direct Washing Evidence

Table A.1. Abel Noser Direct Washing Evidence

ClientDateSymbolTraded, $ρΔ Position, $Volume traded, $Δ Position/volume, %
73212/21/2004IVV(3,615,739)1.00052,4697,179,0090.7
SPY3,563,270
1,1661/11/2007SPY(28,493,020)0.9903,494,15153,491,8896.5
OEF24,998,869
1,1913/19/2007VTI(2,798,431)0.99930,3455,566,5170.5
IWV2,768,086
1,0159/5/2007EEM(10,461,004)0.990183,20920,738,7990.9
VWO10,277,795
7325/13/2008XLE(6,714,513)0.996386,66213,815,6882.8
IYE7,101,175
1,32511/18/2008VWO(10,595,292)0.99061,51021,252,0940.3
EEM10,656,802
1,3251/28/2009IYR(4,621,272)0.991607,2298,635,3157.0
ICF4,014,043


Notes. A small selection of same day, same fund trades in highly correlated ETFs in the Abel Noser data are shown. The Δ Position column is computed as the net change in exposure to the assets underlying the ETFs (i.e., abs[dollars sold − dollars bought]). The final column indicates the relative proportion of the change in position to the volume traded. EEM, iShares MSCI Emerging Markets ETF; ICF, iShares Select U.S. REIT ETF; IWV, iShares Russell 3000 ETF; IYR, iShares U.S. Real Estate ETF; IYE, iShares U.S. Energy ETF; OEF, iShares S&P 100 ETF; VWO, Vanguard Emerging Markets Stock Index Fund; XLE, State Street Energy Select Sector SPDR ETF.

Appendix B. Variable Definitions

Table

Table B.1. Variable Definitions

Table B.1. Variable Definitions

VariableDescriptionSource
ln(Volume)The natural log of ETF shares (in hundreds) traded in a month. This is at the ETF level and is found in Table A.1 and Table 3.CRSP
ln(SharesOut)The natural log of split-adjusted ETF shares outstanding at the end of the month.CRSP
HC ETF PeerAn indicator equal to the value of Treat × Post in Equation (1).CRSP Mutual Funds
SpreadBid-ask scaled by nominal price.CRSP
VolatilityThe standard deviation of the prior 12 monthly returns.CRSP
ln(Market Value)The natural log of (shares outstanding × price).CRSP
B&H Trailing 12The buy and hold return over the trailing 12 months.CRSP
Tax SensitivityIndicator equal to one for 13F filers that serve a non-zero proportion of high-net-worth clients and zero otherwise. High-net-worth clients determined via question 5D of Form ADV. We also classify pensions and endowments as tax insensitive (indicator equal to 0) per the “Type” variable in the Institutional Investor Classification Data available on Brian Bushee’s website.IAPD (SEC), Institutional Investor Classification Data
Assets Under Management (AUM)Computed as Σ(Average of quarterly month-end prices for security i from CRSP × Number of shares of security i held) for quarter t for all 13F securities available in CRSP.WhaleWisdom 13F, CRSP
HC ETF Intensity(HC ETF AUM in quarter t/AUM in quarter t) with HC ETF AUM computed following the AUM calculation above but limited to securities with CRSP share code 73, which have a highly correlated peer available in a given quarter.WhaleWisdom 13F, CRSP
Realized LossRealized losses occur when (Sale Proceeds for security iBasis for security i) < 0. Sales Proceeds are computed as the (Number of shares of security i sold × Average of quarterly month-end prices for security i from CRSP). Realized losses are aggregated and reported at the filer-quarter level. Basis is computed using a highest in, first out methodology as in Blouin et al. (2017).WhaleWisdom 13F, CRSP
Unrealized LossUnrealized losses occur when (Market Value for security iBasis for security i) < 0. Market Value for security i is computed as (Average of quarterly month-end prices for security i from CRSP × Number of shares of security i held). Unrealized losses are aggregated and reported at the filer-quarter level. Basis is computed using a highest in, first out methodology as in Blouin et al. (2017).WhaleWisdom 13F, CRSP
Realized GainsCalculations are the same as for realized losses but for (Sale Proceeds for security iBasis for security i) > 0.WhaleWisdom 13F, CRSP
Unrealized GainsCalculations are the same as for unrealized losses but for (Market Value for security iBasis for security i) > 0.WhaleWisdom 13F, CRSP
Proportion of Losses Realized (PLR)Realized Losses in quarter t scaled by (Realized Losses + Unrealized Losses) in quarter t.WhaleWisdom 13F, CRSP
Proportion of Gains Realized (PGR)Realized Gains for quarter t/(Realized Gains + Unrealized Gains) in quarter t.WhaleWisdom 13F, CRSP
HC ETF PLR (HC ETF PGR)Calculations are the same as for PLR (PGR) but for only HC ETF securities identified with CRSP share code 73, which have a highly correlated peer available in a given quarter.WhaleWisdom 13F, CRSP
Volume, $Estimated dollar volume of total buys and sells for quarter t for each filer. Total buys are computed as Σ(Average of quarterly month-end prices for security i from CRSP × Increase in the number of shares of security i held in quarter t compared with t − 1). Total sells are computed as Σ(Average of quarterly month-end prices for security i from CRSP × Decrease in the number of shares of security i held in quarter t compared with t − 1).WhaleWisdom 13F, CRSP
Realized Loss (Gain)/VolRealized Losses (Gains) for quarter t/Volume for quarter t.WhaleWisdom 13F, CRSP
Unrealized Loss (Gain)/VolUnrealized Losses (Gains) for quarter t/Volume for quarter t.WhaleWisdom 13F, CRSP
TurnoverVolume for quarter t/AUM for quarter t.WhaleWisdom 13F, CRSP
Unique HC ETFsThe count of distinct HC ETFs held by the 13F filer for quarter t.WhaleWisdom 13F, CRSP
Available HC ETFsThe count of distinct HC ETFs available in CRSP for each quarter t.CRSP
Fourth QuarterIndicator equal to 1 for the 13F filings associated with the fourth quarter of the calendar year and zero otherwise.WhaleWisdom 13F
HC ETF SwapsThe natural log of one plus the summation of offsetting buys and sells in highly correlated ETFs. Total Buys and Total Sells are computed the same as for the Volume variable but limited to HC ETFs. We record an HC ETF swap if there is both a Buy and a Sell of two different but highly correlated ETFs for quarter t. The offset is defined as min[(Buy of ETF 1), (Sell of ETF 2)]. We consider only one Sell for each Buy. We limit our calculation to the offsetting quantity as that is the only portion of the trade that the wash sale rule would apply to.WhaleWisdom 13F, CRSP
Heartbeat Size IntensityDollar volume of heartbeat trade activity (as defined in Moussawi et al. 2025) scaled by the dollar value of ETFs held by the filer in quarter t.ETF Global
Heartbeat Count IntensityCount of heartbeat trades among ETFs held by the filer in quarter t scaled by the number of ETFs held.ETF Global
Heartbeat Count Intensity QuintileCount variable formed by sorting each filer-quarter into a quintile of Heartbeat Count Intensity.ETF Global


Notes. This table provides variable definitions and data sources. All nondiscrete variables are winsorized by year in our regression analyses at the 1% and 99% levels. All quarterly measures are as of the end of quarter 13F filing unless noted otherwise. Investment Adviser Public Disclosure (IAPD).

Endnotes

1 Fischer (2010) cites the U.S. Court of Appeals’ opinion from Hanlin v. Commissioner that (in the context of the wash sale rule) “the words ‘substantially identical’ indicate that something less than precise correspondence will suffice to make the transaction a wash sale.” For the remainder of the paper, we reserve the phrase “substantially identical” for discussion of the legal definition. We use “near identical” and “highly correlated” interchangeably to denote ETFs with price movements correlated at 0.99 or greater.

2 Although the 99% threshold is arbitrary, it is consistent with what some brokerage firms have suggested for their clients (e.g., Wealthfront 2021) as well as other academic work (e.g., Matthews 2016, Brown et al. 2022, Li 2024).

3 We focus on institutional investors rather than individual investors for several reasons. First, because institutional investors are required to distribute capital gains regularly, they have an immediate need to harvest tax losses. In contrast, individual investors can avoid capital gains via a step-up in basis at death (Dambra et al. 2020). Second, institutional investors can be partitioned by tax sensitivity, providing a natural control group when evaluating tax-motivated ETF hypotheses. Third, institutional investors comprise a significant majority of trading volume in the United States, allowing us to understand better how prevalent ETF-based swapping has become.

4 The data are available at https://accounting-faculty.wharton.upenn.edu/bushee/. WRDS provides a CIK-mgrno linking table, which we use to match our WhaleWisdom data to the Institutional Investor Classification Data. We retain only matches flagged by WRDS as high quality.

5 We also re-examine our main results using a variety of alternate definitions of tax sensitivity and find them to be robust (see Section 4.7).

6 For instance, when we employ a lower threshold of 95% return correlation to identify ETF pairs, the magnitude of swapping since 2001 balloons to $1.3 trillion (versus $417 billion of swapping using a 99% return correlation).

7 Prior legal precedence exists in support of a more lenient interpretation in non-ETF settings. For example, a legal expert referred us to a Supreme Court decision on the deductibility of a loss by a savings and loan’s (S&L) sale of an interest in multiple single-family mortgages when they simultaneously exchanged it with an equivalent interest in a different package of single-family mortgages to nearby S&Ls (Cottage Savings Ass’n v. Commissioner, 499 U.S. 554 1991). The Supreme Court stated that an exchange of property generates a tax loss deduction only if the properties exchanged are “materially” or “essentially” different. The Supreme Court concluded that because the interests exchanged by Cottage Savings and the other S&Ls derived from loans made to different obligors and secured by different homes, the exchanged interests embodied distinct entitlements. The expert argued that Cottage Savings creates doubt that the IRS could successfully challenge the ETF loss harvesting that we document.

8 The literature typically uses Thomson Reuters 13F data, which allow researchers to start estimating the cost basis of positions beginning in the 1980s. Given that Thomson Reuters is missing many ETFs (WRDS Research 2017), we use another data provider, WhaleWisdom, that offers superior coverage of ETFs. We compare reported holdings for a small number of filers in original SEC filings, and we find that the WhaleWisdom database reflects the SEC filings more accurately than the Thomson Reuters 13F data. Further, few highly correlated ETFs existed before 2001.

9 Although the 13F holdings data provide useful insight into institutional investor equity holdings, they have limitations. First, we lack insight into intraquarter trading as holdings data are only available quarterly. Second, 13F data omit short positions, obscuring our insight into more complicated portfolio strategies. Finally, 13F data are aggregated at the investor (not fund) level, which induces measurement error into our classifications of which funds are tax sensitive and which trades are intrafund. Our results should be interpreted with these caveats in mind.

10 Interestingly, tax-sensitive investors have lower proportions of losses and gains realized (as measured following Sikes 2014) than tax-insensitive investors. However, the fact that tax-sensitive investors have lower realized capital gains and capital losses is a direct implication of the differential trading volume between the two subsets of investors. Turnover for tax-insensitive investors nearly doubles that of tax-sensitive investors as the dollar volumes of buys and sells are 34% and 64% of total AUM, respectively.

11 To illustrate, consider cohort construction for SPY and IVV, two ETFs that track the S&P 500. Their price movements are correlated at close to 1.00. SPY was introduced first (1993) and is the “incumbent.” IVV was introduced in May of 2000. A cohort is constructed with May of 2000 set as zero in event time, and ETFs that do not gain a highly correlated partner in the 72 months around May of 2000 are included as clean controls. This forms the cohort. “Always treated” ETFs are excluded (e.g., IVV does not appear in the cohort because it has no preperiod; it is introduced in a treated state). This cohort-forming process is repeated for the first introduction of a highly correlated partner for any ETF with pre- and posttreatment data.

12 Results are robust to including both “never treated” and “not yet treated” observations as controls (untabulated).

13 We find qualitatively similar evidence in column (1) in Table 2 with cumulative fund flows from the ETF Global database as an alternative dependent variable (untabulated).

14 In defining highly correlated ETFs, Brown et al. (2022) state: “Return correlations are calculated using daily returns over the trailing 12 months, and we use correlation thresholds of 95% and 99%.”

15 We selected examples for Table A.1 in Appendix A based on trade size. The dollar volume of the typical swap tends to be lower and without perfectly offsetting purchase and sale amounts, consistent with Abel Noser’s clients harvesting tax losses as part of a broader investment strategy.

16 We can only ensure that trades occurred intrafund by limiting our investigation to single-day trades. This limitation prevents a time-series analysis, but it allows us to identify intrafund swap trades in highly correlated ETFs.

17 For unmatched observations, we further match to the institution “type” field from Institutional Investor Classification Database available on Brian Bushee’s website. Matched observations with “type” of pension or endowment are classified as tax insensitive per Blouin et al. (2017).

18 We utilize high-net-worth individuals—who account for a significant portion of assets under management on Form ADV—to classify institutions as tax sensitive in our main specification (e.g., Jin 2006, Blouin et al. 2017). Our measure, however, uses a coarser threshold than some others (e.g., Jin 2006, Sikes 2014, Blouin et al. 2017). Our “greater-than-zero” threshold has the advantage of classifying the most filer-quarters. We rely on the assumption that although investment advisors serving few high-net-worth individuals may have small tax-avoidance incentives, they unambiguously have more tax-avoidance incentives than institutions serving zero high-net-worth individuals. We assume that tax sensitivity is a mostly static trait, consistent with prior literature (Blouin et al. 2017).

19 In untabulated robustness tests, our results are quantitatively similar in Equations (2)(5) when we deploy a less conservative 95% return correlation threshold for classifying a highly correlated ETF. In empirical analyses with filer-level fixed effects, the time-invariant variable Tax Sensitivity is not reported.

20 Results are robust to using a scaled measure of swapping (e.g., ETF Swap Vol/Total Vol). We do not consider leveraged ETFs to be highly correlated pairs (e.g., an S&P 500 fund and an S&P 500 2X fund) because they represent a different risk profile even if correlated at ≥99%. Further, the CRSP mutual fund database does not treat leveraged ETFs as an index (i.e., not index fund flag B or D).

21 Expanding our definition of highly correlated ETFs to include those whose returns are correlated at 95% or more increases the magnitude of swapping activity to $1.3 trillion.

22 For an excellent numerical illustration of estimating quarterly gains and losses with HIFO, see Blouin et al. (2017, appendix A). Because institutions can use HIFO, their positions in large liquid ETFs are rarely zeroed out. Indeed, among “sell” transactions involved in an HC ETF swap, we see only 7% of instances in which the quarterly position is fully liquidated (untabulated). Holding multiple tranches of multiple HC ETFs provides more potential harvesting opportunities when faced with return volatility.

23 Note that the filer fixed effect omits the main Tax Sensitivity effect as the variable does not vary within filer.

24 Using an alternative threshold of a ≥95% return correlation increases the magnitude of a one-standard-deviation increase to $192 billion.

25 Turnover is also highly correlated with PGR, which is included in column (3) in Table 5. We document a negative association between PGR and PLR in column (3) in Table 5. In untabulated tests, we confirm that the negative relationship is because of the inclusion of both Turnover and PGR in the regression. Excluding Turnover yields a positive relation between PGR and PLR.

26 Approximately 18% of our filers never hold a highly correlated ETF, which constitutes 12% of our filer-quarter observations. In untabulated results, we find that our results in Tables 35 are robust to excluding these observations.

27 Our results cannot be explained by investors’ propensity to window dress. When investors historically harvested losses, the stock with negative recent returns (i.e., the harvested stock) would be replaced by a different stock with positive recent returns (i.e., the window dressing). However, with highly correlated ETFs (i.e., ≥99%), the ETF with negative recent returns (i.e., the harvested ETF) is replaced by an ETF that has nearly identical negative recent returns. Thus, in our setting, ETF swapping does not allow an institution to window dress.

28 In columns (1)–(4) and (6) in Table 8, the alternative measures of Tax Sensitivity do not vary over time, and thus, the Tax Sensitivity base coefficients are subsumed by filer fixed effects.

29 We use the WRDS CIK-mgrno linking table (wrds_13f_link), which flags high-quality matches.

References