Spanning Tests for Assets with Option-Like Payoffs: The Case of Hedge Funds

We draw on the skewness literature to propose regression-based performance evaluation tests designed for investments with option-like returns. These tests deliver conclusions valid for all risk-averse mean-variance-skewness investors and can better account for non-linearities in returns than option-based factor models. Applied to mutual funds and hedge funds, our tests usually suggest selecting different funds than standard tests, and find that a significant fraction, 11%, of hedge funds add value to investors, whereas this is an insignificant 4% for mutual funds. We also analyze the economic significance of these option-like returns, and their out-of-sample persistence.


Introduction
Performance evaluation of mutual funds and hedge funds is commonly based on linear factor models such as the CAPM, the Fama-French three-factor model, or the Fama-French-Carhart fourfactor model (see also Harvey et al., 2015, for a comprehensive list of additional factors). This approach is fully justied for mean-variance investors, possibly in a multi-factor economy (Fama, 1996), because a positive alpha is equivalent to an improvement in their investment opportunity set. Acknowledging that trading strategies, particularly for hedge funds, involve dynamic trading, derivative usage, and/or leverage, non-linearities and skewness in returns arise which are not captured by these linear factor models and make the mean-variance assumption unappealing. This paper argues that a remedy for the shortcoming of linear factor models is to account for investor's skewness preference in performance evaluation, 1 and develops a general framework to assess whether assets with option-like returns improve an investment opportunity set. Our approach generalizes the mean-variance spanning and intersection tests of Huberman and Kandel (1987) with a risk-free asset. 2 We use our framework to better understand the costs and benets of skewness in hedge fund returns compared with mutual fund returns and to shed new light on the usefulness of hedge funds for investors.
To see why non-linearities and skewness are important empirically and not captured by linear factor models, Figure 1 (a) plots the monthly excess returns on an S&P 500 out of the money put writing strategy (henceforth, put1) versus the excess returns on the total return index. The gure shows that Jensen's alpha of put1 is economically large with 0.59% per month (t-statistic of 5.60). In addition, the relation between put1 and index returns is not entirely captured by the covariancea measure only of linear dependence. The relation between put1 and index returns is not only increasing; it is also concave. Thus, put1 has low returns (relative to the linear relationship) when squared market returns are largecommonly considered as bad times. More formally, there is a signicant negative relation between the residual of the CAPM regression and the squared demeaned index returns, or residual co-skewness, as evidenced in Figure 1 (c). Therefore, the alpha Figure 1: Performance evaluation of put writing strategies Panel (a) plots the (excess returns of ) put writing strategy rp1 versus the S&P500 TR index rs, and Panel (b) plots put writing strategy 2 rp2 versus rp1. These panels also show the coecients and t-statistics with  standard errors of a regression of rp1 on rs and rp2 on rp1. Panel (c) and (d) plot the residuals of these regressions, p1 and p2, versus the respective squared demeaned independent variable,r 2 s andr 2 p1 , and show the coecients and t-statistics of the corresponding regressions. Their slopes are proportional to residual co-skewness, i.e., Cov ,r 2 .
The put writing strategies are constructed following Jurek and Staord (2015), and use Option metrics data on S&P500 index options from 1996 to 2014. rp1 (rp2) uses a leverage ratio of 2 (4) and shorts options with strikes approximately 7% (14%) below prevailing index levels.
of put1 comes at the cost of co-skewness, which deteriorates portfolio skewness, and many investors may prefer not to invest in put1.
A common approach to capture non-linearities is to include factors with non-linear returns, such as option payos, next to the standard factors (e.g., . However, this approach does not suce in general because option-like payos can still generate a positive alpha at the expense of negative skewness. 3 To show this, it suces to construct an alternative put writing strategy (henceforth, put2), which shorts further out of the money puts, 3 Notice that adding the squared benchmark return directly to the linear factor model, as in 's quadratic market model, does not help for performance evaluation. Indeed, the slope on the squared market return is negative for strategies with negative residual co-skewness, and estimated alphas are then higher because average squared returns are positive.
to again obtain an alpha with respect to that benchmark. This is conveyed in Figure 1 (b), which plots put2 against put1. The alpha with put1 as a factor is 0.23% per month (t-statistic of 3.42).
Considering an investment opportunity set of benchmark assets versus a larger set of benchmark plus additional assets, we propose two tests for whether risk-averse mean-variance-skewness investors (henceforth, skewness investors) benet from additional assets. Throughout the paper, we refer to one investor as a particular combination of preferences over mean, variance, and skewness, and all investors as any combination of these preference intensities. First, a spanning test considers the hypothesis that no investor benets from the additional assets versus the alternative where at least one investor benets. It is the equivalent of the mean-variance spanning test to the mean-varianceskewness case. Second, an overlap test considers the hypothesis that at least one investor does not benet versus the alternative: all investors benet. This overlap concept is new and recognizes that for a group of investors the positive alpha of assets such as those depicted in Figure 1 is oset by the negative co-skewness with the benchmark. In a sense the overlap test generalizes the mean-variance intersection test. Where the mean-variance intersection test is for the hypothesis that there is one mean-variance investor that does not benet versus the alternative that all investors do benet, our overlap test tests for the hypothesis that there is a group of mean-variance-skewness investors that does not benet versus the same alternative.
We study the costs and benets of option-like characteristics in returns with a large sample of live and dead hedge funds and mutual funds from Morningstar in the period from 1994 to 2014.
Considering investors who currently invest in stocks and bonds, we obtain dierent results with our method compared with standard spanning tests: among the funds which benet all mean-variance investors, 73% still do so for the Fama-French-Carhart four-factor model and the Fung-Hsieh eightfactor model, but only 15% improve the investment opportunity set of all skewness investors. Thus, the majority of the hedge funds that are a attractive from a mean-variance point of view, have a trade-o with negative co-skewness with stocks and bonds that do not make them attractive for groups of mean-variance-skewness investors. More generally, all skewness investors improve their investment opportunity set with around 11% of the hedge funds but with less than 4% of the mutual funds. 4 To gain more economic understanding about our results, we analyze the relation between coskewness and hedge funds' characteristics and strategies. Although hedge funds characterized by longer lock-up and advance notice periods have more negative co-skewness, we nd that the type of 4 All rejection rates throughout the paper are at the 5% signicance level. strategy better explains the cross-sectional variation in co-skewness. Event and relative value funds mostly follow arbitrage strategies, and we nd that these funds have more negative co-skewness.
Global derivatives funds instead have desirable positive co-skewness and are also those most likely to be benecial to all skewness investors. A detailed subsample analysis shows that rejection rates are highest among funds following systematic futures strategies. These strategies prosper when markets demonstrate sustained bullish or bearish trends i.e., when market returns are large in absolute value, or equivalently squared market returns are large. This generates positive co-skewness and explains their attractiveness for all skewness investors.
We evaluate the economic signicance of our results by calculating the utility gains of taking higher moments into account for the funds which signicantly improve the investment opportunity set of all skewness investors. More specically, we compare Jensen's alphas to alphas adjusted for skewness preference. This skewness adjustment is 0.35% per year on average for hedge funds and attains 1% on average in the highest co-skewness quintile. For mutual funds, it is 0.21% on average and attains 0.67% in the highest co-skewness quintile. Going beyond skewness, we nd that ignoring higher moments has little impact. The dierences between alphas adjusted for skewness and all moments are mostly below 0.10% per year suggesting that our framework provides a good summary of the option-like characteristics in hedge and mutual fund returns.
Finally, because performance analysis always uses observed returns, most of our results are obtained from an in-sample analysis. In robustness checks, we study whether our test statistics are persistent. We nd that hedge funds desirable for all skewness or all mean-variance investors also tend to be desirable out-of-sample. By contrast, mutual funds which have been desirable for both types of investors do not tend to be so out-of-sample. Thus, our results suggest that a signicant subset of hedge funds add value to all investors, in and out-of-sample and even to those concerned about skewness.  (Kosowski et al., 2007;Buraschi et al., 2014a). This research builds on the evidence 5 See Hsieh (1997, 2001); ; Buraschi et al. (2014b) for risk factors in hedge funds returns, and Patton and Ramadorai (2013) for time variation of risk exposures. that higher-order risk and tail risk help explain the cross-section of hedge fund returns (Agarwal et al., 2009;Kelly and Jiang, 2012;Hübner et al., 2015) which suggests that traditional performance metrics may not accurately measure the alpha of hedge funds.
The most widely used method to assess hedge fund performance is a linear factor model with the factors of  or . Motivated by Glosten and Jagannathan (1994), these factors include returns on option-based strategies to capture the optionlike characteristics of hedge fund returns. We show that because these models eectively restrict additional assets to be a xed linear combination of non-linear returns, they are unable to account for general forms of non-linearities. Our method instead jointly considers alpha and co-skewness and outperforms these models due to its increased exibility.
Our methodology diers from approaches that allow for time-varying betas or, equivalently, managed portfolios to capture the non-linearities in hedge fund returns. For example, Ferson and Schadt (1996) and Patton and Ramadorai (2013) augment the set of benchmark assets with managed portfolios based on conditioning information to account for dynamic trading. Patton and Ramadorai (2013) nd that alphas tend to be larger, on average, with time-varying exposure models. Timevarying exposure models and our approach are both motivated by non-linearities in returns. But our approach explicitly accounts for investor's preference for skewness by looking at both alpha and co-skewness. In fact, in our methodology the set of benchmark assets can be augmented with the managed portfolios suggested by Patton and Ramadorai (2013) to analyze whether skewness investors benet from an investment.
Existing research on skewness has studied the portfolio choice problem with skewness. 6 In addition, Bali and Murray (2013) have introduced the concept of skewness asset which is a portfolio of options and the underlying stock designed to take a position in the risk-neutral skewness of an underlying. Bali and Murray document a strong negative relation between risk-neutral skewness and subsequent skewness asset returns. Bali and Murray (2013) and our paper share a similar motivationskewness preferencebut our focus is to propose a performance evaluation approach which allows for investors with skewness preference. In such a framework, a positive alpha and/or a positive co-skewness are valid reasons to add an asset to a portfolio. In the context of hedge funds, Bali et al. (2012) analyze the predictive power of variance, skewness, and kurtosis for returns. They nd that variance is positively related to subsequent returns, and that skewness or kurtosis do not 6 An incomplete reference list is de Athayde and Flores (2004); Jondeau and Rockinger (2006); Mitton and Vorkink (2007); Guidolin and Timmermann (2008); Martellini and Ziemann (2010). predict returns. We do not aspire to introduce a new predictor of hedge fund returns and conduct an out-of-sample analysis only in a robustness check to verify that investor's can identify ex-ante the funds which are desirable ex-post from a skewness perspective.
We provide simple tests for whether all skewness investors benet from additional assets. The most closely related research in this direction is  who propose a parametric test for mean-variance-skewness spanning. Under their distributional assumptions, the ecient frontier can be generated by only three funds, which is not the case in general, and hence their spanning conditions are dierent from the conditions derived in this paper and do not nest the Huberman and Kandel (1987) tests as a special case. In addition, we contribute to this literature by introducing the overlap concept and test.
Existing research on portfolio allocation with hedge funds nds that hedge funds improve the mean-variance trade-o in a portfolio of stocks and bonds at the expense of lower skewness (see Amin and Kat, 2003;Davies et al., 2009). Our paper provides formal tests to analyze the cost of skewness for investors and nds that about 11% of the hedge funds provide both mean-variance and skewness benets for stock and bond investors. In a related study, Almeida et al. (2018) propose a performance evaluation method based on families of non-linear discount factors and also nd a lower performance of hedge funds. The dierence to our study is that we focus on the rst three moments, a dierent set of benchmark assets, and use a spanning methodology which is closely linked to portfolio choice by construction. Finally, Back et al. (2018) analyze residual co-skewness in mutual fund returns, and nd a trade-o between alpha and residual co-skewness across investment styles and for active funds. Their analysis motivates us to include mutual funds in our comparison sample.

Evaluating investments with option-like returns
To facilitate the exposition, we limit the discussion here and in the empirical section to the case of investing with a risk-free asset. The general case without a risk-free asset can be found in Chapter 3 of Karehnke (2014).
We start by stating the popular mean-variance spanning restrictions and the additional skewness restriction, and provide the formal derivation in the next subsection. Let the vectors of excess returns on the k-benchmark and n-test assets be denoted by r x,t and r y,t , respectively. Mean-variance spanning tests use the intercepts of the multivariate regression where t is the n-dimensional vector of residuals, α is the n-dimensional vector of intercepts, and B is the n × k-dimensional matrix of slope coecients. In the standard setting of a multiple regression with simple returns and without risk-free asset, mean-variance spanning implies that the intercepts are zero and the slope coecients for each regression sum to one. With a risk-free asset, there are no constraints on portfolio weights and spanning only requires that the intercepts in (1) are zero. This result provides the rationale to advise investors to buy funds or stocks which have a positive alpha in a linear factor model. For benchmark and test assets with non-normally distributed returns and investors with preferences over higher moments, this result no longer applies. The next subsection shows that in a mean-variance-skewness framework additional conditions are imposed on the n × k 2dimensional co-skewness matrix of the residual t with the benchmark assets where S i x 1 x j = E irx 1r x j ,r are demeaned returns, i = 1, . . . , n and j = 1, . . . , k. This result highlights that factor models augmented with a skewness factor representing the return on a low minus high co-skewness portfolio cannot account for an investor's skewness preference. In addition, the result that it is the residual co-skewness that matters can be traced back to the asset pricing study (although not explicitly named) of Ingersoll (1975). This residual co-skewness is identical to the numerator of Harvey and Siddique (2000)'s empirically motivated co-skewness measure when the aggregate stock market is the only benchmark asset. More recent asset pricing derivations of residual co-skewness are in Back (2014). The theoretical contribution of the next subsection is to derive these results in a spanning and intersection setting with multiple benchmark and additional assets, and to introduce the new overlap case.

The theory
Consider the portfolio choice problem of skewness investors who can either invest in the risk-free security and the benchmark assets r x , or in a larger universe, which additionally consists of the test assets r y . If the optimal portfolio of only one investor is the same with the benchmark assets only as with the benchmark and test assets, the mean-variance-skewness frontiers of r x and (r x , r y ) intersect (Huberman and Kandel, 1987). If the optimal portfolio of at least one investor is the same with the benchmark assets only as with the benchmark and test assets, the mean-variance-skewness frontiers of r x and (r x , r y ) overlap. If the optimal portfolio of r x and (r x , r y ) is the same for all skewness investors, the benchmark assets are said to span the test assets. In the following, we develop these concepts formally.
Let the k + n vector of excess returns be denoted by r ≡ [r x r y ], where stands for transpose, and let the vector of expected excess returns and the matrix of covariances be denoted by µ and Σ, respectively. Bold letters denote vectors or matrices throughout the paper and, if it is not specied otherwise, vectors and matrices have the dimension (k + n)×1 and (k + n)×(k + n), respectively. In addition, we sometimes use the subscripts x and y to refer to the respective moments of benchmark and test assets (e.g., µ y is the vector of the expected returns of the n test assets and Σ yx is the n × k-matrix of the covariances between test and benchmark assets). The (k + n) × (k + n) 2 matrix of co-skewnesses is given by where ⊗ is the Kronecker product, and S r i r 1 r j = E (r ir1rj ) for i, j = 1, . . . , k + n. 7 A skewness investor likes the mean and skewness of his portfolio returns and dislikes the variance.
He or she chooses a portfolio w in the k + n assets to maximize his or her mean-variance-skewness where γ 1 and γ 2 are two positive scalars which measure the aversion to variance and preference for skewness (relative to the preference for the mean). In an expected utility framework, γ 1 can be interpreted as the coecient of risk aversion and γ 2 as the coecient of downside risk aversion of Crainich and Eeckhoudt (2008), or equivalently as the product of risk aversion and prudence.
Thus, in our framework, as in expected utility theory (see, e.g., Menezes et al., 1980;Crainich and Eeckhoudt, 2008), aversion to downside risk and preference for skewness are equivalent. Section 1 of the Technical Appendix derives these interpretations for γ 1 and γ 2 and discusses possible parameter values for popular utility functions.
Throughout the paper, we assume that the rst-order conditions of the investors are necessary and sucient, which is tantamount to assuming that the mean-variance-skewness investors are risk-averse. Mathematically, this is the case either if the second-order condition of (3) holds, or if we consider the rst-order conditions of (3) to be a second-order approximation of the rst-order 7 In statistics, skewness usually refers to the third standardized moment (i.e., the third moment divided by the cube of the standard deviation). Here, skewness refers to the third unstandardized moment in line with the portfolio choice literature.
condition of an investor with a concave utility function. 8 By solving the portfolio choice problem for a specic investor, i.e., a given pair of (γ 1 , γ 2 ), and imposing the condition that he optimally invests only in the set of benchmark assets, we get the condition for mean-variance-skewness intersection for a specic investor in the next proposition. The proof of this proposition and all other proofs are in Section 2 of the Technical Appendix.
Proposition 1 The mean-variance-skewness frontiers of r x and (r x , r y ) intersect, if holds for one particular pair of preference parameters (γ 1 , γ 2 ) and corresponding w * x .
Using (4) and noting that spanning means that any skewness investor only holds the benchmark assets, we obtain the conditions for spanning.
Proposition 2 The mean-variance-skewness frontier of r x spans the frontier of (r x , r y ), if Notice that our conditions for spanning and intersection nest the conditions for mean-variance spanning as a special case. We get the conditions for mean-variance spanningi.e., (5), if we set γ 2 in (4) to zero. In addition, to see that S xx in (2) contains the restriction in (6) observe that S xx can be rewritten to E r xr x = S yxx − Σ yx Σ −1 xx S xxx .
Extensions of mean-variance intersection and spanning tests to take into account short-sale constraints and transaction costs developed by  can be adapted to the meanvariance-skewness case. In the following, we present the extension to short-sale constraints. The spanning restrictions (5) and (6) then have to hold for each subset of the benchmark assets on which the short-sale constraints are simultaneously not binding. Let these subsets be denoted by x j , for j = 1, 2, . . . , M . Using this notation, we can state the following proposition.
8 This is illustrated in Section 1 of the Technical Appendix, and the global second-order conditions require that −γ1Σ + 2γ2S (w ⊗ I), where I is a k + n × k + n identity matrix, is negative semidenite for all relevant w. This assumption is a necessary working assumption implicitly made also by papers studying the pricing of skewness. It rules out the extreme case in which investors wish to invest an innite amount in the risky asset(s) due to convex (risk-seeking) utility. For our overlap test and spanning with short-sales constraints, this condition is always satised when the elements of S are non-positive.
Proposition 3 The mean-variance-skewness frontier of r x spans the frontier of (r x , r y ) without short sales on benchmark and test assets, if for j = 1, ..., M and where the inequalities apply element-wise.
Because skewness spanning contains the conditions for mean-variance spanning as a special case, skewness spanning may not be satised although test assets deteriorate portfolio skewness and some investors prefer to hold only the benchmark assets. To detect this situation, we introduce a new concept, specic to our framework, and label it overlap. In the context of no short sales, it asks whether mean-variance-skewness frontiers overlap, and no overlap means that test assets provide diversication benets for all skewness investors.
Corollary 1 The mean-variance-skewness frontiers of r x and (r x , r y ) overlap without short sales on benchmark and test assets, if at least one element in Our framework is closely linked to portfolio choice, and portfolio weights can be expressed as a function of alpha and residual co-skewness. This is shown in the next corollary for n = 1 and k = 1.
The general case and the portfolio weights in the benchmark asset are in the proof of the corollary in the Technical Appendix.
Corollary 2 Suppose that there is one test asset and one benchmark asset. The portfolio weight in the test asset is implied by where α and are the intercept and the residual in (1), V ar( ) = Σ , and Cov( ,r xry ) = S xy .

Tests
We rst outline the construction of Wald tests for spanning, 9 and then introduce a bootstrap test for overlap. The Technical Appendix contains further details on the implementation of the tests, an analysis which shows that the tests have good size and power properties, and a performance analysis of simulated option strategies.

Spanning
The spanning test with short sales is based on Wald tests with equality constraints. Let h denote a column vector which contains the n + nk(k + 1)/2-restrictions in (5)-(6), and V ar [h] its variance.
As explained in detail in Section 3.1 of the Technical Appendix, we obtain the sample equivalent of h denoted byĥ and its estimated covariance matrix V ar ĥ from multivariate regressions. The Wald statistic for the null hypothesis h = 0 is Under the null hypothesis and standard regularity assumptions, i.e., that the returns on test and benchmark assets are stationary and ergodic, the Wald statistic has a χ 2 limiting distribution with n + nk(k + 1)/2 degrees of freedom (dimension of the column vector h).
To test for spanning without short sales, we use Wald tests with inequality constraints (Gourieroux, Holly, and Monfort, 1982;Kodde and Palm, 1986;. Let h s denote the vector which contains the inequality conditions in (7)-(8), and V ar [h s ] its variance.
The Wald statistic for the null hypothesis h s ≤ 0 is Under the null hypothesis and standard regularity conditions, the probability of Span MVS x exceeding a certain value is (see Kodde and Palm, 1986) P r Span MVS where χ 2 0 has unit mass, d is the number of elements in the vector h s and ω d, i, V ar ĥ s is the probability that i of the d elements of a vector with a N 0 d , V ar ĥ s distribution are strictly negative. Following Gourieroux, Holly, and Monfort (1982) (2010), the researcher seeks to prove or disprove the alternative hypothesis.
Our null hypothesis is that at least one element of h s is non-positive, and the alternative is h s > 0 which can be rewritten to min i=1,...,d h s i > 0, where as before, d is the number of elements in h s . 11 12 Following , the test statistic is Critical values for this test statistic are not known, and we follow  to obtain the critical values with a bootstrap procedure. We obtain B bootstrapped samples from the original sample using randomized block bootstrap with replacement, and calculateĥ b with the bootstrapped return series. The test statistic for the bootstrapped series is and the p-value of the test is given by To eliminate the impact of cross-sectional heteroscedasticity in test 10 We have veried that 100, 000 draws are sucient to obtain accurate weights. 11 The vector h s can be augmented with elements of the skewness matrix (with opposite sign) to test whether the second-order conditions are globally satised. 12 Notice that the alternative in the spanning test with short-sales constraints would correspond to max i=1,...,d h s i > 0 in this framework. asset returns and dierent standard errors in alphas and residual co-skewnesses, we follow Patton and Timmermann (2010) and use the studentized version of this bootstrap.

4
Empirical results

Data
Our analysis takes the perspective of investors who can initially invest in bonds, stocks, and a risk-free asset. 13 The proxy for stocks and bonds are the S&P 500 total return index from Morningstar (henceforth, stocks) and the 10-year US treasury bond index from CRSP (henceforth, bonds). The 30-day t-bill index from CRSP is used as a proxy for the risk-free rate. The hedge funds are live and dead funds with USD as base currency. We start with all funds available in the database and then apply the following standard lters: First, we delete the rst 12 months of return data for each fund to mitigate the instant history or back-ll bias . Second, each fund is required to have at least 36 months of return data available to have enough data at the individual fund level for our analysis Patton and Ramadorai, 2013). Finally and to avoid having very similar investments multiple times, we keep the fund with the longest return history for each strategy identier. Applying these lters reduces our sample from 9, 717 to 4, 753 funds (3, 021 liquidated, 75 merged, and 1, 657 live). The data ranges from January 1993 to December 2014 but eectively starts in January 1994 after the rst twelve months of returns for each fund have been deleted. We use unsmoothed hedge fund returns, i.e., returns adjusted such that their rst-order autocorrelation is zero (Getmansky et al., 2004). Adjusting for smoothing hardly changes the mean and unstandardized skewness of returns but increases the average volatility from 3.61% to 4.28% per month in our dataset. Hence, using unsmoothed returns makes our results more conservative, and we verify in the robustness section that our results are robust to using raw returns.
Morningstar classies hedge funds in six broad categories (see : directional debt, directional equity, event, global derivatives, multistrategy, and relative value. Throughout the analysis, we essentially keep the Morningstar classication except that we assign fund of funds, which are included in the multistrategy category by Morningstar, to a distinct fund of funds category.
Funds without any Morningstar classication are assigned to an additional category labeled other.
To get an idea of the performance of the dierent categories, we analyze net-asset-value-weighted portfolios of hedge funds in each category in addition to individual funds.
Finally, we also consider the two strategies writing S&P 500 Index put options proposed by Jurek and Staord (2015) and mentioned in the introduction. These option strategies are designed to satisfy exchange margin requirements and bid-ask spreads, and have been shown to successfully replicate aggregate hedge fund performance. We construct the returns on these strategies with OptionMetrics data from January 1996 to December 2014. At the end of each month, these strategies buy puts with an expiration as close as possible but longer than one month, and sell the previously bought puts. Put1 thereby uses a leverage ratio of 2 and shorts an option with strikes on average approximately 7% below prevailing index levels. Put2 uses a leverage ratio of 4 and uses strikes approximately 14% below prevailing index levels. 14 The descriptive statistics of the returns on benchmark assets, option strategies, and portfolios of hedge funds are reported in Panel A of Table 1. Panels B to D contain the percentiles of the cross-sectional distribution of the summary statistics of returns and characteristics of individual funds.
Panel A shows that the average monthly excess returns on stocks and bonds are 0.63% and 0.28% and their standard deviations are 4.32% and 2.03%, respectively, over the sample period.
The option strategies seem to almost dominate bonds and stocks based on the rst two moments but have more negative skewness and higher excess kurtosis. Portfolios invested in directional debt, event, and multistrategy have a similar average excess return and a lower standard deviation than stocks and are therefore likely to provide a better mean-variance trade-o than stocks. Portfolios invested in multistrategy and relative value have a higher average excess return and a lower standard deviation than bonds. Panel A also shows that all hedge fund strategies have signicantly skewed returns. The critical value at a 5% signicance level for the null hypothesis that skewness is zero for a sample size of 252 observations is approximately 0.30 under the normal distribution, and the skewness of every strategy exceeds that threshold in absolute value. The strategies also exhibit signicant kurtosis because the critical value for the null of kurtosis equal to three is approximately 14 These strikes are selected based on Z-scores at each rebalancing date to avoid time-varying systematic risk exposures. We refer to Jurek and Staord (2015) for more details on the construction and advantage of these strategies.   3.60 under the normal distribution. Although kurtosis is statistically signicant, we will show in Section 4.4.3 that the eect of kurtosis on utility is economically small. In terms of number of funds in each category, the directional equity category contains the largest number of funds followed by the fund of funds category. All categories contain more than 200 funds.
Portfolios of funds understate the standard deviation of hedge fund returns since the average standard deviation is 4.27% in Panel B, which is more than twice as large as the standard deviation of the portfolios of hedge funds. Individual hedge funds have slightly lower average excess returns than mutual funds but also lower standard deviations. The dispersion in the cross-sectional distribution of skewness of individual hedge fund returns is larger than the dispersion in skewness of mutual funds, and the magnitude of mutual fund skewness is very similar to that of the market, suggesting that skewness is essentially spanned by the market. Finally, our hedge funds have typical hedge fund features such as few assets under management, substantial management and performance fees, withdrawal restrictions, and high minimum investment requirements.

Portfolios of hedge funds and option returns
The summary statistics suggest that some portfolios of hedge funds may provide substantial diversication benets to bond and stock investors. Formal tests of this hypothesis are reported in Table 2. We consider in columns 1-4 mean-variance and skewness spanning denoted by the superscripts MV and MVS , respectively. The mean-variance tests are included because these are the standard tests used in the literature in the form of hypothesis tests on alphas in performance regressions and they can be compared to our tests to determine whether alpha or co-skewness is driving a rejection. The last two columns of the table contain the results of the overlap tests. To understand which benchmark asset drives the rejection, we also test for overlap with only stocks as the benchmark denoted by the subscript s .
Our tests suggest that there is strong evidence against spanning with short sales, and this evidence is slightly weaker without short-sales. Spanning is then rejected for the option strategies, event, global derivatives, multistrategy, and relative value. 15 Notice that the benets of the portfolio of directional debt in terms of skewness can only be achieved with short-positions, and it is important to take short sales into account when testing for skewness spanning.
For most hedge fund strategies, skewness spanning with short-sale constraints, Span , is rejected despite a negative residual co-skewness. As shown by Span MV , this happens because a subset of the skewness investors wish to invest in the hedge funds. More precisely, once alpha 15 Throughout the paper signicant means signicant at the 5% level. , suggesting that the tail risk of the portfolio of hedge funds is coming from their exposure to stocks. As we show in the next section, this is not always the case for individual funds; it is likely a by-product of diversication across several hedge funds.
The bottom line is only portfolios of global derivatives seem to oer benets for all skewness investors. As shown in the descriptive statistics, however, portfolios have a lower standard deviation than individual returns and may therefore overstate the diversication benets. In addition, most hedge funds have very high minimum investment requirements (see Panel D in Table 1), making it likely that some investors only choose to hold a single hedge fund within their portfolio of bonds and stocks. These issues motivate the following analysis of the diversication benets of individual hedge funds.

Individual funds
Investors initially invest in stocks, or in stocks and bonds, and face short-sale constraints. We compare the results of our tests to the mean-variance spanning case and to three factor models: the Fama-French-Carhart four-factor model, 16 the Fung and Hsieh (2001) eight-factor model, 17 and a two-factor model with put1 and the return on stocks. Given that the analysis relies on rejection rates, we also carry out the analysis on the mutual fund sample to have a basis for comparison. All results are reported in Table 3.
The  Fama and French (1993); these factors are obtained from: http://mba.tuck.dartmouth.edu/pages/ faculty/ken.french/index.html. 17 The factors are the market, size, bond, credit spread, emerging market, look-back straddles tracking bond, currency, and commodity trend-following returns; see https://faculty.fuqua.duke.edu/~dah7/HFRFData.htm for further information, and http://faculty.fuqua.duke.edu/~dah7/DataLibrary/TF-FAC.xls for the data. (2) Span The investment opportunity set in rows (1)-(3) consists of only stocks indicated by s , and stocks and bonds indicated by s,b in rows (4)-(6). Span MVS (Span MV ) refers to the corresponding mean-variance-skewness (mean-variance) spanning tests without short-sales, and Over MVS is the mean-variance-skewness overlap test. We compare these spanning tests to the rejection of the null of a non-positive alpha in the Fama-French-Carhart model in row (7) FF4, and the Fung-Hsieh factor model in row (8) FH, and a two factor models with put1 and market return in (9)  test is rejected. The columns`most similar' and`least similar' report in descending order the three tests which would be considered as such based on the conditional rejection frequencies.
All tests use   . This is perhaps unsurprising given that these tests take into account an additional performance attribute, co-skewness, in addition to alpha, and require funds to be attractive in terms of both alpha and co-skewness. When compared to each other, these tests can still lead to dierent conclusions because funds' co-skewnesses with bonds and the product of stocks and bonds, which enter the calculation of Over suggests that less than 4% of the mutual funds reliably improve the investment opportunity set of all skewness investors. Given our size analysis in Section 4 of the Technical Appendix, by luck and if all mutual fund returns are generated by the same underlying no-skill distribution (after fees), this number equals 5%. The lower number then suggests a data-generating process in which funds have negative skill (measured by negative co-skewness and/or alpha). More generally, the rejection rates for all tests are low, and in the range of 2.5% to 9.2%. Perhaps these low rejection rates are unsurprising given that due to equilibrium accounting the alphas before fees (and the residual co-skewness) of all investment strategies need to sum to zero, and thus the net-of-fee alphas of most mutual funds should and are generally found to be negative (Fama and French, 2010 Back et al. (2018). In sum, the results suggest that (co-)skewness matters for mutual funds as well.
By explicitly comparing dierent tests including the Fung-Hsieh eight-factor model, an optionbased factor model, and simple mean-variance spanning tests,  Table 4 show that the rejection rates of spanning and overlap increase with skewness, and the hedge funds in the top skewness group are most likely to improve the investment opportunity set of all skewness investors. For mutual funds, the pattern is less clear-cut and the rejection rates of spanning and overlap for the funds in the highest quintile are lower than in the fourth quintile. Funds are assigned to bins either based on their skewness (quintiles Q1 to Q5, header: skewness) and timing ability (header: timing). Following Henriksson and Merton (1981), market-timing ability is estimated as the coecient on a variable equal to the maximum of zero and excess market returns in a multiple regression with the fund excess returns as the dependent variable and the excess market return as the control variable. Following Bali et al. (2014) and using their uncertainty index, we estimate a similar regression in which the excess market return is replaced by the uncertainty index, and macro-timing ability is measured by the coecient on a variable equal to the uncertainty index when the index is above its time-series median and zero otherwise. We assign funds into bins depending on whether the slope coecient of the timing variable is signicantly negative (neg), positive (pos), or insignicant (no).
The table reports the number of funds in each bin #funds, and the rejection rates for the tests outlined in the caption of Table 3.
A possible explanation is that funds in the high skewness quintile have an inferior performance because they have a lower skill in risk management and invest in lottery-type stocks that help attract investor ows, especially from retail investors. 19 The table also analyses spanning and overlap for funds classied by their market and macrotiming ability. For market timing ability, we follow Henriksson and Merton (1981) and run a regression of fund returns on an intercept, excess market returns, and the market-timing variable equal to the excess market return when it is positive and zero otherwise. 20 Successful market timing involves an increase in exposure prior to a market rise, and translates into a positive and signicant coecient on the timing variable. Macro-timing ability is estimated following Bali et al. (2014) and using their macroeconomic risk index. 21 Specically, we run a regression of fund returns on an intercept, the macroeconomic risk index, and the macro-timing variable equal to the macroeconomic 19 We thank the associate editor for suggesting this interpretation. 20 An alternative way to assess market timing is to follow Treynor and Mazuy (1966) and use the quadratic market model. Here, the market timing variable is the coecient on the squared market return, which is almost perfectly correlated with co-skewness and would obfuscate the distinction between co-skewness and market timing. 21 The macroeconomic risk index is available on Turan Bali's website: http://faculty.msb.edu/tgb27/ workingpapers.html. We thank him for making his data available. risk index when it is above its median and zero otherwise. A positive and signicant coecient on the timing variable indicates macro-timing ability. The right-hand side of Table 4 reports the rejection rates separately for funds with negative, positive, and insignicant market and macrotiming ability. The results suggest a positive relationship between spanning/overlap and timing ability. All skewness investors are most likely to benet from hedge funds with positive market timing ability and positive macro-timing ability. The rejection rates for mutual funds also suggest that all skewness investors are most likely to benet from mutual funds with positive market timing ability, although the total fraction of funds with positive market timing is only half as large. In addition, only very few mutual funds21 out of 3905have macro-timing ability. As a consequence, these rejection rates should be interpreted with caution.
Taken together, the results show that hedge funds are better able to actively vary their exposure to market and macroeconomic risk in an advantageous way than mutual funds. As a result they generate both positive alpha and desirable co-skewness, and are more likely to be attractive to all skewness investors.

Relation to fund characteristics and investment strategies
To get more insight about how our tests are related to hedge fund characteristics and investment strategies, we run cross-sectional regressions of the intercept of the mean-variance regression in Span MV s,b on fund characteristics and dummies for the investment strategies. 22 We also run these regressions for each component of residual co-skewness: the co-skewness with stocks, with bonds, and the interaction of stocks and bonds. Table 5 reports the results.
Simple alphas are positively related to management and performance fees. This upholds the intuition that better funds are able to charge higher management fees and that performance fees better align the incentives of managers and investors. In addition, simple alphas are positively related to minimum investment requirements. To the extent that high minimum investment requirements make an investment more illiquid, this positive relation may reect that investor capital is competitively supplied to the hedge fund industry (Aragon, 2007). For residual co-skewness with stocks, there is no robust relationship with compensation, but a signicant negative relation with the lockup and advance notice perioda proxy for managerial discretion. This can be explained if funds choose their capital structure and investment strategies jointly in the vein of Hombert and Thesmar (2014): funds focusing on investments with lower residual co-skewness may optimally 22 The dependent variable in these regressions is estimated, which can introduce a bias if the measurement error is correlated with the regressors. In our setting as in, e.g., Aragon (2007) or Teo (2011), this is unlikely to be the case because the regressors are all predetermined. Each column reports a regression of the indicated dependent variable depvar on hedge fund characteristics. Residual co-skewness with stocks (bonds) is denoted by s 2 (b 2 ), and the cross residual co-skewness is denoted by s × b. The independent variables are (intercepts not tabulated) the management fee in percent (mfee), the performance fee in percent (pfee), a dummy equal to 1 if the fund has a high watermark provision ( choose more managerial discretion to avoid liquidating their positions after large temporary losses.
There is no robust relationship relationship between fund characteristics and the co-skewness with the bonds and the product of stocks and bonds.
When comparing the explanatory power of characteristics and investment strategies, characteristics better explain cross-sectional variations in alpha but investment strategies have more explanatory power for the co-skewnesses. As suggested by the portfolio analysis in Section 4.2 and compared to other funds, global derivatives funds have positive co-skewnesses with the benchmark assets, which are signicant for the co-skewnesses with stocks and the product of stocks and bonds. Mitchell and Pulvino (2001) show that risk arbitrage hedge funds have returns that are similar to those obtained from short-selling index put options. 23 In the Morningstar classication, event and relative value funds mostly follow arbitrage strategies, and the table suggests that these funds have more negative co-skewness (although only at the 10% signicance level for event funds). Finally, directional equity funds have a larger co-skewness with bonds.
In Table TA.3 of the Technical Appendix, we analyze the relation of investment strategies with residual co-skewness further with rejection rates in Morningstar subcategories (see , for a description of these categories). This analysis shows that in the four global derivatives subcategoriescurrency, global macro, systematic futures, and volatilitythe desirable positive co-skewness is most pronounced in the systematic futures category. The rejection rates within the fund of funds subcategories conrm this result because the rejection rates are the highest for the macro/systematic category and of similar magnitude than those in the systematic futures category. Taken together, this evidence suggests that the systematic futures strategy, which involves trend-following strategies in liquid global futures, options, and foreign-exchange contracts, is most appropriate for all skewness investors. This makes intuitive sense because these strategies prosper when markets demonstrate sustained bullish or bearish trends  or, in other words, when monthly market returns are large in absolute value. This generates a positive relation between the excess returns on the strategy and squared market returns.

Magnitude of utility gains
Having shown in Section 4.3 that 11% of the hedge funds signicantly improve the investment opportunity set of all skewness investors, we analyze below the magnitude of the utility gains of including these funds into a portfolio. We measure utility gains with alphas, and distinguish between 23 The analysis of Morningstar subcategories discussed below also points to the merger arbitrage strategies, which loose most of their appeal once co-skewness is accounted for: the rejection rate of 81% for mean-variance spanning is reduced to 9.5% when all skewness investors are considered.
the gains due to the rst two return moments, and the gains due to skewness. To obtain an alphalike measure for skewness, it is necessary to further specify the risk preferences of the investor. This corresponds to Proposition 1, and we can dene the alpha adjusted for skewness (henceforth also referred to as prudent alpha) as where α i is the intercept of the linear regression of asset i's return on the benchmark asset returns (i.e., the simple or Jensen's alpha), i is the residual of that regression, andr p is the demeaned return on the optimal benchmark portfolio of the investor. 25 Notice that if the individual with preferences (γ 1 , γ 2 ) optimally holds portfolio r p , then α pr i = 0. If there is no overlap, α pr i measures the additional return the asset delivers in excess of the return required by the investor.
The comparison of simple and prudent alphas measures the magnitude of the utility gains associated to skewness. In addition, prudent alpha can be compared with the generalized alpha (i.e., the alpha adjusted for all higher moments), and the kurtosis adjusted alpha to assess how important the skewness adjustment is. The generalized alpha for a CRRA utility function  is where β CRRA i = Cov r i , (R f + r p ) −γ /Cov r p , (R f + r p ) −γ , γ is the coecient of relative risk aversion of the investor and R f is the gross risk-free rate. Below we also calculate the impact of the next moment, kurtosis. The alpha adjusted for moments up to kurtosis will be referred to as temperant alpha, and it can be obtained in the same way as the prudent alpha. 26 It equals where γ 3 is a positive scalar which measures the aversion to kurtosis (relative to the preference for the mean). 27 Next, we calculate these adjusted alphas for the individual funds for which Over MVS s,b is rejected to 24 The term prudence was coined by Kimball (1990), and refers to a positive third derivative of a utility function. 25 To obtain the expression for prudent alpha, assume that (4) is not equal to zero but equal to α pr i for test asset i. Then notice that αi = µ y i − Σy i xΣ −1 26 Temperance is a term coined by Kimball (1992) to refer to a negative fourth derivative of the utility function which measures the aversion to the fourth moment. 27 Following the approach of Appendix 1, γ3 equals − 1 6 u (R f +µp) u (R f +µp) in an expected utility framework. assess the magnitude of the utility gains when overlap is rejected. The adjusted alphas are calculated for a CRRA utility function with relative risk aversion of 10 which implies γ 1 = γ/(R f + µ p ) and γ 2 = 1 2 γ(γ + 1)/(R f + µ p ) 2 = 55/(R f + µ p ) 2 . 28 We also report the alphas for γ = 4 to show that the choice of γ hardly matters for the alpha adjustments. For each fund, the adjustment is calculated for a portfolio of stocks and bonds, which is optimally chosen to maximize the average CRRA utility over the sample period. The results are reported in Table 6. To maximize the cross-sectional dispersion, we sort the funds in residual co-skewness quintiles. The table reports the annualized The average annualized simple alpha of hedge funds is 6%, and the average dierence between prudent and simple alphas is 0.35% (for γ = 10). In the cross-section, this dierence attains 1% per year on average in the highest quintile. Accounting for moments beyond skewness only leads to modest additional utility gains: The dierences between prudent and generalized alphas, which take into account all higher moments, are 0.05% on average (0.14% in the high residual co-skewness quintile). The kurtosis adjustments are thereby mostly below 0.05% on average, suggesting that the next moment beyond skewness has less economic signicance.

30
For mutual funds, the average simple alpha is 2.3%, and the skewness adjustments average 0.21% (0.68% in the high residual co-skewness quintile). Adjusting further for kurtosis and all higher moments changes alphas in absolute value by up to 0.04% and 0.1%, respectively. Finally, skewness adjustments are similar for γ = 4 and γ = 10. Indeed, a lower relative risk aversion decreases prudence but increases optimal portfolio allocations in the risky assets. These two opposing eects approximately cancel out each other in the skewness adjustments.
Overall, these results suggest that taking into account skewness in performance evaluation has signicant economic value for investors. In addition, the skewness framework seems to capture the most meaningful dierence between the mean-variance framework and a more specic framework with all higher moments. 28 A relative risk aversion of 10 is considered to be in the range of reasonable values of risk aversion and yields realistic portfolio allocations. 29 In results available upon request, we have constructed a similar table by sorting on residual co-kurtosis. The high-minus-low quintile dierence between prudent and temperant alphas increases then to up to 0.23%. It reports the alpha of individual funds with respect to the benchmark assets (α), the alpha adjusted for higher order co-moments up to order three (prudent alpha; α pr ), up to order four (temperant alpha; α te ), and all higher comoments (α CRRA ). To adjust alpha for higher order co-moments, we consider a portfolio of stocks and bonds which Having shown that rejection rates are considerably higher for hedge funds than for mutual funds, in Table 7 we investigate the robustness of this result using subsets of hedge funds and alternative modeling assumptions. The insights can be summarized as follows. First, rejection rates are very similar for small and large funds, as measured by the funds rst net asset value. 31 Second, older fundsfunds with longer return historiestend to perform better by mean-variance metrics whereas younger funds are more interesting from a skewness perspective. Third, over time the fraction of funds that provides mean-variance spanning benets is cut in half while the fraction of funds that provides skewness benets decreases less and even increases in the case of Over MVS . This suggests that skewness is especially important in the recent subsample. It may also mean that hedge funds are becoming increasingly sophisticated in managing skewness.
Fourth, rejection rates within Morningstar categories are similar to the results with portfolios in Table 2: global derivatives stand out as being most attractive for all skewness investors. In addition, directional debt, event, and multistrategy funds are much less attractive from a skewness perspective than they are from a mean-variance perspective. Fifth, using simple returns (instead of smoothing adjusted returns) or deleting the rst 24 returns for each fund (instead of deleting only the rst 12 returns) to adjust for back-ll bias yields similar results. Sixth, including de-listing returns for liquidated funds of −25% and −50% as suggested by Aiken et al. (2013)  While overlap is harder to reject with currencies than stocks as benchmarks, the overall rejection rates with currencies or commodities combined with stocks and bonds are similar. This last panel also separately reports rejection rates for global derivatives funds which include currency funds. Even here, rejection rates of overlap are about the same as previously. Hence, it is unlikely that global derivatives achieve their 31 We use this measure because it is identiable at launch by investors and not mechanically correlated with a funds' lifetime performance. 32 The data on the currency factor is available at https://people.stanford.edu/hlustig/data-and-code, and we use the net-of-fee factor with all countries, which is the factor with the highest Sharpe ratio and skewness.
superior performance mechanically by investing in assets not spanned by the benchmarks. Overall, the rejection rates in Table 7 are very similar to the rejection rates in the main analysis.

Out-of-sample analysis
While performance can only be analyzed ex-post, some readers may be interested to see whether investors can select hedge funds and mutual funds ex-ante which provide desirable skewness properties ex-post. The rejection rates presented so far suggest that some hedge funds provide desirable higher moment exposure to stocks or bonds. This could indicate skill and hence persistence. Mutual funds, on the other hand, do not seem to exhibit desirable higher moment and alpha properties, suggesting no skill and no performance persistence.
To investigate whether there is further evidence for this conjecture, we form equally weighted decile portfolios at the end of every year based on Over regression. In the latter sort, using t-statistics makes the results more comparable with Over MVS , which is studentized, and also produces better out-of-sample results by taking into account the noise in estimated alphas. is rejected for similar hedge funds.
Overall, the results highlight that non-linearities are important even out-of-sample for hedge funds, but less so for mutual funds.
Motivated by extensive evidence that hedge fund returns exhibit option-like characteristics and the general inability of linear factor models to account for these non-linearities, this research extends the mean-variance spanning and intersection approach of Huberman and Kandel (1987) to skewness 33 The results are similar for rolling windows of 36 months. Panel A reports the rejection rates in percent separately for small funds (rst net asset value below median) and large funds (  and propose regression-based tests. We use our tests to study option-like characteristics in option strategies, mutual fund returns, and hedge fund returns; and to ask whether hedge funds add value to investors. In summary, our ndings highlight the importance of skewness in performance evaluation, and suggest that in the cross-section and even after controlling for co-skewness, a subset of hedge funds improves the investment opportunity set. This paper focuses on hedge and mutual fund returns. Our method can also be applied to other asset classes. In particular, the returns of currency trading strategies and emerging markets have been in the limelight for their skewness (see, Brunnermeier et al., 2008 andGhysels et al., 2016, respectively) so that is a possible area for future research.

Technical Appendix to accompany Spanning Tests for
where u i (·) is the ith derivative of u with respect to x (i.e., u 1 (·) = u (·), etc.), R f is the gross riskfree rate, r p is the excess portfolio return and µ p its expectation. 34 Truncating the approximation at the second-order and inserting the approximation in the rst-order condition for asset i yields where κ ≡ R f + µ p and we ignore that the equalities in (TA.1) are approximations. If the rst-order condition of the investor holds, the right-hand side of (TA.1) can be rewritten to which is very similar to the rst-order condition of the investor with mean-variance-skewness utility and, therefore, γ 1 = − u (κ) u (κ) and γ 2 = 1 2 u (κ) u (κ) . Hence, the aversion to variance relative to the preference for the mean is the absolute risk aversion, i.e., γ 1 = A (κ), and the preference for skewness relative to the preference for the mean is one half of the product of risk aversion and prudence, i.e., 34 As a reference and for the technical details see Jurczenko and Maillet (2006) and the references cited therein.
Low relative risk aversion and very high relative prudence can now be achieved with γ close to zero and k very close to κ. 36 2 Proofs Proof. Proof of Proposition 1 The optimal portfolio w * satises the rst-order conditions where the subscripts x and y refer to the k benchmark assets and n test assets, respectively, w * x and w * y are the subvectors of w * , µ x and µ y are the subvectors of µ, Σ xx , Σ xy , Σ yx and Σ yy are the submatrices of Σ, S xxx , S xxy , S xyx , S xyy , S yxx , S yxy , S yyx , and S yyy are the submatrices 35 Notice that E [ri (rp − µp)] is the covariance of asset i with the optimal portfolio. E ri (rp − µp) 2 is not exactly the co-skewness of asset i with the optimal portfolio because E ri Hence, interpretations of γ1 and γ2 in terms of expected utility hold only approximately. But this is reasonable because empirically alphas adjusted for skewness according to the mean-variance-skewness framework with γ1 = − u (κ) u (κ) and γ2 = 1 2 u (κ) u (κ) are very close to alphas adjusted using an expected utility framework. 36 E.g., γ = 1/10000, κ = 1 and k = 9999/10000 imply γ1 = 1 and γ2 = 5, 000.5. of S. If there is intersection (i.e., w * y = 0), (TA.3) becomes The rst k rows of (TA.4) can then be written as 37 and using this to rewrite the last n rows of (TA.4) gives the result of the proposition Proof. Proof of Proposition 2 Spanning requires that (4) is satised for all values of γ 2 and associated w * x . Hence, sucient conditions are Proof. Proof of Proposition 3 The portfolio problem with short-sales constraints is Let the vector δ contain the Kuhn-Tucker multipliers for the restriction that portfolio weights are 37 Note that the mean-variance-skewness portfolio problem has no closed form solution for portfolio weights. In addition, there is no three fund separation for arbitrary distributions because it is not possible to write the optimal portfolio of any investor as a function of three distinct funds. Three fund separation can be obtained with additional distributional assumptions as for example in  non-negative. The mean-variance-skewness ecient portfolio w * satises If there is intersection, (TA.5) can be rewritten to We proceed in a similar fashion as  and take the mean-variance-skewness ecient portfolio for a particular value of (γ 1 , γ 2 ). Let r x η refer to the L-dimensional subvector of r x which contains only the returns of the assets for which short-sales constraints are not binding and let superscripts η refer to this subset. (TA.6) becomes then Using (TA.7) we get the condition on the test assets for intersection Spanning implies that (TA.8) holds for all relevant values of γ η 2 . Again, we follow the exposition of  to derive the spanning conditions. Let Γ j be the set of γ η 2 , for which the subset of assets for which the short-sales constraints in the mean-variance-skewness ecient portfolios are not binding is the same. In addition, let the L j -dimensional vector of returns of these assets be denoted by r x j , i.e., r x j = r x η if and only if γ η 2 ∈ Γ j . As before, each variable which refers to the set r x j , j = 1, 2, ..., M , is denoted by a superscript j . Hence, we have spanning if and only if the ∀γ η 2 ∈ Γ j , hold. Note that a sucient condition for part B of (TA.9) to be non-positive is that all Proof. Proof of Corollary 1 Immediate. Proof. Proof of Corollary 2 To simplify the equations, we use S xxy w * The rst k rows of (TA.3) can then be written as and using this to rewrite the last n rows of (TA.3) gives (TA.10) Now consider the regression r y,t = α + Br x,t + t .
It is easy to check that we have the following relations These relations enable us to rewrite (TA.10) to We get then It is easy to see that for γ 2 = 0, we recover the familiar mean-variance case: Finally, if there is only one benchmark asset n = 1 and x = x and one test asset k = 1 and y = y, we obtain

Implementation of the test 3.1 Spanning
We obtain consistent estimatesĥ for h and the asymptotic covariance matrix ofĥ, V ar ĥ , with multivariate regressions. Below we sketch how to obtainĥ and V ar ĥ with simple OLS standard errors for the ease of exposition. The corresponding Matlab code is available for download at http://tiny.cc/9fzt4y. This code also oers the possibility to obtain estimates of V ar ĥ based on  standard errors (as used in our analysis) and Newey and West (1987) standard errors.
Step 1: Run the multivariate regression r y i ,t = α i + β i r x,t + i,t , for i = 1, . . . , n, and t = 1, . . . , T, (TA.11) where r y i ,t is the excess return on test asset i in t, r x,t is the k-dimensional vector of excess returns on the k benchmark assets. Let b M V ≡ α 1 , β 1 , . . . , α n , β n be the (k+1)n-dimensional vector of coecients, and where ⊗ denotes the Kronecker product and 1 n is an n-dimensional vector of ones. Letb M V be the OLS estimate for b M V . A consistent estimate for the asymptotic covariance matrix of where Cov ( , ) is the sample covariance matrix of the estimated residuals in (TA.11).
Step 2: Run the multivariate regression z i,t = α S,i + β S,i X S i ,t + u i,t , for i = 1, . . . , k 2 , and t = 1, . . . , T, (TA.13) where z i,t is the n-dimensional vector z i,t ≡ σ 2 X S i t+1 , and X S i ,t is the ith element in the k 2dimensional vector X S,t ≡r x,t ⊗r x,t (r x,t is the vector of demeaned excess returns in t on the benchmark assets). Note that the left-hand-side variable in (TA.13) is scaled by the variance of the right-hand-side variable to recover the residual co-skewnesses in the slope coecients. Let b S ≡ vec α S,1 , β S,1 , . . . , vec α S,k 2 , β S,k 2 be the 2nk 2 -dimensional vector of coecients (vec denotes the vec operator), and . . (TA.14) Letb S denote the OLS estimate for b S . The OLS estimate for the asymptotic covariance matrix ofb S iŝ Step 3: Calculate the asymptotic covariance matrixb M V andb S To compute the test statistics, it is convenient to combine the OLS estimates in one big vector b ≡ [b M V ,b S ], which has the asymptotic covariance matrix Step 4: To select the alphas and co-skewnesses used for the spanning tests, we dene the matrix where I n is a n × n identity matrix, and A is a diagonal matrix for which the elements on the diagonal equal vec (T k ) ⊗ 1 n (T k is a k × k upper triangular matrix in which all nonzero entries equal one). The purpose of A is to eliminate the repeated rows inb S and the corresponding asymptotic covariance matrixQ S . Column headers refer to the signicance levels for which the average rejection rates of the null hypothesis (small sample size) of the tests in 10,000 simulations are reported. Samples range from 3 to 25 years of monthly data.
Spanning tests with short-sales and without short-sales are in Panel A and Panel B, respectively, and the overlap tests are in Panel C. The simulated data assumes that the returns on the benchmark assets follow a multivariate skew normal distribution (Azzalini and Valle, 1996) and that the new asset is spanned by the benchmark assets.
From the last step, we obtainĥ = Hb and Var ĥ = HQH . Note that the vectorĥ has 2n+2nk 2 rows but nk(3k − 1)/2 of them are zero. These zero-rows do not aect the asymptotic distribution of the tests and can be removed (in which case Var ĥ needs to be adjusted accordingly).

Overlap
Following , the overlap test is implemented with the Politis and Romano (1994) stationary bootstrap. As parameters, we choose 1000 bootstrapped samples and a block length of 6.

Small sample size and power
This section analyzes the small sample properties of our tests with simulations. Table TA.1 contains the average number of rejections of overlap and spanning with n = 1 and k = 2 in 10, 000 simulations for the asymptotic signicance levels 0.01, 0.05, and 0.10. The data-generating process of the two benchmark assets is a multivariate skew normal distribution (Azzalini and Valle, 1996) which has the parameters chosen to t the rst three moments of the benchmark assets in the empirical analysis. The data-generating process of the test asset assumes spanning (e.g., (1) holds with α = 0 and each slope coecient equals 0.5). The regression residual is generated from a skew normal distribution with variance 0.23% and skewness −0.04, which correspond to the average sample values in the empirical section. We simulate 3-25 years of monthly data and report the average number of rejections for the spanning test with short sales in Panel A and without short sales in Panel B. Panel C reports the size of the overlap test.  Figure TA.1 contains the rejection rates of spanning and overlap as a function of the intercept of the mean-variance regression. The power of the spanning test with short-sales seems to be slightly larger than its theoretical power. In addition, the power of the test with inequality constraints generally exceeds the power of the test with equality constraints, which is in line with the results for the mean-variance case of . The gure also contains the rejection rates for the overlap test. While a positive alpha does not invalidate the null hypothesis here, it appears that the null is still rejected more often for positive alphas.

Simulated option strategies
To provide evidence on the eectiveness of our approach, we conduct a performance analysis of call writing strategies in a simulated economy with lognormally distributed market returns. While such an economy is complete, a positive Jensen's alpha can be achieved at the expense of negative co-skewness with strategies that sell put or call optionsi.e., concave investment strategies. In our simulations, factor models in the vein of  and  (those which include option returns or polynomials of the market return as factors) usually conclude that the alphas of option strategies are signicant. This happens because these models restrict strategy returns to be a xed linear combination of factors and therefore cannot account for concave relations between strategy and benchmark returns as long as the option factors have dierent strikes than the strategy to be explained. Because our approach jointly considers alphas and co-skewnesses, it is more exible and can account for the concave relation between benchmark assets and strategy returns. In the simulations, our test concludes that option strategies do not improve the investment opportunity set of investors concerned about skewness.
To show that our method is able to determine whether benchmark assets span test assets with option-like returns, we assess the performance of option strategies in a simulated economy. As in , the market return follows a geometric Brownian motion with constant drift and volatility, and options are priced with Black-Scholes formulas. This setting is attractive because the economy is complete and the representative investor is known to have CRRA utility (He and Leland, 1993). We consider strategies shorting call options. 39 The proceeds from the sale and the margin requirements are invested at the risk-free rate. The strategies are calculated for dierent strike levels, and margin requirements are determined as in Santa-Clara and Saretto (2009). We calculate the returns on these strategies for 100 observations and repeat the simulations 10, 000 times. Table TA.2 reports the results.
Call writing strategies have positive Jensen's alphas on average and also in most simulations, and they are signicant around 50% of the time. The generalized  alphas calculated with power utility are also positive but smaller in magnitude and usually insignicant. 40 The residual co-skewness of the option strategies is negative in all simulations and usually signicant. Hence, the mean-variance benets of these strategies come at the cost of more left-tail risk, and the overlap 39 In unreported robustness checks, we have also considered other strategies such as shorting puts or straddles, and found very similar results. 40 We have veried that the generalized alphas get even closer to zero as we increase the number of observations from 100 to a larger number, in line with the results in  and theory. Each column analyzes the performance of call writing strategies with the indicated strikes in a simulated economy with a log-normally distributed market return with expected return of 12% and volatility of 15%. The initial value of the market is 100, and the risk-free rate is 5%. These values are as in . The strategies short a call option with a maturity of 1.5 periods, and buy the shorted option one period later. Options are traded at Black-Scholes prices, and there are CBOE type margin requirements as in Santa-Clara and Saretto (2009). The statistics are calculated for 100 realizations of market returns, and the simulations are repeated 10, 000 times. α is the simple alpha, and α CRRA is Leland (1999)'s generalized alpha using the relative risk aversion implied by the market return parameters, 3.63. Coskew and cokurt are the residual co-moments, and Span MVS s is the rejection rate for the spanning test with short-sale constraints, and Over MVS s for the overlap test. The last rows report the intercepts of a one factor model with a short call (strike 100), α 1 ; and two factor models with: two short calls (strikes 100 and 120), α 2 ; market and short call (strike 100), α 3 ; and 's quadratic market model with market and squared market return, α 4 . The table reports the average value of these coecients across all simulations, the fraction of times in which the coecient was positive in brackets, and signicantly dierent from zero at the 5% level in parentheses.
test correspondingly does not reject the null hypothesis in any simulation. Notice that the spanning test with short-sale constraints rejects the null in almost half of the simulations because a subset of the investorsthe mean-variance investorswish to invest in the option strategies. We also report results for residual co-kurtosis. It turns out that residual co-kurtosis is signicant in at most 10% of the simulations, in line with the idea that the skewness framework captures the main features of option-like payos in this simple setting.
The table also reports the intercepts of popular factor models used to capture non-linear payos such as a single-factor model with call returns, and two-factor models with the market and call returns, or two call returns. These specications are inspired by the popular factor models of Fung and Hsieh (2001) and  who use returns on option strategies to capture hedge funds' non-linear returns. To be clear, these papers aim to explain the variation in hedge fund returns, whereas we ask whether linear factor models augmented with option returns can evaluate the performance of non-linear investment strategies. It turns out this is not the case, and the option-based factor models tend to nd that the call option strategies have signicant intercepts in 20% to 67% of the simulations. This is quite striking because the test assets are deliberately constructed to dier only from the factors by their strikes. Unreported alphas and rejection rates are even higher if dierent strategies such as selling straddles are used as test assets. Finally, 's quadratic market model, with the excess market return and the squared excess market return, yields alphas that are large, positive, and signicant in all simulation draws.
The quadratic market model is unable to assess the alpha of these strategies, because the slope on the squared market return is negative (in line with negative residual co-skewness) and the estimated intercepts are higher because average squared returns are positive.
Overall, the evidence in Table TA.2 suggests the overlap test is very eective in assessing that these option strategies do not add value to investors concerned about skewness, whereas linear factors are in general unable to get to this conclusion. Our simulation analysis is deliberately kept simple, as we provide further evidence on the eectiveness of our approach with real-world option strategies in the empirical section of the paper.

Spanning and overlap in subsamples
This section analyzes the rejection rates within the subcategories explained in Morningstar (2011). Table TA.3 reports the results, and the number of funds in each subcategory. The rejection rates suggest that among the global derivatives funds, the systematic futures funds are most likely to improve the investment opportunity set of all skewness investors. This is expected because they follow trend-following strategies which prosper when markets demonstrate sustained bearish or bullish trends, thus generating a positive relation between funds excess returns and squared stock market returns. Other strategies, such as event driven funds loose most of their appeal once all skewness investors are considered. and Column 5 reports the number of funds in each subcategory. These categories are described in , and the tests are constructed as explained in the main text.