Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Is Imperfection Better? Evidence from Predicting Stock and Bond Returns

Is Imperfection Better? Evidence from Predicting Stock and Bond Returns Abstract The standard predictive regression assumes expected returns to be perfectly correlated with predictors. In the recently introduced predictive system, imperfect predictors account only for a partial variance in expected returns. However, the out-of-sample benefits of relaxing the assumption of perfect correlation are unclear. We compare the performance of the two models from an investor’s perspective. In the Bayesian setup, we allow for various distributions of R2 to account for different degrees of optimism about predictability. We find that relaxing the assumption of perfect predictors does not pay off out-of-sample. Furthermore, extreme optimism or pessimism reduces the performance of both models. The existence of return predictability is one of the most discussed questions in finance. Papers finding evidence in favor of predictability mostly use predictive regression to assess the predictive power of different variables. Excess returns are regressed on various predictors, assuming a perfect correlation between the expected returns and predictors. On the other hand, the predictive system, proposed by Pastor and Stambaugh (2009), relaxes the assumption of perfect correlation. In their setup, the unobservable process of expected returns is (weakly) correlated with predictors—that is, predictors are imperfect in the sense that they do not deliver full information about the expected return. Pastor and Stambaugh empirically show that this correlation between expected returns and predictors is far from perfect. However, their focus is more on assessing the strength of imperfection in data, and less on forecasting. Therefore, a comprehensive comparison of the predictive system and predictive regression as well as an assessment of the economic effects associated with their predictions is missing from the literature. Whereas Wachter and Warusawitharana (2009) show that the predictive regression outperforms the non-predictability model out-of-sample, it is not clear whether there is an additional benefit from going to a more complex predictive model with imperfect predictors. In our main analysis, we build on findings of Wachter and Warusawitharana and extend the predictive regression by imperfect predictors in the predictive system. We document that the predictive system does not outperform the predictive regression out-of-sample—that is, allowing for imperfection in predictors does not improve the certainty equivalent returns (henceforth, CERs). Thus, an investor does not have any economic gains from following predictions from a more comprehensive model. The comparison of these two models is not straightforward. As shown in Wachter and Warusawitharana (2009), different degrees of investor skepticism about predictability (modeled as a prior distribution on R2) are highly relevant for the performance of predictive regression. To compare the performance of the predictive system and predictive regression, we have to ensure that the priors used in these two models are comparable. To achieve this, we propose an approach whereby priors on relevant parameters in the predictive system are chosen to match the prior on R2 in the predictive regression. The present article investigates the predictive system in relation to a special case of the system, the predictive regression. As discussed in Pastor and Stambaugh (2009), the predictive system allows for imperfect predictors in terms of any degree of correlation between the predictors and expected returns. Instead of modeling the expected return as an explicit combination of predictors, they model each time series (returns as well as predictors) as autoregressive processes of order one (AR(1)) and let them interact through the covariance matrix of error terms. Therefore, the expected return does not depend only on the most recent value of the predictor, but on its lagged values as well. Furthermore, estimated residuals from predictive regressions are usually autocorrelated, but the associated complications are often ignored (see Stambaugh, 1999). An advantage of the predictive system is that it allows for modeling such serial correlation in residuals directly. In contrast to Pastor and Stambaugh who use the system to analyze the effects of the correlation between the unexpected returns and innovations in expected returns on the properties of the estimates, we focus on the performance and asset allocation effects of different beliefs about the predictive power of the model. Different prior beliefs on the distribution of R2, that correspond to the fraction of explained variance, play a key role in our article. This goodness-of-fit measure is often more intuitive for investors and easier to elicit than the individual coefficients of the models. Moreover, we analyze the out-of-sample properties of the models that are more relevant for an investor than the frequently presented in-sample evidence. The main criterion for judging and comparing the economic performance of the predictive regression and the predictive system are CERs. These are derived from the asset allocation implied by the out-of-sample predictions from both models. The goal of this article is to compare the two models in a consistent way. Wachter and Warusawitharana (2009) use CERs derived from asset allocations to investigate the role of priors on R2, but they do this only for the predictive regression. Pastor and Stambaugh (2009) compare the predictive system to the predictive regression, but use a Bayesian approach to estimate the system, and ordinary least squares (OLS) for the regression. Our comparison is based on estimating both models by the Bayesian Markov Chain Monte Carlo (MCMC) technique. Therefore, we are able to compare the characteristics and implications of the two models in a way unaffected by the estimation method. Although we emphasize the importance of prior beliefs, we do not aim to analyze several variables with potential predictive power. We only choose the most prominent predictors—namely, the dividend–price ratio for stock returns and the yield spread for bond returns (Campbell and Shiller, 1988, 1991; Fama and French, 1988). Our empirical results indicate that the predictive system might not have the ability to increase economic gains for an investor in the out-of-sample analysis. Moreover, potential economic gains are driven much more by the prior belief of how strong the predictors are (in terms of the coefficient of determination) relative to the prior belief of how imperfect the predictors are (the correlation between the expected and unexpected return). In particular, investors with modest prior on the coefficient of determination achieve superior performance relative to investors with extreme priors (either too optimistic or too skeptical). We find the same patterns in different time periods and check that they are not driven by extreme portfolio allocations. Furthermore, instead of individual predictors for assets we estimate both models with multiple predictors. The findings are in line with the stream of literature, showing that complexity does not always improve the performance, in particular when predicting very noisy stock and bond returns. However, economically motivated constraints (introduced via priors) are able to boost the performance. Our article further relates to Binsbergen and Koijen (2010) who focus on return predictability using filtering techniques. They investigate how the current price–dividend ratio is able to predict future aggregate returns and dividend growth rates as a part of a state space system. In particular, they focus on the interaction term between return and dividend growth predictability, and the role of the reinvestment strategy of dividends. We contribute by analyzing the state–space predictive system in the out-of-sample setup and the crucial role of the prior belief about the predictive strength captured by the coefficient of determination. In addition, this article is related to Kelly and Pruitt (2013) who, jointly with us, aimed at the out-of-sample performance. Moreover, they similarly study possible predictors as noisy measures of the latent expected return. Whereas they highlight how to use rich cross-sectional information in achieving high out-of-sample performance, we focus on the imperfection of predictors and the predictive power. 1 Literature Review Our results bring more insight on the relationship of predictors and expected returns. In the first models such as Samuelson (1965, 1969) and Merton (1969), excess returns were assumed to be unpredictable and investors were to keep portfolio weights constant over time. However, the empirical literature in the 1980s has found variables with predictive power to explain stock and bond returns (Keim and Stambaugh, 1986; Fama and French, 1989; Cochrane, 1992). After strong evidence in favor of return predictability on the aggregate level in the 1990s and 2000s, according to more recent evidence, return predictability is actually considered debatable or even illusory. In their comprehensive study with many different predictors, Welch and Goyal (2008) show that although the in-sample predictive power of the models might be significant, out-of-sample forecasts are poor. They argue that no variable has any significant predictive power. However, their conclusion is based on the OLS estimation of predictive regressions for various predictors, and is not robust with respect to different predictive models and estimation techniques. Therefore, the question whether there are variables containing some predictable components remains unsettled and still fascinates many researchers. Attempts to deal with this question come from many areas of empirical finance. The first stream of literature improves the forecasting performance by small refinements. Campbell and Thompson (2008) respond to Welch and Goyal (2008) by imposing constraints on the sign of the coefficients and return forecasts. Rapach, Strauss, and Zhou (2010) take combinations—that is, means or medians—of predictions from different predictive regressions to obtain a better performance. Avramov (2002) adopts a Bayesian model averaging methodology to exploit the information from different predictors at once and finds both in-sample and out-of-sample predictability. A second stream of literature attempts to explain the predictability phenomenon by various versions of time variation, such as structural breaks, or time-varying coefficients. Pesaran and Timmermann (2002) identify one structural break around 1991 after which predictability disappears. However, later studies differ quite considerably in terms of the timing of breaks and their number. Furthermore, the out-of-sample performance is found to be poor because breaks cannot be reliably detected in real time (Lettau and Van Nieuwerburgh, 2008). Dangl and Halling (2012) assume the predictive regression with time-varying coefficients in a Bayesian framework, and provide a comprehensive look at the performance in this setup. They find that an investor following the optimal strategy implied by their model would be consistently better off than an investor using the historic mean. The third way of looking at this phenomenon is by using regime-switching models. Most studies which find support for two regimes (Henkel, Martin, and Nardari, 2011)—interpreted as recession and expansion—find a countercyclical pattern. Whereas predictability during recessions is significantly better than the historical average, predictability during expansions is typically weaker, if at all. The intuition is simple. In bad times, investors demand a higher risk premium. Furthermore, volatility is higher. The prices are adjusted much more to discount rates per unit of price change. As a consequence, prices are more sensitive to a more volatile price–dividend ratio. Wachter and Warusawitharana (2015) assume an investor who distinguishes two states of the world—when returns are predictable and when returns are unpredictable—and assigns prior beliefs on the two states, that is, the two models. They find strong support in favor of predictability. Most of the recent papers in favor of predictability rely on the Bayesian estimation technique. This method allows an investor to incorporate her prior beliefs about the model to determine the optimal weights. The initiation of applying this approach in the asset allocation literature goes back to the paper by Kandel and Stambaugh (1996) and their simulation study. Although predictability seems to be weak in terms of frequentist statistical measures, they show that an investor observing the simulated data might significantly change her asset allocation and improve her performance. This article supplements the literature on predictability by elaborating on a fair comparison of predictive regression and the predictive system in the Bayesian framework. It evaluates the out-of-sample performance of both models for different prior beliefs and compares the changes in asset allocation with respect to the model and the prior distribution. In the empirical part, we find that the predictive regression—a more parsimonious model with perfect predictors—turns out to perform better in terms of out-of-sample CERs. The remainder of the article is structured as follows. Section 2 presents the modeling framework and the estimation technique. We discuss both investigated models in detail and highlight their differences. In Section 3, we describe the data used for modeling, apply the suggested models to them, and report our empirical findings. Section 4 concludes and provides suggestions for future research. 2 Econometric Methodology In this section, we introduce predictive regression and the predictive system, explain their main differences, and describe the estimation technique and criteria we use to evaluate the out-of-sample forecasts. 2.1 Model Setup The most common way of modeling predictability is by using predictive regression. Here, we assume that predictors, usually some financial ratios, provide full information about the expected returns. However, we might have some doubts whether these variables fully capture the actual market expectations. Therefore, we can search for an efficient estimate of unobservable expectations, given the noisy proxies that are available. The predictive system offers one way to put some structure on the return process and model the noisy, a.k.a. imperfect, predictors. We define the realized return rt+1 as a sum of the expected return δt and the unexpected return ut+1  rt+1=δt+ut+1. (1) The two models we consider in this article differ with respect to the relation between the expected return δt and predictors. In predictive regression, the expected return depends only on the recent value of predictors. The model is given by   rt+1=a+b′xt+ut+1, (2)  xt+1=(I−A)Ex+Axt+vt+1, (3) where xt is a vector of predictors, a, b, A, and Ex are regression coefficients and   [utvt]∼(0,Σ)  where  Σ≡[σu2σuvσvuΣv] (4) are identically distributed errors. Because the expected return is modeled as a linear function of the predictors, δt=a+b′xt, it implies a perfect correlation between predictors and expected returns (not realized returns!). This means that the entire variance in the expected returns is explained by the current value of the predictors. The predictive system proposed by Pastor and Stambaugh (2009, 2012) relaxes this assumption of perfect correlation. It takes the form   rt+1=δt+ut+1, (5)  xt+1=(I−A)Ex+Axt+vt+1, (6)  δt+1=(1−β)Er+βδt+wt+1, (7) where δt is the unobservable expected excess return and A, β, Ex, and Er are system coefficients. The error distribution is   [utvtwt]∼N(0,Σ),  where  Σ≡[σu2σuvσuwσvuΣvσvwσwuσwvσw2] (8) and the errors are identically distributed. Both expected excess returns and predictors follow AR(1) processes. In the predictive system, the connection between the expected return and the predictors is not obvious. In fact, they are related through the error covariance matrix. In the case of a single predictor, the correlation between the expected return δt and the predictor xt is determined by the correlation between the errors ρvw and the scalars A and β  ρxδ=ρvw(1−A2)(1−β2)(1−Aβ)2. (9) Therefore, it can be easily shown that the predictive system collapses to predictive regression if A=β and ρvw=±1. Then, there exists b such that wt=b′vt and the dynamics for the expected return can be rewritten   δt=(1−A)Er+Aδt−1+b′vt=(1−A)Er+b′∑τ=0∞Aτvt−τ=(1−A)(Er−b′Ex)+b′xt=a+b′xt, (10) and the expected return is perfectly correlated with the predictor. In the predictive system, other than in the predictive regression, the current value of the predictor is not the only source of the information about the expected return. The additional information in the lagged realized returns and predictors is incorporated in a parsimonious way via the covariance structure. The hypothesis that predictors are not perfectly correlated with the expected return is thoroughly discussed in Pastor and Stambaugh (2009). They argue that serial correlation of estimated residuals, typically found in empirical studies of the predictive regression, is justified using the predictive system. Moreover, different values of correlation ρuw allow for modeling various types of dependence of expected returns on lagged values of the predictor. As discussed in their paper, the bond price is purely driven by discount rate shocks, which implies the correlation between the innovations in expected returns and the unexpected return to be 1. For stocks, the analogous effect on the negative correlation might be weaker, but still present. Therefore, they investigate the role of more or less informative priors by varying the mass put on the negative values. They show that they get more precise in-sample estimates by assuming an informative prior. The present article contributes to the analysis of the system by looking at its out-of-sample properties that are important for an investor. Besides the prior distribution on the correlation ρuw that is analyzed in Pastor and Stambaugh (2009), we emphasize the importance of the priors on parameters relevant for the implied prior distribution on R2. Moreover, we provide a comparison of the predictions from the predictive system to predictive regression, where both models are estimated in a Bayesian framework with the same prior beliefs. 2.2 Estimation Technique and Prior Distribution As shown in Wachter and Warusawitharana (2009), the estimation technique has an effect on performance. In their article, the predictive regression estimated by OLS exhibits poor performance (in terms of CER) compared to the same model estimated by Bayesian methods. The dependence of the results on the estimation technique suggests using a Bayesian approach, via MCMC, to estimate the predictive system as well. Furthermore, a Bayesian approach allows us to consider different investor’s beliefs about the predictive power of the system before knowing the data. Economically significant effects of different priors are also emphasized in Shanken and Tamayo (2012). In contrast to this article and the paper by Wachter and Warusawitharana (2009) with their focus on the prior about the coefficient of determination, they analyze a larger set of parameters and their prior distributions. However, it is not so obvious for an investor to form prior beliefs about many individual parameters. We offer a more parsimonious approach where an investor might have an opinion about the R2, but the prior beliefs about the rest of the parameters are noninformative. For the estimation of the predictive regression, we adopt the framework proposed by Wachter and Warusawitharana (2009). They allow for different prior beliefs on R2, which is usually interpreted as an indicator of predictive power. In this article, we develop a similar approach for the predictive system that allows us to compare the two models and analyze the effects of the priors. In the predictive system, we need to estimate the unobservable series of expected returns δt and several parameters: α, β, A, B, and a covariance matrix Σ. As discussed in Pastor and Stambaugh (2009), the system is not fully identified only by the data. Only the coefficients (α, β, A, B) and the covariance of innovations in the process for predictors, Σv, can be identified by the data. However, we can impose an additional structure on the covariance matrix Σ in the Bayesian framework via priors that guarantees the identification of all parameters in the covariance matrix. The Bayesian setup allows us to put an informative prior distribution on the parameters about which we have some intuition, and noninformative priors otherwise. Our main focus is on the parameters that have an impact on R2 of the first equation in the system which is defined as   R2=1−σu2σr2=1−σu2σw21−β2+σu2. (11) We implement different priors on R2 by imposing restrictions on the distributions of σu2, σw2, and β. We model the prior on R2 indirectly by imposing a specific structure on the prior distribution of the error covariance matrix Σ. We choose the informative prior on the error covariance submatrix Σ11  Σ11=[σu2σuwσuwσw2], (12) but the noninformative prior about the other elements of the error covariance matrix Σ. Stambaugh (1997) argues that such a prior can be modeled as a posterior of Σ with a noninformative prior and an alternative hypothetical sample of T1 observations of v and T2 observations of (u, w), where T1<<T2<<T, wherein T is the actual number of observations. As in Pastor and Stambaugh (2009), we use T1=K+3 and T2=T/5, where K is the number of predictors. Therefore, the informative prior on the submatrix Σ11 has an inverted Wishart distribution   Σ11∼IW(T2Σ^11,T2−K). (13) where the elements of Σ^11 are chosen in such a way that E(Σ11) is defined in order to get different distributions of R2. In contrast to Pastor and Stambaugh (2009), we do not analyze the prior on ρuw (which is driven by the nondiagonal element of Σ^11), but we focus on σu2 and σw2, which are related to the diagonal elements of Σ^11. By using the relation   σδ2=σw21−β2 (14) in equation (11), we see that these variances, together with the value of β, determine the value of R2. As a starting point, we investigate an investor with the same priors as used by Pastor and Stambaugh (2009). We set elements of Σ^11 in such a way that the prior mean of the variance of the unexpected return σu2 equals 95% of the sample return variance σr2. The prior mean of the variance of the shocks in the expected returns σw2 are chosen in a way that, combined with β=0.97, delivers the variance of the expected return σδ2 to 5% of the sample return variance. Pastor and Stambaugh argue that these values imply a plausible prior on R2. We go one step further and analyze a richer set of priors to see the effect of different beliefs about predictability. An investor with the priors used in Pastor and Stambaugh (2009) is our benchmark investor and, with the prior expected value of 5% for the coefficient of determination, he is considered to be modest (compared to the other types of investors). Moreover, we consider a few more types of investors with   E(σδ2)=kσr2  and  E(σu2)=(1−k)σr2 (15) where k is the fraction of explained variance and 1−k is the fraction of unexplained variance. The explained variance is lower for skeptical investors; more specifically, we consider k=1%,0.5%,0.1%. We chose these values to obtain priors comparable to those in Wachter and Warusawitharana (2009). Finally, we also consider more optimistic priors with k=50% and 10%. To derive a closed-form solution for the relation between E(R2) and k is not possible. However, we can sketch this relation by using the formula derived in Ali, Pal, and Woo (2007). They consider two independent random variables X and Y with inverted gamma distributions (one-dimensional inverse Wishart distributions). They are able to derive the formula for the ratio X/(X+Y) by using the gamma function. In our case, X corresponds to the variance of δ and Y is the variance of u. (However, they do not have to be independent in general in our model.) With this one-dimensional simplification, we can calculate the expected value of R2 for all k, except k=50% which is outside the range for which the formula yields a usable value. From Table 1, we see that E (R2) and k are practically the same and, thus, for ease of exposition in the rest of the article, we refer to E(R2) and k as the same parameter. Table 1. Relation of the expected value on prior R2 and the parameter k Parameter k, %  50  10  5  1  0.5  0.1  E(R2), %  –  10.356  5.081  1.003  0.508  0.100  Parameter k, %  50  10  5  1  0.5  0.1  E(R2), %  –  10.356  5.081  1.003  0.508  0.100  Notes: The mean of R2 is calculated by using a formula for the ratio of two inverted gamma-distributed variables (Ali, Pal, and Woo, 2007). The reported values are based on the length of the time series used in the empirical part. For a longer time series, the expected value of R2 would be even closer to the parameter k. Table 1. Relation of the expected value on prior R2 and the parameter k Parameter k, %  50  10  5  1  0.5  0.1  E(R2), %  –  10.356  5.081  1.003  0.508  0.100  Parameter k, %  50  10  5  1  0.5  0.1  E(R2), %  –  10.356  5.081  1.003  0.508  0.100  Notes: The mean of R2 is calculated by using a formula for the ratio of two inverted gamma-distributed variables (Ali, Pal, and Woo, 2007). The reported values are based on the length of the time series used in the empirical part. For a longer time series, the expected value of R2 would be even closer to the parameter k. The framework described above allows us to estimate the predictive system with different means of the prior distribution on R2. Moreover, the framework derived in Wachter and Warusawitharana (2009) enables us to model different means of the prior on R2 in the predictive regression. By defining the normalized variable η  η=σxγσu (16) with the normally distributed prior η∼N(0,ση2), Wachter and Warusawitharana show that   R2=η2η2+1. (17) Therefore, by choosing the corresponding pairs of parameters ση in the predictive regression and k in the predictive system, we can estimate the models for investors with various beliefs about predictability. Table 2 reports which values of ση2 from Wachter and Warusawitharana (2009) correspond to the parameter k for the predictive system in our framework. In the comparison of the models, we thus assume that the prior distribution on R2 in the system and the regression has the same mean. However, we have to point out that the distributions are not identical and differ in higher moments. Although we have looked carefully at the prior distributions implied by different values of model parameters that have an impact on R2, we are not able to match the higher moments. Table 2. Comparison of parameters in the predictive system and the predictive regression   Fraction of explained variance (%)     50  10  5  1  0.5  0.1  Predictive system, k  0.5  0.1  0.05  0.01  0.005  0.001  Predictive regression, ση  1.60  0.37  0.24  0.10  0.07  0.03    Fraction of explained variance (%)     50  10  5  1  0.5  0.1  Predictive system, k  0.5  0.1  0.05  0.01  0.005  0.001  Predictive regression, ση  1.60  0.37  0.24  0.10  0.07  0.03  Table 2. Comparison of parameters in the predictive system and the predictive regression   Fraction of explained variance (%)     50  10  5  1  0.5  0.1  Predictive system, k  0.5  0.1  0.05  0.01  0.005  0.001  Predictive regression, ση  1.60  0.37  0.24  0.10  0.07  0.03    Fraction of explained variance (%)     50  10  5  1  0.5  0.1  Predictive system, k  0.5  0.1  0.05  0.01  0.005  0.001  Predictive regression, ση  1.60  0.37  0.24  0.10  0.07  0.03  As we do not focus on the correlation ρuw between the unexpected return and innovations in the expected return, we estimate the model for the same three priors on ρuw that are used in Pastor and Stambaugh (2009). They argue that this correlation is negative and, thus, informative priors reflect this belief—noninformative priors with the same mass for positive and negative values, less informative with a positive mass only on negative values, and more informative that have most mass on negative values close to 1. However, we argue based on the empirical results below that this prior does not have a strong effect on the out-of-sample performance. All priors on the other parameters of the system are the same as in Pastor and Stambaugh (2009). Thus, we have defined a framework that allows for a comparison of the predictive system and the predictive regression. Both models are estimated by the Bayesian approach. Therefore, we accomplish to compare the characteristics and implications of the two models in a way that is unaffected by the estimation technique. 2.3 Out-of-Sample Performance Our focus is to investigate the out-of-sample performance of the predictive system and predictive regression given different investor’s beliefs. Although many papers find strong in-sample predictability, the results in real-time evaluations of the models are mostly much weaker. In this article, we compare the performance of the two suggested models estimated by the Bayesian framework in real time. We now describe the evaluation procedure in more detail. For measuring out-of-sample performance, we use an expanding window strategy. Following Wachter and Warusawitharana (2009), we estimate the models after observing at least 20 years of quarterly data. By simulating 200,000 (75,000) draws, dropping the first 50,000 (1000) as a burn-in phase, and taking every third draw from the rest to decrease the serial correlation, we obtain the posterior distributions for the predictive regression (predictive system). Both models are re-estimated every four quarters. Predictions for t + 1 are computed every quarter, holding estimates fixed throughout a year, but using observed predictors lagged by one quarter. In the optimal portfolio choice problem, we consider an investor who holds stocks, bonds, and a risk-free asset in her portfolio. She maximizes expected utility in the next period t + 1 conditional on the information available now (period t). Although we look at the one-period static asset allocation problem, it would be possible to predict more periods ahead and look at the effects of the dynamic asset allocation problem. However, this is beyond the scope of this article, which provides a first step in comparing the performance of the models in static setup. The investor solves the one-period portfolio choice problem   max⁡Et(U(Wt+1)), where U(·) is a utility function, Wt is the wealth at time t, and the value is calculated conditional on all available information through time t. We consider an investor with a mean–variance utility function to make our results comparable to other studies (Wachter and Warusawitharana, 2009; Dangl and Halling, 2012). Therefore, the stock and bond weights from time t to t + 1 are given by   ωti=1AEt(rt+1i)Vart(rt+1i), (18) where A is a risk-aversion coefficient, i is the index of risky assets, and Et(rt+1i) and Vart(rt+1i) are the first two moments of the posterior distribution of the returns at time t + 1, conditional on the information at time t. Given a draw j from the posterior distribution of the model parameters, a draw from the predictive distribution of asset returns is given by rj=δj+uj for the predictive system and by rj=aj+bjxt+uj for the predictive regression where uj∼N(0,σu,j2). The optimal portfolio is then the solution to (18) with the mean and variance computed by simulating draws rj. Both moments are determined for each model separately. As it stands, the covariance among assets in the predictive regression and the predictive system cannot be easily modeled. Therefore, the asset weights in Equation (18) are based on an asset correlation of zero. To investigate the role of covariance, we additionally use the correlation between the risky assets from the historical data (i.e., the sample correlation) as a robustness check in the empirical part. The final wealth is given by   Wt+1=Wt(∑iωtirt+1i+rf,t+1)︸rt+1p, (19) where rti is the realized return on the risky asset, i, rf,t is the realized return on the risk-free asset, and rtp is the portfolio return. For assessing the out-of-sample performance, it is important for an investor how predictability is mapped into gains and losses of her strategy. As in the optimal asset allocation problem, an investor takes into account the first and the second moment of the return distribution; it is a natural choice to take these moments also into account when evaluating the model performance. Therefore, we measure the performance in terms of the CER that evaluates the model in economic terms, adjusting for risk. For the portfolio over the considered out-of-sample time period, we define CER as   CER=r¯p−A2vrp  r¯p=1n∑trtp  vrp=1n−1∑t(rtp−r¯p)2, (20) where t is the index of all quarters in the out-of-sample period 1972–2011, r¯p is the average portfolio return over the out-of-sample period, and vrp is the portfolio variance over the out-of-sample period. In the tables below, CER is multiplied by 400 to express them as an annual percentage. To assess the significance of the out-of-sample performance is not easy, because the parameters are re-estimated sequentially, and the usual tests known from an in-sample analysis cannot be applied (i.e., transferred) to an out-of-sample context. Dangl and Halling (2012) use daily data to estimate the monthly variance of the portfolio from stocks and the risk-free asset. However, as we do not have the daily data for bond prices, we cannot repeat their test. Furthermore, the simulation exercise undertaken by Wachter and Warusawitharana (2009) in the predictive regression is infeasible for the predictive system due to time constraints. Therefore, for each CER, we opt to focus on the bootstrap standard error (Efron and Gong, 1983) that reflects the variability of the mean of CER. In particular, to avoid possible time dependency in realized returns, we rely on the block bootstrap standard errors. We bootstrap samples of four quarterly sequential out-of-sample returns until we obtain a series of the same length as the original out-of-sample window and repeat it 10,000 times. Based on the simulated returns, we calculate the standard error. 3 Data and Empirical Findings We conducted our analysis on the quarterly data spanning the first quarter of 1952 until the last quarter of 2011. Following other studies, we began our sample after 1951 when the Fed was allowed to pursue an independent monetary policy. All financial data are obtained from the Center for Research in Security Prices (CRSP). Excess stock returns are defined as the quarterly returns on the NYSE-AMEX-NASDAQ index in excess of the three-month Treasury bill. Similarly, the excess bond returns are constructed as the quarterly returns on ten-year Treasury bond minus the three-month Treasury bill. The dividend–price ratio—used as a predictor of the stock returns (Campbell and Shiller, 1988; Fama and French, 1988)—is constructed as a sum of total dividends paid over the previous 12 months divided by the current price. Dividends are calculated from the monthly stock returns, inclusive of dividends and exclusive of dividends on the value-weighted NYSE-AMEX-NASDAQ index. The yield spread—used as a predictor of the bond returns (Campbell and Shiller, 1991)—is constructed as the yield on the five-year bond minus the yield on the three-month bond. 3.1 Results This section describes the results obtained from the estimation of both considered models—the predictive regression and the predictive system. First, we evaluate the out-of-sample performance for the entire sample period, and then we conduct several robustness checks. We start with the analysis of the effects of the prior distribution for R2 on the in-sample model performance. Table 3 shows the mean and standard deviation of the posterior distribution of R2 for the predictive system and predictive regression for both risky assets. A similar analysis has been already done for stock returns in Pastor and Stambaugh (2009). However, they do not distinguish different priors on R2, but only use one prior labeled P&S in our tables. Moreover, the predictive regression in their paper is estimated by OLS, which makes a comparison to the Bayesian estimates obtained for the predictive system difficult. Nevertheless, our more comprehensive results are consistent with their conclusion on the R2 for stock returns. For every column (i.e., prior on the fraction of the explained variance), the mean of in-sample R2 for the predictive system is higher than for the predictive regression. However, the results for bond returns are less clear. For the most optimistic prior, the predictive system yields a higher posterior mean for R2 than the predictive regression. For all other priors, the results are opposite. In any case, given the high standard deviations, deriving strong conclusions may be problematic, and we take these results only as a first indication of the model’s performance. Table 3. In-sample performance: Posterior R2 Posterior R2 (%), 1952−2011     Fraction of explained variance (%)     50%  10  5  1  0.5  0.1    W&W    P&S  W&W  W&W  W&W    optimistic      pessimistic  Stocks               Predictive system  8.70  5.55  4.63  3.00  2.38  1.17  (3.28)  (4.61)  (5.17)  (6.23)  (5.42)  (4.13)   Predictive regression  1.36  1.30  1.26  1.02  0.84  0.16  (1.57)  (1.06)  (0.83)  (0.65)  (0.59)  (0.20)  Bonds               Predictive system  8.62  2.65  1.82  0.92  0.59  0.18  (2.97)  (2.05)  (2.82)  (3.50)  (2.79)  (1.43)   Predictive regression  5.54  5.22  4.78  2.75  1.65  0.23  (3.00)  (2.85)  (2.64)  (1.69)  (1.14)  (0.24)  Posterior R2 (%), 1952−2011     Fraction of explained variance (%)     50%  10  5  1  0.5  0.1    W&W    P&S  W&W  W&W  W&W    optimistic      pessimistic  Stocks               Predictive system  8.70  5.55  4.63  3.00  2.38  1.17  (3.28)  (4.61)  (5.17)  (6.23)  (5.42)  (4.13)   Predictive regression  1.36  1.30  1.26  1.02  0.84  0.16  (1.57)  (1.06)  (0.83)  (0.65)  (0.59)  (0.20)  Bonds               Predictive system  8.62  2.65  1.82  0.92  0.59  0.18  (2.97)  (2.05)  (2.82)  (3.50)  (2.79)  (1.43)   Predictive regression  5.54  5.22  4.78  2.75  1.65  0.23  (3.00)  (2.85)  (2.64)  (1.69)  (1.14)  (0.24)  Notes: This table reports means and standard deviations (in parentheses) of posterior R2 for the predictive system and predictive regression. The predictor is the dividend–price ratio for the stock returns and the yield spread for the bond returns. Different beliefs about the prior distribution on R2 are considered, characterized by the mean of the prior distribution. Lower values represent more skeptical investors. For the predictive system, we report the results for the more informative prior on the correlation between the expected and unexpected returns. Data are quarterly and span from 1952 to 2011. Table 3. In-sample performance: Posterior R2 Posterior R2 (%), 1952−2011     Fraction of explained variance (%)     50%  10  5  1  0.5  0.1    W&W    P&S  W&W  W&W  W&W    optimistic      pessimistic  Stocks               Predictive system  8.70  5.55  4.63  3.00  2.38  1.17  (3.28)  (4.61)  (5.17)  (6.23)  (5.42)  (4.13)   Predictive regression  1.36  1.30  1.26  1.02  0.84  0.16  (1.57)  (1.06)  (0.83)  (0.65)  (0.59)  (0.20)  Bonds               Predictive system  8.62  2.65  1.82  0.92  0.59  0.18  (2.97)  (2.05)  (2.82)  (3.50)  (2.79)  (1.43)   Predictive regression  5.54  5.22  4.78  2.75  1.65  0.23  (3.00)  (2.85)  (2.64)  (1.69)  (1.14)  (0.24)  Posterior R2 (%), 1952−2011     Fraction of explained variance (%)     50%  10  5  1  0.5  0.1    W&W    P&S  W&W  W&W  W&W    optimistic      pessimistic  Stocks               Predictive system  8.70  5.55  4.63  3.00  2.38  1.17  (3.28)  (4.61)  (5.17)  (6.23)  (5.42)  (4.13)   Predictive regression  1.36  1.30  1.26  1.02  0.84  0.16  (1.57)  (1.06)  (0.83)  (0.65)  (0.59)  (0.20)  Bonds               Predictive system  8.62  2.65  1.82  0.92  0.59  0.18  (2.97)  (2.05)  (2.82)  (3.50)  (2.79)  (1.43)   Predictive regression  5.54  5.22  4.78  2.75  1.65  0.23  (3.00)  (2.85)  (2.64)  (1.69)  (1.14)  (0.24)  Notes: This table reports means and standard deviations (in parentheses) of posterior R2 for the predictive system and predictive regression. The predictor is the dividend–price ratio for the stock returns and the yield spread for the bond returns. Different beliefs about the prior distribution on R2 are considered, characterized by the mean of the prior distribution. Lower values represent more skeptical investors. For the predictive system, we report the results for the more informative prior on the correlation between the expected and unexpected returns. Data are quarterly and span from 1952 to 2011. The out-of-sample coefficients of determination are reported in Table 4. The in-sample performance does not persist in the out-of-sample and R2 decreases for both stock and bond returns. The predictive system delivers higher performance than the predictive regression for stock returns and investors who have modest priors. For bonds, the predictive regression acquires higher R2. Although the coefficient of determination is a very common statistical measure, it is not clear what overall effect these R2s have on investor’s portfolio and, therefore, we focus on the CERs in the rest of the analysis. Table 4. Out-of-sample R2 Out-of-sample R2 (%), 1972−2011     Fraction of explained variance (%)     50  10  5  1  0.5  0.1    W&W    P&S  W&W  W&W  W&W    optimistic      pessimistic  Stocks               Predictive system  −2.28  −0.71  0.26  1.42  0.68  0.26   Predictive regression  −0.18  −0.17  −0.26  0.30  0.68  1.07  Bonds               Predictive system  0.23  −0.07  0.30  0.46  0.49  0.62   Predictive regression  4.17  4.15  4.05  3.45  2.64  0.86  Out-of-sample R2 (%), 1972−2011     Fraction of explained variance (%)     50  10  5  1  0.5  0.1    W&W    P&S  W&W  W&W  W&W    optimistic      pessimistic  Stocks               Predictive system  −2.28  −0.71  0.26  1.42  0.68  0.26   Predictive regression  −0.18  −0.17  −0.26  0.30  0.68  1.07  Bonds               Predictive system  0.23  −0.07  0.30  0.46  0.49  0.62   Predictive regression  4.17  4.15  4.05  3.45  2.64  0.86  Notes: This table reports out-of-sample R2 for the predictive system and predictive regression. The predictor is the dividend–price ratio for the stock returns and the yield spread for the bond returns. Different beliefs about the prior distribution on R2 are considered, characterized by the mean of the prior distribution. Lower values represent more skeptical investors. For the predictive system, we report the results for the more informative prior on the correlation between the expected and unexpected returns. Data are quarterly and span from 1952 to 2011. Table 4. Out-of-sample R2 Out-of-sample R2 (%), 1972−2011     Fraction of explained variance (%)     50  10  5  1  0.5  0.1    W&W    P&S  W&W  W&W  W&W    optimistic      pessimistic  Stocks               Predictive system  −2.28  −0.71  0.26  1.42  0.68  0.26   Predictive regression  −0.18  −0.17  −0.26  0.30  0.68  1.07  Bonds               Predictive system  0.23  −0.07  0.30  0.46  0.49  0.62   Predictive regression  4.17  4.15  4.05  3.45  2.64  0.86  Out-of-sample R2 (%), 1972−2011     Fraction of explained variance (%)     50  10  5  1  0.5  0.1    W&W    P&S  W&W  W&W  W&W    optimistic      pessimistic  Stocks               Predictive system  −2.28  −0.71  0.26  1.42  0.68  0.26   Predictive regression  −0.18  −0.17  −0.26  0.30  0.68  1.07  Bonds               Predictive system  0.23  −0.07  0.30  0.46  0.49  0.62   Predictive regression  4.17  4.15  4.05  3.45  2.64  0.86  Notes: This table reports out-of-sample R2 for the predictive system and predictive regression. The predictor is the dividend–price ratio for the stock returns and the yield spread for the bond returns. Different beliefs about the prior distribution on R2 are considered, characterized by the mean of the prior distribution. Lower values represent more skeptical investors. For the predictive system, we report the results for the more informative prior on the correlation between the expected and unexpected returns. Data are quarterly and span from 1952 to 2011. Table 5 reports our main results, presenting CERs of asset allocation strategies obtained from the entire out-of-sample period 1972–2011. We report CER calculated for both models with different prior distributions on R2 for different priors on the correlation ρuw in the predictive system. We consider investors with two different risk aversion coefficients, A = 2 and A = 5. For both degrees of risk aversion, the results are qualitatively very similar. Table 5. Out-of-sample performance: CERs     Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−2011  Predictive system  Prior on ρuw                More informative  3.45  7.49  8.31  8.37  6.83  6.08  (0.38)  (0.31)  (0.29)  (0.20)  (0.18)  (0.16)    Less informative  5.25  8.18  8.65  7.49  7.67  8.39  (0.37)  (0.32)  (0.30)  (0.27)  (0.25)  (0.17)    Noninformative  4.54  7.94  8.73  8.08  8.07  9.19  (0.42)  (0.34)  (0.30)  (0.27)  (0.25)  (0.20)  Predictive regression    8.12  8.64  8.95  10.76  10.80  9.77  (0.39)  (0.37)  (0.34)  (0.26)  (0.21)  (0.17)  Panel B: Risk aversion A = 5, 1972−2011  Predictive system  Prior on ρuw                More informative  4.91  6.46  6.79  6.81  6.19  5.87  (0.15)  (0.12)  (0.12)  (0.08)  (0.07)  (0.07)    Less informative  5.59  6.73  6.91  6.44  6.52  6.83  (0.15)  (0.13)  (0.12)  (0.11)  (0.10)  (0.07)    Noninformative  5.30  6.62  6.94  6.69  6.70  7.17  (0.17)  (0.14)  (0.12)  (0.11)  (0.10)  (0.08)  Predictive regression    6.76  6.96  7.08  7.79  7.81  7.40  (0.15)  (0.14)  (0.13)  (0.11)  (0.09)  (0.07)      Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−2011  Predictive system  Prior on ρuw                More informative  3.45  7.49  8.31  8.37  6.83  6.08  (0.38)  (0.31)  (0.29)  (0.20)  (0.18)  (0.16)    Less informative  5.25  8.18  8.65  7.49  7.67  8.39  (0.37)  (0.32)  (0.30)  (0.27)  (0.25)  (0.17)    Noninformative  4.54  7.94  8.73  8.08  8.07  9.19  (0.42)  (0.34)  (0.30)  (0.27)  (0.25)  (0.20)  Predictive regression    8.12  8.64  8.95  10.76  10.80  9.77  (0.39)  (0.37)  (0.34)  (0.26)  (0.21)  (0.17)  Panel B: Risk aversion A = 5, 1972−2011  Predictive system  Prior on ρuw                More informative  4.91  6.46  6.79  6.81  6.19  5.87  (0.15)  (0.12)  (0.12)  (0.08)  (0.07)  (0.07)    Less informative  5.59  6.73  6.91  6.44  6.52  6.83  (0.15)  (0.13)  (0.12)  (0.11)  (0.10)  (0.07)    Noninformative  5.30  6.62  6.94  6.69  6.70  7.17  (0.17)  (0.14)  (0.12)  (0.11)  (0.10)  (0.08)  Predictive regression    6.76  6.96  7.08  7.79  7.81  7.40  (0.15)  (0.14)  (0.13)  (0.11)  (0.09)  (0.07)  Notes: CERs are calculated for the predictive system and predictive regression. The portfolio consists of stocks, bonds, and a risk-free asset. The predictor is the dividend–price ratio for the stock returns and the yield spread for the bond returns. Optimal weights are calculated for a mean–variance investor with risk-aversion coefficients A = 2 (Panel A) and A = 5 (Panel B). At the beginning of each year, starting in 1972, we estimated the model and used the estimated parameters for calculating the optimal portfolio for each quarter in this year. Different beliefs about the prior distribution on R2 are considered, characterized by the expected value of the prior distribution. Lower values represent more skeptical investors. For the predictive system, we report the results for different priors on the correlation between the expected and unexpected returns as in Pastor and Stambaugh (2009). The noninformative prior is flat on most of the (−1,1) interval; the less informative implies most mass below zero; and the more informative imposes most mass below −0.7. Data are quarterly and span from 1952 to 2011. Numbers in parentheses are block bootstrap standard errors. Table 5. Out-of-sample performance: CERs     Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−2011  Predictive system  Prior on ρuw                More informative  3.45  7.49  8.31  8.37  6.83  6.08  (0.38)  (0.31)  (0.29)  (0.20)  (0.18)  (0.16)    Less informative  5.25  8.18  8.65  7.49  7.67  8.39  (0.37)  (0.32)  (0.30)  (0.27)  (0.25)  (0.17)    Noninformative  4.54  7.94  8.73  8.08  8.07  9.19  (0.42)  (0.34)  (0.30)  (0.27)  (0.25)  (0.20)  Predictive regression    8.12  8.64  8.95  10.76  10.80  9.77  (0.39)  (0.37)  (0.34)  (0.26)  (0.21)  (0.17)  Panel B: Risk aversion A = 5, 1972−2011  Predictive system  Prior on ρuw                More informative  4.91  6.46  6.79  6.81  6.19  5.87  (0.15)  (0.12)  (0.12)  (0.08)  (0.07)  (0.07)    Less informative  5.59  6.73  6.91  6.44  6.52  6.83  (0.15)  (0.13)  (0.12)  (0.11)  (0.10)  (0.07)    Noninformative  5.30  6.62  6.94  6.69  6.70  7.17  (0.17)  (0.14)  (0.12)  (0.11)  (0.10)  (0.08)  Predictive regression    6.76  6.96  7.08  7.79  7.81  7.40  (0.15)  (0.14)  (0.13)  (0.11)  (0.09)  (0.07)      Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−2011  Predictive system  Prior on ρuw                More informative  3.45  7.49  8.31  8.37  6.83  6.08  (0.38)  (0.31)  (0.29)  (0.20)  (0.18)  (0.16)    Less informative  5.25  8.18  8.65  7.49  7.67  8.39  (0.37)  (0.32)  (0.30)  (0.27)  (0.25)  (0.17)    Noninformative  4.54  7.94  8.73  8.08  8.07  9.19  (0.42)  (0.34)  (0.30)  (0.27)  (0.25)  (0.20)  Predictive regression    8.12  8.64  8.95  10.76  10.80  9.77  (0.39)  (0.37)  (0.34)  (0.26)  (0.21)  (0.17)  Panel B: Risk aversion A = 5, 1972−2011  Predictive system  Prior on ρuw                More informative  4.91  6.46  6.79  6.81  6.19  5.87  (0.15)  (0.12)  (0.12)  (0.08)  (0.07)  (0.07)    Less informative  5.59  6.73  6.91  6.44  6.52  6.83  (0.15)  (0.13)  (0.12)  (0.11)  (0.10)  (0.07)    Noninformative  5.30  6.62  6.94  6.69  6.70  7.17  (0.17)  (0.14)  (0.12)  (0.11)  (0.10)  (0.08)  Predictive regression    6.76  6.96  7.08  7.79  7.81  7.40  (0.15)  (0.14)  (0.13)  (0.11)  (0.09)  (0.07)  Notes: CERs are calculated for the predictive system and predictive regression. The portfolio consists of stocks, bonds, and a risk-free asset. The predictor is the dividend–price ratio for the stock returns and the yield spread for the bond returns. Optimal weights are calculated for a mean–variance investor with risk-aversion coefficients A = 2 (Panel A) and A = 5 (Panel B). At the beginning of each year, starting in 1972, we estimated the model and used the estimated parameters for calculating the optimal portfolio for each quarter in this year. Different beliefs about the prior distribution on R2 are considered, characterized by the expected value of the prior distribution. Lower values represent more skeptical investors. For the predictive system, we report the results for different priors on the correlation between the expected and unexpected returns as in Pastor and Stambaugh (2009). The noninformative prior is flat on most of the (−1,1) interval; the less informative implies most mass below zero; and the more informative imposes most mass below −0.7. Data are quarterly and span from 1952 to 2011. Numbers in parentheses are block bootstrap standard errors. First, we discuss the results for the risk-aversion coefficient A = 2. By comparing the predictive system to the predictive regression for the same prior on R2 (i.e., comparing across rows in Table 5), the CER for the predictive regression is higher for any prior on ρuw. In other words, relaxing the assumption of a perfect correlation between expected returns and the predictor does not seem to pay off. With regard to the behavior of CER with respect to the fraction of explained variance, we find an inverted U- or J-shape. Extreme investors on both tails—an optimistic investor with a high expected value of prior R2 and a skeptical investor with a low expected value—tend to perform worse that investors with a modest prior distribution. However, the most optimistic investor in our study is in a worse position than the most skeptical investor for both models. For A = 5, the difference between the models is slightly weaker, but the predictive regression still exhibits superior performance to the predictive system. Similarly as for A = 2, the CER for A = 5 varies with the prior beliefs about predictability (R2) and exhibits a U-shape. On the other hand, the performance is not so sensitive to the prior on the correlation ρuw. We now investigate the asset allocation weights. The effects of different levels of optimism on the stock weights for the predictive system can be seen in Figure 1, Panel A. We plot estimated weights given by (18) for different priors on predictive power, while choosing the prior on ρuw to be noninformative. More optimism about predictability is mirrored in more volatile weights. The more pessimistic an investor is, the less weight (in absolute terms) he puts in the risky assets. Although the weights for the most optimistic investor vary from −200% to 200%, the weights for the most pessimistic investor are almost constant at approximately 40%. These results are consistent with findings of Wachter and Warusawitharana (2009) for the predictive regression. Furthermore, the dynamics of weights over time reflect an unfavorable situation for stocks in the 1990s, when the holdings for stocks were mostly negative. The spikes each year are caused by the fact that we estimate the model on a yearly basis and keep the estimated parameters constant for all quarters in that year. Figure 1. View largeDownload slide Stock weights for investors with different beliefs in predictability. Panel (A): Sensitivity to the prior on R2 for the predictive system using a noninformative prior on ρuw. Panel (B): Sensitivity to the prior on ρuw for a prior on R2 fixed at 5%. Panel (C): Sensitivity to the model choice for a prior on R2 fixed at 5%. Figure 1. View largeDownload slide Stock weights for investors with different beliefs in predictability. Panel (A): Sensitivity to the prior on R2 for the predictive system using a noninformative prior on ρuw. Panel (B): Sensitivity to the prior on ρuw for a prior on R2 fixed at 5%. Panel (C): Sensitivity to the model choice for a prior on R2 fixed at 5%. Panel B of Figure 1 shows the sensitivity to the prior discussed in Pastor and Stambaugh (2009). We fix the prior on R2 to 5% and plot the stock weights for different priors on ρuw. There is no clear monotonicity or any clear pattern in the weights when changing the prior. A similar pattern holds for bond holdings. This indicates that the prior on the correlation between expected and unexpected returns in the predictive system does not play a key role for the investor. Finally, in Panel C of Figure 1 we compare the stock weights for both the predictive regression (PR in the legend) and the predictive system (PS in the legend) with the same prior beliefs about the predictability power, R2. As in Panel B, the prior on R2 is fixed to 5%. The weights from the predictive regression are slightly less volatile than the weights from the predictive system for any prior on ρuw. The higher volatility for the predictive system might be explained by the higher precision of the estimated parameters documented in Pastor and Stambaugh (2009). As the parameter uncertainty is lower, it increases the weights a mean–variance investor is willing to allocate. 3.2 Robustness Checks To investigate the robustness of our results, we conducted several additional checks. As a first robustness check, we explore the role of correlation between the stock and bond returns. As we are able to estimate stock and bond returns for each model only separately, we have assumed zero correlation among these assets so far. To show that the results are not driven by this assumption, we use the correlation between the stock and bond returns obtained from historical data to calculate the optimal weights. The CERs accounting for this correlation are reported in Table 6. The absolute values are lower as compared to the main analysis. However, the comparative advantage of the predictive regression is still there. As the correlation from historical data might not be the same as implied by the predictive models, we further calculate the CER for a constant correlation—in particular, 10% and 20% (chosen to be of a similar magnitude as time-varying correlations from the historical data). As the results are qualitatively the same, we do not report them in a separate table. Table 6. Out-of-sample performance: CERs, correlation included     Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−2011  Predictive system  Prior on ρuw                More informative  2.56  6.52  7.20  7.43  5.85  5.31  (0.38)  (0.32)  (0.29)  (0.21)  (0.18)  (0.16)    Less informative  3.76  7.14  7.72  7.39  6.85  8.02  (0.37)  (0.33)  (0.31)  (0.28)  (0.25)  (0.16)    Noninformative  3.22  6.81  7.86  7.23  7.49  8.99  (0.42)  (0.35)  (0.30)  (0.28)  (0.25)  (0.19)  Predictive regression    7.66  8.15  8.68  10.16  10.61  8.77  (0.37)  (0.36)  (0.34)  (0.26)  (0.21)  (0.19)  Panel B: Risk aversion A = 5, 1972−2011  Predictive system  Prior on ρuw                More informative  4.54  6.07  6.34  6.43  5.79  5.56  (0.15)  (0.13)  (0.12)  (0.08)  (0.07)  (0.06)    Less informative  4.97  6.30  6.53  6.34  6.19  6.68  (0.16)  (0.13)  (0.12)  (0.11)  (0.10)  (0.07)    Noninformative  4.76  6.17  6.58  6.34  6.46  7.09  (0.17)  (0.14)  (0.12)  (0.11)  (0.10)  (0.08)  Predictive regression    6.58  6.76  6.98  7.55  7.73  7.26  (0.15)  (0.14)  (0.13)  (0.11)  (0.09)  (0.07)      Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−2011  Predictive system  Prior on ρuw                More informative  2.56  6.52  7.20  7.43  5.85  5.31  (0.38)  (0.32)  (0.29)  (0.21)  (0.18)  (0.16)    Less informative  3.76  7.14  7.72  7.39  6.85  8.02  (0.37)  (0.33)  (0.31)  (0.28)  (0.25)  (0.16)    Noninformative  3.22  6.81  7.86  7.23  7.49  8.99  (0.42)  (0.35)  (0.30)  (0.28)  (0.25)  (0.19)  Predictive regression    7.66  8.15  8.68  10.16  10.61  8.77  (0.37)  (0.36)  (0.34)  (0.26)  (0.21)  (0.19)  Panel B: Risk aversion A = 5, 1972−2011  Predictive system  Prior on ρuw                More informative  4.54  6.07  6.34  6.43  5.79  5.56  (0.15)  (0.13)  (0.12)  (0.08)  (0.07)  (0.06)    Less informative  4.97  6.30  6.53  6.34  6.19  6.68  (0.16)  (0.13)  (0.12)  (0.11)  (0.10)  (0.07)    Noninformative  4.76  6.17  6.58  6.34  6.46  7.09  (0.17)  (0.14)  (0.12)  (0.11)  (0.10)  (0.08)  Predictive regression    6.58  6.76  6.98  7.55  7.73  7.26  (0.15)  (0.14)  (0.13)  (0.11)  (0.09)  (0.07)  Notes: CERs are calculated for the predictive system and predictive regression. The portfolio consists of stocks, bonds, and a risk-free asset. The predictor is the dividend–price ratio for the stock returns and the yield spread for the bond returns. The correlation between the risky assets is calculated from the historical average and used for both models. Optimal weights are calculated for a mean–variance investor with risk-aversion coefficients A = 2 (Panel A) and A = 5 (Panel B). At the beginning of each year, starting in 1972, we estimated the model and used the estimated parameters for calculating the optimal portfolio for each quarter in this year. Different beliefs about the prior distribution on R2 are considered, characterized by the expected value of the prior distribution. Lower values represent more skeptical investors. For the predictive system, we report the results for different priors on the correlation between the expected and unexpected returns as in Pastor and Stambaugh (2009). The noninformative prior is flat on most of the (−1,1) interval; the less informative implies most mass below zero; and the more informative imposes most mass below −0.7. Data are quarterly and span from 1952 to 2011. Numbers in parentheses are block bootstrap standard errors. Table 6. Out-of-sample performance: CERs, correlation included     Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−2011  Predictive system  Prior on ρuw                More informative  2.56  6.52  7.20  7.43  5.85  5.31  (0.38)  (0.32)  (0.29)  (0.21)  (0.18)  (0.16)    Less informative  3.76  7.14  7.72  7.39  6.85  8.02  (0.37)  (0.33)  (0.31)  (0.28)  (0.25)  (0.16)    Noninformative  3.22  6.81  7.86  7.23  7.49  8.99  (0.42)  (0.35)  (0.30)  (0.28)  (0.25)  (0.19)  Predictive regression    7.66  8.15  8.68  10.16  10.61  8.77  (0.37)  (0.36)  (0.34)  (0.26)  (0.21)  (0.19)  Panel B: Risk aversion A = 5, 1972−2011  Predictive system  Prior on ρuw                More informative  4.54  6.07  6.34  6.43  5.79  5.56  (0.15)  (0.13)  (0.12)  (0.08)  (0.07)  (0.06)    Less informative  4.97  6.30  6.53  6.34  6.19  6.68  (0.16)  (0.13)  (0.12)  (0.11)  (0.10)  (0.07)    Noninformative  4.76  6.17  6.58  6.34  6.46  7.09  (0.17)  (0.14)  (0.12)  (0.11)  (0.10)  (0.08)  Predictive regression    6.58  6.76  6.98  7.55  7.73  7.26  (0.15)  (0.14)  (0.13)  (0.11)  (0.09)  (0.07)      Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−2011  Predictive system  Prior on ρuw                More informative  2.56  6.52  7.20  7.43  5.85  5.31  (0.38)  (0.32)  (0.29)  (0.21)  (0.18)  (0.16)    Less informative  3.76  7.14  7.72  7.39  6.85  8.02  (0.37)  (0.33)  (0.31)  (0.28)  (0.25)  (0.16)    Noninformative  3.22  6.81  7.86  7.23  7.49  8.99  (0.42)  (0.35)  (0.30)  (0.28)  (0.25)  (0.19)  Predictive regression    7.66  8.15  8.68  10.16  10.61  8.77  (0.37)  (0.36)  (0.34)  (0.26)  (0.21)  (0.19)  Panel B: Risk aversion A = 5, 1972−2011  Predictive system  Prior on ρuw                More informative  4.54  6.07  6.34  6.43  5.79  5.56  (0.15)  (0.13)  (0.12)  (0.08)  (0.07)  (0.06)    Less informative  4.97  6.30  6.53  6.34  6.19  6.68  (0.16)  (0.13)  (0.12)  (0.11)  (0.10)  (0.07)    Noninformative  4.76  6.17  6.58  6.34  6.46  7.09  (0.17)  (0.14)  (0.12)  (0.11)  (0.10)  (0.08)  Predictive regression    6.58  6.76  6.98  7.55  7.73  7.26  (0.15)  (0.14)  (0.13)  (0.11)  (0.09)  (0.07)  Notes: CERs are calculated for the predictive system and predictive regression. The portfolio consists of stocks, bonds, and a risk-free asset. The predictor is the dividend–price ratio for the stock returns and the yield spread for the bond returns. The correlation between the risky assets is calculated from the historical average and used for both models. Optimal weights are calculated for a mean–variance investor with risk-aversion coefficients A = 2 (Panel A) and A = 5 (Panel B). At the beginning of each year, starting in 1972, we estimated the model and used the estimated parameters for calculating the optimal portfolio for each quarter in this year. Different beliefs about the prior distribution on R2 are considered, characterized by the expected value of the prior distribution. Lower values represent more skeptical investors. For the predictive system, we report the results for different priors on the correlation between the expected and unexpected returns as in Pastor and Stambaugh (2009). The noninformative prior is flat on most of the (−1,1) interval; the less informative implies most mass below zero; and the more informative imposes most mass below −0.7. Data are quarterly and span from 1952 to 2011. Numbers in parentheses are block bootstrap standard errors. Second, we consider different subsamples. As there seems to be a persistent decline in expected returns and an increase in the steady-state growth rate of the economy at the beginning of the 1990s (Pesaran and Timmermann, 2002; Lettau and Van Nieuwerburgh, 2008), we split the out-of-sample period into two halves: 1972−1991 and 1992−2011. Tables 7 and 8 report CERs for these subsamples and the same degrees of risk aversion as reported for the entire period. The absolute CER in the first sample is higher than in the second. This is consistent with the evidence in the literature indicating a weaker degree of or no predictability starting in the 1990s. In the first period, there is no clear preference for either model. For a not too optimistic investor ( k=5% or k=10%), the predictive system outperforms the predictive regression. However, this is the only case in our robustness exercises when the predictive system pays off compared to the predictive regression. For pessimistic prior beliefs, the predictive regression outperforms all other models. For less averse investors with A = 5, the results are less volatile and less sensitive to the prior on R2. In the second period, the absolute returns are lower. Nevertheless, if we compare the predictive regression to the predictive system, the predictive regression always outperforms the system. Thus, the main conclusion does not change when considering these subsamples. Table 7. Out-of-sample performance: CERs, subsamples     Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−1991  Predictive system  Prior on ρuw                More informative  9.42  14.45  14.53  10.99  9.18  8.24  (0.87)  (0.50)  (0.42)  (0.35)  (0.35)  (0.24)    Less informative  10.56  15.29  15.27  12.56  11.15  8.45  (0.94)  (0.77)  (0.74)  (0.64)  (0.57)  (0.31)    Noninformative  8.63  15.69  15.44  12.14  10.68  7.80  (1.12)  (0.77)  (0.73)  (0.65)  (0.60)  (0.42)  Predictive regression    11.65  12.69  13.40  15.45  13.99  9.56  (0.93)  (0.86)  (0.79)  (0.57)  (0.47)  (0.40)  Panel B: Risk aversion A = 5, 1972−1991  Predictive system  Prior on ρuw                More informative  8.86  10.80  10.83  9.37  8.62  8.22  (0.34)  (0.20)  (0.17)  (0.14)  (0.15)  (0.15)    Less informative  9.32  11.14  11.12  10.01  9.43  8.34  (0.338)  (0.31)  (0.29)  (0.25)  (0.16)  (0.13)    Noninformative  8.55  11.29  11.18  9.83  9.25  8.09  (0.45)  (0.31)  (0.29)  (0.26)  (0.24)  (0.17)  Predictive regression    9.71  10.12  10.40  11.19  10.59  8.80  (0.36)  (0.33)  (0.31)  (0.22)  (0.19)  (0.16)      Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−1991  Predictive system  Prior on ρuw                More informative  9.42  14.45  14.53  10.99  9.18  8.24  (0.87)  (0.50)  (0.42)  (0.35)  (0.35)  (0.24)    Less informative  10.56  15.29  15.27  12.56  11.15  8.45  (0.94)  (0.77)  (0.74)  (0.64)  (0.57)  (0.31)    Noninformative  8.63  15.69  15.44  12.14  10.68  7.80  (1.12)  (0.77)  (0.73)  (0.65)  (0.60)  (0.42)  Predictive regression    11.65  12.69  13.40  15.45  13.99  9.56  (0.93)  (0.86)  (0.79)  (0.57)  (0.47)  (0.40)  Panel B: Risk aversion A = 5, 1972−1991  Predictive system  Prior on ρuw                More informative  8.86  10.80  10.83  9.37  8.62  8.22  (0.34)  (0.20)  (0.17)  (0.14)  (0.15)  (0.15)    Less informative  9.32  11.14  11.12  10.01  9.43  8.34  (0.338)  (0.31)  (0.29)  (0.25)  (0.16)  (0.13)    Noninformative  8.55  11.29  11.18  9.83  9.25  8.09  (0.45)  (0.31)  (0.29)  (0.26)  (0.24)  (0.17)  Predictive regression    9.71  10.12  10.40  11.19  10.59  8.80  (0.36)  (0.33)  (0.31)  (0.22)  (0.19)  (0.16)  Notes: CERs are calculated for the predictive system and predictive regression. The portfolio consists of stocks, bonds, and a risk-free asset. The predictor is the dividend–price ratio for the stock returns and the yield spread for the bond returns. Optimal weights are calculated for a mean–variance investor with risk-aversion coefficients A = 2 (Panel A) and A = 5 (Panel B). At the beginning of each year, starting in 1972 and ending in 1991, we estimated the model and used the estimated parameters for calculating the optimal portfolio for each quarter in this year. Different beliefs about the prior distribution on R2 are considered, characterized by the expected value of the prior distribution. Lower values represent more skeptical investors. For the predictive system, we report the results for different priors on the correlation between the expected and unexpected returns as in Pastor and Stambaugh (2009). The noninformative prior is flat on most of the (−1,1) interval; the less informative implies most mass below zero; and the more informative imposes most mass below −0.7. Data are quarterly and span from 1952 to 2011. Numbers in parentheses are block bootstrap standard errors. Table 7. Out-of-sample performance: CERs, subsamples     Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−1991  Predictive system  Prior on ρuw                More informative  9.42  14.45  14.53  10.99  9.18  8.24  (0.87)  (0.50)  (0.42)  (0.35)  (0.35)  (0.24)    Less informative  10.56  15.29  15.27  12.56  11.15  8.45  (0.94)  (0.77)  (0.74)  (0.64)  (0.57)  (0.31)    Noninformative  8.63  15.69  15.44  12.14  10.68  7.80  (1.12)  (0.77)  (0.73)  (0.65)  (0.60)  (0.42)  Predictive regression    11.65  12.69  13.40  15.45  13.99  9.56  (0.93)  (0.86)  (0.79)  (0.57)  (0.47)  (0.40)  Panel B: Risk aversion A = 5, 1972−1991  Predictive system  Prior on ρuw                More informative  8.86  10.80  10.83  9.37  8.62  8.22  (0.34)  (0.20)  (0.17)  (0.14)  (0.15)  (0.15)    Less informative  9.32  11.14  11.12  10.01  9.43  8.34  (0.338)  (0.31)  (0.29)  (0.25)  (0.16)  (0.13)    Noninformative  8.55  11.29  11.18  9.83  9.25  8.09  (0.45)  (0.31)  (0.29)  (0.26)  (0.24)  (0.17)  Predictive regression    9.71  10.12  10.40  11.19  10.59  8.80  (0.36)  (0.33)  (0.31)  (0.22)  (0.19)  (0.16)      Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−1991  Predictive system  Prior on ρuw                More informative  9.42  14.45  14.53  10.99  9.18  8.24  (0.87)  (0.50)  (0.42)  (0.35)  (0.35)  (0.24)    Less informative  10.56  15.29  15.27  12.56  11.15  8.45  (0.94)  (0.77)  (0.74)  (0.64)  (0.57)  (0.31)    Noninformative  8.63  15.69  15.44  12.14  10.68  7.80  (1.12)  (0.77)  (0.73)  (0.65)  (0.60)  (0.42)  Predictive regression    11.65  12.69  13.40  15.45  13.99  9.56  (0.93)  (0.86)  (0.79)  (0.57)  (0.47)  (0.40)  Panel B: Risk aversion A = 5, 1972−1991  Predictive system  Prior on ρuw                More informative  8.86  10.80  10.83  9.37  8.62  8.22  (0.34)  (0.20)  (0.17)  (0.14)  (0.15)  (0.15)    Less informative  9.32  11.14  11.12  10.01  9.43  8.34  (0.338)  (0.31)  (0.29)  (0.25)  (0.16)  (0.13)    Noninformative  8.55  11.29  11.18  9.83  9.25  8.09  (0.45)  (0.31)  (0.29)  (0.26)  (0.24)  (0.17)  Predictive regression    9.71  10.12  10.40  11.19  10.59  8.80  (0.36)  (0.33)  (0.31)  (0.22)  (0.19)  (0.16)  Notes: CERs are calculated for the predictive system and predictive regression. The portfolio consists of stocks, bonds, and a risk-free asset. The predictor is the dividend–price ratio for the stock returns and the yield spread for the bond returns. Optimal weights are calculated for a mean–variance investor with risk-aversion coefficients A = 2 (Panel A) and A = 5 (Panel B). At the beginning of each year, starting in 1972 and ending in 1991, we estimated the model and used the estimated parameters for calculating the optimal portfolio for each quarter in this year. Different beliefs about the prior distribution on R2 are considered, characterized by the expected value of the prior distribution. Lower values represent more skeptical investors. For the predictive system, we report the results for different priors on the correlation between the expected and unexpected returns as in Pastor and Stambaugh (2009). The noninformative prior is flat on most of the (−1,1) interval; the less informative implies most mass below zero; and the more informative imposes most mass below −0.7. Data are quarterly and span from 1952 to 2011. Numbers in parentheses are block bootstrap standard errors. Table 8. Out-of-sample performance: CERs, subsamples     Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1992−2011  Predictive system  Prior on ρuw                More informative  –2.38  0.80  2.30  5.76  4.48  3.92  (0.75)  (0.56)  (0.50)  (0.34)  (0.34)  (0.43)    Less informative  0.51  1.52  2.43  2.67  4.31  8.31  (0.77)  (0.48)  (0.48)  (0.43)  (0.44)  (0.35)    Noninformative  1.06  0.70  2.42  4.19  5.54  10.55  (0.83)  (0.55)  (0.48)  (0.47)  (0.47)  (0.49)  Predictive regression    4.74  4.73  4.65  6.17  7.64  9.97  (0.58)  (0.57)  (0.56)  (0.45)  (0.36)  (0.29)  Panel B: Risk aversion A = 5, 1992−2011  Predictive system  Prior on ρuw                More informative  1.15  2.39  2.97  4.32  3.81  3.58  (0.30)  (0.23)  (0.20)  (0.13)  (0.14)  (0.17)    Less informative  2.29  2.68  3.03  3.11  3.76  5.34  (0.31)  (0.20)  (0.20)  (0.17)  (0.18)  (0.15)    Noninformative  2.50  2.35  3.03  3.73  4.27  6.25  (0.33)  (0.22)  (0.19)  (0.19)  (0.19)  (0.20)  Predictive regression    4.01  4.00  3.96  4.55  5.12  6.02  (0.22)  (0.22)  (0.21)  (0.17)  (0.14)  (0.12)      Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1992−2011  Predictive system  Prior on ρuw                More informative  –2.38  0.80  2.30  5.76  4.48  3.92  (0.75)  (0.56)  (0.50)  (0.34)  (0.34)  (0.43)    Less informative  0.51  1.52  2.43  2.67  4.31  8.31  (0.77)  (0.48)  (0.48)  (0.43)  (0.44)  (0.35)    Noninformative  1.06  0.70  2.42  4.19  5.54  10.55  (0.83)  (0.55)  (0.48)  (0.47)  (0.47)  (0.49)  Predictive regression    4.74  4.73  4.65  6.17  7.64  9.97  (0.58)  (0.57)  (0.56)  (0.45)  (0.36)  (0.29)  Panel B: Risk aversion A = 5, 1992−2011  Predictive system  Prior on ρuw                More informative  1.15  2.39  2.97  4.32  3.81  3.58  (0.30)  (0.23)  (0.20)  (0.13)  (0.14)  (0.17)    Less informative  2.29  2.68  3.03  3.11  3.76  5.34  (0.31)  (0.20)  (0.20)  (0.17)  (0.18)  (0.15)    Noninformative  2.50  2.35  3.03  3.73  4.27  6.25  (0.33)  (0.22)  (0.19)  (0.19)  (0.19)  (0.20)  Predictive regression    4.01  4.00  3.96  4.55  5.12  6.02  (0.22)  (0.22)  (0.21)  (0.17)  (0.14)  (0.12)  Notes: CERs are calculated for the predictive system and predictive regression. The portfolio consists of stocks, bonds, and a risk-free asset. The predictor is the dividend–price ratio for the stock returns and the yield spread for the bond returns. Optimal weights are calculated for a mean–variance investor with risk-aversion coefficients A = 2 (Panel A) and A = 5 (Panel B). At the beginning of each year, starting in 1992 and ending in 2011, we estimated the model and used the estimated parameters for calculating the optimal portfolio for each quarter in this year. Different beliefs about the prior distribution on R2 are considered, characterized by the expected value of the prior distribution. Lower values represent more skeptical investors. For the predictive system, we report the results for different priors on the correlation between the expected and unexpected returns as in Pastor and Stambaugh (2009). The noninformative prior is flat on most of the (−1,1) interval; the less informative implies most mass below zero; and the more informative imposes most mass below −0.7. Data are quarterly and span from 1952 to 2011. Numbers in parentheses are block bootstrap standard errors. Table 8. Out-of-sample performance: CERs, subsamples     Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1992−2011  Predictive system  Prior on ρuw                More informative  –2.38  0.80  2.30  5.76  4.48  3.92  (0.75)  (0.56)  (0.50)  (0.34)  (0.34)  (0.43)    Less informative  0.51  1.52  2.43  2.67  4.31  8.31  (0.77)  (0.48)  (0.48)  (0.43)  (0.44)  (0.35)    Noninformative  1.06  0.70  2.42  4.19  5.54  10.55  (0.83)  (0.55)  (0.48)  (0.47)  (0.47)  (0.49)  Predictive regression    4.74  4.73  4.65  6.17  7.64  9.97  (0.58)  (0.57)  (0.56)  (0.45)  (0.36)  (0.29)  Panel B: Risk aversion A = 5, 1992−2011  Predictive system  Prior on ρuw                More informative  1.15  2.39  2.97  4.32  3.81  3.58  (0.30)  (0.23)  (0.20)  (0.13)  (0.14)  (0.17)    Less informative  2.29  2.68  3.03  3.11  3.76  5.34  (0.31)  (0.20)  (0.20)  (0.17)  (0.18)  (0.15)    Noninformative  2.50  2.35  3.03  3.73  4.27  6.25  (0.33)  (0.22)  (0.19)  (0.19)  (0.19)  (0.20)  Predictive regression    4.01  4.00  3.96  4.55  5.12  6.02  (0.22)  (0.22)  (0.21)  (0.17)  (0.14)  (0.12)      Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1992−2011  Predictive system  Prior on ρuw                More informative  –2.38  0.80  2.30  5.76  4.48  3.92  (0.75)  (0.56)  (0.50)  (0.34)  (0.34)  (0.43)    Less informative  0.51  1.52  2.43  2.67  4.31  8.31  (0.77)  (0.48)  (0.48)  (0.43)  (0.44)  (0.35)    Noninformative  1.06  0.70  2.42  4.19  5.54  10.55  (0.83)  (0.55)  (0.48)  (0.47)  (0.47)  (0.49)  Predictive regression    4.74  4.73  4.65  6.17  7.64  9.97  (0.58)  (0.57)  (0.56)  (0.45)  (0.36)  (0.29)  Panel B: Risk aversion A = 5, 1992−2011  Predictive system  Prior on ρuw                More informative  1.15  2.39  2.97  4.32  3.81  3.58  (0.30)  (0.23)  (0.20)  (0.13)  (0.14)  (0.17)    Less informative  2.29  2.68  3.03  3.11  3.76  5.34  (0.31)  (0.20)  (0.20)  (0.17)  (0.18)  (0.15)    Noninformative  2.50  2.35  3.03  3.73  4.27  6.25  (0.33)  (0.22)  (0.19)  (0.19)  (0.19)  (0.20)  Predictive regression    4.01  4.00  3.96  4.55  5.12  6.02  (0.22)  (0.22)  (0.21)  (0.17)  (0.14)  (0.12)  Notes: CERs are calculated for the predictive system and predictive regression. The portfolio consists of stocks, bonds, and a risk-free asset. The predictor is the dividend–price ratio for the stock returns and the yield spread for the bond returns. Optimal weights are calculated for a mean–variance investor with risk-aversion coefficients A = 2 (Panel A) and A = 5 (Panel B). At the beginning of each year, starting in 1992 and ending in 2011, we estimated the model and used the estimated parameters for calculating the optimal portfolio for each quarter in this year. Different beliefs about the prior distribution on R2 are considered, characterized by the expected value of the prior distribution. Lower values represent more skeptical investors. For the predictive system, we report the results for different priors on the correlation between the expected and unexpected returns as in Pastor and Stambaugh (2009). The noninformative prior is flat on most of the (−1,1) interval; the less informative implies most mass below zero; and the more informative imposes most mass below −0.7. Data are quarterly and span from 1952 to 2011. Numbers in parentheses are block bootstrap standard errors. In addition to splitting the sample, we limit the asset weights of risky assets to a range between 0% and 150% of the overall portfolio as in Dangl and Halling (2012). From Table 9, we can derive that the absolute performance increases for all models. In terms of relative performance, the results indicate the same pattern as in the analysis without constraints, but now they are even more pronounced. The predictive regression exhibits consistently better performance than the predictive system. Moreover, we check the performance by using a rolling window of 20 years instead of expanding window for the estimation. Because the results are almost the same as in the main analysis, we do not report them in a separate table. Table 9. Out-of-sample performance: CERs, constraints on weights     Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−2011  Predictive system  Prior on ρuw                More informative  9.60  10.13  10.17  9.20  8.17  7.65  (0.33)  (0.29)  (0.29)  (0.20)  (0.17)  (0.16)    Less informative  10.15  10.01  9.89  9.03  9.04  9.08  (0.33)  (0.30)  (0.28)  (0.28)  (0.26)  (0.17)    Noninformative  9.91  9.48  9.60  9.14  9.07  8.95  (0.34)  (0.33)  (0.31)  (0.28)  (0.28)  (0.20)  Predictive regression    10.45  10.38  10.36  10.39  10.03  9.70  (0.31)  (0.31)  (0.30)  (0.25)  (0.21)  (0.18)  Panel B: Risk aversion A = 5, 1972−2011  Predictive system  Prior on ρuw                More informative  6.92  7.16  7.07  6.85  6.39  6.31  (0.13)  (0.12)  (0.11)  (0.08)  (0.08)  (0.07)    Less informative  7.58  7.35  7.27  6.58  6.62  6.97  (0.13)  (0.12)  (0.12)  (0.11)  (0.11)  (0.07)    Noninformative  7.46  7.15  7.10  6.81  6.86  7.28  (0.14)  (0.13)  (0.12)  (0.11)  (0.11)  (0.08)  Predictive regression    8.39  8.43  8.44  8.42  8.03  8.35  (0.13)  (0.13)  (0.12)  (0.10)  (0.09)  (0.08)      Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−2011  Predictive system  Prior on ρuw                More informative  9.60  10.13  10.17  9.20  8.17  7.65  (0.33)  (0.29)  (0.29)  (0.20)  (0.17)  (0.16)    Less informative  10.15  10.01  9.89  9.03  9.04  9.08  (0.33)  (0.30)  (0.28)  (0.28)  (0.26)  (0.17)    Noninformative  9.91  9.48  9.60  9.14  9.07  8.95  (0.34)  (0.33)  (0.31)  (0.28)  (0.28)  (0.20)  Predictive regression    10.45  10.38  10.36  10.39  10.03  9.70  (0.31)  (0.31)  (0.30)  (0.25)  (0.21)  (0.18)  Panel B: Risk aversion A = 5, 1972−2011  Predictive system  Prior on ρuw                More informative  6.92  7.16  7.07  6.85  6.39  6.31  (0.13)  (0.12)  (0.11)  (0.08)  (0.08)  (0.07)    Less informative  7.58  7.35  7.27  6.58  6.62  6.97  (0.13)  (0.12)  (0.12)  (0.11)  (0.11)  (0.07)    Noninformative  7.46  7.15  7.10  6.81  6.86  7.28  (0.14)  (0.13)  (0.12)  (0.11)  (0.11)  (0.08)  Predictive regression    8.39  8.43  8.44  8.42  8.03  8.35  (0.13)  (0.13)  (0.12)  (0.10)  (0.09)  (0.08)  Notes: CERs are calculated for the predictive system and predictive regression. The portfolio consists of stocks, bonds, and a risk-free asset. The predictor is the dividend–price ratio for the stock returns and the yield spread for the bond returns. Optimal weights are calculated for a mean–variance investor with risk-aversion coefficients A = 2 (Panel A) and A = 5 (Panel B). We limit the weights to lie between 0 and 1.5 for each risky asset. At the beginning of each year, starting in 1972, we estimated the model and used the estimated parameters for calculating the optimal portfolio for each quarter in this year. Different beliefs about the prior distribution on R2 are considered, characterized by the expected value of the prior distribution. Higher values represent more skeptical investors. For the predictive system, we report the results for different priors on the correlation between the expected and unexpected returns as in Pastor and Stambaugh (2009). The noninformative prior is flat on most of the (−1,1) interval; the less informative implies most mass below zero; and the more informative imposes most mass below −0.7. Data are quarterly and span from 1952 to 2011. Numbers in parentheses are block bootstrap standard errors. Table 9. Out-of-sample performance: CERs, constraints on weights     Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−2011  Predictive system  Prior on ρuw                More informative  9.60  10.13  10.17  9.20  8.17  7.65  (0.33)  (0.29)  (0.29)  (0.20)  (0.17)  (0.16)    Less informative  10.15  10.01  9.89  9.03  9.04  9.08  (0.33)  (0.30)  (0.28)  (0.28)  (0.26)  (0.17)    Noninformative  9.91  9.48  9.60  9.14  9.07  8.95  (0.34)  (0.33)  (0.31)  (0.28)  (0.28)  (0.20)  Predictive regression    10.45  10.38  10.36  10.39  10.03  9.70  (0.31)  (0.31)  (0.30)  (0.25)  (0.21)  (0.18)  Panel B: Risk aversion A = 5, 1972−2011  Predictive system  Prior on ρuw                More informative  6.92  7.16  7.07  6.85  6.39  6.31  (0.13)  (0.12)  (0.11)  (0.08)  (0.08)  (0.07)    Less informative  7.58  7.35  7.27  6.58  6.62  6.97  (0.13)  (0.12)  (0.12)  (0.11)  (0.11)  (0.07)    Noninformative  7.46  7.15  7.10  6.81  6.86  7.28  (0.14)  (0.13)  (0.12)  (0.11)  (0.11)  (0.08)  Predictive regression    8.39  8.43  8.44  8.42  8.03  8.35  (0.13)  (0.13)  (0.12)  (0.10)  (0.09)  (0.08)      Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−2011  Predictive system  Prior on ρuw                More informative  9.60  10.13  10.17  9.20  8.17  7.65  (0.33)  (0.29)  (0.29)  (0.20)  (0.17)  (0.16)    Less informative  10.15  10.01  9.89  9.03  9.04  9.08  (0.33)  (0.30)  (0.28)  (0.28)  (0.26)  (0.17)    Noninformative  9.91  9.48  9.60  9.14  9.07  8.95  (0.34)  (0.33)  (0.31)  (0.28)  (0.28)  (0.20)  Predictive regression    10.45  10.38  10.36  10.39  10.03  9.70  (0.31)  (0.31)  (0.30)  (0.25)  (0.21)  (0.18)  Panel B: Risk aversion A = 5, 1972−2011  Predictive system  Prior on ρuw                More informative  6.92  7.16  7.07  6.85  6.39  6.31  (0.13)  (0.12)  (0.11)  (0.08)  (0.08)  (0.07)    Less informative  7.58  7.35  7.27  6.58  6.62  6.97  (0.13)  (0.12)  (0.12)  (0.11)  (0.11)  (0.07)    Noninformative  7.46  7.15  7.10  6.81  6.86  7.28  (0.14)  (0.13)  (0.12)  (0.11)  (0.11)  (0.08)  Predictive regression    8.39  8.43  8.44  8.42  8.03  8.35  (0.13)  (0.13)  (0.12)  (0.10)  (0.09)  (0.08)  Notes: CERs are calculated for the predictive system and predictive regression. The portfolio consists of stocks, bonds, and a risk-free asset. The predictor is the dividend–price ratio for the stock returns and the yield spread for the bond returns. Optimal weights are calculated for a mean–variance investor with risk-aversion coefficients A = 2 (Panel A) and A = 5 (Panel B). We limit the weights to lie between 0 and 1.5 for each risky asset. At the beginning of each year, starting in 1972, we estimated the model and used the estimated parameters for calculating the optimal portfolio for each quarter in this year. Different beliefs about the prior distribution on R2 are considered, characterized by the expected value of the prior distribution. Higher values represent more skeptical investors. For the predictive system, we report the results for different priors on the correlation between the expected and unexpected returns as in Pastor and Stambaugh (2009). The noninformative prior is flat on most of the (−1,1) interval; the less informative implies most mass below zero; and the more informative imposes most mass below −0.7. Data are quarterly and span from 1952 to 2011. Numbers in parentheses are block bootstrap standard errors. Finally, we consider multiple predictors for both risky assets, implementing the dividend–price ratio and yield spread jointly predicting risky returns. The CERs for both models are reported in Table 10. The performance of the predictive system is improved, but still does not outperform the predictive regression. Note that the highest economic gains are now achieved for slightly more optimistic investors—the spike of the inverted U-shape has moved to higher values of prior R2. This evidence is consistent with our expectation that, if we have more predictors, we could have more optimistic views on the predictive power of the model. Table 10. Out-of-sample performance: CERs, multiple predictors     Fraction of explained variance (%)   50  10  5  1  0.5  0.1      P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−2011  Predictive system  Prior on ρuw                More informative  8.13  9.48  9.40  9.92  9.13  9.77  (0.44)  (0.23)  (0.22)  (0.20)  (0.21)  (0.19)    Less informative  8.31  10.31  10.03  8.85  8.67  7.33  (0.41)  (0.27)  (0.23)  (0.17)  (0.18)  (0.20)    Noninformative  6.21  10.30  9.68  8.06  8.12  8.01  (0.51)  (0.25)  (0.21)  (0.21)  (0.22)  (0.22)  Predictive regression    10.72  10.82  10.70  11.18  11.10  9.89  (0.34)  (0.34)  (0.33)  (0.27)  (0.22)  (0.17)  Panel B: Risk aversion A = 5, 1972−2011  Predictive system  Prior on ρuw                More informative  6.79  7.27  7.31  7.48  7.17  7.43  (0.16)  (0.11)  (0.10)  (0.07)  (0.07)  (0.08)    Less informative  6.88  7.63  7.52  7.04  6.97  6.41  (0.17)  (0.09)  (0.09)  (0.08)  (0.008)  (0.08)    Noninformative  6.03  7.64  7.38  6.71  6.74  6.70  (0.20)  (0.10)  (0.09)  (0.09)  (0.09)  (0.09)  Predictive regression    7.75  7.79  7.74  7.94  7.91  7.44  (0.13)  (0.13)  (0.11)  (0.11)  (0.09)  (0.07)      Fraction of explained variance (%)   50  10  5  1  0.5  0.1      P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−2011  Predictive system  Prior on ρuw                More informative  8.13  9.48  9.40  9.92  9.13  9.77  (0.44)  (0.23)  (0.22)  (0.20)  (0.21)  (0.19)    Less informative  8.31  10.31  10.03  8.85  8.67  7.33  (0.41)  (0.27)  (0.23)  (0.17)  (0.18)  (0.20)    Noninformative  6.21  10.30  9.68  8.06  8.12  8.01  (0.51)  (0.25)  (0.21)  (0.21)  (0.22)  (0.22)  Predictive regression    10.72  10.82  10.70  11.18  11.10  9.89  (0.34)  (0.34)  (0.33)  (0.27)  (0.22)  (0.17)  Panel B: Risk aversion A = 5, 1972−2011  Predictive system  Prior on ρuw                More informative  6.79  7.27  7.31  7.48  7.17  7.43  (0.16)  (0.11)  (0.10)  (0.07)  (0.07)  (0.08)    Less informative  6.88  7.63  7.52  7.04  6.97  6.41  (0.17)  (0.09)  (0.09)  (0.08)  (0.008)  (0.08)    Noninformative  6.03  7.64  7.38  6.71  6.74  6.70  (0.20)  (0.10)  (0.09)  (0.09)  (0.09)  (0.09)  Predictive regression    7.75  7.79  7.74  7.94  7.91  7.44  (0.13)  (0.13)  (0.11)  (0.11)  (0.09)  (0.07)  Notes: CERs are calculated for the predictive system and predictive regression. The portfolio consists of stocks, bonds, and a risk-free asset. The predictors are the dividend–price ratio and the yield spread for both the stock returns and for the bond returns. Optimal weights are calculated for a mean-variance investor with risk-aversion coefficients A = 2 (Panel A) and A = 5 (Panel B). At the beginning of each year, starting in 1972, we estimated the model and used the estimated parameters for calculating the optimal portfolio for each quarter in this year. Different beliefs about the prior distribution on R2 are considered, characterized by the expected value of the prior distribution. Lower values represent more skeptical investors. For the predictive system, we report the results for different priors on the correlation between the expected and unexpected returns as in Pastor and Stambaugh (2009). The noninformative prior is flat on most of the (−1,1) interval; the less informative implies most mass below zero; and the more informative imposes most mass below −0.7. Data are quarterly and span from 1952 to 2011. Numbers in parentheses are block bootstrap standard errors. Table 10. Out-of-sample performance: CERs, multiple predictors     Fraction of explained variance (%)   50  10  5  1  0.5  0.1      P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−2011  Predictive system  Prior on ρuw                More informative  8.13  9.48  9.40  9.92  9.13  9.77  (0.44)  (0.23)  (0.22)  (0.20)  (0.21)  (0.19)    Less informative  8.31  10.31  10.03  8.85  8.67  7.33  (0.41)  (0.27)  (0.23)  (0.17)  (0.18)  (0.20)    Noninformative  6.21  10.30  9.68  8.06  8.12  8.01  (0.51)  (0.25)  (0.21)  (0.21)  (0.22)  (0.22)  Predictive regression    10.72  10.82  10.70  11.18  11.10  9.89  (0.34)  (0.34)  (0.33)  (0.27)  (0.22)  (0.17)  Panel B: Risk aversion A = 5, 1972−2011  Predictive system  Prior on ρuw                More informative  6.79  7.27  7.31  7.48  7.17  7.43  (0.16)  (0.11)  (0.10)  (0.07)  (0.07)  (0.08)    Less informative  6.88  7.63  7.52  7.04  6.97  6.41  (0.17)  (0.09)  (0.09)  (0.08)  (0.008)  (0.08)    Noninformative  6.03  7.64  7.38  6.71  6.74  6.70  (0.20)  (0.10)  (0.09)  (0.09)  (0.09)  (0.09)  Predictive regression    7.75  7.79  7.74  7.94  7.91  7.44  (0.13)  (0.13)  (0.11)  (0.11)  (0.09)  (0.07)      Fraction of explained variance (%)   50  10  5  1  0.5  0.1      P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−2011  Predictive system  Prior on ρuw                More informative  8.13  9.48  9.40  9.92  9.13  9.77  (0.44)  (0.23)  (0.22)  (0.20)  (0.21)  (0.19)    Less informative  8.31  10.31  10.03  8.85  8.67  7.33  (0.41)  (0.27)  (0.23)  (0.17)  (0.18)  (0.20)    Noninformative  6.21  10.30  9.68  8.06  8.12  8.01  (0.51)  (0.25)  (0.21)  (0.21)  (0.22)  (0.22)  Predictive regression    10.72  10.82  10.70  11.18  11.10  9.89  (0.34)  (0.34)  (0.33)  (0.27)  (0.22)  (0.17)  Panel B: Risk aversion A = 5, 1972−2011  Predictive system  Prior on ρuw                More informative  6.79  7.27  7.31  7.48  7.17  7.43  (0.16)  (0.11)  (0.10)  (0.07)  (0.07)  (0.08)    Less informative  6.88  7.63  7.52  7.04  6.97  6.41  (0.17)  (0.09)  (0.09)  (0.08)  (0.008)  (0.08)    Noninformative  6.03  7.64  7.38  6.71  6.74  6.70  (0.20)  (0.10)  (0.09)  (0.09)  (0.09)  (0.09)  Predictive regression    7.75  7.79  7.74  7.94  7.91  7.44  (0.13)  (0.13)  (0.11)  (0.11)  (0.09)  (0.07)  Notes: CERs are calculated for the predictive system and predictive regression. The portfolio consists of stocks, bonds, and a risk-free asset. The predictors are the dividend–price ratio and the yield spread for both the stock returns and for the bond returns. Optimal weights are calculated for a mean-variance investor with risk-aversion coefficients A = 2 (Panel A) and A = 5 (Panel B). At the beginning of each year, starting in 1972, we estimated the model and used the estimated parameters for calculating the optimal portfolio for each quarter in this year. Different beliefs about the prior distribution on R2 are considered, characterized by the expected value of the prior distribution. Lower values represent more skeptical investors. For the predictive system, we report the results for different priors on the correlation between the expected and unexpected returns as in Pastor and Stambaugh (2009). The noninformative prior is flat on most of the (−1,1) interval; the less informative implies most mass below zero; and the more informative imposes most mass below −0.7. Data are quarterly and span from 1952 to 2011. Numbers in parentheses are block bootstrap standard errors. To sum up, we find that the prior beliefs matter for the model performance. Although in-sample forecasts indicate better results for the predictive system, out-of-sample performance that is more relevant for an investor is more in favor of the predictive regression. Overall, an investor following predictions from the predictive regression estimated by the MCMC framework obtains superior out-of-sample performance relative to the predictive system. 4 Conclusion This article proposes an approach that allows for comparing the predictive regression and the predictive system on the basis of similar prior distributions. We consider optimistic and pessimistic investors who differ in their beliefs about predictability. We compare the models in terms of the out-of-sample CERs from the asset allocation strategy. The present article thus contributes to the literature in two ways. First, we elaborate the setup under which the predictive regression and the predictive system are comparable. This involves a framework for defining priors in the predictive system that allows for matching the mean of the prior on R2 in the predictive regression, and using the same MCMC technique for estimating both models. Second, we investigate the out-of-sample performance that tends to correlate poorly with the in-sample results. For the post-war data on bond and stock returns, our results cast doubt on the ability of the predictive system to deliver higher economic gains and indicate that relaxing the assumption of a perfect predictor does not improve the performance out-of-sample. Our results support the current stream of literature that more complex models do not necessarily improve the performance (Feldhütter et al., 2012; Sarno, Schneider, and Wagner, 2016). Furthermore, we explore the role of investor’s beliefs about predictability. Our results for the predictive system are consistent with findings in Wachter and Warusawitharana (2009) for the predictive regression that the beliefs about predictability matter, and it pays off to have a modest prior. When estimating the model, reasonable prior beliefs help an investor to construct a strategy with a better performance (in terms of CERs). To check the robustness of our results, we conduct several exercises. First, we explore the effect of different levels of correlation among risky assets. Second, we look at different subsamples—before and after 1991—to investigate the effect of a possible structural break in that year. Third, we constrain the weights to lie between 0% and 150% and, fourth, we use a rolling window of 20 years instead of an expanding window for the estimation. Finally, we look at the sensitivity of results to the prior discussed in Pastor and Stambaugh (2009). The pattern in all robustness checks is similar to that in the main analysis. The article could be extended along several dimensions. In the future, we would like to look at a richer set of predictors, and investigate whether the predictive regression outperforms the predictive system for other predictors as well. Furthermore, we are going to explore possibilities to impose additional restrictions to model the stock and bond returns jointly in the predictive system. Rytchkov (2012) was able to impose additional restrictions when jointly modeling returns and dividend growth. His constraints are based on the present value relation. However, in the setup where stock and bonds returns are modeled jointly, the constraints are not clear. Another interesting extension would be to consider the multi-period asset allocation problem that has been studied in the simpler setup (Barberis, 1999) or in-sample in the predictive system (Pastor and Stambaugh, 2012). Supplementary Data Supplementary data are available at Journal of Financial Econometrics online. Appendix: Details of the Estimation Technique for the Predictive System Consider the predictive system given by   rt+1=δt+ut+1, (21)  xt+1=(I−A)Ex+Axt+vt+1, (22)  δt+1=(1−β)Er+βδt+wt+1, (23) where δt is the unobservable expected excess return and A, β, Ex, Er are system coefficients. The error distribution is   [utvtwt]∼N(0,Σ)  where  Σ≡[σu2σuvσuwσvuΣvσvwσwuσwvσw2] (24) The estimation procedure consists of two steps: First, we draw the time series of the unobservable conditional expected return δt conditional on the parameter values. In this step, we apply the forward-filtering, backward-sampling algorithm introduced by Frühwirth-Schnatter (1994) and Carter and Kohn (1994). Second, we obtain a new set of parameters conditional on the recent draw of the time series for δt. We use the conditional distributions identical to those derived in Pastor and Stambaugh (2009) (Supplementary Appendix, Sections B3 and B5). We refer to this paper for all formulas needed to get a new draw. However, as our article focuses on a novel interpretation of priors, we do devote substantial space here to discuss all details of setting up priors. A more general setup of the predictive system relative to the predictive regression offers more flexibility, but introduces problems with identification for some of the parameters. In particular, the full covariance matrix of innovations Σ cannot be identified just by the data. Only the covariance of innovations in the predictor’s process Σv can be identified by the data. Therefore, the prior on the covariance matrix is a crucial ingredient for an identification strategy. We split the covariance matrix Σ on the submatrix   Σ11≡[σu2σuwσwuσw2] (25) for which we choose an informative prior, and the rest of elements, (Σv,σuv,σvw), where we stay noninformative. To do that, we implement the idea of Stambaugh (1997) and construct a hypothetical sample in which there are T2 observations of (u, w), but only T1≪T2≪T observations of v. We consider T1 to be equal to 4 for one predictor and equal to 6 for three predictors. We take T2 to be one-fifth of our sample size. If the noninformative prior is updated with this hypothetical sample, we get the posteriors that can be used as priors in the predictive system. We describe the priors in the next three paragraphs. The informative prior on the submatrix Σ11 has an inverted Wishart distribution Σ11∼IW(T2Σ^11,T2−K) and the prior mean E(Σ11)=Σ^11(T2/T2−K−3), where K is a number of predictors. Elements of Σ^11 are chosen in such a way that E(Σ11) implies various distributions of R2. In contrast to Pastor and Stambaugh (2009), we do not target on the prior on ρuw (which is driven by the nondiagonal element of Σ^11), but we focus on σu2 and σw2,which are related to the diagonal elements of Σ^11. By using the relation σδ2=σw2/(1−β2) in Equation (11), we see that these variances, together with the value of β, determine the value of R2. As a starting point, we set elements of Σ^11 in such a way that the prior mean of the variance of the unexpected return σu2 equals 95% of the sample return variance σr2 ( 0.95=1−k, where k = 0.05, see Equation (15)). The prior mean of the variance of the shocks in the expected returns σw2 are chosen in a way that, combined with β=0.97, it delivers the variance of the expected return σδ2 to k=5% of the sample return variance. These are the same priors as used by Pastor and Stambaugh (2009). Further, we analyze a richer set of priors to see the effect of different beliefs about predictability. An investor with the priors used in Pastor and Stambaugh (2009) is our benchmark investor and, with the prior expected value of 5% for the coefficient of determination, he is considered to be modest (compared to the other types of investors). Moreover, we consider a few more types of investors with E(σδ2)=kσr2 and E(σu2)=(1−k)σr2, where k is the fraction of explained variance and 1−k is the fraction of unexplained variance. The explained variance is lower for skeptical investors—more specifically, we consider k=1%,0.5%,0.1%. We choose these values to obtain priors comparable to those in Wachter and Warusawitharana (2009). We also consider more optimistic priors with k=50% and 10%. Altogether, we consider six different priors to capture various levels of optimism. For the prior on the nondiagonal elements of Σ11, we follow exactly the same hyperparameter approach as Pastor and Stambaugh (2009). We first consider staying noninformative about the correlation between ut and wt and assume that the nondiagonal element of Σ^11 is from a uniform prior distribution on the interval (−0.9σ^uσ^w,0.9σ^uσ^w), where σ^u2,σ^w2 are diagonal elements of Σ^11. Second, for a less informative prior, we consider a uniform prior distribution on the interval (−0.9σ^uσ^w,−0.35σ^uσ^w) and, for a more informative prior, a uniform prior distribution on the interval (−0.9σ^uσ^w,−0.87σ^uσ^w). For the rest of the elements in Σ, (Σv,σuv,σvw), we consider a noninformative prior. We construct it by the following substitution: We run a regression of vt on (ut, wt) with zero intercept, take the slope C=[σuwσuw]′Σ11 and the residual covariance matrix Ω=Σv−CΣ11C′, and put a normal-inverted-Wishart prior on them: Ω∼IW(T1Ω^0,T1) and vec(C)|Ω∼N(c^1,Ω⊗(X1′X1)−1). The variances of Ω are high because T1 is small and the variance of C is high, because X1′X1 is set to a diagonal matrix with small numbers on a diagonal. The choice for values of Ω^, c^1 are not relevant, because they correspond to means of distributions for which the variance is high. Therefore, the priors on these variables are noninformative. Finally, we discuss the priors on (A,β,Ex,Er). We follow the setup by Pastor and Stambaugh (2009). First, we need to guarantee stationarity for both AR processes. Therefore, we require all eigenvalues of A to lie within the unit circle and β to be from (−1,1). Further, we assume that all parameters have independent prior distributions and are from the following distributions   vec(A)∼N(0,σA2IK2) (26)  β∼N(0.99,0.152) (27)  Er∼N(δ¯,σEr2) (28)  Ex∼N(0,σEx2IK) (29) where σA and σEx are large numbers, σEr is equal to 1% per quarter and δ^ is the sample mean. For illustration, Figure 2 displays density on the coefficient of determination, given various levels of optimism for the predictive regression and the predictive system. The density is obtained by simulating from the prior distribution of relevant parameters and then calculating R2 for each draw based on Equation (17), for the predictive regression, and Equation (11), for the predictive system. Figure 2. View largeDownload slide Density of the prior on R2 for the predictive regression and the predictive system. Column (A): Predictive regression. Column (B): Predictive system. Notes: The density is obtained by simulating from the prior distribution of relevant parameters and then calculating the R2 for each draw based on Equation (17) for the predictive regression and Equation (11) for the predictive system. For better readability (ranges for y-axis are very different), six possible priors are split in two figures (note a different scale). The two upper graphs plot priors of more optimistic investors and the two lower graphs correspond to more pessimistic priors. The left column gives the density for the predictive regression and the right column for the predictive system. Figure 2. View largeDownload slide Density of the prior on R2 for the predictive regression and the predictive system. Column (A): Predictive regression. Column (B): Predictive system. Notes: The density is obtained by simulating from the prior distribution of relevant parameters and then calculating the R2 for each draw based on Equation (17) for the predictive regression and Equation (11) for the predictive system. For better readability (ranges for y-axis are very different), six possible priors are split in two figures (note a different scale). The two upper graphs plot priors of more optimistic investors and the two lower graphs correspond to more pessimistic priors. The left column gives the density for the predictive regression and the right column for the predictive system. Footnotes * I thank Doron Avramov, Alois Geyer, Stefan Pichler, and Leopold Sögner for helpful discussions and comments. This article also benefited from suggestions provided by Havva Özlem Dursun, Lennart Hoogerheide, Marcin Jaskowski, Paul Schneider, and participants of the Southern Finance Association Meeting, the World Finance Conference, the German Finance Association Meeting, the Forecasting Financial Markets Conference, the CEQURA Conference, the CFE Conference, the VGSF Conference, and the Slovak Economic Association Meeting. I gratefully acknowledge support from the Slovak Research and Development Agency (contract no. APVV-14-0357). All errors are my own. References Ali M. M., Pal M., Woo J.. 2007. On the Ratio of Inverted Gamma Variates. Austrian Journal of Statistics  36( 2): 153– 159. Avramov D. 2002. Stock Return Predictability and Model Uncertainty. Journal of Financial Economics  64: 423– 458. Google Scholar CrossRef Search ADS   Barberis N. 1999. Investing for the Long Run When Returns Are Predictable. Journal of Finance  55: 225– 264. Google Scholar CrossRef Search ADS   Binsbergen J. H. V., Koijen R. S. J.. 2010. Predictive Regressions: A Present-Value Approach. Journal of Finance  65( 4): 1439– 1471. Google Scholar CrossRef Search ADS   Campbell J. Y., Shiller R. J.. 1988. The Dividend-Price Ratio and Expectations of Future Dividends and Discount Factors. Review of Financial Studies  1( 3): 195– 228. Google Scholar CrossRef Search ADS   Campbell J. Y., Shiller R. J.. 1991. Yield Spreads and Interest Rate Movements: A Bird’s Eye View. Review of Economic Studies  58( 3): 495– 514. Google Scholar CrossRef Search ADS   Campbell J. Y., Thompson S. B.. 2008. Predicting Excess Stock Returns Out of Sample: Can Anything Beat the Historical Average? Review of Financial Studies  21( 4): 1509– 1531. Google Scholar CrossRef Search ADS   Carter C. K., Kohn R.. 1994. On Gibbs Sampling For State Space Models. Biometrika  81( 3): 541– 553. Google Scholar CrossRef Search ADS   Cochrane J. H. 1992. Explaining the Variance of Price-Dividend Ratios. Review of Financial Studies  5( 2): 243– 280. Google Scholar CrossRef Search ADS   Dangl T., Halling M.. 2012. Predictive Regressions with Time-Varying Coefficients. Journal of Financial Economics  106( 1): 157– 181. Google Scholar CrossRef Search ADS   Efron B., Gong G.. 1983. A Leisurely Look at the Bootstrap, the Jackknife, and Cross-Validation. The American Statistician  37( 1): 36– 48. Fama E. F., French K. R.. 1988. Dividend Yields and Expected Stock Returns. Journal of Financial Economics  22( 1): 3– 25. Google Scholar CrossRef Search ADS   Fama E. F., French K. R.. 1989. Business Conditions and Expected Returns on Stocks and Bonds. Journal of Financial Economics  25: 23– 49. Google Scholar CrossRef Search ADS   Feldhütter P., Larsen L. S., Munk C., Trolle A. B.. 2012. “ Keep It Simple: Dynamic Bond Portfolios Under Parameter Uncertainty.” Working Paper, London Business School. Available at: http://feldhutter.com/ModelUncertainty.pdf (accessed 20 December 2017). Google Scholar CrossRef Search ADS   Frühwirth-Schnatter S. 1994. Data Augmentation and Dynamic Linear Models. Journal of Time Series Analysis  15( 2): 183– 202. Google Scholar CrossRef Search ADS   Henkel S. J., Martin J. S., Nardari F.. 2011. Time-Varying Short-Horizon Predictability. Journal of Financial Economics  99( 3): 560– 580. Google Scholar CrossRef Search ADS   Kandel S., Stambaugh R. F.. 1996. On The Predictability of Stock Returns: An Asset-Allocation Perspective. Journal of Finance  51( 2): 385– 424. Google Scholar CrossRef Search ADS   Keim D. B., Stambaugh R. F.. 1986. Predicting Returns in the Stock and Bond Markets. Journal of Financial Economics  17( 2): 357– 390. Google Scholar CrossRef Search ADS   Kelly B., Pruitt S.. 2013. Market Expectations in the Cross-Section of Present Values. Journal of Finance  68( 5): 1721– 1756. Google Scholar CrossRef Search ADS   Lettau M., Van Nieuwerburgh S.. 2008. Reconciling the Return Predictability Evidence. Review of Financial Studies  21( 4): 1607– 1652. Google Scholar CrossRef Search ADS   Merton R. C. 1969. Lifetime Portfolio Selection under Uncertainty: The Continuous-Time Case. The Review of Economics and Statistics  51( 3): 247– 257. Google Scholar CrossRef Search ADS   Pastor L., Stambaugh R. F.. 2009. Predictive Systems: Living with Imperfect Predictors. Journal of Finance  64( 4): 1583– 1628. Google Scholar CrossRef Search ADS   Pastor L., Stambaugh R. F.. 2012. Are Stocks Really Less Volatile in the Long Run? Journal of Finance  67( 2): 431– 478. Google Scholar CrossRef Search ADS   Pesaran M., Timmermann A.. 2002. Market Timing and Return Prediction under Model Instability. Journal of Empirical Finance  9( 5): 495– 510. Google Scholar CrossRef Search ADS   Rapach D. E., Strauss J. K., Zhou G.. 2010. Out-of-Sample Equity Premium Prediction: Combination Forecasts and Links to the Real Economy. Review of Financial Studies  23( 2): 821– 862. Google Scholar CrossRef Search ADS   Rytchkov O. 2012. Filtering Out Expected Dividends and Expected Returns. The Quarterly Journal of Finance  2( 03): 1250012. Google Scholar CrossRef Search ADS   Samuelson P. 1965. Proof That Properly Anticipated Prices Fluctuate Randomly. Industrial Management Review  6: 41– 49. Samuelson P. A. 1969. Lifetime Portfolio Selection by Dynamic Stochastic Programming. The Review of Economics and Statistics  51( 3): 239– 46. Google Scholar CrossRef Search ADS   Sarno L., Schneider P., Wagner C.. 2016. The Economic Value of Predicting Bond Risk Premia. Journal of Empirical Finance  37: 247– 267. Google Scholar CrossRef Search ADS   Shanken J. A., Tamayo A. M.. 2012. Dividend Yield, Risk, and Mispricing: A Bayesian Analysis. Journal of Financial Economics  105: 131– 152. Google Scholar CrossRef Search ADS   Stambaugh R. F. 1997. Analyzing Investments Whose Histories Differ In Length. Journal of Financial Economics  45( 3): 285– 331. Google Scholar CrossRef Search ADS   Stambaugh R. F. 1999. Predictive Regressions. Journal of Financial Economics  54( 3): 375– 421. Google Scholar CrossRef Search ADS   Wachter J. A., Warusawitharana M.. 2009. Predictable Returns and Asset Allocation: Should a Skeptical Investor Time the Market? Journal of Econometrics  148( 2): 162– 178. Google Scholar CrossRef Search ADS   Wachter J. A., Warusawitharana M.. 2015. What Is the Chance That the Equity Premium Varies Over Time? Evidence from Regressions on Dividend-Price Ratio. Journal of Econometrics  186: 74– 93. Google Scholar CrossRef Search ADS   Welch I., Goyal A.. 2008. A Comprehensive Look at the Empirical Performance of Equity Premium Prediction. Review of Financial Studies  21( 4): 1455– 1508. Google Scholar CrossRef Search ADS   © The Author(s) 2018. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of Financial Econometrics Oxford University Press

Is Imperfection Better? Evidence from Predicting Stock and Bond Returns

Loading next page...
 
/lp/oxford-university-press/is-imperfection-better-evidence-from-predicting-stock-and-bond-returns-peNj0wWpZ8

References (35)

Publisher
Oxford University Press
Copyright
© The Author(s) 2018. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
ISSN
1479-8409
eISSN
1479-8417
DOI
10.1093/jjfinec/nby003
Publisher site
See Article on Publisher Site

Abstract

Abstract The standard predictive regression assumes expected returns to be perfectly correlated with predictors. In the recently introduced predictive system, imperfect predictors account only for a partial variance in expected returns. However, the out-of-sample benefits of relaxing the assumption of perfect correlation are unclear. We compare the performance of the two models from an investor’s perspective. In the Bayesian setup, we allow for various distributions of R2 to account for different degrees of optimism about predictability. We find that relaxing the assumption of perfect predictors does not pay off out-of-sample. Furthermore, extreme optimism or pessimism reduces the performance of both models. The existence of return predictability is one of the most discussed questions in finance. Papers finding evidence in favor of predictability mostly use predictive regression to assess the predictive power of different variables. Excess returns are regressed on various predictors, assuming a perfect correlation between the expected returns and predictors. On the other hand, the predictive system, proposed by Pastor and Stambaugh (2009), relaxes the assumption of perfect correlation. In their setup, the unobservable process of expected returns is (weakly) correlated with predictors—that is, predictors are imperfect in the sense that they do not deliver full information about the expected return. Pastor and Stambaugh empirically show that this correlation between expected returns and predictors is far from perfect. However, their focus is more on assessing the strength of imperfection in data, and less on forecasting. Therefore, a comprehensive comparison of the predictive system and predictive regression as well as an assessment of the economic effects associated with their predictions is missing from the literature. Whereas Wachter and Warusawitharana (2009) show that the predictive regression outperforms the non-predictability model out-of-sample, it is not clear whether there is an additional benefit from going to a more complex predictive model with imperfect predictors. In our main analysis, we build on findings of Wachter and Warusawitharana and extend the predictive regression by imperfect predictors in the predictive system. We document that the predictive system does not outperform the predictive regression out-of-sample—that is, allowing for imperfection in predictors does not improve the certainty equivalent returns (henceforth, CERs). Thus, an investor does not have any economic gains from following predictions from a more comprehensive model. The comparison of these two models is not straightforward. As shown in Wachter and Warusawitharana (2009), different degrees of investor skepticism about predictability (modeled as a prior distribution on R2) are highly relevant for the performance of predictive regression. To compare the performance of the predictive system and predictive regression, we have to ensure that the priors used in these two models are comparable. To achieve this, we propose an approach whereby priors on relevant parameters in the predictive system are chosen to match the prior on R2 in the predictive regression. The present article investigates the predictive system in relation to a special case of the system, the predictive regression. As discussed in Pastor and Stambaugh (2009), the predictive system allows for imperfect predictors in terms of any degree of correlation between the predictors and expected returns. Instead of modeling the expected return as an explicit combination of predictors, they model each time series (returns as well as predictors) as autoregressive processes of order one (AR(1)) and let them interact through the covariance matrix of error terms. Therefore, the expected return does not depend only on the most recent value of the predictor, but on its lagged values as well. Furthermore, estimated residuals from predictive regressions are usually autocorrelated, but the associated complications are often ignored (see Stambaugh, 1999). An advantage of the predictive system is that it allows for modeling such serial correlation in residuals directly. In contrast to Pastor and Stambaugh who use the system to analyze the effects of the correlation between the unexpected returns and innovations in expected returns on the properties of the estimates, we focus on the performance and asset allocation effects of different beliefs about the predictive power of the model. Different prior beliefs on the distribution of R2, that correspond to the fraction of explained variance, play a key role in our article. This goodness-of-fit measure is often more intuitive for investors and easier to elicit than the individual coefficients of the models. Moreover, we analyze the out-of-sample properties of the models that are more relevant for an investor than the frequently presented in-sample evidence. The main criterion for judging and comparing the economic performance of the predictive regression and the predictive system are CERs. These are derived from the asset allocation implied by the out-of-sample predictions from both models. The goal of this article is to compare the two models in a consistent way. Wachter and Warusawitharana (2009) use CERs derived from asset allocations to investigate the role of priors on R2, but they do this only for the predictive regression. Pastor and Stambaugh (2009) compare the predictive system to the predictive regression, but use a Bayesian approach to estimate the system, and ordinary least squares (OLS) for the regression. Our comparison is based on estimating both models by the Bayesian Markov Chain Monte Carlo (MCMC) technique. Therefore, we are able to compare the characteristics and implications of the two models in a way unaffected by the estimation method. Although we emphasize the importance of prior beliefs, we do not aim to analyze several variables with potential predictive power. We only choose the most prominent predictors—namely, the dividend–price ratio for stock returns and the yield spread for bond returns (Campbell and Shiller, 1988, 1991; Fama and French, 1988). Our empirical results indicate that the predictive system might not have the ability to increase economic gains for an investor in the out-of-sample analysis. Moreover, potential economic gains are driven much more by the prior belief of how strong the predictors are (in terms of the coefficient of determination) relative to the prior belief of how imperfect the predictors are (the correlation between the expected and unexpected return). In particular, investors with modest prior on the coefficient of determination achieve superior performance relative to investors with extreme priors (either too optimistic or too skeptical). We find the same patterns in different time periods and check that they are not driven by extreme portfolio allocations. Furthermore, instead of individual predictors for assets we estimate both models with multiple predictors. The findings are in line with the stream of literature, showing that complexity does not always improve the performance, in particular when predicting very noisy stock and bond returns. However, economically motivated constraints (introduced via priors) are able to boost the performance. Our article further relates to Binsbergen and Koijen (2010) who focus on return predictability using filtering techniques. They investigate how the current price–dividend ratio is able to predict future aggregate returns and dividend growth rates as a part of a state space system. In particular, they focus on the interaction term between return and dividend growth predictability, and the role of the reinvestment strategy of dividends. We contribute by analyzing the state–space predictive system in the out-of-sample setup and the crucial role of the prior belief about the predictive strength captured by the coefficient of determination. In addition, this article is related to Kelly and Pruitt (2013) who, jointly with us, aimed at the out-of-sample performance. Moreover, they similarly study possible predictors as noisy measures of the latent expected return. Whereas they highlight how to use rich cross-sectional information in achieving high out-of-sample performance, we focus on the imperfection of predictors and the predictive power. 1 Literature Review Our results bring more insight on the relationship of predictors and expected returns. In the first models such as Samuelson (1965, 1969) and Merton (1969), excess returns were assumed to be unpredictable and investors were to keep portfolio weights constant over time. However, the empirical literature in the 1980s has found variables with predictive power to explain stock and bond returns (Keim and Stambaugh, 1986; Fama and French, 1989; Cochrane, 1992). After strong evidence in favor of return predictability on the aggregate level in the 1990s and 2000s, according to more recent evidence, return predictability is actually considered debatable or even illusory. In their comprehensive study with many different predictors, Welch and Goyal (2008) show that although the in-sample predictive power of the models might be significant, out-of-sample forecasts are poor. They argue that no variable has any significant predictive power. However, their conclusion is based on the OLS estimation of predictive regressions for various predictors, and is not robust with respect to different predictive models and estimation techniques. Therefore, the question whether there are variables containing some predictable components remains unsettled and still fascinates many researchers. Attempts to deal with this question come from many areas of empirical finance. The first stream of literature improves the forecasting performance by small refinements. Campbell and Thompson (2008) respond to Welch and Goyal (2008) by imposing constraints on the sign of the coefficients and return forecasts. Rapach, Strauss, and Zhou (2010) take combinations—that is, means or medians—of predictions from different predictive regressions to obtain a better performance. Avramov (2002) adopts a Bayesian model averaging methodology to exploit the information from different predictors at once and finds both in-sample and out-of-sample predictability. A second stream of literature attempts to explain the predictability phenomenon by various versions of time variation, such as structural breaks, or time-varying coefficients. Pesaran and Timmermann (2002) identify one structural break around 1991 after which predictability disappears. However, later studies differ quite considerably in terms of the timing of breaks and their number. Furthermore, the out-of-sample performance is found to be poor because breaks cannot be reliably detected in real time (Lettau and Van Nieuwerburgh, 2008). Dangl and Halling (2012) assume the predictive regression with time-varying coefficients in a Bayesian framework, and provide a comprehensive look at the performance in this setup. They find that an investor following the optimal strategy implied by their model would be consistently better off than an investor using the historic mean. The third way of looking at this phenomenon is by using regime-switching models. Most studies which find support for two regimes (Henkel, Martin, and Nardari, 2011)—interpreted as recession and expansion—find a countercyclical pattern. Whereas predictability during recessions is significantly better than the historical average, predictability during expansions is typically weaker, if at all. The intuition is simple. In bad times, investors demand a higher risk premium. Furthermore, volatility is higher. The prices are adjusted much more to discount rates per unit of price change. As a consequence, prices are more sensitive to a more volatile price–dividend ratio. Wachter and Warusawitharana (2015) assume an investor who distinguishes two states of the world—when returns are predictable and when returns are unpredictable—and assigns prior beliefs on the two states, that is, the two models. They find strong support in favor of predictability. Most of the recent papers in favor of predictability rely on the Bayesian estimation technique. This method allows an investor to incorporate her prior beliefs about the model to determine the optimal weights. The initiation of applying this approach in the asset allocation literature goes back to the paper by Kandel and Stambaugh (1996) and their simulation study. Although predictability seems to be weak in terms of frequentist statistical measures, they show that an investor observing the simulated data might significantly change her asset allocation and improve her performance. This article supplements the literature on predictability by elaborating on a fair comparison of predictive regression and the predictive system in the Bayesian framework. It evaluates the out-of-sample performance of both models for different prior beliefs and compares the changes in asset allocation with respect to the model and the prior distribution. In the empirical part, we find that the predictive regression—a more parsimonious model with perfect predictors—turns out to perform better in terms of out-of-sample CERs. The remainder of the article is structured as follows. Section 2 presents the modeling framework and the estimation technique. We discuss both investigated models in detail and highlight their differences. In Section 3, we describe the data used for modeling, apply the suggested models to them, and report our empirical findings. Section 4 concludes and provides suggestions for future research. 2 Econometric Methodology In this section, we introduce predictive regression and the predictive system, explain their main differences, and describe the estimation technique and criteria we use to evaluate the out-of-sample forecasts. 2.1 Model Setup The most common way of modeling predictability is by using predictive regression. Here, we assume that predictors, usually some financial ratios, provide full information about the expected returns. However, we might have some doubts whether these variables fully capture the actual market expectations. Therefore, we can search for an efficient estimate of unobservable expectations, given the noisy proxies that are available. The predictive system offers one way to put some structure on the return process and model the noisy, a.k.a. imperfect, predictors. We define the realized return rt+1 as a sum of the expected return δt and the unexpected return ut+1  rt+1=δt+ut+1. (1) The two models we consider in this article differ with respect to the relation between the expected return δt and predictors. In predictive regression, the expected return depends only on the recent value of predictors. The model is given by   rt+1=a+b′xt+ut+1, (2)  xt+1=(I−A)Ex+Axt+vt+1, (3) where xt is a vector of predictors, a, b, A, and Ex are regression coefficients and   [utvt]∼(0,Σ)  where  Σ≡[σu2σuvσvuΣv] (4) are identically distributed errors. Because the expected return is modeled as a linear function of the predictors, δt=a+b′xt, it implies a perfect correlation between predictors and expected returns (not realized returns!). This means that the entire variance in the expected returns is explained by the current value of the predictors. The predictive system proposed by Pastor and Stambaugh (2009, 2012) relaxes this assumption of perfect correlation. It takes the form   rt+1=δt+ut+1, (5)  xt+1=(I−A)Ex+Axt+vt+1, (6)  δt+1=(1−β)Er+βδt+wt+1, (7) where δt is the unobservable expected excess return and A, β, Ex, and Er are system coefficients. The error distribution is   [utvtwt]∼N(0,Σ),  where  Σ≡[σu2σuvσuwσvuΣvσvwσwuσwvσw2] (8) and the errors are identically distributed. Both expected excess returns and predictors follow AR(1) processes. In the predictive system, the connection between the expected return and the predictors is not obvious. In fact, they are related through the error covariance matrix. In the case of a single predictor, the correlation between the expected return δt and the predictor xt is determined by the correlation between the errors ρvw and the scalars A and β  ρxδ=ρvw(1−A2)(1−β2)(1−Aβ)2. (9) Therefore, it can be easily shown that the predictive system collapses to predictive regression if A=β and ρvw=±1. Then, there exists b such that wt=b′vt and the dynamics for the expected return can be rewritten   δt=(1−A)Er+Aδt−1+b′vt=(1−A)Er+b′∑τ=0∞Aτvt−τ=(1−A)(Er−b′Ex)+b′xt=a+b′xt, (10) and the expected return is perfectly correlated with the predictor. In the predictive system, other than in the predictive regression, the current value of the predictor is not the only source of the information about the expected return. The additional information in the lagged realized returns and predictors is incorporated in a parsimonious way via the covariance structure. The hypothesis that predictors are not perfectly correlated with the expected return is thoroughly discussed in Pastor and Stambaugh (2009). They argue that serial correlation of estimated residuals, typically found in empirical studies of the predictive regression, is justified using the predictive system. Moreover, different values of correlation ρuw allow for modeling various types of dependence of expected returns on lagged values of the predictor. As discussed in their paper, the bond price is purely driven by discount rate shocks, which implies the correlation between the innovations in expected returns and the unexpected return to be 1. For stocks, the analogous effect on the negative correlation might be weaker, but still present. Therefore, they investigate the role of more or less informative priors by varying the mass put on the negative values. They show that they get more precise in-sample estimates by assuming an informative prior. The present article contributes to the analysis of the system by looking at its out-of-sample properties that are important for an investor. Besides the prior distribution on the correlation ρuw that is analyzed in Pastor and Stambaugh (2009), we emphasize the importance of the priors on parameters relevant for the implied prior distribution on R2. Moreover, we provide a comparison of the predictions from the predictive system to predictive regression, where both models are estimated in a Bayesian framework with the same prior beliefs. 2.2 Estimation Technique and Prior Distribution As shown in Wachter and Warusawitharana (2009), the estimation technique has an effect on performance. In their article, the predictive regression estimated by OLS exhibits poor performance (in terms of CER) compared to the same model estimated by Bayesian methods. The dependence of the results on the estimation technique suggests using a Bayesian approach, via MCMC, to estimate the predictive system as well. Furthermore, a Bayesian approach allows us to consider different investor’s beliefs about the predictive power of the system before knowing the data. Economically significant effects of different priors are also emphasized in Shanken and Tamayo (2012). In contrast to this article and the paper by Wachter and Warusawitharana (2009) with their focus on the prior about the coefficient of determination, they analyze a larger set of parameters and their prior distributions. However, it is not so obvious for an investor to form prior beliefs about many individual parameters. We offer a more parsimonious approach where an investor might have an opinion about the R2, but the prior beliefs about the rest of the parameters are noninformative. For the estimation of the predictive regression, we adopt the framework proposed by Wachter and Warusawitharana (2009). They allow for different prior beliefs on R2, which is usually interpreted as an indicator of predictive power. In this article, we develop a similar approach for the predictive system that allows us to compare the two models and analyze the effects of the priors. In the predictive system, we need to estimate the unobservable series of expected returns δt and several parameters: α, β, A, B, and a covariance matrix Σ. As discussed in Pastor and Stambaugh (2009), the system is not fully identified only by the data. Only the coefficients (α, β, A, B) and the covariance of innovations in the process for predictors, Σv, can be identified by the data. However, we can impose an additional structure on the covariance matrix Σ in the Bayesian framework via priors that guarantees the identification of all parameters in the covariance matrix. The Bayesian setup allows us to put an informative prior distribution on the parameters about which we have some intuition, and noninformative priors otherwise. Our main focus is on the parameters that have an impact on R2 of the first equation in the system which is defined as   R2=1−σu2σr2=1−σu2σw21−β2+σu2. (11) We implement different priors on R2 by imposing restrictions on the distributions of σu2, σw2, and β. We model the prior on R2 indirectly by imposing a specific structure on the prior distribution of the error covariance matrix Σ. We choose the informative prior on the error covariance submatrix Σ11  Σ11=[σu2σuwσuwσw2], (12) but the noninformative prior about the other elements of the error covariance matrix Σ. Stambaugh (1997) argues that such a prior can be modeled as a posterior of Σ with a noninformative prior and an alternative hypothetical sample of T1 observations of v and T2 observations of (u, w), where T1<<T2<<T, wherein T is the actual number of observations. As in Pastor and Stambaugh (2009), we use T1=K+3 and T2=T/5, where K is the number of predictors. Therefore, the informative prior on the submatrix Σ11 has an inverted Wishart distribution   Σ11∼IW(T2Σ^11,T2−K). (13) where the elements of Σ^11 are chosen in such a way that E(Σ11) is defined in order to get different distributions of R2. In contrast to Pastor and Stambaugh (2009), we do not analyze the prior on ρuw (which is driven by the nondiagonal element of Σ^11), but we focus on σu2 and σw2, which are related to the diagonal elements of Σ^11. By using the relation   σδ2=σw21−β2 (14) in equation (11), we see that these variances, together with the value of β, determine the value of R2. As a starting point, we investigate an investor with the same priors as used by Pastor and Stambaugh (2009). We set elements of Σ^11 in such a way that the prior mean of the variance of the unexpected return σu2 equals 95% of the sample return variance σr2. The prior mean of the variance of the shocks in the expected returns σw2 are chosen in a way that, combined with β=0.97, delivers the variance of the expected return σδ2 to 5% of the sample return variance. Pastor and Stambaugh argue that these values imply a plausible prior on R2. We go one step further and analyze a richer set of priors to see the effect of different beliefs about predictability. An investor with the priors used in Pastor and Stambaugh (2009) is our benchmark investor and, with the prior expected value of 5% for the coefficient of determination, he is considered to be modest (compared to the other types of investors). Moreover, we consider a few more types of investors with   E(σδ2)=kσr2  and  E(σu2)=(1−k)σr2 (15) where k is the fraction of explained variance and 1−k is the fraction of unexplained variance. The explained variance is lower for skeptical investors; more specifically, we consider k=1%,0.5%,0.1%. We chose these values to obtain priors comparable to those in Wachter and Warusawitharana (2009). Finally, we also consider more optimistic priors with k=50% and 10%. To derive a closed-form solution for the relation between E(R2) and k is not possible. However, we can sketch this relation by using the formula derived in Ali, Pal, and Woo (2007). They consider two independent random variables X and Y with inverted gamma distributions (one-dimensional inverse Wishart distributions). They are able to derive the formula for the ratio X/(X+Y) by using the gamma function. In our case, X corresponds to the variance of δ and Y is the variance of u. (However, they do not have to be independent in general in our model.) With this one-dimensional simplification, we can calculate the expected value of R2 for all k, except k=50% which is outside the range for which the formula yields a usable value. From Table 1, we see that E (R2) and k are practically the same and, thus, for ease of exposition in the rest of the article, we refer to E(R2) and k as the same parameter. Table 1. Relation of the expected value on prior R2 and the parameter k Parameter k, %  50  10  5  1  0.5  0.1  E(R2), %  –  10.356  5.081  1.003  0.508  0.100  Parameter k, %  50  10  5  1  0.5  0.1  E(R2), %  –  10.356  5.081  1.003  0.508  0.100  Notes: The mean of R2 is calculated by using a formula for the ratio of two inverted gamma-distributed variables (Ali, Pal, and Woo, 2007). The reported values are based on the length of the time series used in the empirical part. For a longer time series, the expected value of R2 would be even closer to the parameter k. Table 1. Relation of the expected value on prior R2 and the parameter k Parameter k, %  50  10  5  1  0.5  0.1  E(R2), %  –  10.356  5.081  1.003  0.508  0.100  Parameter k, %  50  10  5  1  0.5  0.1  E(R2), %  –  10.356  5.081  1.003  0.508  0.100  Notes: The mean of R2 is calculated by using a formula for the ratio of two inverted gamma-distributed variables (Ali, Pal, and Woo, 2007). The reported values are based on the length of the time series used in the empirical part. For a longer time series, the expected value of R2 would be even closer to the parameter k. The framework described above allows us to estimate the predictive system with different means of the prior distribution on R2. Moreover, the framework derived in Wachter and Warusawitharana (2009) enables us to model different means of the prior on R2 in the predictive regression. By defining the normalized variable η  η=σxγσu (16) with the normally distributed prior η∼N(0,ση2), Wachter and Warusawitharana show that   R2=η2η2+1. (17) Therefore, by choosing the corresponding pairs of parameters ση in the predictive regression and k in the predictive system, we can estimate the models for investors with various beliefs about predictability. Table 2 reports which values of ση2 from Wachter and Warusawitharana (2009) correspond to the parameter k for the predictive system in our framework. In the comparison of the models, we thus assume that the prior distribution on R2 in the system and the regression has the same mean. However, we have to point out that the distributions are not identical and differ in higher moments. Although we have looked carefully at the prior distributions implied by different values of model parameters that have an impact on R2, we are not able to match the higher moments. Table 2. Comparison of parameters in the predictive system and the predictive regression   Fraction of explained variance (%)     50  10  5  1  0.5  0.1  Predictive system, k  0.5  0.1  0.05  0.01  0.005  0.001  Predictive regression, ση  1.60  0.37  0.24  0.10  0.07  0.03    Fraction of explained variance (%)     50  10  5  1  0.5  0.1  Predictive system, k  0.5  0.1  0.05  0.01  0.005  0.001  Predictive regression, ση  1.60  0.37  0.24  0.10  0.07  0.03  Table 2. Comparison of parameters in the predictive system and the predictive regression   Fraction of explained variance (%)     50  10  5  1  0.5  0.1  Predictive system, k  0.5  0.1  0.05  0.01  0.005  0.001  Predictive regression, ση  1.60  0.37  0.24  0.10  0.07  0.03    Fraction of explained variance (%)     50  10  5  1  0.5  0.1  Predictive system, k  0.5  0.1  0.05  0.01  0.005  0.001  Predictive regression, ση  1.60  0.37  0.24  0.10  0.07  0.03  As we do not focus on the correlation ρuw between the unexpected return and innovations in the expected return, we estimate the model for the same three priors on ρuw that are used in Pastor and Stambaugh (2009). They argue that this correlation is negative and, thus, informative priors reflect this belief—noninformative priors with the same mass for positive and negative values, less informative with a positive mass only on negative values, and more informative that have most mass on negative values close to 1. However, we argue based on the empirical results below that this prior does not have a strong effect on the out-of-sample performance. All priors on the other parameters of the system are the same as in Pastor and Stambaugh (2009). Thus, we have defined a framework that allows for a comparison of the predictive system and the predictive regression. Both models are estimated by the Bayesian approach. Therefore, we accomplish to compare the characteristics and implications of the two models in a way that is unaffected by the estimation technique. 2.3 Out-of-Sample Performance Our focus is to investigate the out-of-sample performance of the predictive system and predictive regression given different investor’s beliefs. Although many papers find strong in-sample predictability, the results in real-time evaluations of the models are mostly much weaker. In this article, we compare the performance of the two suggested models estimated by the Bayesian framework in real time. We now describe the evaluation procedure in more detail. For measuring out-of-sample performance, we use an expanding window strategy. Following Wachter and Warusawitharana (2009), we estimate the models after observing at least 20 years of quarterly data. By simulating 200,000 (75,000) draws, dropping the first 50,000 (1000) as a burn-in phase, and taking every third draw from the rest to decrease the serial correlation, we obtain the posterior distributions for the predictive regression (predictive system). Both models are re-estimated every four quarters. Predictions for t + 1 are computed every quarter, holding estimates fixed throughout a year, but using observed predictors lagged by one quarter. In the optimal portfolio choice problem, we consider an investor who holds stocks, bonds, and a risk-free asset in her portfolio. She maximizes expected utility in the next period t + 1 conditional on the information available now (period t). Although we look at the one-period static asset allocation problem, it would be possible to predict more periods ahead and look at the effects of the dynamic asset allocation problem. However, this is beyond the scope of this article, which provides a first step in comparing the performance of the models in static setup. The investor solves the one-period portfolio choice problem   max⁡Et(U(Wt+1)), where U(·) is a utility function, Wt is the wealth at time t, and the value is calculated conditional on all available information through time t. We consider an investor with a mean–variance utility function to make our results comparable to other studies (Wachter and Warusawitharana, 2009; Dangl and Halling, 2012). Therefore, the stock and bond weights from time t to t + 1 are given by   ωti=1AEt(rt+1i)Vart(rt+1i), (18) where A is a risk-aversion coefficient, i is the index of risky assets, and Et(rt+1i) and Vart(rt+1i) are the first two moments of the posterior distribution of the returns at time t + 1, conditional on the information at time t. Given a draw j from the posterior distribution of the model parameters, a draw from the predictive distribution of asset returns is given by rj=δj+uj for the predictive system and by rj=aj+bjxt+uj for the predictive regression where uj∼N(0,σu,j2). The optimal portfolio is then the solution to (18) with the mean and variance computed by simulating draws rj. Both moments are determined for each model separately. As it stands, the covariance among assets in the predictive regression and the predictive system cannot be easily modeled. Therefore, the asset weights in Equation (18) are based on an asset correlation of zero. To investigate the role of covariance, we additionally use the correlation between the risky assets from the historical data (i.e., the sample correlation) as a robustness check in the empirical part. The final wealth is given by   Wt+1=Wt(∑iωtirt+1i+rf,t+1)︸rt+1p, (19) where rti is the realized return on the risky asset, i, rf,t is the realized return on the risk-free asset, and rtp is the portfolio return. For assessing the out-of-sample performance, it is important for an investor how predictability is mapped into gains and losses of her strategy. As in the optimal asset allocation problem, an investor takes into account the first and the second moment of the return distribution; it is a natural choice to take these moments also into account when evaluating the model performance. Therefore, we measure the performance in terms of the CER that evaluates the model in economic terms, adjusting for risk. For the portfolio over the considered out-of-sample time period, we define CER as   CER=r¯p−A2vrp  r¯p=1n∑trtp  vrp=1n−1∑t(rtp−r¯p)2, (20) where t is the index of all quarters in the out-of-sample period 1972–2011, r¯p is the average portfolio return over the out-of-sample period, and vrp is the portfolio variance over the out-of-sample period. In the tables below, CER is multiplied by 400 to express them as an annual percentage. To assess the significance of the out-of-sample performance is not easy, because the parameters are re-estimated sequentially, and the usual tests known from an in-sample analysis cannot be applied (i.e., transferred) to an out-of-sample context. Dangl and Halling (2012) use daily data to estimate the monthly variance of the portfolio from stocks and the risk-free asset. However, as we do not have the daily data for bond prices, we cannot repeat their test. Furthermore, the simulation exercise undertaken by Wachter and Warusawitharana (2009) in the predictive regression is infeasible for the predictive system due to time constraints. Therefore, for each CER, we opt to focus on the bootstrap standard error (Efron and Gong, 1983) that reflects the variability of the mean of CER. In particular, to avoid possible time dependency in realized returns, we rely on the block bootstrap standard errors. We bootstrap samples of four quarterly sequential out-of-sample returns until we obtain a series of the same length as the original out-of-sample window and repeat it 10,000 times. Based on the simulated returns, we calculate the standard error. 3 Data and Empirical Findings We conducted our analysis on the quarterly data spanning the first quarter of 1952 until the last quarter of 2011. Following other studies, we began our sample after 1951 when the Fed was allowed to pursue an independent monetary policy. All financial data are obtained from the Center for Research in Security Prices (CRSP). Excess stock returns are defined as the quarterly returns on the NYSE-AMEX-NASDAQ index in excess of the three-month Treasury bill. Similarly, the excess bond returns are constructed as the quarterly returns on ten-year Treasury bond minus the three-month Treasury bill. The dividend–price ratio—used as a predictor of the stock returns (Campbell and Shiller, 1988; Fama and French, 1988)—is constructed as a sum of total dividends paid over the previous 12 months divided by the current price. Dividends are calculated from the monthly stock returns, inclusive of dividends and exclusive of dividends on the value-weighted NYSE-AMEX-NASDAQ index. The yield spread—used as a predictor of the bond returns (Campbell and Shiller, 1991)—is constructed as the yield on the five-year bond minus the yield on the three-month bond. 3.1 Results This section describes the results obtained from the estimation of both considered models—the predictive regression and the predictive system. First, we evaluate the out-of-sample performance for the entire sample period, and then we conduct several robustness checks. We start with the analysis of the effects of the prior distribution for R2 on the in-sample model performance. Table 3 shows the mean and standard deviation of the posterior distribution of R2 for the predictive system and predictive regression for both risky assets. A similar analysis has been already done for stock returns in Pastor and Stambaugh (2009). However, they do not distinguish different priors on R2, but only use one prior labeled P&S in our tables. Moreover, the predictive regression in their paper is estimated by OLS, which makes a comparison to the Bayesian estimates obtained for the predictive system difficult. Nevertheless, our more comprehensive results are consistent with their conclusion on the R2 for stock returns. For every column (i.e., prior on the fraction of the explained variance), the mean of in-sample R2 for the predictive system is higher than for the predictive regression. However, the results for bond returns are less clear. For the most optimistic prior, the predictive system yields a higher posterior mean for R2 than the predictive regression. For all other priors, the results are opposite. In any case, given the high standard deviations, deriving strong conclusions may be problematic, and we take these results only as a first indication of the model’s performance. Table 3. In-sample performance: Posterior R2 Posterior R2 (%), 1952−2011     Fraction of explained variance (%)     50%  10  5  1  0.5  0.1    W&W    P&S  W&W  W&W  W&W    optimistic      pessimistic  Stocks               Predictive system  8.70  5.55  4.63  3.00  2.38  1.17  (3.28)  (4.61)  (5.17)  (6.23)  (5.42)  (4.13)   Predictive regression  1.36  1.30  1.26  1.02  0.84  0.16  (1.57)  (1.06)  (0.83)  (0.65)  (0.59)  (0.20)  Bonds               Predictive system  8.62  2.65  1.82  0.92  0.59  0.18  (2.97)  (2.05)  (2.82)  (3.50)  (2.79)  (1.43)   Predictive regression  5.54  5.22  4.78  2.75  1.65  0.23  (3.00)  (2.85)  (2.64)  (1.69)  (1.14)  (0.24)  Posterior R2 (%), 1952−2011     Fraction of explained variance (%)     50%  10  5  1  0.5  0.1    W&W    P&S  W&W  W&W  W&W    optimistic      pessimistic  Stocks               Predictive system  8.70  5.55  4.63  3.00  2.38  1.17  (3.28)  (4.61)  (5.17)  (6.23)  (5.42)  (4.13)   Predictive regression  1.36  1.30  1.26  1.02  0.84  0.16  (1.57)  (1.06)  (0.83)  (0.65)  (0.59)  (0.20)  Bonds               Predictive system  8.62  2.65  1.82  0.92  0.59  0.18  (2.97)  (2.05)  (2.82)  (3.50)  (2.79)  (1.43)   Predictive regression  5.54  5.22  4.78  2.75  1.65  0.23  (3.00)  (2.85)  (2.64)  (1.69)  (1.14)  (0.24)  Notes: This table reports means and standard deviations (in parentheses) of posterior R2 for the predictive system and predictive regression. The predictor is the dividend–price ratio for the stock returns and the yield spread for the bond returns. Different beliefs about the prior distribution on R2 are considered, characterized by the mean of the prior distribution. Lower values represent more skeptical investors. For the predictive system, we report the results for the more informative prior on the correlation between the expected and unexpected returns. Data are quarterly and span from 1952 to 2011. Table 3. In-sample performance: Posterior R2 Posterior R2 (%), 1952−2011     Fraction of explained variance (%)     50%  10  5  1  0.5  0.1    W&W    P&S  W&W  W&W  W&W    optimistic      pessimistic  Stocks               Predictive system  8.70  5.55  4.63  3.00  2.38  1.17  (3.28)  (4.61)  (5.17)  (6.23)  (5.42)  (4.13)   Predictive regression  1.36  1.30  1.26  1.02  0.84  0.16  (1.57)  (1.06)  (0.83)  (0.65)  (0.59)  (0.20)  Bonds               Predictive system  8.62  2.65  1.82  0.92  0.59  0.18  (2.97)  (2.05)  (2.82)  (3.50)  (2.79)  (1.43)   Predictive regression  5.54  5.22  4.78  2.75  1.65  0.23  (3.00)  (2.85)  (2.64)  (1.69)  (1.14)  (0.24)  Posterior R2 (%), 1952−2011     Fraction of explained variance (%)     50%  10  5  1  0.5  0.1    W&W    P&S  W&W  W&W  W&W    optimistic      pessimistic  Stocks               Predictive system  8.70  5.55  4.63  3.00  2.38  1.17  (3.28)  (4.61)  (5.17)  (6.23)  (5.42)  (4.13)   Predictive regression  1.36  1.30  1.26  1.02  0.84  0.16  (1.57)  (1.06)  (0.83)  (0.65)  (0.59)  (0.20)  Bonds               Predictive system  8.62  2.65  1.82  0.92  0.59  0.18  (2.97)  (2.05)  (2.82)  (3.50)  (2.79)  (1.43)   Predictive regression  5.54  5.22  4.78  2.75  1.65  0.23  (3.00)  (2.85)  (2.64)  (1.69)  (1.14)  (0.24)  Notes: This table reports means and standard deviations (in parentheses) of posterior R2 for the predictive system and predictive regression. The predictor is the dividend–price ratio for the stock returns and the yield spread for the bond returns. Different beliefs about the prior distribution on R2 are considered, characterized by the mean of the prior distribution. Lower values represent more skeptical investors. For the predictive system, we report the results for the more informative prior on the correlation between the expected and unexpected returns. Data are quarterly and span from 1952 to 2011. The out-of-sample coefficients of determination are reported in Table 4. The in-sample performance does not persist in the out-of-sample and R2 decreases for both stock and bond returns. The predictive system delivers higher performance than the predictive regression for stock returns and investors who have modest priors. For bonds, the predictive regression acquires higher R2. Although the coefficient of determination is a very common statistical measure, it is not clear what overall effect these R2s have on investor’s portfolio and, therefore, we focus on the CERs in the rest of the analysis. Table 4. Out-of-sample R2 Out-of-sample R2 (%), 1972−2011     Fraction of explained variance (%)     50  10  5  1  0.5  0.1    W&W    P&S  W&W  W&W  W&W    optimistic      pessimistic  Stocks               Predictive system  −2.28  −0.71  0.26  1.42  0.68  0.26   Predictive regression  −0.18  −0.17  −0.26  0.30  0.68  1.07  Bonds               Predictive system  0.23  −0.07  0.30  0.46  0.49  0.62   Predictive regression  4.17  4.15  4.05  3.45  2.64  0.86  Out-of-sample R2 (%), 1972−2011     Fraction of explained variance (%)     50  10  5  1  0.5  0.1    W&W    P&S  W&W  W&W  W&W    optimistic      pessimistic  Stocks               Predictive system  −2.28  −0.71  0.26  1.42  0.68  0.26   Predictive regression  −0.18  −0.17  −0.26  0.30  0.68  1.07  Bonds               Predictive system  0.23  −0.07  0.30  0.46  0.49  0.62   Predictive regression  4.17  4.15  4.05  3.45  2.64  0.86  Notes: This table reports out-of-sample R2 for the predictive system and predictive regression. The predictor is the dividend–price ratio for the stock returns and the yield spread for the bond returns. Different beliefs about the prior distribution on R2 are considered, characterized by the mean of the prior distribution. Lower values represent more skeptical investors. For the predictive system, we report the results for the more informative prior on the correlation between the expected and unexpected returns. Data are quarterly and span from 1952 to 2011. Table 4. Out-of-sample R2 Out-of-sample R2 (%), 1972−2011     Fraction of explained variance (%)     50  10  5  1  0.5  0.1    W&W    P&S  W&W  W&W  W&W    optimistic      pessimistic  Stocks               Predictive system  −2.28  −0.71  0.26  1.42  0.68  0.26   Predictive regression  −0.18  −0.17  −0.26  0.30  0.68  1.07  Bonds               Predictive system  0.23  −0.07  0.30  0.46  0.49  0.62   Predictive regression  4.17  4.15  4.05  3.45  2.64  0.86  Out-of-sample R2 (%), 1972−2011     Fraction of explained variance (%)     50  10  5  1  0.5  0.1    W&W    P&S  W&W  W&W  W&W    optimistic      pessimistic  Stocks               Predictive system  −2.28  −0.71  0.26  1.42  0.68  0.26   Predictive regression  −0.18  −0.17  −0.26  0.30  0.68  1.07  Bonds               Predictive system  0.23  −0.07  0.30  0.46  0.49  0.62   Predictive regression  4.17  4.15  4.05  3.45  2.64  0.86  Notes: This table reports out-of-sample R2 for the predictive system and predictive regression. The predictor is the dividend–price ratio for the stock returns and the yield spread for the bond returns. Different beliefs about the prior distribution on R2 are considered, characterized by the mean of the prior distribution. Lower values represent more skeptical investors. For the predictive system, we report the results for the more informative prior on the correlation between the expected and unexpected returns. Data are quarterly and span from 1952 to 2011. Table 5 reports our main results, presenting CERs of asset allocation strategies obtained from the entire out-of-sample period 1972–2011. We report CER calculated for both models with different prior distributions on R2 for different priors on the correlation ρuw in the predictive system. We consider investors with two different risk aversion coefficients, A = 2 and A = 5. For both degrees of risk aversion, the results are qualitatively very similar. Table 5. Out-of-sample performance: CERs     Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−2011  Predictive system  Prior on ρuw                More informative  3.45  7.49  8.31  8.37  6.83  6.08  (0.38)  (0.31)  (0.29)  (0.20)  (0.18)  (0.16)    Less informative  5.25  8.18  8.65  7.49  7.67  8.39  (0.37)  (0.32)  (0.30)  (0.27)  (0.25)  (0.17)    Noninformative  4.54  7.94  8.73  8.08  8.07  9.19  (0.42)  (0.34)  (0.30)  (0.27)  (0.25)  (0.20)  Predictive regression    8.12  8.64  8.95  10.76  10.80  9.77  (0.39)  (0.37)  (0.34)  (0.26)  (0.21)  (0.17)  Panel B: Risk aversion A = 5, 1972−2011  Predictive system  Prior on ρuw                More informative  4.91  6.46  6.79  6.81  6.19  5.87  (0.15)  (0.12)  (0.12)  (0.08)  (0.07)  (0.07)    Less informative  5.59  6.73  6.91  6.44  6.52  6.83  (0.15)  (0.13)  (0.12)  (0.11)  (0.10)  (0.07)    Noninformative  5.30  6.62  6.94  6.69  6.70  7.17  (0.17)  (0.14)  (0.12)  (0.11)  (0.10)  (0.08)  Predictive regression    6.76  6.96  7.08  7.79  7.81  7.40  (0.15)  (0.14)  (0.13)  (0.11)  (0.09)  (0.07)      Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−2011  Predictive system  Prior on ρuw                More informative  3.45  7.49  8.31  8.37  6.83  6.08  (0.38)  (0.31)  (0.29)  (0.20)  (0.18)  (0.16)    Less informative  5.25  8.18  8.65  7.49  7.67  8.39  (0.37)  (0.32)  (0.30)  (0.27)  (0.25)  (0.17)    Noninformative  4.54  7.94  8.73  8.08  8.07  9.19  (0.42)  (0.34)  (0.30)  (0.27)  (0.25)  (0.20)  Predictive regression    8.12  8.64  8.95  10.76  10.80  9.77  (0.39)  (0.37)  (0.34)  (0.26)  (0.21)  (0.17)  Panel B: Risk aversion A = 5, 1972−2011  Predictive system  Prior on ρuw                More informative  4.91  6.46  6.79  6.81  6.19  5.87  (0.15)  (0.12)  (0.12)  (0.08)  (0.07)  (0.07)    Less informative  5.59  6.73  6.91  6.44  6.52  6.83  (0.15)  (0.13)  (0.12)  (0.11)  (0.10)  (0.07)    Noninformative  5.30  6.62  6.94  6.69  6.70  7.17  (0.17)  (0.14)  (0.12)  (0.11)  (0.10)  (0.08)  Predictive regression    6.76  6.96  7.08  7.79  7.81  7.40  (0.15)  (0.14)  (0.13)  (0.11)  (0.09)  (0.07)  Notes: CERs are calculated for the predictive system and predictive regression. The portfolio consists of stocks, bonds, and a risk-free asset. The predictor is the dividend–price ratio for the stock returns and the yield spread for the bond returns. Optimal weights are calculated for a mean–variance investor with risk-aversion coefficients A = 2 (Panel A) and A = 5 (Panel B). At the beginning of each year, starting in 1972, we estimated the model and used the estimated parameters for calculating the optimal portfolio for each quarter in this year. Different beliefs about the prior distribution on R2 are considered, characterized by the expected value of the prior distribution. Lower values represent more skeptical investors. For the predictive system, we report the results for different priors on the correlation between the expected and unexpected returns as in Pastor and Stambaugh (2009). The noninformative prior is flat on most of the (−1,1) interval; the less informative implies most mass below zero; and the more informative imposes most mass below −0.7. Data are quarterly and span from 1952 to 2011. Numbers in parentheses are block bootstrap standard errors. Table 5. Out-of-sample performance: CERs     Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−2011  Predictive system  Prior on ρuw                More informative  3.45  7.49  8.31  8.37  6.83  6.08  (0.38)  (0.31)  (0.29)  (0.20)  (0.18)  (0.16)    Less informative  5.25  8.18  8.65  7.49  7.67  8.39  (0.37)  (0.32)  (0.30)  (0.27)  (0.25)  (0.17)    Noninformative  4.54  7.94  8.73  8.08  8.07  9.19  (0.42)  (0.34)  (0.30)  (0.27)  (0.25)  (0.20)  Predictive regression    8.12  8.64  8.95  10.76  10.80  9.77  (0.39)  (0.37)  (0.34)  (0.26)  (0.21)  (0.17)  Panel B: Risk aversion A = 5, 1972−2011  Predictive system  Prior on ρuw                More informative  4.91  6.46  6.79  6.81  6.19  5.87  (0.15)  (0.12)  (0.12)  (0.08)  (0.07)  (0.07)    Less informative  5.59  6.73  6.91  6.44  6.52  6.83  (0.15)  (0.13)  (0.12)  (0.11)  (0.10)  (0.07)    Noninformative  5.30  6.62  6.94  6.69  6.70  7.17  (0.17)  (0.14)  (0.12)  (0.11)  (0.10)  (0.08)  Predictive regression    6.76  6.96  7.08  7.79  7.81  7.40  (0.15)  (0.14)  (0.13)  (0.11)  (0.09)  (0.07)      Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−2011  Predictive system  Prior on ρuw                More informative  3.45  7.49  8.31  8.37  6.83  6.08  (0.38)  (0.31)  (0.29)  (0.20)  (0.18)  (0.16)    Less informative  5.25  8.18  8.65  7.49  7.67  8.39  (0.37)  (0.32)  (0.30)  (0.27)  (0.25)  (0.17)    Noninformative  4.54  7.94  8.73  8.08  8.07  9.19  (0.42)  (0.34)  (0.30)  (0.27)  (0.25)  (0.20)  Predictive regression    8.12  8.64  8.95  10.76  10.80  9.77  (0.39)  (0.37)  (0.34)  (0.26)  (0.21)  (0.17)  Panel B: Risk aversion A = 5, 1972−2011  Predictive system  Prior on ρuw                More informative  4.91  6.46  6.79  6.81  6.19  5.87  (0.15)  (0.12)  (0.12)  (0.08)  (0.07)  (0.07)    Less informative  5.59  6.73  6.91  6.44  6.52  6.83  (0.15)  (0.13)  (0.12)  (0.11)  (0.10)  (0.07)    Noninformative  5.30  6.62  6.94  6.69  6.70  7.17  (0.17)  (0.14)  (0.12)  (0.11)  (0.10)  (0.08)  Predictive regression    6.76  6.96  7.08  7.79  7.81  7.40  (0.15)  (0.14)  (0.13)  (0.11)  (0.09)  (0.07)  Notes: CERs are calculated for the predictive system and predictive regression. The portfolio consists of stocks, bonds, and a risk-free asset. The predictor is the dividend–price ratio for the stock returns and the yield spread for the bond returns. Optimal weights are calculated for a mean–variance investor with risk-aversion coefficients A = 2 (Panel A) and A = 5 (Panel B). At the beginning of each year, starting in 1972, we estimated the model and used the estimated parameters for calculating the optimal portfolio for each quarter in this year. Different beliefs about the prior distribution on R2 are considered, characterized by the expected value of the prior distribution. Lower values represent more skeptical investors. For the predictive system, we report the results for different priors on the correlation between the expected and unexpected returns as in Pastor and Stambaugh (2009). The noninformative prior is flat on most of the (−1,1) interval; the less informative implies most mass below zero; and the more informative imposes most mass below −0.7. Data are quarterly and span from 1952 to 2011. Numbers in parentheses are block bootstrap standard errors. First, we discuss the results for the risk-aversion coefficient A = 2. By comparing the predictive system to the predictive regression for the same prior on R2 (i.e., comparing across rows in Table 5), the CER for the predictive regression is higher for any prior on ρuw. In other words, relaxing the assumption of a perfect correlation between expected returns and the predictor does not seem to pay off. With regard to the behavior of CER with respect to the fraction of explained variance, we find an inverted U- or J-shape. Extreme investors on both tails—an optimistic investor with a high expected value of prior R2 and a skeptical investor with a low expected value—tend to perform worse that investors with a modest prior distribution. However, the most optimistic investor in our study is in a worse position than the most skeptical investor for both models. For A = 5, the difference between the models is slightly weaker, but the predictive regression still exhibits superior performance to the predictive system. Similarly as for A = 2, the CER for A = 5 varies with the prior beliefs about predictability (R2) and exhibits a U-shape. On the other hand, the performance is not so sensitive to the prior on the correlation ρuw. We now investigate the asset allocation weights. The effects of different levels of optimism on the stock weights for the predictive system can be seen in Figure 1, Panel A. We plot estimated weights given by (18) for different priors on predictive power, while choosing the prior on ρuw to be noninformative. More optimism about predictability is mirrored in more volatile weights. The more pessimistic an investor is, the less weight (in absolute terms) he puts in the risky assets. Although the weights for the most optimistic investor vary from −200% to 200%, the weights for the most pessimistic investor are almost constant at approximately 40%. These results are consistent with findings of Wachter and Warusawitharana (2009) for the predictive regression. Furthermore, the dynamics of weights over time reflect an unfavorable situation for stocks in the 1990s, when the holdings for stocks were mostly negative. The spikes each year are caused by the fact that we estimate the model on a yearly basis and keep the estimated parameters constant for all quarters in that year. Figure 1. View largeDownload slide Stock weights for investors with different beliefs in predictability. Panel (A): Sensitivity to the prior on R2 for the predictive system using a noninformative prior on ρuw. Panel (B): Sensitivity to the prior on ρuw for a prior on R2 fixed at 5%. Panel (C): Sensitivity to the model choice for a prior on R2 fixed at 5%. Figure 1. View largeDownload slide Stock weights for investors with different beliefs in predictability. Panel (A): Sensitivity to the prior on R2 for the predictive system using a noninformative prior on ρuw. Panel (B): Sensitivity to the prior on ρuw for a prior on R2 fixed at 5%. Panel (C): Sensitivity to the model choice for a prior on R2 fixed at 5%. Panel B of Figure 1 shows the sensitivity to the prior discussed in Pastor and Stambaugh (2009). We fix the prior on R2 to 5% and plot the stock weights for different priors on ρuw. There is no clear monotonicity or any clear pattern in the weights when changing the prior. A similar pattern holds for bond holdings. This indicates that the prior on the correlation between expected and unexpected returns in the predictive system does not play a key role for the investor. Finally, in Panel C of Figure 1 we compare the stock weights for both the predictive regression (PR in the legend) and the predictive system (PS in the legend) with the same prior beliefs about the predictability power, R2. As in Panel B, the prior on R2 is fixed to 5%. The weights from the predictive regression are slightly less volatile than the weights from the predictive system for any prior on ρuw. The higher volatility for the predictive system might be explained by the higher precision of the estimated parameters documented in Pastor and Stambaugh (2009). As the parameter uncertainty is lower, it increases the weights a mean–variance investor is willing to allocate. 3.2 Robustness Checks To investigate the robustness of our results, we conducted several additional checks. As a first robustness check, we explore the role of correlation between the stock and bond returns. As we are able to estimate stock and bond returns for each model only separately, we have assumed zero correlation among these assets so far. To show that the results are not driven by this assumption, we use the correlation between the stock and bond returns obtained from historical data to calculate the optimal weights. The CERs accounting for this correlation are reported in Table 6. The absolute values are lower as compared to the main analysis. However, the comparative advantage of the predictive regression is still there. As the correlation from historical data might not be the same as implied by the predictive models, we further calculate the CER for a constant correlation—in particular, 10% and 20% (chosen to be of a similar magnitude as time-varying correlations from the historical data). As the results are qualitatively the same, we do not report them in a separate table. Table 6. Out-of-sample performance: CERs, correlation included     Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−2011  Predictive system  Prior on ρuw                More informative  2.56  6.52  7.20  7.43  5.85  5.31  (0.38)  (0.32)  (0.29)  (0.21)  (0.18)  (0.16)    Less informative  3.76  7.14  7.72  7.39  6.85  8.02  (0.37)  (0.33)  (0.31)  (0.28)  (0.25)  (0.16)    Noninformative  3.22  6.81  7.86  7.23  7.49  8.99  (0.42)  (0.35)  (0.30)  (0.28)  (0.25)  (0.19)  Predictive regression    7.66  8.15  8.68  10.16  10.61  8.77  (0.37)  (0.36)  (0.34)  (0.26)  (0.21)  (0.19)  Panel B: Risk aversion A = 5, 1972−2011  Predictive system  Prior on ρuw                More informative  4.54  6.07  6.34  6.43  5.79  5.56  (0.15)  (0.13)  (0.12)  (0.08)  (0.07)  (0.06)    Less informative  4.97  6.30  6.53  6.34  6.19  6.68  (0.16)  (0.13)  (0.12)  (0.11)  (0.10)  (0.07)    Noninformative  4.76  6.17  6.58  6.34  6.46  7.09  (0.17)  (0.14)  (0.12)  (0.11)  (0.10)  (0.08)  Predictive regression    6.58  6.76  6.98  7.55  7.73  7.26  (0.15)  (0.14)  (0.13)  (0.11)  (0.09)  (0.07)      Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−2011  Predictive system  Prior on ρuw                More informative  2.56  6.52  7.20  7.43  5.85  5.31  (0.38)  (0.32)  (0.29)  (0.21)  (0.18)  (0.16)    Less informative  3.76  7.14  7.72  7.39  6.85  8.02  (0.37)  (0.33)  (0.31)  (0.28)  (0.25)  (0.16)    Noninformative  3.22  6.81  7.86  7.23  7.49  8.99  (0.42)  (0.35)  (0.30)  (0.28)  (0.25)  (0.19)  Predictive regression    7.66  8.15  8.68  10.16  10.61  8.77  (0.37)  (0.36)  (0.34)  (0.26)  (0.21)  (0.19)  Panel B: Risk aversion A = 5, 1972−2011  Predictive system  Prior on ρuw                More informative  4.54  6.07  6.34  6.43  5.79  5.56  (0.15)  (0.13)  (0.12)  (0.08)  (0.07)  (0.06)    Less informative  4.97  6.30  6.53  6.34  6.19  6.68  (0.16)  (0.13)  (0.12)  (0.11)  (0.10)  (0.07)    Noninformative  4.76  6.17  6.58  6.34  6.46  7.09  (0.17)  (0.14)  (0.12)  (0.11)  (0.10)  (0.08)  Predictive regression    6.58  6.76  6.98  7.55  7.73  7.26  (0.15)  (0.14)  (0.13)  (0.11)  (0.09)  (0.07)  Notes: CERs are calculated for the predictive system and predictive regression. The portfolio consists of stocks, bonds, and a risk-free asset. The predictor is the dividend–price ratio for the stock returns and the yield spread for the bond returns. The correlation between the risky assets is calculated from the historical average and used for both models. Optimal weights are calculated for a mean–variance investor with risk-aversion coefficients A = 2 (Panel A) and A = 5 (Panel B). At the beginning of each year, starting in 1972, we estimated the model and used the estimated parameters for calculating the optimal portfolio for each quarter in this year. Different beliefs about the prior distribution on R2 are considered, characterized by the expected value of the prior distribution. Lower values represent more skeptical investors. For the predictive system, we report the results for different priors on the correlation between the expected and unexpected returns as in Pastor and Stambaugh (2009). The noninformative prior is flat on most of the (−1,1) interval; the less informative implies most mass below zero; and the more informative imposes most mass below −0.7. Data are quarterly and span from 1952 to 2011. Numbers in parentheses are block bootstrap standard errors. Table 6. Out-of-sample performance: CERs, correlation included     Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−2011  Predictive system  Prior on ρuw                More informative  2.56  6.52  7.20  7.43  5.85  5.31  (0.38)  (0.32)  (0.29)  (0.21)  (0.18)  (0.16)    Less informative  3.76  7.14  7.72  7.39  6.85  8.02  (0.37)  (0.33)  (0.31)  (0.28)  (0.25)  (0.16)    Noninformative  3.22  6.81  7.86  7.23  7.49  8.99  (0.42)  (0.35)  (0.30)  (0.28)  (0.25)  (0.19)  Predictive regression    7.66  8.15  8.68  10.16  10.61  8.77  (0.37)  (0.36)  (0.34)  (0.26)  (0.21)  (0.19)  Panel B: Risk aversion A = 5, 1972−2011  Predictive system  Prior on ρuw                More informative  4.54  6.07  6.34  6.43  5.79  5.56  (0.15)  (0.13)  (0.12)  (0.08)  (0.07)  (0.06)    Less informative  4.97  6.30  6.53  6.34  6.19  6.68  (0.16)  (0.13)  (0.12)  (0.11)  (0.10)  (0.07)    Noninformative  4.76  6.17  6.58  6.34  6.46  7.09  (0.17)  (0.14)  (0.12)  (0.11)  (0.10)  (0.08)  Predictive regression    6.58  6.76  6.98  7.55  7.73  7.26  (0.15)  (0.14)  (0.13)  (0.11)  (0.09)  (0.07)      Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−2011  Predictive system  Prior on ρuw                More informative  2.56  6.52  7.20  7.43  5.85  5.31  (0.38)  (0.32)  (0.29)  (0.21)  (0.18)  (0.16)    Less informative  3.76  7.14  7.72  7.39  6.85  8.02  (0.37)  (0.33)  (0.31)  (0.28)  (0.25)  (0.16)    Noninformative  3.22  6.81  7.86  7.23  7.49  8.99  (0.42)  (0.35)  (0.30)  (0.28)  (0.25)  (0.19)  Predictive regression    7.66  8.15  8.68  10.16  10.61  8.77  (0.37)  (0.36)  (0.34)  (0.26)  (0.21)  (0.19)  Panel B: Risk aversion A = 5, 1972−2011  Predictive system  Prior on ρuw                More informative  4.54  6.07  6.34  6.43  5.79  5.56  (0.15)  (0.13)  (0.12)  (0.08)  (0.07)  (0.06)    Less informative  4.97  6.30  6.53  6.34  6.19  6.68  (0.16)  (0.13)  (0.12)  (0.11)  (0.10)  (0.07)    Noninformative  4.76  6.17  6.58  6.34  6.46  7.09  (0.17)  (0.14)  (0.12)  (0.11)  (0.10)  (0.08)  Predictive regression    6.58  6.76  6.98  7.55  7.73  7.26  (0.15)  (0.14)  (0.13)  (0.11)  (0.09)  (0.07)  Notes: CERs are calculated for the predictive system and predictive regression. The portfolio consists of stocks, bonds, and a risk-free asset. The predictor is the dividend–price ratio for the stock returns and the yield spread for the bond returns. The correlation between the risky assets is calculated from the historical average and used for both models. Optimal weights are calculated for a mean–variance investor with risk-aversion coefficients A = 2 (Panel A) and A = 5 (Panel B). At the beginning of each year, starting in 1972, we estimated the model and used the estimated parameters for calculating the optimal portfolio for each quarter in this year. Different beliefs about the prior distribution on R2 are considered, characterized by the expected value of the prior distribution. Lower values represent more skeptical investors. For the predictive system, we report the results for different priors on the correlation between the expected and unexpected returns as in Pastor and Stambaugh (2009). The noninformative prior is flat on most of the (−1,1) interval; the less informative implies most mass below zero; and the more informative imposes most mass below −0.7. Data are quarterly and span from 1952 to 2011. Numbers in parentheses are block bootstrap standard errors. Second, we consider different subsamples. As there seems to be a persistent decline in expected returns and an increase in the steady-state growth rate of the economy at the beginning of the 1990s (Pesaran and Timmermann, 2002; Lettau and Van Nieuwerburgh, 2008), we split the out-of-sample period into two halves: 1972−1991 and 1992−2011. Tables 7 and 8 report CERs for these subsamples and the same degrees of risk aversion as reported for the entire period. The absolute CER in the first sample is higher than in the second. This is consistent with the evidence in the literature indicating a weaker degree of or no predictability starting in the 1990s. In the first period, there is no clear preference for either model. For a not too optimistic investor ( k=5% or k=10%), the predictive system outperforms the predictive regression. However, this is the only case in our robustness exercises when the predictive system pays off compared to the predictive regression. For pessimistic prior beliefs, the predictive regression outperforms all other models. For less averse investors with A = 5, the results are less volatile and less sensitive to the prior on R2. In the second period, the absolute returns are lower. Nevertheless, if we compare the predictive regression to the predictive system, the predictive regression always outperforms the system. Thus, the main conclusion does not change when considering these subsamples. Table 7. Out-of-sample performance: CERs, subsamples     Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−1991  Predictive system  Prior on ρuw                More informative  9.42  14.45  14.53  10.99  9.18  8.24  (0.87)  (0.50)  (0.42)  (0.35)  (0.35)  (0.24)    Less informative  10.56  15.29  15.27  12.56  11.15  8.45  (0.94)  (0.77)  (0.74)  (0.64)  (0.57)  (0.31)    Noninformative  8.63  15.69  15.44  12.14  10.68  7.80  (1.12)  (0.77)  (0.73)  (0.65)  (0.60)  (0.42)  Predictive regression    11.65  12.69  13.40  15.45  13.99  9.56  (0.93)  (0.86)  (0.79)  (0.57)  (0.47)  (0.40)  Panel B: Risk aversion A = 5, 1972−1991  Predictive system  Prior on ρuw                More informative  8.86  10.80  10.83  9.37  8.62  8.22  (0.34)  (0.20)  (0.17)  (0.14)  (0.15)  (0.15)    Less informative  9.32  11.14  11.12  10.01  9.43  8.34  (0.338)  (0.31)  (0.29)  (0.25)  (0.16)  (0.13)    Noninformative  8.55  11.29  11.18  9.83  9.25  8.09  (0.45)  (0.31)  (0.29)  (0.26)  (0.24)  (0.17)  Predictive regression    9.71  10.12  10.40  11.19  10.59  8.80  (0.36)  (0.33)  (0.31)  (0.22)  (0.19)  (0.16)      Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−1991  Predictive system  Prior on ρuw                More informative  9.42  14.45  14.53  10.99  9.18  8.24  (0.87)  (0.50)  (0.42)  (0.35)  (0.35)  (0.24)    Less informative  10.56  15.29  15.27  12.56  11.15  8.45  (0.94)  (0.77)  (0.74)  (0.64)  (0.57)  (0.31)    Noninformative  8.63  15.69  15.44  12.14  10.68  7.80  (1.12)  (0.77)  (0.73)  (0.65)  (0.60)  (0.42)  Predictive regression    11.65  12.69  13.40  15.45  13.99  9.56  (0.93)  (0.86)  (0.79)  (0.57)  (0.47)  (0.40)  Panel B: Risk aversion A = 5, 1972−1991  Predictive system  Prior on ρuw                More informative  8.86  10.80  10.83  9.37  8.62  8.22  (0.34)  (0.20)  (0.17)  (0.14)  (0.15)  (0.15)    Less informative  9.32  11.14  11.12  10.01  9.43  8.34  (0.338)  (0.31)  (0.29)  (0.25)  (0.16)  (0.13)    Noninformative  8.55  11.29  11.18  9.83  9.25  8.09  (0.45)  (0.31)  (0.29)  (0.26)  (0.24)  (0.17)  Predictive regression    9.71  10.12  10.40  11.19  10.59  8.80  (0.36)  (0.33)  (0.31)  (0.22)  (0.19)  (0.16)  Notes: CERs are calculated for the predictive system and predictive regression. The portfolio consists of stocks, bonds, and a risk-free asset. The predictor is the dividend–price ratio for the stock returns and the yield spread for the bond returns. Optimal weights are calculated for a mean–variance investor with risk-aversion coefficients A = 2 (Panel A) and A = 5 (Panel B). At the beginning of each year, starting in 1972 and ending in 1991, we estimated the model and used the estimated parameters for calculating the optimal portfolio for each quarter in this year. Different beliefs about the prior distribution on R2 are considered, characterized by the expected value of the prior distribution. Lower values represent more skeptical investors. For the predictive system, we report the results for different priors on the correlation between the expected and unexpected returns as in Pastor and Stambaugh (2009). The noninformative prior is flat on most of the (−1,1) interval; the less informative implies most mass below zero; and the more informative imposes most mass below −0.7. Data are quarterly and span from 1952 to 2011. Numbers in parentheses are block bootstrap standard errors. Table 7. Out-of-sample performance: CERs, subsamples     Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−1991  Predictive system  Prior on ρuw                More informative  9.42  14.45  14.53  10.99  9.18  8.24  (0.87)  (0.50)  (0.42)  (0.35)  (0.35)  (0.24)    Less informative  10.56  15.29  15.27  12.56  11.15  8.45  (0.94)  (0.77)  (0.74)  (0.64)  (0.57)  (0.31)    Noninformative  8.63  15.69  15.44  12.14  10.68  7.80  (1.12)  (0.77)  (0.73)  (0.65)  (0.60)  (0.42)  Predictive regression    11.65  12.69  13.40  15.45  13.99  9.56  (0.93)  (0.86)  (0.79)  (0.57)  (0.47)  (0.40)  Panel B: Risk aversion A = 5, 1972−1991  Predictive system  Prior on ρuw                More informative  8.86  10.80  10.83  9.37  8.62  8.22  (0.34)  (0.20)  (0.17)  (0.14)  (0.15)  (0.15)    Less informative  9.32  11.14  11.12  10.01  9.43  8.34  (0.338)  (0.31)  (0.29)  (0.25)  (0.16)  (0.13)    Noninformative  8.55  11.29  11.18  9.83  9.25  8.09  (0.45)  (0.31)  (0.29)  (0.26)  (0.24)  (0.17)  Predictive regression    9.71  10.12  10.40  11.19  10.59  8.80  (0.36)  (0.33)  (0.31)  (0.22)  (0.19)  (0.16)      Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−1991  Predictive system  Prior on ρuw                More informative  9.42  14.45  14.53  10.99  9.18  8.24  (0.87)  (0.50)  (0.42)  (0.35)  (0.35)  (0.24)    Less informative  10.56  15.29  15.27  12.56  11.15  8.45  (0.94)  (0.77)  (0.74)  (0.64)  (0.57)  (0.31)    Noninformative  8.63  15.69  15.44  12.14  10.68  7.80  (1.12)  (0.77)  (0.73)  (0.65)  (0.60)  (0.42)  Predictive regression    11.65  12.69  13.40  15.45  13.99  9.56  (0.93)  (0.86)  (0.79)  (0.57)  (0.47)  (0.40)  Panel B: Risk aversion A = 5, 1972−1991  Predictive system  Prior on ρuw                More informative  8.86  10.80  10.83  9.37  8.62  8.22  (0.34)  (0.20)  (0.17)  (0.14)  (0.15)  (0.15)    Less informative  9.32  11.14  11.12  10.01  9.43  8.34  (0.338)  (0.31)  (0.29)  (0.25)  (0.16)  (0.13)    Noninformative  8.55  11.29  11.18  9.83  9.25  8.09  (0.45)  (0.31)  (0.29)  (0.26)  (0.24)  (0.17)  Predictive regression    9.71  10.12  10.40  11.19  10.59  8.80  (0.36)  (0.33)  (0.31)  (0.22)  (0.19)  (0.16)  Notes: CERs are calculated for the predictive system and predictive regression. The portfolio consists of stocks, bonds, and a risk-free asset. The predictor is the dividend–price ratio for the stock returns and the yield spread for the bond returns. Optimal weights are calculated for a mean–variance investor with risk-aversion coefficients A = 2 (Panel A) and A = 5 (Panel B). At the beginning of each year, starting in 1972 and ending in 1991, we estimated the model and used the estimated parameters for calculating the optimal portfolio for each quarter in this year. Different beliefs about the prior distribution on R2 are considered, characterized by the expected value of the prior distribution. Lower values represent more skeptical investors. For the predictive system, we report the results for different priors on the correlation between the expected and unexpected returns as in Pastor and Stambaugh (2009). The noninformative prior is flat on most of the (−1,1) interval; the less informative implies most mass below zero; and the more informative imposes most mass below −0.7. Data are quarterly and span from 1952 to 2011. Numbers in parentheses are block bootstrap standard errors. Table 8. Out-of-sample performance: CERs, subsamples     Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1992−2011  Predictive system  Prior on ρuw                More informative  –2.38  0.80  2.30  5.76  4.48  3.92  (0.75)  (0.56)  (0.50)  (0.34)  (0.34)  (0.43)    Less informative  0.51  1.52  2.43  2.67  4.31  8.31  (0.77)  (0.48)  (0.48)  (0.43)  (0.44)  (0.35)    Noninformative  1.06  0.70  2.42  4.19  5.54  10.55  (0.83)  (0.55)  (0.48)  (0.47)  (0.47)  (0.49)  Predictive regression    4.74  4.73  4.65  6.17  7.64  9.97  (0.58)  (0.57)  (0.56)  (0.45)  (0.36)  (0.29)  Panel B: Risk aversion A = 5, 1992−2011  Predictive system  Prior on ρuw                More informative  1.15  2.39  2.97  4.32  3.81  3.58  (0.30)  (0.23)  (0.20)  (0.13)  (0.14)  (0.17)    Less informative  2.29  2.68  3.03  3.11  3.76  5.34  (0.31)  (0.20)  (0.20)  (0.17)  (0.18)  (0.15)    Noninformative  2.50  2.35  3.03  3.73  4.27  6.25  (0.33)  (0.22)  (0.19)  (0.19)  (0.19)  (0.20)  Predictive regression    4.01  4.00  3.96  4.55  5.12  6.02  (0.22)  (0.22)  (0.21)  (0.17)  (0.14)  (0.12)      Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1992−2011  Predictive system  Prior on ρuw                More informative  –2.38  0.80  2.30  5.76  4.48  3.92  (0.75)  (0.56)  (0.50)  (0.34)  (0.34)  (0.43)    Less informative  0.51  1.52  2.43  2.67  4.31  8.31  (0.77)  (0.48)  (0.48)  (0.43)  (0.44)  (0.35)    Noninformative  1.06  0.70  2.42  4.19  5.54  10.55  (0.83)  (0.55)  (0.48)  (0.47)  (0.47)  (0.49)  Predictive regression    4.74  4.73  4.65  6.17  7.64  9.97  (0.58)  (0.57)  (0.56)  (0.45)  (0.36)  (0.29)  Panel B: Risk aversion A = 5, 1992−2011  Predictive system  Prior on ρuw                More informative  1.15  2.39  2.97  4.32  3.81  3.58  (0.30)  (0.23)  (0.20)  (0.13)  (0.14)  (0.17)    Less informative  2.29  2.68  3.03  3.11  3.76  5.34  (0.31)  (0.20)  (0.20)  (0.17)  (0.18)  (0.15)    Noninformative  2.50  2.35  3.03  3.73  4.27  6.25  (0.33)  (0.22)  (0.19)  (0.19)  (0.19)  (0.20)  Predictive regression    4.01  4.00  3.96  4.55  5.12  6.02  (0.22)  (0.22)  (0.21)  (0.17)  (0.14)  (0.12)  Notes: CERs are calculated for the predictive system and predictive regression. The portfolio consists of stocks, bonds, and a risk-free asset. The predictor is the dividend–price ratio for the stock returns and the yield spread for the bond returns. Optimal weights are calculated for a mean–variance investor with risk-aversion coefficients A = 2 (Panel A) and A = 5 (Panel B). At the beginning of each year, starting in 1992 and ending in 2011, we estimated the model and used the estimated parameters for calculating the optimal portfolio for each quarter in this year. Different beliefs about the prior distribution on R2 are considered, characterized by the expected value of the prior distribution. Lower values represent more skeptical investors. For the predictive system, we report the results for different priors on the correlation between the expected and unexpected returns as in Pastor and Stambaugh (2009). The noninformative prior is flat on most of the (−1,1) interval; the less informative implies most mass below zero; and the more informative imposes most mass below −0.7. Data are quarterly and span from 1952 to 2011. Numbers in parentheses are block bootstrap standard errors. Table 8. Out-of-sample performance: CERs, subsamples     Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1992−2011  Predictive system  Prior on ρuw                More informative  –2.38  0.80  2.30  5.76  4.48  3.92  (0.75)  (0.56)  (0.50)  (0.34)  (0.34)  (0.43)    Less informative  0.51  1.52  2.43  2.67  4.31  8.31  (0.77)  (0.48)  (0.48)  (0.43)  (0.44)  (0.35)    Noninformative  1.06  0.70  2.42  4.19  5.54  10.55  (0.83)  (0.55)  (0.48)  (0.47)  (0.47)  (0.49)  Predictive regression    4.74  4.73  4.65  6.17  7.64  9.97  (0.58)  (0.57)  (0.56)  (0.45)  (0.36)  (0.29)  Panel B: Risk aversion A = 5, 1992−2011  Predictive system  Prior on ρuw                More informative  1.15  2.39  2.97  4.32  3.81  3.58  (0.30)  (0.23)  (0.20)  (0.13)  (0.14)  (0.17)    Less informative  2.29  2.68  3.03  3.11  3.76  5.34  (0.31)  (0.20)  (0.20)  (0.17)  (0.18)  (0.15)    Noninformative  2.50  2.35  3.03  3.73  4.27  6.25  (0.33)  (0.22)  (0.19)  (0.19)  (0.19)  (0.20)  Predictive regression    4.01  4.00  3.96  4.55  5.12  6.02  (0.22)  (0.22)  (0.21)  (0.17)  (0.14)  (0.12)      Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1992−2011  Predictive system  Prior on ρuw                More informative  –2.38  0.80  2.30  5.76  4.48  3.92  (0.75)  (0.56)  (0.50)  (0.34)  (0.34)  (0.43)    Less informative  0.51  1.52  2.43  2.67  4.31  8.31  (0.77)  (0.48)  (0.48)  (0.43)  (0.44)  (0.35)    Noninformative  1.06  0.70  2.42  4.19  5.54  10.55  (0.83)  (0.55)  (0.48)  (0.47)  (0.47)  (0.49)  Predictive regression    4.74  4.73  4.65  6.17  7.64  9.97  (0.58)  (0.57)  (0.56)  (0.45)  (0.36)  (0.29)  Panel B: Risk aversion A = 5, 1992−2011  Predictive system  Prior on ρuw                More informative  1.15  2.39  2.97  4.32  3.81  3.58  (0.30)  (0.23)  (0.20)  (0.13)  (0.14)  (0.17)    Less informative  2.29  2.68  3.03  3.11  3.76  5.34  (0.31)  (0.20)  (0.20)  (0.17)  (0.18)  (0.15)    Noninformative  2.50  2.35  3.03  3.73  4.27  6.25  (0.33)  (0.22)  (0.19)  (0.19)  (0.19)  (0.20)  Predictive regression    4.01  4.00  3.96  4.55  5.12  6.02  (0.22)  (0.22)  (0.21)  (0.17)  (0.14)  (0.12)  Notes: CERs are calculated for the predictive system and predictive regression. The portfolio consists of stocks, bonds, and a risk-free asset. The predictor is the dividend–price ratio for the stock returns and the yield spread for the bond returns. Optimal weights are calculated for a mean–variance investor with risk-aversion coefficients A = 2 (Panel A) and A = 5 (Panel B). At the beginning of each year, starting in 1992 and ending in 2011, we estimated the model and used the estimated parameters for calculating the optimal portfolio for each quarter in this year. Different beliefs about the prior distribution on R2 are considered, characterized by the expected value of the prior distribution. Lower values represent more skeptical investors. For the predictive system, we report the results for different priors on the correlation between the expected and unexpected returns as in Pastor and Stambaugh (2009). The noninformative prior is flat on most of the (−1,1) interval; the less informative implies most mass below zero; and the more informative imposes most mass below −0.7. Data are quarterly and span from 1952 to 2011. Numbers in parentheses are block bootstrap standard errors. In addition to splitting the sample, we limit the asset weights of risky assets to a range between 0% and 150% of the overall portfolio as in Dangl and Halling (2012). From Table 9, we can derive that the absolute performance increases for all models. In terms of relative performance, the results indicate the same pattern as in the analysis without constraints, but now they are even more pronounced. The predictive regression exhibits consistently better performance than the predictive system. Moreover, we check the performance by using a rolling window of 20 years instead of expanding window for the estimation. Because the results are almost the same as in the main analysis, we do not report them in a separate table. Table 9. Out-of-sample performance: CERs, constraints on weights     Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−2011  Predictive system  Prior on ρuw                More informative  9.60  10.13  10.17  9.20  8.17  7.65  (0.33)  (0.29)  (0.29)  (0.20)  (0.17)  (0.16)    Less informative  10.15  10.01  9.89  9.03  9.04  9.08  (0.33)  (0.30)  (0.28)  (0.28)  (0.26)  (0.17)    Noninformative  9.91  9.48  9.60  9.14  9.07  8.95  (0.34)  (0.33)  (0.31)  (0.28)  (0.28)  (0.20)  Predictive regression    10.45  10.38  10.36  10.39  10.03  9.70  (0.31)  (0.31)  (0.30)  (0.25)  (0.21)  (0.18)  Panel B: Risk aversion A = 5, 1972−2011  Predictive system  Prior on ρuw                More informative  6.92  7.16  7.07  6.85  6.39  6.31  (0.13)  (0.12)  (0.11)  (0.08)  (0.08)  (0.07)    Less informative  7.58  7.35  7.27  6.58  6.62  6.97  (0.13)  (0.12)  (0.12)  (0.11)  (0.11)  (0.07)    Noninformative  7.46  7.15  7.10  6.81  6.86  7.28  (0.14)  (0.13)  (0.12)  (0.11)  (0.11)  (0.08)  Predictive regression    8.39  8.43  8.44  8.42  8.03  8.35  (0.13)  (0.13)  (0.12)  (0.10)  (0.09)  (0.08)      Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−2011  Predictive system  Prior on ρuw                More informative  9.60  10.13  10.17  9.20  8.17  7.65  (0.33)  (0.29)  (0.29)  (0.20)  (0.17)  (0.16)    Less informative  10.15  10.01  9.89  9.03  9.04  9.08  (0.33)  (0.30)  (0.28)  (0.28)  (0.26)  (0.17)    Noninformative  9.91  9.48  9.60  9.14  9.07  8.95  (0.34)  (0.33)  (0.31)  (0.28)  (0.28)  (0.20)  Predictive regression    10.45  10.38  10.36  10.39  10.03  9.70  (0.31)  (0.31)  (0.30)  (0.25)  (0.21)  (0.18)  Panel B: Risk aversion A = 5, 1972−2011  Predictive system  Prior on ρuw                More informative  6.92  7.16  7.07  6.85  6.39  6.31  (0.13)  (0.12)  (0.11)  (0.08)  (0.08)  (0.07)    Less informative  7.58  7.35  7.27  6.58  6.62  6.97  (0.13)  (0.12)  (0.12)  (0.11)  (0.11)  (0.07)    Noninformative  7.46  7.15  7.10  6.81  6.86  7.28  (0.14)  (0.13)  (0.12)  (0.11)  (0.11)  (0.08)  Predictive regression    8.39  8.43  8.44  8.42  8.03  8.35  (0.13)  (0.13)  (0.12)  (0.10)  (0.09)  (0.08)  Notes: CERs are calculated for the predictive system and predictive regression. The portfolio consists of stocks, bonds, and a risk-free asset. The predictor is the dividend–price ratio for the stock returns and the yield spread for the bond returns. Optimal weights are calculated for a mean–variance investor with risk-aversion coefficients A = 2 (Panel A) and A = 5 (Panel B). We limit the weights to lie between 0 and 1.5 for each risky asset. At the beginning of each year, starting in 1972, we estimated the model and used the estimated parameters for calculating the optimal portfolio for each quarter in this year. Different beliefs about the prior distribution on R2 are considered, characterized by the expected value of the prior distribution. Higher values represent more skeptical investors. For the predictive system, we report the results for different priors on the correlation between the expected and unexpected returns as in Pastor and Stambaugh (2009). The noninformative prior is flat on most of the (−1,1) interval; the less informative implies most mass below zero; and the more informative imposes most mass below −0.7. Data are quarterly and span from 1952 to 2011. Numbers in parentheses are block bootstrap standard errors. Table 9. Out-of-sample performance: CERs, constraints on weights     Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−2011  Predictive system  Prior on ρuw                More informative  9.60  10.13  10.17  9.20  8.17  7.65  (0.33)  (0.29)  (0.29)  (0.20)  (0.17)  (0.16)    Less informative  10.15  10.01  9.89  9.03  9.04  9.08  (0.33)  (0.30)  (0.28)  (0.28)  (0.26)  (0.17)    Noninformative  9.91  9.48  9.60  9.14  9.07  8.95  (0.34)  (0.33)  (0.31)  (0.28)  (0.28)  (0.20)  Predictive regression    10.45  10.38  10.36  10.39  10.03  9.70  (0.31)  (0.31)  (0.30)  (0.25)  (0.21)  (0.18)  Panel B: Risk aversion A = 5, 1972−2011  Predictive system  Prior on ρuw                More informative  6.92  7.16  7.07  6.85  6.39  6.31  (0.13)  (0.12)  (0.11)  (0.08)  (0.08)  (0.07)    Less informative  7.58  7.35  7.27  6.58  6.62  6.97  (0.13)  (0.12)  (0.12)  (0.11)  (0.11)  (0.07)    Noninformative  7.46  7.15  7.10  6.81  6.86  7.28  (0.14)  (0.13)  (0.12)  (0.11)  (0.11)  (0.08)  Predictive regression    8.39  8.43  8.44  8.42  8.03  8.35  (0.13)  (0.13)  (0.12)  (0.10)  (0.09)  (0.08)      Fraction of explained variance (%)   50  10  5  1  0.5  0.1  W&W    P&S  W&W  W&W  W&W  optimistic      pessimistic  Panel A: Risk aversion A = 2, 1972−2011  Predictive system  Prior on ρuw                More informative  9.60  10.13  10.17  9.20  8.17  7.65  (0.33)  (0.29)  (0.29)  (0.20)  (0.17)  (0.16)    Less informative  10.15  10.01  9.89  9.03  9.04  9.08  (0.33)  (0.30)  (0.28)  (0.28)  (0.26)  (0.17)    Noninformative  9.91  9.48  9.60  9.14  9.07  8.95  (0.34)  (0.33)  (0.31)  (0.28)  (0.28)  (0.20)  Predictive regression    10.45  10.38  10.36  10.39  10.03  9.70  (0.31)  (0.31)  (0.30)  (0.25)  (0.21)  (0.18)  Panel B: Risk aversion A = 5, 1972−2011  Predictive system  Prior on ρuw                More informative  6.92  7.16  7.07  6.85  6.39  6.31  (0.13)  (0.12)  (0.11)  (0.08)  (0.08)  (0.07)    Less informative  7.58  7.35  7.27  6.58  6.62  6.97  (0.13)  (0.12)  (0.12)  (0.11)  (0.11)  (0.07)    Noninformative  7.46  7.15  7.10  6.81  6.86  7.28  (0.14)  (0.13)  (0.12)  (0.11)  (0.11)  (0.08)  Predictive regression    8.39  8.43  8.44  8.42  8.03  8.35  (0.13)  (0.13)  (0.12)  (0.10)  (0.09)  (0.08)  Notes: CERs are calculated for the predictive system and predictive regression. The portfolio consists of stocks, bonds, and a risk-free asset. The predictor is the dividend–price ratio for the stock returns and the yield spread for the bond returns. Optimal weights are calculated for a mean–variance investor with risk-aversion coefficients A = 2 (Panel A) and A = 5 (Panel B). We limit the weights to lie between 0 and 1.5 for each risky asset. At the beginning of each year, starting in 1972, we estimated the model and used the estimated parameters for calculating the optimal portfolio for each quarter in this year. Different beliefs about the prior distribution on R2