Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

The Income Fluctuation Problem and the Evolution of Wealth

The Income Fluctuation Problem and the Evolution of Wealth The Income Fluctuation Problem and the Evolution of Wealth a b c Qingyin Ma , John Stachurski and Alexis Akira Toda International School of Economics and Management, Capital University of Economics and Business Research School of Economics, Australian National University Department of Economics, University of California San Diego January 30, 2020 Abstract. We analyze the household savings problem in a general setting where returns on assets, non- nancial income and impatience are all state dependent and uctuate over time. All three processes can be serially correlated and mutually dependent. Rewards can be bounded or unbounded and wealth can be arbitrarily large. Extending classic results from an earlier literature, we determine conditions under which (a) solutions exist, are unique and are globally computable, (b) the resulting wealth dynamics are stationary, ergodic and geometrically mixing, and (c) the wealth distribution has a Pareto tail. We show how these results can be used to extend recent studies of the wealth distribution. Our conditions have natural economic interpretations in terms of asymptotic growth rates for discounting and return on savings. Keywords: Income uctuation, optimality, stochastic stability, wealth distribution. 1. Introduction It has been observed that, in the US and several other large economies, the wealth distribution is heavy tailed and wealth inequality has risen sharply over the last few We thank the editors and two anonymous referees for many valuable comments and suggestions. This paper has also bene ted from discussion with many colleagues. We particularly thank Fedor Iskhakov, Larry Liu and Chung Tran for their insightful feedback and suggestions. The second author gratefully acknowledges nancial support from ARC grant FT160100423. Email addresses: qingyin.ma@cueb.edu.cn, john.stachurski@anu.edu.au, atoda@ucsd.edu. arXiv:1905.13045v3 [econ.TH] 29 Feb 2020 2 decades. This matters not only for its direct impact on taxation and redistribution policies, but also for potential ow-on e ects for productivity growth, business cycles and scal policy, as well as for the political environment that shapes these and other economic outcomes. At present, our understanding of these phenomena is hampered by the fact that standard tools of analysis|such as those used for heterogeneous agent models|are not well adapted to studying the wealth distribution as it stands. For example, while we have sound understanding of the household problem when returns on savings and rates of time discount are constant (see, e.g., Schechtman (1976), Schechtman and Escudero (1977), Deaton and Laroque (1992), Carroll (1997), or A ckg oz (2018)), our knowledge is far more limited in settings where these values are stochastic. This is problematic, since injecting such features into the household problem is essential for accurately representing the joint distribution of income and wealth (e.g., Benhabib et al. (2015), Benhabib et al. (2017), Stachurski and Toda (2019)). Moreover, models with time-varying discount rates and returns on assets are at the forefront of recent quantitative analysis of wealth and inequality. While it might be hoped that the analysis of the income uctuation problem (or household consumption and savings problem) changes little when we shift from con- stant to state dependent asset returns and rates of time discount, this turns out not For example, in a study based on capital income data, Saez and Zucman (2016) nd that, in the case of the US, the share of total household wealth held by the top 0.1% increased from 7 percent to 22 percent between 1978 and 2012. For a discussion of the heavy-tailed property of the wealth distribution, see Pareto (1896), Davies and Shorrocks (2000), Benhabib and Bisin (2018), Vermeulen (2018) or references therein. One analysis of the two-way interactions between inequality and political decision making can be found in Acemoglu and Robinson (2002). Glaeser et al. (2003) show how inequality can alter economic and social outcomes through subversion of institutions. The same study contains references on linkages between inequality and growth. Regarding scal policy, Brinca et al. (2016) nd strong correlations between wealth inequality and the magnitude of scal multipliers, while Bhandari et al. (2018) study the connection between scal-monetary policy, business cycles and inequality. Ahn et al. (2018) discuss the impact of distributional properties on macroeconomic aggregates. Also related is the recent experimental study of Epper et al. (2018), which nds a strong positive connection between dispersion in subjective rates of time discounting across the population and realized dispersion in the wealth distribution. This in turn is consistent with earlier empirical studies such as Lawrance (1991). For a recent quantitative study see, for example, Hubmer et al. (2018), where returns on savings and discount rates are both state dependent (as is labor income). Kaymak et al. (2018) nd that asset return heterogeneity is required to match the upper tail of the wealth distribution. 3 to be the case. E ectively modeling these features and the way they map to the wealth distribution requires signi cant advances in our understanding of choice and stochastic dynamics in the setting of optimal savings. One diculty is that state-dependent discounting takes us beyond the bounds of tradi- tional dynamic programming theory. This matters little if there exists some constant < 1 such that the discount process f g satis es  for all t with probability t t one, since, in this case, a standard contraction mapping argument can still be applied (see, e.g., Miao (2006) or Cao (2020)). However, recent quantitative studies extend beyond such settings. For example, AR(1) speci cations are increasingly common, in which case the support of is unbounded above at every point in time. Even if dis- cretization is employed, the outcome  1 can occur with positive probability when the approximation is suciently ne. Moreover, such outcomes are not inconsistent with empirical and experimental evidence, at least for some households in some states of the world. Do there exist conditions on f g that allow for  1 in some states t t and yet imply existence of optimal polices and practical computational techniques? Another source of complexity for the income uctuation problem in the general setting considered here is that the set of possible values for household assets is typically unbounded above. For example, when returns on assets are stochastic, a suciently long sequence of favorable returns can compound one another to project a household to arbitrarily high levels of wealth. This model feature is desirable: We wish to analyze these kinds of outcomes rather than rule them out. Indeed, Benhabib et al. (2015) and other related studies argue convincingly that such outcomes are a key causal mechanism behind the heavy tail of the current distribution of wealth. However, if we accept this logic, then stationarity and ergodicity of the wealth process|which are fundamental both for estimation and for simulation-based numerical methods|must now be established in a setting where the wealth distribution has unbounded support. See, for example, Hills and Nakata (2018), Hubmer et al. (2018) or Schorfheide et al. (2018). See, for example, Loewenstein and Prelec (1991) and Loewenstein and Sicherman (1991). One related study is Benhabib et al. (2011), who show that capital income risk is the driving force of the heavy-tail properties of the stationary wealth distribution. In Blanchard-Yaari style economies, Toda (2014), Toda and Walsh (2015) and Benhabib et al. (2016) show that idiosyncratic investment risk generates a double Pareto stationary wealth distribution. Gabaix et al. (2016) point out that a positive correlation of returns with wealth (\scale dependence") in addition to persistent heterogeneity in returns (\type dependence") can well explain the speed of changes in the tail inequality observed in the data. 4 In such a scenario, what conditions on preferences and nancial and labor income are necessary for these properties to hold? A nal and related example of the need for deeper analysis is as follows: To understand the upper tail of the wealth distribution, we must avoid unnecessarily truncating the upper tail of the set of possible asset values in quantitative work. While truncation is convenient because nite or compact state spaces are easier to handle computationally, we can attain greater accuracy in modeling the wealth distribution if truncation at the upper tail can be replaced locally by a parameterized savings function, such as a linear function (Gouin-Bonenfant and Toda, 2018). However, any such approximation must be justi ed by theory. What conditions can be imposed on primitives to generate such properties while still maintaining realistic assumptions for asset returns and non- nancial income? In this paper we address all of these questions, along with other key properties of the income uctuation problem, such as continuity and monotonicity of the optimal consumption policy. Our setting admits capital income risk, labor earnings shocks and time-varying discount rates, driven by a combination of iid innovations and an exogenous Markov chain fZ g. The supports of the innovations can be unbounded, so we admit practical innovation sequences such as normal and lognormal. As a whole, this environment allows for a range of realistic features, such as stochastic volatility in returns on asset holdings, or correlation in the shocks impacting asset returns and non- nancial income. The utility function can be unbounded both above and below, with no speci c structure imposed beyond di erentiability, concavity and the usual slope (Inada) conditions. To begin, when considering optimality in the household problem, we require a con- dition on the state dependent discount process f g that generalizes the classical condition < 1 from the constant case and, for reasons discussed above, permits > 1 with positive probability. To this end, we introduce the restriction 1=n G < 1 where G := lim E : (1) n!1 t=1 While the assumption that the exogenous state process fZ g is a ( nite state) Markov chain might appear restrictive, it ts most practical settings and avoids a host of technical issues that tend to obscure the key ideas. Moreover, the innovation shocks are not restricted to be discrete, and the same is true for assets and consumption. Q Q n n Here and below we set  1, so = . 0 t t t=1 t=0 5 Condition (1) clearly generalizes the classical condition < 1 for the constant dis- count case. In the stochastic case, ln G can be understood as the asymptotic growth rate of the probability weighted average discount factor. Indeed, if B := E n t t=1 is the average n-period discount factor, then, from the de nition of G and some straightforward analysis, we obtain ln(B =B ) ! ln G , so the condition G < 1 n+1 n implies that the asymptotic growth rate of the average n-period discount factor is negative, drifting down from its initial condition  1 at the rate ln G . This does not, of course, preclude the possibility that > 1 at any given t. We show that condition (1) is in fact a necessary condition in those settings where the classical condition is necessary for nite lifetime values. In this sense it cannot be further weakened for the income uctuation problem apart from special cases. At the same time, it admits the use of convenient speci cations such as the discretized AR(1) process from Hubmer et al. (2018). In addition, we prove that G can be represented as the spectral radius of a nonnegative matrix, and hence can be computed by numerical linear algebra (as discussed below). We also generalize the standard condition R < 1, where R is the gross interest rate in the constant case, which is used to ensure stability of the asset path and niteness of lifetime valuations, as well as existence of stationary Markov policies (see, e.g., Deaton and Laroque (1992), Chamberlain and Wilson (2000) or Li and Stachurski (2014)). Analogous to (1), we introduce the generalized condition 1=n G < 1 where G := lim E R : (2) R R t t n!1 t=1 Here fR g is a stochastic capital income process. Analogous to the case of G , the value ln G can be understood as the asymptotic growth rate of average gross payo on assets, discounted to present value. We show that, when Conditions (1){(2) hold and non- nancial income satis es two moment conditions, a unique optimal consumption policy exists. We also show that the policy can be computed by successive approximations and analyze its properties, such as monotonicity and asymptotic linearity. This asymptotic linearity can be used to successfully model wealth inequality by accurately representing asset path dynamics for very high wealth households (Gouin-Bonenfant and Toda, 2018). One important feature of Conditions (1){(2) is that they take into account the au- tocorrelation structure of preference shocks and asset returns. For example, if these processes depend only on iid innovations, then (1) reduces to E < 1 and (2) reduces t 6 to E R < 1. But returns on assets are typically not iid, since both mean returns t t and volatility are, in general, time varying, and preference shocks are typically mod- eled as correlated (see, e.g., Hubmer et al. (2018) or Schorfheide et al. (2018)). This dependence must be and is accounted for in (2), since long upswings in f g and fR g t t can lead to explosive paths for valuations and assets. Next we study asymptotic stability, stationarity and ergodicity of wealth. Such prop- erties are essential to existence of stationary equilibria in heterogeneous agent models (e.g., Huggett (1993), Aiyagari (1994) or Cao (2020)), as well as standard estimation, calibration and simulation techniques that connect time series averages with cross- sectional moments. These properties require an additional restriction, placed on the asymptotic growth rate of mean returns. Analogous to (1) and (2), this is de ned as 1=n G := lim E R : (3) R t n!1 t=1 We show that if G is suciently restricted and a degree of social mobility is present, then there exists a unique stationary distribution for the state process, the distri- butional path of the state process under the optimal path converges globally to the stationary distribution, and the stationary distribution is ergodic. We also show that, under some mild additional conditions, the rate of convergence of marginal distribu- tions to the stationary distribution is geometric, and that a version of the Central Limit Theorem is valid. Finally, under some mild additional conditions, we prove that the stationary distribution of assets is Pareto tailed, consistent with the data. Our study is related to Benhabib et al. (2015), who prove the existence of a heavy- tailed wealth distribution in an in nite horizon heterogeneous agent economy with capital income risk. In the process, they show that households facing a stochastic return on savings possess a unique optimal consumption policy characterized by the (boundary constraint-contingent) Euler equation, and that a unique and unbounded stationary distribution exists for wealth under this consumption policy. They assume isoelastic utility, constant discounting, and mutually independent, iid returns and labor income processes, both supported on bounded closed intervals with strictly positive lower bounds. We relax all of these assumptions. Apart from allowing more general utility and state dependent discounting, this permits such realistic features for household income as positive correlations between labor earnings and wealth returns A well-known example of a computational technique that uses ergodicity can be found in Krusell and Smith (1998). On the estimation side see, for example, Hansen and West (2002). 7 (an extension that was suggested by Benhabib et al. (2015)), or time varying volatility in returns. Another related paper is Chamberlain and Wilson (2000), which studies an income uctuation problem with stochastic income and asset returns and obtains many signif- icant results on asymptotic properties of consumption. Their study imposes relatively few restrictions on the wealth return and labor income processes. Our paper extends their work by allowing for random discounting, as well as dropping their boundedness restriction on the utility, which prevents their work from being used in many standard settings such as constant relative risk aversion. We also develop a set of new results on stability and ergodicity, as well as asymptotic normality of the wealth process. Our optimality theory draws on techniques found in Li and Stachurski (2014), who show that the time iteration operator is a contraction mapping with respect to a met- ric that evaluates consumption di erences in terms of marginal utility, while assuming a constant discount factor and constant rate of return on assets. We show that these ideas extend to a setting where both returns and discount rates are stochastic and time varying. Our results on dynamics under the optimal policy have no counterparts in Li and Stachurski (2014). In a similar vein, our work is related to several other papers that treat the standard income uctuation problems with constant rates of return on assets and constant discount rates, such as Rabault (2002), Carroll (2004) and Kuhn (2013). While Carroll (2004) constructs a weighted supremum norm contraction and works with the Bellman operator, the other two papers focus on time iteration. In particular, Rabault (2002) exploits the monotonicity structure, while Kuhn (2013) applies a version of the Tarski xed point theorem. Our techniques for studying optimality are close to those in Li and Stachurski (2014), as discussed above. Empirical motivation for these kinds of extensions can be found in numerous studies, including Guvenen and Smith (2014) and Fagereng et al. (2016a,b). Coleman (1990) introduced the time iteration operator as a constructive method for solving stochastic growth models. It has since been used in Datta et al. (2002), Morand and Re ett (2003) and many other studies. Our paper is also related to Cao and Luo (2017), who study wealth inequality in a continuous- time framework with heterogeneous returns following a two-state Markov chain. While we do not pursue the connection here, the generality of our setup, including a persistent shock structure to wealth returns, might permit a study of the continuous-time limit that yields the tail results of Cao and Luo (2017) in a general framework. 8 The rest of this paper is structured as follows. Section 2 formulates the problem and establishes optimality results. Sucient conditions for the existence and uniqueness of optimal policies are discussed. Section 3 focuses on stochastic stability. Section 4 discusses our key conditions and how they can be checked. Section 5 provides a set of applications and Section 6 concludes. All proofs are deferred to the appendix. Code that generates our gures can be found at https://github.com/jstac/ifp_public. 2. The Income Fluctuation Problem and Optimality Results This section formulates the income uctuation problem we consider, establishes the existence, uniqueness and computability of a solution, and derives its properties. 2.1. Problem Statement. We consider a general income uctuation problem, where a household chooses a consumption-asset path f(c ; a )g to solve t t ( ! ) 1 t X Y max E u(c ) 0 i t t=0 i=0 s:t: a = R (a c ) + Y ; (4) t+1 t+1 t t t+1 0  c  a ; (a ; Z ) = (a; z) given: t t 0 0 Here u is the utility function, f g is discount factor process with = 1, fR g t t0 0 t t1 is the gross rate of return on wealth, and fY g is non- nancial income. These t t1 stochastic processes obey = (Z ; " ) ; R = R (Z ;  ) ; and Y = Y (Z ;  ) ; (5) t t t t t t t t t where , R and Y are measurable nonnegative functions and fZ g is an irreducible t t0 time-homogeneous Z-valued Markov chain taking values in nite set Z. Let P (z; z ^) be the probability of transitioning from z to z ^ in one step. The innovation processes f" g, f g and f g are iid independent and their supports can be continuous and t t t vector-valued. The function u maps R to f1g [ R, is twice di erentiable on (0;1), satis es 0 00 0 0 u > 0 and u < 0 everywhere on (0;1), and that u (c) ! 1 as c ! 0 and u (c) < 1 as c ! 1. We de ne E := E  (a ; Z ) = (a; z) and E := E  Z = z : (6) a;z 0 0 z 0 The next period value of a random variable X is typically denoted X . Expectation without a subscript refers to the stationary process, where Z is drawn from its (necessarily unique) stationary distribution. 9 2.2. Key Conditions. Our conditions for optimality are listed below. In what fol- lows, G is the asymptotic growth rate of the discount process as de ned in (1). Assumption 2.1. The discount factor process satis es G < 1. Assumption 2.1 is a natural extension of the standard condition < 1 from the constant discount case. If  for all t, then G = , as follows immediately from the de nition. It is weaker than the obvious sucient condition  with probability one for some constant < 1, since in such a setting we have G  < 1. In fact it cannot be signi cantly weakened, as the proposition shows. Proposition 2.1 (Necessity of the discount condition). Let and u(Y ) be positive t t with probability one for all t and all initial states z in Z. If, in this setting, we have G  1, then the objective in (4) is in nite at every initial state (a; z). The positivity assumed here may or may not hold in applications, but Proposition 2.1 shows that special conditions will have to be imposed on preferences if Assumption 2.1 fails. Put di erently, allowing G  1 is tantamount to allowing  1 in the case when the discount rate is constant. Next, we need to ensure that the present discounted value of wealth does not grow too quickly, which requires a joint restriction on asset returns and discounting. When fR g and f g are constant at values R and , the standard restriction from the t t existing literature is R < 1. A generalization using G as de ned in (2) is Assumption 2.2. The discount factor and return processes satisfy G < 1. Finally, we impose routine technical restrictions on non- nancial income. The second restriction is needed to exploit rst order conditions. Assumption 2.3. E Y < 1 and E u (Y ) < 1. Next we provide one example where Assumptions 2.1{2.3 are easily veri ed. More complex examples are deferred to Sections 4 and 5. Example 2.1. Suppose, as in Benhabib et al. (2015), that there is a constant dis- count factor < 1, utility is CRRA with  1, fR g and fY g are iid, mutually t t independent, supported on bounded closed intervals of strictly positive real numbers, and, moreover, 1 1 1= ER < 1 and ( ER ) ER < 1: (7) t t t 10 Assumptions 2.1{2.3 are all satis ed in this case. To see this, observe that G = < 1 in the constant discount case, so Assumption 2.1 holds. Since x 7! x is convex when  1, Jensen's inequality implies that ER  (ER ) . Multiplying both sides of the last inequality by (ER ) yields G = ER = (ER ) (ER )  ( ER )(ER ) : R t t t t By the second condition of (7), Assumption 2.2 holds. Assumption 2.3 also holds because Y is restricted to a compact subset of the positive reals. 2.3. Optimality: De nitions and Fundamental Properties. To consider opti- mality, we temporarily assume that a > 0 and set the asset space to (0;1). The state space for f(a ; Z )g is then S := (0;1)  Z. A feasible policy is a Borel t t t0 0 measurable function c : S ! R with 0  c(a; z)  a for all (a; z) 2 S . A feasible 0 0 policy c and initial condition (a; z) 2 S generate an asset path fa g via (4) when 0 t t0 c = c(a ; Z ) and (a ; Z ) = (a; z). The lifetime value of policy c is t t t 0 0 V (a; z) = E  u [c(a ; Z )] ; (8) c a;z 0 t t t t=0 where fa g is the asset path generated by (c; (a; z)). In the Appendix we show that V is well-de ned on S . A feasible policy c is called optimal if V  V  on S for c 0 c c 0 any feasible policy c. A feasible policy is said to satisfy the rst order optimality condition if 0 0 ^ ^ ^ ^ (u  c) (a; z)  E R (u  c) R [a c(a; z)] + Y ; Z (9) for all (a; z) 2 S , and equality holds when c(a; z) < a. Noting that u is decreasing, the rst order optimality condition can be compactly stated as n   o 0 0 0 ^ ^ ^ ^ ^ (u  c) (a; z) = max E R (u  c) R [a c(a; z)] + Y ; Z ; u (a) (10) for all (a; z) 2 S . A feasible policy is said to satisfy the transversality condition if, for all (a; z) 2 S , lim E  (u  c) (a ; Z ) a = 0: (11) a;z 0 t t t t t!1 Theorem 2.1 (Suciency of rst order and transversality conditions). If Assump- tions 2.1{2.3 hold, then every feasible policy satisfying the rst order and transver- sality conditions is an optimal policy. 15 0 Assumption 2.3 combined with u (0) = 1 implies that PfY > 0g = 1 for all t  1. Hence, Pfa > 0g = 1 for all t  1 and excluding zero from the asset space makes no di erence to optimality. t 11 2.4. Existence and Computability of Optimal Consumption. Let C be the space of continuous functions c : S ! R such that c is increasing in the rst argument, 0 < c(a; z)  a for all (a; z) 2 S , and 0 0 sup j(u  c)(a; z) u (a)j < 1: (12) (a;z)2S To compare two consumption policies, we pair C with the distance 0 0 0 0 (c; d) := ku  c u  dk := sup j(u  c) (a; z) (u  d) (a; z)j ; (13) (a;z)2S which evaluates the maximal di erence in terms of marginal utility. While elements of C are not generally bounded,  is a valid metric on C . In particular,  is nite 0 0 0 0 on C since (c; d)  ku  c uk +ku  d uk, and the last two terms are nite by (12). In Appendix B, we show that (C ; ) is a complete metric space. The following proposition shows that, for any policy in C , the rst order optimality condition (10) implies the transversality condition. Proposition 2.2 (Suciency of rst order condition). Let Assumptions 2.1{2.3 hold. If c 2 C and the rst order optimality condition (10) holds for all (a; z) 2 S , then c satis es the transversality condition. In particular, c is an optimal policy. We aim to characterize the optimal policy as the xed point of the time iteration operator T de ned as follows: for xed c 2 C and (a; z) 2 S , the value of the image Tc at (a; z) is de ned as the  2 (0; a] that solves u () = (; a; z); (14) where is the function on G := f(; a; z) 2 R  (0;1) Z : 0 <   ag (15) de ned by n o 0 0 ^ ^ ^ ^ ^ (; a; z) := max E R(u  c)[R(a ) + Y ; Z ]; u (a) : (16) c z The following theorem shows that the time iteration operator is an n-step contraction mapping on a complete metric space of candidate policies and its xed point is the unique optimal policy. Theorem 2.2 (Existence, uniqueness and computability of optimal policies). If As- sumptions 2.1{2.3 hold, then there exists an n in N such that T is a contraction mapping on (C ; ). In particular, 12 (1) T has a unique xed point c 2 C . (2) The xed point c is the unique optimal policy in C . (3) For all c 2 C we have (T c; c ) ! 0 as k ! 1. Part (3) shows that, under our conditions, the familiar time iteration algorithm is globally convergent, provided one starts with some policy in the candidate class C . 2.5. Properties of Optimal Consumption. In this section we study the properties of the optimal consumption function obtained in Theorem 2.2. Assumptions 2.1{2.3 are held to be true throughout. The following two propositions show the monotonicity of the consumption function, which is intuitive. Proposition 2.3 (Monotonicity with respect to wealth). The optimal consumption and savings functions c (a; z) and i (a; z) := a c (a; z) are increasing in a. Proposition 2.4 (Monotonicity with respect to income). If fY g and fY g are two 1t 2t income processes satisfying Y  Y for all t and c and c are the corresponding 1t 2t 1 2 optimal consumption functions, then c  c pointwise on S . 1 2 Under further assumptions we can show that the optimal policy is concave and asymp- totically linear with respect to the wealth level. Proposition 2.5 (Concavity and asymptotic linearity of consumption function). If for each z 2 Z and c 2 C that is concave in its rst argument, h i 0 1 0 ^ ^ ^ ^ x 7! (u ) E R (u  c) (Rx + Y ; Z ) is concave on R ; (17) z + then (1) a 7! c (a; z) is concave, and (2) there exists (z) 2 [0; 1] such that lim [c (a; z)=a] = (z). a!1 Remark 2.1. Condition (17) imposes some concavity structure on utility. It holds for the constant relative risk aversion (CRRA) utility function u(c) = if > 0 and u(c) = log c if = 1; (18) as shown in Appendix B. 13 Proposition 2.5 states that c (a; z)  (z)a + b(z) for some function b(z) when a is large. This provides justi cation for linearly extrapolating the policy functions when computing them at high wealth levels. Together, parts (1) and (2) of Proposition 2.5 imply the linear lower bound c (a; z) (z)a, although they do not provide a concrete number for (z). The following proposition establishes an explicit linear lower bound. Proposition 2.6 (Linear lower bound on consumption). If there exists a nonnegative constant s  such that 0 0 ^ ^ ^ s  < 1 and E R u (R s a)  u (a) for all (a; z) 2 S ; (19) z 0 then c (a; z)  (1 s )a for all (a; z) 2 S . The second inequality in (19) restricts marginal utility derived from transferring wealth to the next period and then consuming versus consuming wealth today. The value s  can be clari ed once primitives are speci ed, as the next example illustrates. Example 2.2. Suppose that utility is CRRA, as in (18). If we now take 1= ^ ^ s  := max E R (20) z2Z and s  < 1, then the conditions of Proposition 2.6 hold. In particular, the second inequality in (19) holds, as follows directly from the de nition of s  and u (x) = x . In the case of Benhabib et al. (2015), where the discount rate is constant and returns 1= are iid, the expression in (20) reduces to s  := ( ER ) . The requirement s  < 1 then reduces to ER < 1, which is one of their assumptions (see Example 2.1). 3. Stationarity, Ergodicity, and Tail Behavior This section focuses on stationarity, ergodicity and tail behavior of wealth under the unique optimal policy c obtained in Theorem 2.2. So that this policy exists, Assumptions 2.1{2.3 are always taken to be valid. We extend c to S by setting We adopt the convention 0  1 = 0, so condition (19) does not rule out the case PfR = 0 j Z = zg > 0. Indeed, as shown in the proofs, the conclusions still hold if we replace this condition t1 0 0 ^ ^ ^ ^ by the weaker alternative E R u [Rsa  + (1 s )Y ]  u (a) for all (a; z) 2 S . z 0 14 c (0; z) = 0 for all z 2 Z and consider dynamics of (a ; Z ) on S := R  Z, the law t t + of motion for which is a = R (Z ;  ) [a c (a ; Z )] + Y (Z ;  ) ; (21a) t+1 t+1 t+1 t t t t+1 t+1 Z  P (Z ;  ) (21b) t+1 t Let Q be the joint stochastic kernel of (a ; Z ) on S. See Appendix A for this and t t related de nitions. 3.1. Stationarity. To obtain existence of a stationary distribution we need to re- strict the asymptotic growth rate for asset returns G de ned in (3). Assumption 3.1. There exists a constant s  such that (19) holds and s G < 1. Below is one straightforward example of a setting where this holds, with more complex applications deferred to Sections 4{5. Example 3.1. Assumption 3.1 holds in the setting of Benhabib et al. (2015). As 1= shown in Example 2.2, with s  := ( ER ) and the assumptions of Benhabib et al. (2015) in force, the conditions of (19) hold. Moreover, in their iid setting we 1= have G = ER , so s G < 1 reduces to ( ER ) ER < 1. This is one of their R t R t conditions, as discussed in Example 2.1. By Proposition 2.6, the value s  in Assumption 3.1 is an upper bound on the rate of savings. G is an asymptotic growth rate for each unit of savings invested. If the product of these is less than one, then probability mass contained in the wealth distribution will not drift to +1, which allows us to obtain the following result. Theorem 3.1 (Existence of a stationary distribution). If Assumption 3.1 holds, then Q admits at least one stationary distribution on S. Stationarity of the form obtained in Theorem 3.1 is required to establish existence of stationary recursive equilibria in heterogeneous agent models with idiosyncratic risk, such as Huggett (1993) or Aiyagari (1994). Assumption 3.1 is weaker than any restriction implying wealth is bounded from above|a com- mon device for compactifying the state space and thereby obtaining a stationary distribution. In- deed, under many speci cations of fY g and fR g that fall within our framework, wealth of a given t t household can and will, over an in nite horizon, exceed any nite bound with probability one. See, for example, Benhabib et al. (2015), Proposition 6. For models with aggregate shocks, such as Krusell and Smith (1998), a fully speci ed recursive equilibrium requires that households take the wealth distribution as one component of the state in 15 3.2. Ergodicity. While Assumption 3.1 implies existence of a stationary distribu- tion, it is not in general sucient for uniqueness or stability. For these additional properties to hold, we must impose sucient mixing. In doing so, we consider the following two cases: (Y1) The support of fY g is nite. (Y2) The process fY g admits a density representation. Condition (Y2) means that there exists a function f from R  Z to R such that + + PfY 2 A j Z = zg = f (y j z) dy (22) t t for all Borel sets A  R and all z in Z. Assumption 3.2. There exists a z  in Z such that P (z ; z ) > 0. Moreover, with y  0 de ned as the greatest lower bound of the support of fY g, either (Y1) holds and PfY = y j Z = z g > 0, or t ` t (Y2) holds and there exists a  > y such that f ( j z ) > 0 on (y ; ). ` ` Assumption 3.2 requires that there is a positive probability of receiving low labor income at some relatively persistent state of the world z . This is a mixing condition that enforces social mobility. The reason is that fZ g is already assumed to be irreducible, so z  is eventually visited by each household. For any such household, there is a positive probability of low labor income over a long period. Wealth then declines. In other words, currently rich households or dynasties will not be rich forever. This guarantees sucient social mobility between rich and poor, generating ergodicity. To state our uniqueness and stability results, let Q be the t-step stochastic kernel, let kk be total variation norm and let V (a; z) := a + m , where m is a constant TV V V to be de ned in the proof. For any integrable real-valued function h on S, let h(a; z) := h(a; z) Eh(a ; Z ) t t and 2 2 := E h (a ; Z ) + 2 E h(a ; Z )h(a ; Z ) ; 0 0 0 0 t t t=1 their savings problem, and that stationarity holds for the entire joint distribution (de ned over a product space encompassing both the wealth distribution and the exogenous state process). These problems fall outside the scope of Theorem 3.1, since fZ g is nite-valued. For a careful treatment of stationary recursive equilibrium in Krusell{Smith type models, see Cao (2020). 16 where, here and in the theorem below, E indicates expectation under stationarity. Theorem 3.2 (Uniqueness, stability, ergodicity and mixing). If Assumptions 3.1 and 3.2 hold, then (1) the stationary distribution of Q is unique and there exist constants  < 1 and M < 1 such that, t t Q ((a; z);)   MV (a; z) for all (a; z) 2 S: TV (2) For all (a; z) 2 S and real-valued function h on S such that Ejh(a ; Z )j < 1, t t ( ) P lim h(a ; Z ) = Eh(a ; Z ) = 1: a;z t t t t T!1 t=1 2 2 (3) Q is V -geometrically mixing. Moreover, if > 0 and h =V is bounded, p h(a ; Z ) ! N (0; 1) as T ! 1: t t t=1 Part 1 of Theorem 3.2 states that the stationary distribution is unique and asymp- totically attracting at a geometric rate. Part 2 states that the state process is er- godic, and hence long-run sample moments for individual households coincide with cross-sectional moments. The notion of mixing discussed in Part 3 is de ned in the appendix. It states that social mobility holds asymptotically and mixing occurs at a geometric rate, although the rate may be arbitrarily slow. This mixing is enough to provide a Central Limit Theorem for the state process, which is the second claim in Part 3. 3.3. Tail Behavior. Having established the stationarity and ergodicity of wealth, we now study the tail behavior of the wealth distribution. We show that the wealth distribution is either bounded or (unbounded and) heavy-tailed under mild conditions. To prove this result we introduce the following assumption. Assumption 3.3. The assumptions of Proposition 2.5 are satis ed, so the optimal policy a 7! c (a; z) is concave and asymptotically linear: lim c (a; z)=a = (z) 2 a!1 [0; 1]. Furthermore, there exists z 2 Z such that P (z ; z ) > 0 and P fR(z ;  )(1 (z )) > 1g > 0: (23) z  17 Remark 3.1. Condition (23) implies that wealth grows with nonzero probability when it is large. Indeed, using the law of motion (21a) and noting that Y  0, if Z = Z = z , then by (23) we have t t+1 t+1 R (z ;  ) [1 c (a ; z )=a ] > 1 t+1 t t with positive probability if a is large enough. To state our result on tail behavior, we introduce the following notation. For any nonnegative function A(z; z ^;  ), de ne the Z Z matrix-valued function M by (M (s))(z; z ^) = E A(z; z ^;  ) : (24) A z;z ^ Elements of M (s) are conditional moment generating functions of log A. In the statement below, denotes the Hadamard (entry-wise) product, and r() returns the spectral radius of a matrix. Also a is a random variable with distribution (; Z). 1 1 Theorem 3.3 (Tail behavior). Let Assumptions 3.1{3.3 hold and de ne ^ ^ G(z; z ^;  ) = R(z ^;  )(1 (z)); (25a) ^ ^ ^ A(z; z ^;  ) = G(z; z ^;  )1fG(z; z ^;  ) > 1g; and (25b) (s) = r(P M (s)): (25c) Then  is convex in s  0. Assume that there exists s > 0 in the interior of the domain of  such that 1 < (s) < 1 and let := inffs > 0j (s) > 1g: (26) If a has unbounded support, then it is heavy-tailed. In particular, for any " > 0, +" lim inf a Pfa  ag > 0: (27) a!1 Remark 3.2. The assumption 1 < (s) < 1 for some s > 0 is weak. Because the (z ; z )-th element of P M (s) is ^ ^ P (z ; z )E G(z ; z ;  ) 1fG(z ; z ;  ) > 1g; z ;z by the de nition of G in (25a) and condition (23), we always have (s) ! 1 as s ! 1. Hence there exists s > 0 such that (s) 2 (1;1) if, for example,  has a compact support. Condition (27) implies that for any " > 0, there exists a constant C (") > 0 such that Pfa  ag  C (")a for large enough a, so the upper tail of the wealth distribution is at least Pareto. 18 Remark 3.3. Toda (2019) constructs an example of a Huggett (1993) economy with Pareto-tailed wealth distribution when discount factors are random. Theorem 3.3 is signi cantly more general as we allow for stochastic returns and income. Stachurski and Toda (2019) prove that with constant discount factor, constant asset return, and light-tailed income, the wealth distribution is always light-tailed. Theorem 3.3 shows that sucient heterogeneity in discount factor or returns generates heavy tails. Example 3.2. The CRRA-iid setting of Benhabib et al. (2015) satis es the assump- tions of Theorem 3.3. When utility is CRRA, by Proposition 5 of Benhabib et al. (2015), condition (23) holds if R(z ;  ) > 1=s  with positive probability, where s  is given 1= in Example 2.2. In the iid case, this condition reduces to Pf( ER ) R > 1g > 0, which holds under the conditions of Benhabib et al. (2015). Thus, Assumption 3.3 holds. The existence of s > 0 with (s) 2 (1;1) follows from Remark 3.2 and the assumption that R has a compact support. 4. Testing the Growth Conditions The three key conditions in the paper are the restrictions on the growth rates G , G and G , with the rst two required for optimality and the last for stationarity (see Assumptions 2.1, 2.2 and 3.1 respectively). In this section we explore the restrictions implied by these conditions. We begin with the following result, which yields a straightforward method for computing these growth rates. Lemma 4.1 (Long-run growth rates and spectral radii). Let ' = '(Z ;  ), where t t t ' is a nonnegative measurable function and f g is an iid sequence with marginal distribution . In this setting we have 1=n G = r(L ); where G := lim E ' (28) ' ' ' t n!1 t=1 and r(L ) is the spectral radius of the matrix de ned by ^ ^ L (z; z ^) = P (z; z ^) '(z ^; )(d): (29) 19 1= Benhabib et al. (2015) assume that Pf R > 1g > 0, so it suces to show that ( ER ) or, equivalently, E( R )  1. By Jensen's inequality and their restriction  1, the last bound is true whenever (E R )  1. But this must hold because, under their conditions, we have ER < 1, as shown in Example 2.1. t 19 The matrix L is expressed as a function on Z Z in (29) but can be represented in traditional matrix notation by enumerating Z. What factors determine the long-run average growth rates embedded in our assump- tions, such as G or G ? Lemma 4.1 tells us how to compute these values for a given speci cation of dynamics, but how should we understand them intuitively and what factors determine their size? To address these questions, let us consider an AR(1) discount factor process, which has been adopted in several recent quantitative studies (see, e.g., Hubmer et al. (2018) or Hills and Nakata (2018)). In particular, suppose that the state process follows a discretized version of iid 2 1=2 Z = (1 ) + Z + (1  )  ; f g  N (0; 1); (30) t+1 t t+1 t and = Z . (The discretization implies that is always positive.) To simplify t t t interpretation, the process (30) is structured so that the stationary distribution of fZ g is N (;  ). We use Rouwenhorst (1995)'s method to discretize fZ g and then t t calculate G using Lemma 4.1, studying how G is a ected by the parameters in (30). Since = Z for all t, the structure of (30) implies that  is the long-run unconditional t t mean of f g. It can therefore be set to standard calibrated value for the discount factor, such as 0:99 from Krusell and Smith (1998). What we wish to understand is how the remaining parameters  and  a ect the value of G . While no closed form expression is available in this case, Figure 1 sheds some light by providing a contour plot of G over a set of (; ) pairs. The gure shows that G grows with both the persistence term  and volatility term . In particular, the condition G < 1 fails when the persistence and volatility of the discount factor process are suciently high. n 1=n This is because G is the limit of (E ) and, for positive random variables, t=1 sequence of large outcomes have a strong compounding e ect on their product. High volatility and high persistence reinforce this e ect. This discussion has focused on G but similar intuition applies to both G and G . R R If and R are both increasing functions of the state process, then these asymptotic t t growth rates also increase with greater persistence and volatility in the state process, as well as higher unconditional mean. The next section further illustrates these points. Speci cally, if Z := fz ; : : : ; z g, then L = PD where P is, as before, the transition matrix 1 N ' ' for the exogenous state, and D := diag (E '; : : : ; E ') when E ' := E '(z; ). In what follows, ' z z z z 1 N D , D and D are de ned analogously to D . R R ' 1.00 1.0100 0.015 1.0075 0.014 1.0050 1.0025 0.013 1.0000 0.012 0.9975 0.9950 0.011 0.9925 0.9900 0.010 0.960 0.965 0.970 0.975 0.980 0.985 0.990 Figure 1. Contour plot of G under AR(1) discounting 5. Application: Stochastic Volatility and Mean Persistence We showed in Examples 2.1, 2.2 and 3.1 that, in the setting of Benhabib et al. (2015), where the discount factor is constant and returns and labor income are iid, Assumptions 2.1{2.3 and Assumption 3.1 are all satis ed. Hence, by Theorems 2.2 and 3.1, the household optimization problem has a unique optimal policy and the wealth process under this policy has a stationary solution. If, in addition, the support of Y is nite or Y has a positive density, say, then the conditions of Theorem 3.2 also t t hold and the stationary solution is ergodic, geometrically mixing and its time series averages are asymptotically normal. Let us now bring the model closer to the data by relaxing the iid restrictions on nan- cial and non- nancial returns, introducing both mean persistence and time varying volatility in returns on assets. In particular, we set log R =  +   ; (31) t t t t where f g is iid and standard normal and f g and f g are nite-state Markov t t t chains, discretized from = (1  )  +   +   and log  = (1  )  +  log  +   : t   t1  t   t1 t t The importance of these features for wealth dynamics was highlighted in Fagereng et al. (2016a). 21 Innovations are iid and standard normal. Using the data in Fagereng et al. (2016b) on Norwegian nancial returns over 1993{2003, we estimate these AR(1) models to obtain   = 0:0281,  = 0:5722,  = 0:0067,   = 3:2556,  = 0:2895 and = 0:1896. Based on this calibration, the stationary mean and standard deviation of fR g are around 1:03 and 4%, respectively. To distinguish the e ects of stochastic volatility and mean persistence, we consider two subsidiary models. The rst reduces f g to its stationary mean  , while the 2 2 + =2(1 ) second reduces f g to its stationary mean  ~ := e . In summary, log R =   +   (Model I) t t t log R =  +  ~ (Model II) t t t We set = 0:95 and = 1:5. To test the stability properties of Model I, we explore a neighborhood of the calibrated ( ;  ) values, while in Model II, we do likewise for ( ;  ) pairs. In each scenario, other parameters are xed to the benchmark. The results are shown in Figures 2 and 3. In part (a) of each gure, we see that G is increasing in the persistence and volatility parameters of the state process. The intuition behind this feature was explained in Section 4 for the case of G and is similar here. (Note that G = G in the R R present case, since  is a constant, so G has the same shape as G in terms t R R of contours.) The dots in the gures show that G < 1 at the estimated parameter values. Part (b) of each gure shows the set of parameters under which the model is globally stable and ergodic. The stability threshold is the boundary of the set of parameter pairs that produce maxfG ; s;  sG  g < 1, where s  is given by (20). For such pairs, R R Assumptions 2.2 and 3.1 both hold, so the conditions of Theorems 3.1{3.2 are satis ed. (We are continuing to suppose that Y is nite or has a positive density, so that Assumption 3.2 holds. Assumptions 2.1 and 2.3 are always valid in the current setting). Observe that the estimated parameter values (dot points) lie inside the stable set. 6. Conclusion We studied an updated version of the income uctuation problem, the \common ancestor" of modern macroeconomic theory (Ljungqvist and Sargent (2012), p. 3.) 1.00 1.150 1.4 1.125 1.2 1.100 1.0 1.075 0.8 0.6 1.050 1.025 0.4 1.000 0.2 (0.2895, 0.1896) 0.975 0.0 0.0 0.2 0.4 0.6 0.8 1.0 (a) Contour plot of G 1.4 stability threshold 1.2 1.0 0.8 0.6 stable 0.4 0.2 (0.2895, 0.1896) 0.0 0.2 0.4 0.6 0.8 1.0 (b) Range and threshold of stability Figure 2. Stability tests for Model I Working in a setting where returns on nancial assets, non- nancial income and impa- tience are all state dependent and uctuate over time, we obtained conditions under which the household savings problem has a unique solution that can be computed by successive approximations and the wealth process under the optimal savings policy 1.00 1.44 0.14 1.38 0.12 1.32 0.10 1.26 0.08 1.20 0.06 1.14 0.04 1.08 0.02 1.02 (0.5722, 0.0067) 0.96 0.0 0.2 0.4 0.6 0.8 1.0 (a) Contour plot of G 0.14 stability threshold 0.12 0.10 0.08 0.06 0.04 stable 0.02 (0.5722, 0.0067) 0.0 0.2 0.4 0.6 0.8 1.0 (b) Range and threshold of stability Figure 3. Stability tests for Model II has a unique stationary distribution with Pareto right tail. We also obtained condi- tions under which wealth is ergodic and exhibits geometric mixing and asymptotic normality. We investigated the nature of our conditions and provided methods for testing them in applications. While our work was motivated by the desire to bet- ter understand the joint distribution of income and wealth, the income uctuation 24 problem also has applications in asset pricing, life-cycle choice, scal policy, monetary policy, optimal taxation, and social security. The ideas contained in this paper should be helpful for those elds after suitable modi cations or extensions. Appendix A. Preliminaries Given a topological space T, let B(T) be the Borel -algebra and P (T) be the probability measures on B(T). A stochastic kernel  on T is a map  : TB(T) ! [0; 1] such that x 7! (x; B) is B(T)-measurable for each B 2 B(T) and B 7! (x; B) is a probability measure on B(T) for each x 2 T. For all t 2 N, x; y 2 T and B 2 1 t t1 B(T), we de ne  :=  and  (x; B) :=  (y; B)(x; dy). Furthermore, for all R R t t 2 P (T), let ( )(B) :=  (x; B)(dx).  is called Feller if x 7! h(y)(x; dy) is continuous on T whenever h is bounded and continuous on T. We call 2 P (T) stationary for  if  = . A sequencef g  P (T) is called tight, if, for all " > 0, there exists a compact K  T such that  (TnK )  " for all n. A stochastic kernel  is called bounded in probability if the sequence fQ (x;)g is tight for all x 2 T. Given  2 P (T), we de ne the t0 total variation norm kk := sup g d . Given any measurable map V : T ! TV g:jgj1 [1;1), we say that  is V -geometrically mixing if there exist constants M < 1 and < 1 such that, for all x 2 T and t  0, the corresponding Markov process fX g satis es sup jE g(X )h(X ) [E g(X )] [E h(X )]j   MV (x). 2 2 x t t+k x t x t+k k0; h ; g V Below we use ( ;F; P) to denote a xed probability space on which all random variables are de ned. E is expectations with respect to P. The state process fZ g and the innovation processes f" g, f g and f g introduced in (5) live on this space. t t t In what follows, fZ g is a stationary version of the chain, where Z is drawn from its t 0 unique stationary distribution|henceforth denoted  . The marginal distributions of the innovations are denoted by  ,  and  respectively. We let fF g be the "   t natural ltration generated by fZ g and the three innovation processes. P conditions t z on Z = z and E is expectation under P . 0 z z We rst prove Lemma 4.1, since its implications will be used immediately below. In the proof, we consider the matrix L as a linear operator on R and identify vectors in R with real-valued functions on Z. 25 Proof of Lemma 4.1. A proof by induction con rms that, for any function h 2 R , L h(z) = E ' h(Z ); (32) z t t t=1 where L is the n-th composition of the operator L with itself (or, equivalently, the n-th power of the matrix L ). The positivity of L and Theorem 9.1 of Krasnosel'skii ' ' n 1=n Z et al. (2012) imply that r(L ) = lim kL hk when kk is any norm on R and ' n!1 h is everywhere positive on Z. With h  1 and kfk = Ejf (Z )j, this becomes ! ! 1=n 1=n n n Y Y 1=n r(L ) = lim E L 1(Z ) = lim E E ' = lim E ' (33) ' 0 Z t t ' 0 n!1 n!1 n!1 t=1 t=1 where the second equality is due to (32) and h = 1 and the third is by the law of iterated expectations. Lemma A.1. Let f' g and G be as de ned in Lemma 4.1. If G < 1, then there t ' ' exists an N in N and a  < 1 such that max E ' <  whenever n  N . z2Z z t t=1 n 1=n Proof. Recalling from the proof of Lemma 4.1 that r(L ) = lim kL hk when ' n!1 kk is any norm on R and h is everywhere positive on Z, we can again take h  1 but now switch to kfk = max jf (z)j, so that (33) becomes z2Z 1=n 1=n r(L ) = lim max L 1(z) = lim max E ' : (34) ' z t n!1 n!1 z2Z z2Z t=1 Since r(L ) = G and G < 1, the claim in Lemma A.1 now follows. ' ' ' Appendix B. Proof of Section 2 Results Proof of Proposition 2.1. Pick any a  0 and z 2 Z. Since c = Y for all t is t t dominated by a feasible consumption path, monotonicity of u and the law of iterated expectations give 1 t 1 t 1 t XY XY X Y max E u(c )  E u(Y ) = E h(Z ); a;z i t z i t z i t t=0 i=0 t=0 i=0 t=0 i=0 where h(Z ) := E u(Y ) and the monotone convergence theorem has been employed t Z to pass the expectation through the sum. In view of (32) and = 1, we then have 1 t 1 XY X max E u(c )  L h(z): (35) a;z i t t=0 t=0 i=0 26 By the assumed almost sure positivity of and the irreducibility of P , the matrix L is irreducible. Hence, by the Perron{Frobenius theorem, we can choose an everywhere positive eigenfunction e such that L e = r(L )e. By the everywhere positivity of u(Y ), the function h is everywhere positive on Z, and hence we can choose > 0 such that e := e is less than h pointwise on Z. We then have 1 1 1 X X X t t t L h(z)  L e (z) = r(L ) e(z): t=0 t=0 t=0 By lemma 4.1 we know that r(L )  1, and since and e are positive, this expres- sion is in nite. Returning to (35), we see that the value function is in nite at our arbitrarily chosen pair (a; z). For the rest of this section we suppose that Assumptions 2.1{2.3 hold. P Q P Q 1 t 1 t Lemma B.1. M := max E and M := max E R , 1 z2Z z i 2 z2Z z i i t=0 i=1 t=0 i=1 are nite, as are the constants M = max E Y and M = max E u (Y ). 3 z2Z z 4 z2Z z Proof. That M and M are nite follows directly from Lemma A.1, with ' = and 1 2 t t ' = R respectively. Regarding M , Assumption 2.3 states that EY < 1. By the t t t 3 Law of Iterated Expectations, we can write this as E Y  (z) < 1. As fZ g z Z t z2Z is irreducible, we know that  is positive everywhere on Z. Hence, M < 1 must Z 3 hold. The proof of M < 1 is similar. Lemma B.2. For the maximal asset path fa ~ g de ned by a ~ = R a ~ + Y and (a ~ ; z ~ ) = (a; z) given; (36) t+1 t+1 t t+1 0 0 P Q 1 t we have, for each (a; z) 2 S , that M (a; z) := E a ~ < 1. 0 a;z i t t=0 i=0 Q P Q t t t Proof. Iterating backward on (36), we can show that ~a = R a+ Y R . t i j i i=1 j=1 i=j+1 Taking expectation yields t t t t j Y Y X Y Y E a ~ = E R a + E R Y : a;z i t z i i z i i k j i=0 i=1 j=1 i=j+1 k=0 27 Then the Monotone Convergence Theorem and the Markov property imply that 1 t 1 t t j X Y XX Y Y M (a; z) = E R a + E R Y z i i z i i k j t=0 i=1 t=0 j=1 i=j+1 k=0 1 t 1 1 j i XY XX Y Y = E R a + E Y R z i i z k j j+` j+` t=0 i=1 j=1 i=0 k=0 `=1 1 t 1 1 i X Y X Y XY = E R a + E Y E R : z i i z k j Z ` ` t=0 i=1 j=1 k=0 i=0 `=1 By Lemma B.1, we now have, for all (a; z) 2 S , 1 t 1 t X Y X Y M (a; z)  M a + M E Y = M a + M E E Y: 2 2 z i t 2 2 z i Z t=1 i=0 t=1 i=0 Applying Lemma B.1 again gives M (a; z) < 1, as was to be shown. Proposition B.1. The value V (a; z) in (8) is well-de ned in f1g[ R. Proof. By the assumptions on the utility function, there exists a constant B 2 R P Q 1 t such that u(c)  c + B, and hence V (a; z)  E u(a ~ )  M (a; z) + c a;z i t t=0 i=0 P Q 1 t B E . The last term is nite by Lemma A.1. z i t=0 i=0 Proof of Thoerem 2.1. The proof is a long but relatively straightforward extension of Theorem 1 of Benhabib et al. (2015) and thus omitted. A full proof is available from the authors upon request. Proposition B.2. (C ; ) is a complete metric space. Proof. The proof is a straightforward extension of Proposition 4.1 of Li and Stachurski (2014) and thus omitted. A full proof is available from the authors upon request. Proof of Proposition 2.2. Let c be a policy in C satisfying (10). To show that any asset path generated by c satis es the transversality condition (11), observe that, by condition (12), we have 0 0 0 c 2 C =) 9M 2 R s.t. u (a)  (u  c)(a; z)  u (a) + M; 8(a; z) 2 S : (37) + 0 t t t Y Y Y 0 0 ) E (u  c)(a ; Z )a  E u (a )a + M E a : (38) a;z i t t t a;z i t t a;z i t i=0 i=0 i=0 28 Regarding the rst term on the right hand side of (38), x A > 0 and observe that 0 0 0 u (a )a = u (a )a 1fa  Ag + u (a )a 1fa > Ag t t t t t t t t 0 0 0 0 Au (a ) + u (A)a  Au (Y ) + u (A)a ~ t t t t with probability one, where a ~ is the maximal path de ned in (36). We then have t t t Y Y Y 0 0 0 E u (a )a  AE u (Y ) + u (A)E a ~ : (39) a;z i t t z i t a;z i t i=0 i=0 i=0 By Lemma B.1, we have t t t Y Y Y 0 0 A E u (Y ) = A E E u (Y )  M A E ; z i t z i Z 4 z i i=0 i=0 i=0 and the last expression converges to zero as t ! 1 by Lemma A.1. The second term in (39) also converges to zero by Lemma B.2. Hence E u (a )a ! 0 as a;z i t t i=0 t ! 1, which, combined with (38) and another application of Lemma B.2, gives our desired result. Proposition B.3. For all c 2 C and (a; z) 2 S , there exists a unique  2 (0; a] that solves (14). Proof. Fix c 2 C and (a; z) 2 S . Because c 2 C , the map  7! (; a; z) is 0 c increasing. Since  7! u () is strictly decreasing, the equation (14) can have at most one solution. Hence uniqueness holds. Existence follows from the intermediate value theorem provided we can show that (a)  7! (; a; z) is a continuous function, (b) 9 2 (0; a] such that u ()  (; a; z), and (c) 9 2 (0; a] such that u ()  (; a; z). For part (a), it suces to show that ^ ^ ^ ^ ^ g() := E R (u  c) [R(a ) + Y ; Z ] is continuous on (0; a]. To this end, x  2 (0; a] and  ! . By (37) we have 0 0 0 ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ R (u  c) [R (a ) + Y ; Z ]  R (u  c) (Y ; Z )  Ru (Y ) + RM: (40) The last term is integrable, as follows easily from Lemma B.1. Hence the domi- nated convergence theorem applies. From this fact and the continuity of c, we obtain g( ) ! g(). Hence,  7! (; a; z) is continuous. n c 29 Part (b) clearly holds, since u () ! 1 as  ! 0 and  7! (; a; z) is increasing and always nite (since it is continuous as shown in the previous paragraph). Part (c) is also trivial (just set  = a). Proposition B.4. We have Tc 2 C for all c 2 C . ^ ^ ^ ^ ^ Proof. Fix c 2 C and let g (; a; z) := E R (u  c) [R (a ) + Y ; Z ]. Step 1. We show that Tc is continuous. To apply a standard xed point parametric continuity result such as Theorem B.1.4 of Stachurski (2009), we rst show that is jointly continuous on the set G de ned in (15). This will be true if g is jointly continuous on G. For any f( ; a ; z )g and (; a; z) in G with ( ; a ; z ) ! (; a; z), n n n n n n we need to show that g( ; a ; z ) ! g(; a; z). To that end, we de ne n n n 0 0 ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ h (; a; Z; "; ^ ;  ^); h (; a; Z; "; ^ ;  ^) := R[u (Y ) + M ] R (u  c) [R (a ) + Y ; Z ]; 1 2 ^ ^ ^ ^ ^ ^ ^ where := (Z; " ^), R := R(Z;  ) and Y := Y (Z;  ^) as de ned in (5). Then h and h are continuous in (; a; Z ) by the continuity of c and nonnegative by (40). By Fatou's lemma and Theorem 1.1 of Feinberg et al. (2014), ZZZ ^ ^ h (; a; z ^; "; ^ ;  ^)P (z; z ^) (d" ^) (d ) (d ^) i " z ^2Z ZZZ ^ ^ lim inf h ( ; a ; z ^; "; ^ ;  ^)P (z ; z ^) (d" ^) (d ) (d ^) i n n n " n!1 z ^2Z ZZZ ^ ^ lim inf h ( ; a ; z ^; "; ^ ;  ^)P (z ; z ^) (d" ^) (d ) (d ^): i n n n " n!1 z ^2Z This implies that 0 0 ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ lim inf E R (u  c) [R (a  ) + Y ; Z ]  E R (u  c) [R (a ) + Y ; Z ] : z n n z n!1 The function g is then continuous, since the above inequality is equivalent to the statement lim inf g( ; a ; z )  g(; a; z)  lim sup g( ; a ; z ). Hence, n!1 n n n n n n c n!1 is continuous on G, as was to be shown. Moreover, since  7! (; a; z) takes values 0 0 0 ^ ^ ^ in the closed interval I (a; z) := [u (a); u (a) + E R(u (Y ) + M )], and the correspon- dence (a; z) 7! I (a; z) is nonempty, compact-valued and continuous, Theorem B.1.4 of Stachurski (2009) then implies that Tc is continuous on S . Step 2. We show that Tc is increasing in a. Suppose that for some z 2 Z and a ; a 2 (0;1) with a < a , we have  := Tc(a ; z) > Tc(a ; z) =:  . Since c 1 2 1 2 1 1 2 2 30 is increasing in a by assumption, is increasing in  and decreasing in a. Then 0 0 0 u ( ) < u ( ) = ( ; a ; z)  ( ; a ; z) = u ( ). This is a contradiction. 1 2 c 2 2 c 1 1 1 Step 3. We have shown in Proposition B.3 that Tc(a; z) 2 (0; a] for all (a; z) 2 S . 0 0 0 0 Step 4. We show that ku  (Tc) uk < 1. Since u [Tc(a; z)]  u (a), we have 0 0 0 0 ju [Tc(a; z)] u (a)j = u [Tc(a; z)] u (a) 0 0 ^ ^ ^ ^ ^ ^ ^ ^ E R (u  c) (R[a Tc(a; z)] + Y ; Z )  E R[u (Y ) + M ] z z for all (a; z) 2 S . The right hand side is easily shown to be nite via Lemma B.1. To prove Theorem 2.2, let H be all continuous functions h : S ! R that is decreasing in its rst argument and (a; z) 7! h(a; z) u (a) is bounded and nonnegative. Given h 2 H , let Th be the function mapping (a; z) 2 S into the  that solves 0 1 0 ^ ^ ^ ^ ^ = maxfE R h(R [a (u ) ()] + Y ; Z ); u (a)g: (41) Moreover, consider the bijection H : C ! H de ned by Hc := u  c. ~ ~ Lemma B.3. The operator T : H ! H and satis es TH = HT on C . Proof. Pick any c 2 C and (a; z) 2 S . Let  := Tc(a; z), then  solves 0 0 0 ^ ^ ^ ^ u () = maxfE R (u  c) [R (a ) + Y ; Z ]; u (a)g: (42) We need to show that HTc and THc evaluate to the same number at (a; z). In other words, we need to show that u () is the solution to 0 0 1 0 ^ ^ ^ ^ ^ = maxfE R (u  c) (R [a (u ) ()] + Y ; Z ); u (a)g: But this is immediate from (42). Hence, we have shown that TH = HT on C . Since H : C ! H is a bijection, we have T = HTH . Since in addition T : C ! C by Proposition B.4, we have T : H ! H . This concludes the proof. ~ ~ ~ Lemma B.4. T is order preserving on H . That is, Th  Th for all h ; h 2 H 1 2 1 2 with h  h . 1 2 Proof. Let h ; h be functions in H with h  h . Suppose to the contrary that there 1 2 1 2 ~ ~ exists (a; z) 2 S such that  := Th (a; z) > Th (a; z) =:  . Since functions in H 0 1 1 2 2 31 are decreasing in the rst argument, we have 0 1 0 ^ ^ ^ ^ ^ = maxfE R h (R [a (u ) ( )] + Y ; Z ); u (a)g 1 z 1 1 0 1 0 ^ ^ ^ ^ ^ maxfE R h (R [a (u ) ( )] + Y ; Z ); u (a)g z 2 1 0 1 0 ^ ^ ^ ^ ^ maxfE R h (R [a (u ) ( )] + Y ; Z ); u (a)g =  : z 2 2 2 This is a contradiction. Hence, T is order preserving. Lemma B.5. There exists an n 2 N and  < 1 such that T is a contraction mapping of modulus  on (H ; d ). Proof. Since T is order preserving and H is closed under the addition of nonnegative constants, based on Blackwell (1965), it remains to verify the existence of n 2 N and n n ~ ~ < 1 such that T (h + )  T h +  for all h 2 H and  0. By Lemma A.1 and Assumption 2.2, it suces to show that for all k 2 N and (a; z) 2 S , we have k k ~ ~ T (h + )(a; z)  T h(a; z) + E R : (43) z i i i=1 Fix h 2 H ,  0, and let h (a; z) := h(a; z) + . By the de nition of T , we have 0 1 0 ~ ^ ^ ^ ~ ^ ^ Th (a; z) = maxfE R h (R [a (u ) (Th )(a; z)] + Y ; Z ); u (a)g 0 1 0 ^ ^ ~ ^ ^ maxfE R h(R [a (u ) (Th )(a; z)] + Y ; Z ); u (a)g + E R z z 1 1 0 1 0 ^ ^ ^ ~ ^ ^ maxfE R h(R [a (u ) (Th)(a; z)] + Y ; Z ); u (a)g + E R : z z 1 1 Here, the rst inequality is elementary and the second is due to the fact that h  h ~ ~ ~ and T is order preserving. Hence, T (h + )(a; z)  Th(a; z) + E R and (43) holds z 1 1 for k = 1. Suppose (43) holds for arbitrary k. It remains to show that it holds for k + 1. For z 2 Z, de ne f (z) := E R  R . By the induction hypothesis, the z 1 1 k k monotonicity of T and the Markov property, k+1 k 0 1 k+1 0 ~ ^ ^ ~ ^ ~ ^ ^ T h (a; z) = maxfE R (T h )(R [a (u ) (T h )(a; z)] + Y ; Z ); u (a)g k 0 1 k+1 0 ^ ^ ~ ^ ~ ^ ^ maxfE R (T h + f )(R [a (u ) (T h )(a; z)] + Y ; Z ); u (a)g k 0 1 k+1 0 ^ ~ ^ ~ ^ ^ maxfE R (T h)(R [a (u ) (T h )(a; z)] + Y ; Z ); u (a)g + E R f (Z ) z 1 1 1 k 0 1 k+1 0 ^ ~ ^ ~ ^ ^ maxfE R (T h)(R [a (u ) (T h)(a; z)] + Y ; Z ); u (a)g + E R E R  R z 1 1 Z 1 1 k k k+1 = T h(a; z) + E R  R : z 1 1 k+1 k+1 32 Hence, (43) is veri ed by induction. This concludes the proof. Proof of Theorem 2.2. Let n and  be as in Lemma B.5. In view of Propositions 2.2, B.2 and B.4, to show that T is a contraction and verify claims (1){(3) of Theo- rem 2.2, based on the Banach contraction mapping theorem, it suces to show that n n (T c; T d)  (c; d) for all c; d 2 C . To this end, pick any c; d 2 C . Note that the topological conjugacy result established in Lemma B.3 implies that T = HTH . n 1 1 n 1 n n ~ ~ Hence, T = (HTH ) (HTH ) = HT H and T H = HT . By the de nition of  and the contraction property established in Lemma B.5, n n n n n n ~ ~ (T c; T d) = d (HT c; HT d) = d (T Hc; T Hd)  d (Hc; Hd) = (c; d): 1 1 1 Hence, T is a contraction and claims (1){(3) are veri ed. Our next goal is to prove Proposition 2.3. To begin with, we de ne C = fc 2 C : a 7! a c(a; z) is increasing for all z 2 Zg : Lemma B.6. C is a closed subset of C , and Tc 2 C for all c 2 C . 0 0 0 Proof. To see that C is closed, for a given sequence fc g in C and c 2 C with 0 n 0 (c ; c) ! 0, we need to show that c 2 C . This obviously holds since a 7! ac (a; z) n 0 n is increasing for all n, and, in addition, (c ; c) ! 0 implies that c (a; z) ! c(a; z) n n for all (a; z) 2 S . Fix c 2 C . We now show that  := Tc 2 C . Since  2 C by Proposition B.4, it 0 0 remains to show that a 7! a (a; z) is increasing. Suppose the claim is false, then there exist z 2 Z and a ; a 2 (0;1) such that a < a and a (a ; z) > a (a ; z). 1 2 1 2 1 1 2 2 Since a (a ; z)  0, a (a ; z)  0 and (a ; z)  (a ; z) by Proposition B.4, 1 1 2 2 1 2 we have (a ; z) < a and (a ; z) < (a ; z). However, based on the property of the 1 1 1 2 time iteration operator, we then have 0 0 ^ ^ ^ ^ ^ (u  )(a ; z) = E R(u  c)(R [a (a ; z)] + Y ; Z ) 1 z 1 1 0 0 ^ ^ ^ ^ ^ E R(u  c)(R [a (a ; z)] + Y ; Z )  (u  )(a ; z); z 2 2 2 which implies that (a ; z)  (a ; z). This is a contradiction. Hence, a 7! a(a; z) 1 2 is increasing, and T is a self-map on C . Proof of Proposition 2.3. Since T maps elements of the closed subset C into itself by Lemma B.6, Theorem 2.2 implies that c 2 C . Hence, the stated claims hold. 0 33 Proof of Proposition 2.4. Let T be the time iteration operator for the income process j established in Proposition B.4. It suces to show T c  T c for all c 2 C . To see 1 2 this, note that by Lemma B.4, we have T c  T c whenever c  c . Therefore if j 1 j 2 1 2 T c  T c for all c 2 C , we obtain T c  T c  T c . Iterating this starting from 1 2 1 1 1 2 2 2 n n any c 2 C , by Theorem 2.2, it follows that c = lim (T ) c  lim (T ) c = c , n!1 1 n!1 2 1 2 completing the proof. To show that T c  T c for any c 2 C , take any (a; z) 2 S and de ne  = (T c)(a; z). 1 2 0 j j To show    , suppose on the contrary that  >  . Since c is increasing in a and 1 2 1 2 00 0 u < 0 (hence u is decreasing), it follows from the de nition of the time iteration operator in (14){(16), Y  Y , u < 0 and the monotonicity of c 2 C that 1 2 0 0 0 0 ^ ^ ^ ^ ^ u ( ) > u ( ) = maxfE R (u  c)[R(a  ) + Y ; Z ]; u (a)g 2 1 z 1 1 0 0 0 ^ ^ ^ ^ ^ maxfE R (u  c)[R(a  ) + Y ; Z ]; u (a)g = u ( ); z 2 2 2 which is a contradiction. To prove Proposition 2.5, we need several lemmas. Lemma B.7. For all c 2 C , there exists a threshold a  (z) such that Tc(a; z) = a if 0 c and only if a  a  (z). In particular, there exists a threshold a (z) such that c (a; z) = a if and only if a  a (z). Proof. Recall that, for all c 2 C , (a; z) := Tc(a; z) solves 0 0 0 ^ ^ ^ ^ ^ (u  ) (a; z) = maxfE R (u  c) (R [a (a; z)] + Y ; Z ); u (a)g: (44) For each z 2 Z and c 2 C , de ne 0 0 ^ ^ ^ ^ a  (z) := (u ) [E R (u  c) (Y ; Z )] and a (z) := a   (z): (45) c z c To prove the rst claim, by Lemma B.6, it suces to show that (a; z) < a implies a > a  (z). This obviously holds since in view of (44), the former implies that 0 0 0 0 ^ ^ ^ ^ ^ ^ ^ ^ ^ u (a) < E R (u  c) (R [a (a; z)] + Y ; Z )  E R (u  c) (Y ; Z ) = u [a  (z)]; z z c which then yields a > a  (z). The second claim follows immediately from the rst claim and the fact that c 2 C is the unique xed point of T in C . Consider a subset C de ned by C := fc 2 C : a 7! c(a; z) is concave for all z 2 Zg. 1 1 0 Lemma B.8. C is a closed subset of C and C , and, Tc 2 C for all c 2 C . 1 0 1 1 34 Proof. The rst claim is immediate because limits of concave functions are concave. To prove the second claim, x c 2 C . We have Tc 2 C by Lemma B.6. It remains to 1 0 show that a 7! (a; z) := Tc(a; z) is concave for all z 2 Z. Given z 2 Z, Lemma B.7 implies that (a; z) = a for a  a  (z) and that (a; z) < a for a > a  (z). Since in c c addition a 7! (a; z) is continuous and increasing, to show the concavity of  with respect to a, it suces to show that a 7! (a; z) is concave on (a  (z);1). Suppose there exist some z 2 Z, 2 [0; 1], and a ; a 2 (a  (z);1) such that 1 2 c ((1 )a + a ; z) < (1 )(a ; z) + (a ; z): (46) 1 2 1 2 ^ ^ ^ ^ Let h(a; z; ! ^ ) := R [a (a; z)] + Y , where ! ^ := (R; Y ). Then by Lemma B.7 and noting that consumption is interior, we have 0 0 ^ ^ ^ (u  ) ((1 )a + a ; z) = E R (u  c)fh[(1 )a + a ; z; ! ^ ]; Zg 1 2 z 1 2 ^ ^ ^ E R (u  c) [(1 )h(a ; z; ! ^ ) + h(a ; z; ! ^ ); Z ]: z 1 2 Using condition (17) then yields 0 1 0 ^ ^ ^ ((1 )a + a ; z)  (u ) fE R(u  c)[(1 )h(a ; z; ! ^ ) + h(a ; z; ! ^ ); Z ]g 1 2 z 1 2 0 1 0 0 1 0 ^ ^ ^ ^ ^ ^ (1 )(u ) fE R(u  c)[h(a ; z; ! ^ ); Z ]g + (u ) fE R(u  c)[h(a ; z; ! ^ ); Z ]g z 1 z 2 0 1 0 0 1 0 = (1 )(u ) f(u  )(a ; z)g + (u ) f(u  )(a ; z)g = (1 )(a ; z) + (a ; z); 1 2 1 2 which contradicts (46). Hence, a 7! (a; z) is concave for all z 2 Z. Proof of Proposition 2.5. By Theorem 2.2, T : C ! C is a contraction mapping with unique xed point c . Since C is a closed subset of C and TC  C by Lemma B.8, 1 1 1 we know that c 2 C . The rst claim is veri ed. Regarding the second claim, note that c 2 C implies that a 7! c (a; z) is increasing and concave for all z 2 Z. Hence, a 7! c (a; z)=a is a decreasing function for all z 2 Z. Since 0  c (a; z)  a for all (a; z) 2 S , (z) := lim c (a; z)=a is well-de ned and (z) 2 [0; 1]. 0 a!1 Proof of Remark 2.1. For each c in C concave in its rst argument, let h (x; ! ^ ) := ^ ^ ^ ^ c(Rx + Y ; z ^), where ! ^ := (R; Y ; z ^). Then x 7! h (x; ! ^ ) is concave. Based on the generalized Minkowski's inequality (see, e.g., Hardy et al. (1952), page 146, theorem 35 198), we have 1 1 ^ ^ ^ ^ [E R h ( x + (1 )x ; ! ^ ) ]  fE R [ h (x ; ! ^ ) + (1 )h (x ; ! ^ )] g z c 1 2 z c 1 c 2 1 1 1 ^ ^ ^ ^ = fE [ ( R) h (x ; ! ^ ) + (1 )( R) h (x ; ! ^ ) ] g z c 1 c 2 1 1 1 1 ^ ^ ^ ^ (E [ ( R) h (x ; ! ^ )] ) + (E [(1 )( R) h (x ; ! ^ )] ) z c 1 z c 2 1 1 ^ ^ ^ ^ = [E R h (x ; ! ^ ) ] + (1 )[E R h (x ; ! ^ ) ] ; z c 1 z c 2 Since u (c) = c , the above inequality implies that condition (17) holds. To prove Proposition 2.6, let s  be as in (19) and de ne C := fc 2 C : c(a; z)  (1 s )a for all (a; z) 2 S g : (47) 2 0 Lemma B.9. C is a closed subset of C , and Tc 2 C for all c 2 C . 2 2 2 Proof. To see that C is closed, for a given sequence fc g in C and c 2 C with 2 n 2 (c ; c) ! 0, we need to verify that c 2 C . This obviously holds since c (a; z)=a n 2 n 1 s  for all n and (a; z) 2 S , and, on the other hand, (c ; c) ! 0 implies that 0 n c (a; z) ! c(a; z) for all (a; z) 2 S . n 0 We next show that T is a self-map on C . Fix c 2 C . We have Tc 2 C since T is 2 2 a self-map on C . It remains to show that  := Tc satis es (a; z)  (1 s )a for all (a; z) 2 S . Suppose (a; z) < (1 s )a for some (a; z) 2 S . Then 0 0 0 0 0 0 ^ ^ ^ ^ ^ u ((1 s )a) < (u  )(a; z) = maxfE R (u  c) (R [a (a; z)] + Y ; Z ); u (a)g: 0 0 Since u ((1 s )a) > u (a) and c 2 C , this implies that 0 0 ^ ^ ^ ^ u ((1 s )a) < E R (u  c) (R [a (a; z)] + Y ; Z ) ^ ^ ^ ^ E R u f(1 s )R [a (a; z)] + (1 s )Yg 0 0 ^ ^ ^ ^ ^ ^ ^ E R u [(1 s )Rsa  + (1 s )Y ]  E R u [Rs (1 s )a]; z z which contradicts (19) since ((1 s )a; z) 2 S . As a result, (a; z)  (1 s )a for all (a; z) 2 S and we conclude that Tc 2 C . 0 2 Proof of Proposition 2.6. We have shown in Theorem 2.2 that T is a contraction mapping on the complete metric space (C ; ), with unique xed point c . Since in addition C is a closed subset of C and TC  C by Lemma B.9, we know that 2 2 2 c 2 C . The stated claim is veri ed. 2 36 Appendix C. Proof of Section 3 Results As before, Assumptions 2.1{2.3 are in force. Notice that Assumption 2.2, Assump- tion 3.1 and Lemma A.1 imply existence of an n in N such that n n Y Y := max E R < 1 and := s  max E R < 1: (48) z t t z t z2Z z2Z t=1 t=1 Lemma C.1. For all (a; z) 2 S, we have sup E a < 1. a;z t t0 Proof. Since c (0; z) = 0, Proposition 2.6 implies that c (a; z)  (1 s )a for all (a; z) 2 S. For all t  1, we have t = kn + j in general, where the integers k  0 and j 2 f0; 1; : : : ; n 1g. Using these facts and (4), we have: t t1 a  s  R  R a + s  R  R Y + + sR  Y + Y t t 1 t 2 1 t t1 t kn+j kn+j` = s  R  R a + s  R  R Y kn+j 1 kn+j `+1 ` `=1 k n X X mn` + s  R  R Y kn+j (km)n+j+`+1 (km)n+j+` m=1 `=1 with probability one. Taking expectations of the above while noting that M := max E R < 1 by Assumption 3.1 and Lemma A.1, we have 1`n; z2Z z t t=1 k j k j` E a  s  E R  R a + s  E R  R Y a;z t z j 1 z j `+1 ` `=1 k1 n X X m n` + s  E R  R Y z (km)n+j (km1)n+j+`+1 (km)n+j+` m=0 `=1 k1 n X X X k k m M a + M E Y + M E Y 0 0 z ` 0 z (km1)n+j+` m=0 `=1 `=1 M a + M M n + M M n < 1: 0 0 3 0 3 m=0 or all (a; z) 2 S and t  0. Here we have used M in Lemma B.1 and the Markov property. Hence, sup E a < 1 for all (a; z) 2 S, as was claimed. a;z t t0 A function w : S ! R is called norm-like if all its sublevel sets (i.e., sets of the form fx 2 S : w(x)  bg; b 2 R ) are precompact in S (i.e., any sequence in a given sublevel set has a subsequence that converges to a point of S). 37 Proof of Theorem 3.1. Based on Lemma D.5.3 of Meyn and Tweedie (2009), a sto- chastic kernel Q is bounded in probability if and only if for all x 2 S, there exists a norm-like function w : S ! R such that the (Q; x)-Markov process fX g satis- + t t0 es lim sup E [w (X )] < 1. Fix (a; z) 2 S. Since Z is nite, P is bounded x t t!1 in probability. Hence, there exists a norm-like function w : Z ! R such that lim sup E w(Z ) < 1. Then w : S ! R de ned by w (a ; Z ) := a + w(Z ) z t + 0 0 0 0 t!1 is a norm-like function on S. The stochastic kernel Q is then bounded in prob- ability since Lemma C.1 implies that lim sup E w (a ; Z )  sup E a + a;z t t a;z t t!1 t0 lim sup E w(Z ) < 1. Regarding existence of stationary distribution, since P is z t t!1 Feller (due to the niteness of Z), whenever z ! z, the product measure satis es P (z ;) ! P (z;) Since in addition c is continuous, a simple application of the generalized Fatou's lemma of Feinberg et al. (2014) (Theorem 1.1) shows that the stochastic kernel Q is Feller. Moreover, since Q is bounded in probability, based on the Krylov-Bogolubov theorem (see, e.g., Meyn and Tweedie (2009), Proposition 12.1.3 and Lemma D.5.3), Q admits at least one stationary distribution. Lemma C.2. The borrowing constraint binds in nite time with positive probability. That is, for all (a; z) 2 S, we have P ([ fc = a g) > 0. a;z t0 t t Proof. The claim holds trivially when a = 0. Suppose the claim does not hold on S (recall that S = Snf0g), then P (\ fc < a g) = 1 for some (a; z) 2 S , i.e., the 0 a;z t0 t t 0 borrowing constraint never binds with probability one. Hence, 0  0 P (u  c )(a ; Z ) = E R (u  c )(a ; Z ) F = 1 a;z t t t+1 t+1 t+1 t+1 t for all t  0. Then we have 0  0 (u  c ) (a; z) = E R  R (u  c ) (a ; Z ) a;z 1 1 t t t t 0 0 E R  R [u (a ) + M ]  E R  R [u (Y ) + M ] (49) a;z 1 1 t t t z 1 1 t t t for all t  1. Let n and  be de ned by (48). Let t = kn + 1. Based on the Markov property and Lemma B.1, as k ! 1, E R  R = E R  R E R z 1 1 t t z 1 1 t1 t1 Z 1 1 t1 max E R (E R  R )  max E R  ! 0: z 1 1 z 1 1 nk nk z 1 1 z2Z z2Z 38 Similarly, as k ! 1, 0 0 E R  R u (Y ) = E R  R E [ R u (Y )] z 1 1 t t t z 1 1 t1 t1 Z 1 1 1 t1 0 0 k ^ ^ ^ ^ ^ ^ max E Ru (Y ) E R  R  max E Ru (Y )  ! 0: z z 1 1 nk nk z z2Z z2Z Letting k ! 1. (49) then implies that (u  c ) (a; z)  0, contradicted with the fact that u > 0. Thus, we must have P ([ fc = a g) > 0 for all (a; z) 2 S. a;z t0 t t Our next goal is to prove Theorem 3.2. In proofs we apply the theory of Meyn and Tweedie (2009). Important de nitions (their information in the textbook) include: -irreducibility (Section 4.2), small set (page 102), strong aperiodicity (page 114), petite set (page 117), Harris chain (page 199), and positivity (page 230). Recall that R paired with its Euclidean topology is a second countable topological space (i.e., its topology has a countable base). Since R and Z are respectively Borel subsets of R and R paired with the relative topologies, they are also second countable. Hence, S := R  Z satis es B(S) = B(R ) B(Z) (see, e.g., page 149, + + Theorem 4.44 of Aliprantis and Border (2006)). Recall (22). With slight abuse of notation, in proofs, we use f to denote the density of fY g in both cases (Y1) and (Y2) and write dy = (dy), where  is the related measure. Speci cally,  is the Lebesgue measure when (Y2) holds. Moreover, Let # be the counting measure. Recall z  2 Z and the greatest lower bound y  0 of the support of fY g given by ` t Assumption 3.2. Let p  := P (z ; z ). Then p  > 0 by Assumption 3.2. Lemma C.3. P f[ [fc = a g\ (\ fZ = z g)]g > 0 for all a 2 (0;1). (a;z ) t0 t t i i=0 Proof. Fix a 2 (0;1). If a  a (z ), the claim holds trivially by Lemma B.7. Now consider the case a > a (z ). Suppose P f[ [fc = a g\ (\ fZ = z g)]g = 0. (a;z ) t0 t t i i=0 Then, based on the De Morgan's law, we have P \ fc < a g[ [ fZ 6= z g = 1: (a;z ) t0 t t i i=0 ) P fc < a g[ [ fZ 6= z g = 1 for all t 2 N: (a;z ) t t i i=0 ) P fc < a g[ [ fZ 6= z g = 1 for all k; t 2 N with k  t: (a;z ) k k i i=0 t t ) P \ fc < a g [ [ fZ 6= z g = 1 for all t 2 N: i i i (a;z ) i=0 i=0 t t Note that the set 4(t) := (\ fc < a g)[ ([ fZ 6= z g) can be written as i i i i=0 i=0 4(t) = 4 (t)[4 (t); where 4 (t)\4 (t) = ;; 1 2 1 2 t t t 4 (t) := \ fc < a g \ \ fZ = z g and 4 (t) := [ fZ 6= z g: 1 i i i 2 i i=0 i=0 i=0 39 Assumption 3.2 then implies that, for all t  0, t t P f4 (t)g = 1 P f4 (t)g = P \ fZ = z g = p  > 0: (a;z ) 1 z  2 z  i i=0 Let n and  be de ned by (48) and let t = kn + 1. Similar to the proof of Lemma B.7, we can show that, with probability p  > 0, 0  k 0 ^ ^ ^ ^ ^ (u  c )(a; z )   max E Ru (Y ) + M max E R z z z2Z z2Z for some constant M 2 R . Since  2 (0; 1) and (u  c )(a; z ) > 0, Lemma B.1 implies that there exists N 2 N such that N 0 0 ^ ^ ^ ^ ^ max E Ru (Y ) + M max E R < (u  c )(a; z ): z z z2Z z2Z 0  0  Nn+1 As a result, we have (u  c )(a; z ) < (u  c )(a; z ) with probability p  > 0. This is a contradiction. Hence the stated claim is veri ed. Let F (da j a ; Z ; Z ) be de ned such that Pfa 2 A j (a ; Z ; Z ) = (a; z; z )g = t+1 t t t+1 t+1 t t t+1 0 0 0 1fa 2 AgF (da j a; z; z ) at A 2 B(R ). Lemma C.4. Let h : S ! R be an integrable map such that a 7! h(a; z) is de- creasing for all z 2 Z. Then, for all t 2 N and z 2 Z, the map a 7! `(a; z; t) := 0 0 t 0 0 h(a ; z )Q ((a; z); d(a ; z )) is decreasing. Proof. Fix z 2 Z. When t = 1, (21a) implies that Z Z 0 0 0 0 0 0 `(a; z; 1) = h(a ; z )F (da j a; z; z ) P (z; z )#(dz ): Since a 7! h(a; z) is decreasing, and by Proposition 2.3 and (21a), the optimal asset accumulation path a is increasing in a with probability one, we know that a 7! t+1 t 0 0 0 0 0 h(a ; z )F (da j a; z; z ) is decreasing for all z 2 Z. Thus, a 7! `(a; z; 1) is decreasing. The claim holds for t = 1. Suppose this claim holds for arbitrary t, it remains to show that it holds for t + 1. Note that ZZ 00 00 t 0 0 00 00 0 0 `(a; z; t + 1) = h(a ; z )Q ((a ; z ); d(a ; z ))Q((a; z); d(a ; z )) 0 0 0 0 = `(a ; z ; t)Q((a; z); d(a ; z )): 0 0 0 0 Since a 7! `(a ; z ; t) is decreasing for all z 2 Z, based on the induction argument, a 7! `(a; z; t + 1) is decreasing. The stated claim then follows. Lemma C.5. The Markov process f(a ; Z )g is -irreducible. t t t0 40 Proof. Recall  > y given by Assumption 3.2. Let D 2 B(S) be de ned by D := fy gfz g if (Y1) holds and D := (y ; )fz g if (Y2) holds. We de ne the measure ' ` ` on B(S) by '(A) := (#)(A\D) for A 2 B(S). Clearly ' is a nontrivial measure. In particular, #(fz g) = 1 as # is the counting measure. Moreover, since y is the greatest lower bound of the support of fY g, it must be the case that (fy g) > 0 if (Y1) holds t ` and that ((y ; )) > 0 if (Y2) holds. As a result, '(S) = (fy g) #(fz g) > 0 when ` ` (Y1) holds and '(S) = ((y ; )) #(fz g) > 0 when (Y2) holds. We rst show that f(a ; Z )g is '-irreducible. Let A be an element of B(S) such that t t '(A) > 0. Fix (a; z) 2 S. We need to show that f(a ; Z )g visits set A in nite time t t with positive probability. Since fz g is irreducible, P fZ = z g > 0 for some integer N  0. By Lemma C.1, t z N 0 there exists a ~ < 1 such that P fa < a; ~ Z = z g > 0. By Lemma C.3, there N N (a;z) 0 0 exists T 2 N such that P fc = a ; Z = z g  P c = a ; \ fZ = z g > (a; ~ z ) T T T (a; ~ z ) T T i i=0 0. Lemma B.7 and Lemma C.4 then imply that P 0 fc = a ; Z = z g > 0 for all (a ;z ) T T T a 2 (0; a ~). Hence, for N := N + T and E := fc = a ; Z = z g, we have 0 N N N N 0 0 P (E)  P 0 fc = a ; Z = z gQ ((a; z); d(a ; z )) > 0 (50) (a;z) (a ;z ) T T T 0 0 fa a; ~ z =z g based on the Markov property. By (21a), we have P f(a ; Z ) 2 Ag  P f(a ; Z ) 2 A; a = c ; Z = z g (a;z) N +1 N +1 (a;z) N +1 N +1 N N N = P f(a ; Z ) 2 A j a = c ; Z = z g P (E) (a;z) N +1 N +1 N N N (a;z) = P f(Y ; Z ) 2 A; a = c ; Z = z g : (51) (a;z) N +1 N +1 N N N 00 00 00 00 00 Note that, by Assumption 3.2, f (y j z )P (z ; z ) > 0 whenever (y ; z ) 2 D. Since in addition '(A) = (  #)(A\ D) > 0, we have 00 00 00 00 00 f (y j z )P (z ; z )(  #)[d(y ; z )] > 0: Let 4 := P f(a ; Z ) 2 Ag. Then (50) and (51) imply that (a;z) N +1 N +1 Z Z 00 00 0 00 00 00 N 0 0 4  f (y j z )P (z ; z )(  #)[d(y ; z )] Q ((a; z); d(a ; z )) > 0: E A Therefore, we have shown that any measurable subset with positive ' measure can be reached in nite time with positive probability, i.e., f(a ; Z )g is '-irreducible. Based t t on Proposition 4.2.2 of Meyn and Tweedie (2009), there exists a maximal probability measure on B(S) such that f(a ; Z )g is -irreducible. t t 41 Lemma C.6. Let the function a  be de ned as in (45). Then a (z )  y if (Y1) holds, while a (z ) > y if (Y2) holds. Proof. Suppose (Y1) holds and a (z ) < y . Then, by Lemma B.7, for all t 2 N, t t fc = a g\ \ fZ = z g = fa  a (Z )g\ \ fZ = z g t t i t t i i=0 i=0 fa < y g\ \ fZ = z g  fa < y g: (52) t ` i t ` i=0 Hence, for all a 2 (0;1) and t 2 N, P fc = a g\ \ fZ = z g  P fa < y g = 0; (a;z ) t t i (a;z ) t ` i=0 where the last equality follows from (21a), which implies that a  Y  y with t t ` probability one. This is contradicted with Lemma C.3. Suppose (Y2) holds and a (z )  y . By de nition, P fY  y g = 0 for all z 2 Z ` z t ` and t 2 N. Since a  Y with probability one, we have P fa  y g = 0 for t t (a;z) t ` all (a; z) 2 S and t 2 N. Via similar analysis to (52), Lemma B.7 implies that [fc = a g\ (\ fZ = z g)]  fa  y g for all t 2 N. Hence, for all a 2 (0; 1) and t t i t ` i=0 t 2 N, we have P [fc = a g\ (\ fZ = z g)]  P fa  y g = 0. Again, this t t i t ` (a;z ) (a;z ) i=0 contradicts Lemma C.3. Lemma C.7. The Markov process f(a ; Z )g is strongly aperiodic. t t t0 Proof. By the de nition of strong aperiodicity, we need to show that there exists a v -small set D with v (D ) > 0, i.e., there exists a nontrivial measure v on B(S) 1 1 1 1 1 and a subset D 2 B(S) such that v (D ) > 0 and 1 1 1 inf Q ((a; z); A)  v (A) for all A 2 B(S): (53) (a;z)2D For  > 0 given by Assumption 3.2, let C := (y ; minf; a (z )g) and let D := fy gfz g ` 1 ` if (Y1) holds and D := C  fz g if (Y2) holds. We now show that D satis es the 1 1 0 0 0 0 0 0 0 above conditions. De ne r(a ; z ) := f (a j z )P (z ; z ) and note that r(a ; z ) > 0 on 0 0 0 0 D . De ne the measure v on B(S) by v (A) := r(a ; z )(  #)[d(a ; z )]. If (Y1) 1 1 1 holds, then (fy g) > 0 as shown above, and, if (Y2) holds, Lemma C.6 implies that (C) > 0. Since in addition #(fz g) > 0, it always holds that (  #)(D ) > 0. 0 0 Moreover, since r(a ; z ) > 0 on D , we have v (D ) > 0 and v is a nontrivial measure. 1 1 1 1 For all (a; z) 2 D and A 2 B(S), Lemma B.7 implies that 0 0 0 0 Q ((a; z); A) = r(a ; z )(  #)[d(a ; z )] = v (A): Hence, D satis es (53) and f(a ; Z )g is strongly aperiodic. 1 t t t0 42 Lemma C.8. The set [0; d] Z is a petite set for all d 2 R . Proof. Fix d 2 (0;1) and z 2 Z. Let B := [0; d]fzg. By Lemma C.3, P fc = a ; Z = z g > 0 for some N 2 N: (54) (d;z) N1 N1 N1 We start by showing that there exists a nontrivial measure v on B(S) such that inf Q ((a; z); A)  v (A) for all A 2 B(S): (55) (a;z)2B In other words, B is a v -small set. Fix A 2 B(S). For all z 2 Z, de ne Z Z 0 00 00 00 00 00 0 00 00 m(z ) := 1f(y ; z ) 2 Agf (y j z ) dy P (z ; z )#(dz ): Note that for all (a; z) 2 B, Lemma B.7 implies that Q ((a; z); A)  P f(Y ; Z ) 2 A; a  a (Z ); Z = z g a;z N N N1 N1 N1 0 0 0 0 N1 0 0 = m(z )1fa  a (z ); z = z gQ ((a; z); d(a ; z )): 0 0 0 0 0 0 Since a 7! m(z )1fa  a (z ); z = z g is decreasing for all z 2 Z, by Lemma C.4, N 0 0 0 0 N1 0 0 Q ((a; z); A)  m(z )1fa  a (z ); z = z gQ ((d; z); d(a ; z )) = P f(Y ; Z ) 2 A; c = a ; Z = z g =: v (A): d;z N N N1 N1 N1 N Note that v is a nontrivial measure on B(S) since (54) implies that v (S) > 0. N N Furthermore, since (a; z) is chosen arbitrarily, the above inequality implies that (55) holds. We have shown that B is a v -small set, and hence a petite set. Since nite union of petite sets is petite for -irreducible chains (see, e.g., Proposition 5.5.5 of Meyn and Tweedie (2009)), the set [0; d] Z must also be petite. Recall s 2 [0; 1) in Assumption 3.1, n 2 N and 2 (0; 1) in (48). Let B := [0; d] Z. Lemma C.9. There exist constants b 2 R ,  2 (0; 1) and a measurable map V : S ! [n=;1) that is bounded on B, such that, for suciently large d 2 R and all (a; z) 2 S, we have E V (a ; Z ) V (a; z)  V (a; z) + b1f(a; z) 2 Bg. a;z n n 43 Proof. Since c (a; z)  (1 s )a by Proposition 2.6 and M := max E R < 1 by 0 z2Z z Assumption 3.1 and Lemma A.1, by Lemma B.1 and the Markov property, n nt E a  s  E R  R a + s  E R  R Y a;z n z n 1 z n t+1 t t=1 n n X X nt nt nt a + s  E Y E R  R  a + s  M M : z t Z t+1 n 3 t 0 t=1 t=1 nt nt De ne b := s  M M . Note that b < 1. Choose  2 (0; 1 ), m  n= 0 3 0 V t=1 and d 2 R such that (1 )d  b + m . Then, for V (a; z) := a + m , + 0 V V E V (a ; Z ) V (a; z)  (1 )a + b = a (1 )a + b a;z n n 0 0 = V (a; z) (1 )a + b + m : (56) 0 V In particular, if (a; z) 2= B, then a > d and (56) implies that E V (a ; Z ) V (a; z)  V (a; z) (1 )d + b + m  V (a; z): (57) a;z n n 0 V Let b := b + m . Then the stated claim follows from (56){(57) and the fact that V 0 V is bounded on B. Proof of Theorem 3.2. Claim (1) can be proved by applying Theorem 19.1.3 (or a combination of Proposition 5.4.5 and Theorem 15.0.1) of Meyn and Tweedie (2009). The required conditions in those theorems have been established by Lemmas C.5, C.7, C.8 and C.9 above. Regarding claim (2), Lemmas C.8 and C.9 imply that E V (a ; Z )V (a; z)  n+b1f(a; z) 2 Bg for all (a; z) 2 S, where B := [0; d]Z is a;z n n petite. Since in addition f(a ; Z )g is -irreducible by Lemma C.5, Theorem 19.1.2 of t t Meyn and Tweedie (2009) implies that f(a ; Z )g is a positive Harris chain. Claim (2) t t then follows from Theorem 17.1.7 of Meyn and Tweedie (2009). To verify claim (3), since we have shown that  := f(a ; Z )g is positive Harris with t t stationary distribution , based on Theorem 16.1.5 and Theorem 17.5.4 of Meyn and Tweedie (2009), it suces to show that Q is V -uniformly ergodic. Let  be the n-skeleton of  (see page 62 of Meyn and Tweedie (2009)). Then  is -irreducible and aperiodic by Proposition 5.4.5 of Meyn and Tweedie (2009). Theorem 16.0.1 of Meyn and Tweedie (2009) and Lemmas C.8 and C.9 then imply that  is V - nN uniformly ergodic, and, there exists N 2 N such that jjjQ 1 jjj < 1, where 1 V kk := sup j g dj for  2 P (S) and, for all t 2 N, g:jgjV kQ ((a; z);) k 1 V jjjQ 1 jjj := sup : 1 V V (a; z) (a;z)2S 44 To show that Q is V -uniformly ergodic, by Theorem 16.0.1 of Meyn and Tweedie (2009), it remains to verify: jjjQ 1 jjj < 1 for t  nN . This obviously holds 1 V since, by the proof of Lemma C.9, there exist L ; L 2 R such that, for all t 2 N, 0 1 0 0 t 0 0 jf (a ; z )jQ ((a; z); d(a ; z )) jjjQ 1 jjj  sup sup + L 1 V 0 V (a; z) (a;z)2SkfkV 0 0 t 0 0 V (a ; z )Q ((a; z); d(a ; z )) sup + L  L + L < 1: 0 0 1 V (a; z) (a;z)2S Hence, Q is V -uniformly ergodic and claim (3) follows. The proof is now complete. Proof of Theorem 3.3. Take an arbitrarily large constant k < 1 such that P (z ; z ) > 0 and P fkG(z ; z ;  ) > 1g > 0; which is possible by Assumption 3.3 and the de nition of G in (25a). For this k, since lim c (a; z)=a = (z) and Z is a nite set, we can take a  > 0 such that a!1 c (a; z) 1  k(1 (z)) for all z 2 Z and a  a . Multiplying both sides by R(z ^;  )  0, it follows from the law of motion (21a), Y (z ^;  ^)  0, and the de nition of G in (25a) that for a  a , a ^ = R(z ^;  )(a c (a; z)) + Y (z ^;  ^) c (a; z) ^ ^ R(z ^;  )(a c (a; z)) = R(z ^;  ) 1 a ^ ^ R(z ^;  )k(1 (z))a = kG(z; z ^;  )a: ~ ^ ^ ^ ^ Let A(z; z ^;  ) := kG(z; z ^;  )1fkG(z; z ^;  ) > 1g. Then for all z; z ^; ;  ^ and all a  a , ~ ^ a ^  A(z; z ^;  )a: (58) Start the wealth accumulation process a from a  a . Consider the following process: t 0 S = A(Z ; Z ;  )S ; t+1 t t+1 t+1 t where S = a . We now show that a  S with probability one for all t by induction. 0 0 t t Since S = a , the case t = 0 is trivial. Suppose the claim holds up to t. Because 0 0 a  0 and S remains 0 once it becomes 0, without loss of generality we may assume t t ~ ~ ~ S ; : : : ; S are all positive. Hence A ; : : : ; A > 0. By the de nition of A, we have 0 t 1 t ~ ~ A > 1 whenever A > 0. Therefore ~ ~ S = A  A S  S = a  a: t t 1 0 0 0 45 Hence applying (58), we get ~ ~ a  A(Z ; Z ;  )a  A(Z ; Z ;  )S = S : t+1 t t+1 t+1 t t t+1 t+1 t t+1 Now take any p 2 (0; 1) and let T be a geometric random variable with mean 1=p that is independent of everything. De ne (s) = (1 p)r(P M (s)); ~ ~ where M (s) is as in (24). Since clearly A  A and p > 0, we have  > . By Lemma 3.1 of Beare and Toda (2017), ;  are convex, and hence continuous in the interior of their domains. Therefore () = 1 and (s) > 1 for small enough s > . Hence, for any " > 0, we can take small enough p 2 (0; 1) and large enough k < 1 ~ ~ such that () < 1 < ( + ") < 1. By Lemma 3.1 of Beare and Toda (2017), there exists a unique  ~ 2 (;  + ") such that ( ~) = 1. Theorem 3.4 of Beare and Toda (2017) then implies that lim inf a P fS > ag > 0 a ;z T 0 0 a!1 for all (a ; z ) 2 S. In particular, for any initial (a ; z ) 2 S with a  a , 0 0 0 0 0 +" lim inf a P fS > ag > 0: (59) a ;z T 0 0 a!1 Now suppose that we draw a from the ergodic distribution. Then a has the same 0 t distribution as a , and so does a . Therefore 1 T Pfa > ag = Pfa > ag 1 T = Pfa < a gPfa > aj a < a g + Pfa  a gPfa > aj a  a g: (60) 0 T 0 0 T 0 If the ergodic distribution of fa g has unbounded support, then Pfa  a g > 0. As t 0 we have seen above, conditional on a  a , we have a  S for all t. Therefore 0 t t +" +" lim inf a Pfa > a j a  a g  lim inf a PfS > a j a  a g > 0 (61) T 0 T 0 a!1 a!1 by (59), and so (27) follows from (60) and (61). References Acemoglu, D. and J. A. Robinson (2002): \The Political Economy of the Kuznets curve," Review of Development Economics, 6, 183{203. Ac kgoz, O. T. (2018): \On the Existence and Uniqueness of Stationary Equilib- rium in Bewley Economies with Production," Journal of Economic Theory, 173, 18{55. 46 Ahn, S., G. Kaplan, B. Moll, T. Winberry, and C. Wolf (2018): \When Inequality Matters for Macro and Macro Matters for Inequality," NBER Macroe- conomics Annual, 32, 1{75. Aiyagari, S. R. (1994): \Uninsured Idiosyncratic Risk and Aggregate Saving," Quarterly Journal of Economics, 109, 659{684. Aliprantis, C. D. and K. C. Border (2006): In nite Dimensional Analysis: A Hitchhiker's Guide, Springer. Beare, B. K. and A. A. Toda (2017): \Geometrically Stopped Markovian Ran- dom Growth Processes and Pareto Tails," Tech. rep., UC San Diego. Benhabib, J. and A. Bisin (2018): \Skewed Wealth Distributions: Theory and Empirics," Journal of Economic Literature, 56, 1261{1291. Benhabib, J., A. Bisin, and M. Luo (2017): \Earnings Inequality and Other Determinants of Wealth Inequality," American Economic Review: Papers and Pro- ceedings, 107, 593{597. Benhabib, J., A. Bisin, and S. Zhu (2011): \The Distribution of Wealth and Fiscal Policy in Economies with Finitely Lived Agents," Econometrica, 79, 123{ ||| (2015): \The Wealth Distribution in Bewley Economies with Capital Income Risk," Journal of Economic Theory, 159, 489{515. ||| (2016): \The Distribution of Wealth in the Blanchard{Yaari Model," Macroe- conomic Dynamics, 20, 466{481. Bhandari, A., D. Evans, M. Golosov, and T. J. Sargent (2018): \Inequal- ity, Business Cycles, and Monetary-Fiscal Policy," Tech. rep., National Bureau of Economic Research. Blackwell, D. (1965): \Discounted Dynamic Programming," Annals of Mathe- matical Statistics, 36, 226{235. Brinca, P., H. A. Holter, P. Krusell, and L. Malafry (2016): \Fiscal Multipliers in the 21st Century," Journal of Monetary Economics, 77, 53{69. Cao, D. (2020): \Recursive Equilibrium in Krusell and Smith (1998)," Journal of Economic Theory, 186. Cao, D. and W. Luo (2017): \Persistent heterogeneous returns and top end wealth inequality," Review of Economic Dynamics, 26, 301{326. Carroll, C. (2004): \Theoretical Foundations of Bu er Stock Saving," Tech. rep., National Bureau of Economic Research. Carroll, C. D. (1997): \Bu er-stock Saving and the Life Cycle/Permanent Income Hypothesis," Quarterly Journal of Economics, 112, 1{55. 47 Chamberlain, G. and C. A. Wilson (2000): \Optimal Intertemporal Consump- tion under Uncertainty," Review of Economic Dynamics, 3, 365{395. Coleman, II, W. J. (1990): \Solving the Stochastic Growth Model by Policy- Function Iteration," Journal of Business and Economic Statistics, 8, 27{29. Datta, M., L. J. Mirman, and K. L. Reffett (2002): \Existence and Unique- ness of Equilibrium in Distorted Dynamic Economies with Capital and Labor," Journal of Economic Theory, 103, 377{410. Davies, J. B. and A. F. Shorrocks (2000): \The Distribution of Wealth," in Handbook of Income Distribution, Elsevier, vol. 1, 605{675. Deaton, A. and G. Laroque (1992): \On the Behaviour of Commodity Prices," Review of Economic Studies, 59, 1{23. Epper, T., E. Fehr, H. Fehr-Duda, C. Kreiner, D. Lassen, S. Leth- Petersen, and G. Rasmussen (2018): \Time Discounting and Wealth Inequal- ity," Tech. rep., Working paper. Fagereng, A., L. Guiso, D. Malacrino, and L. Pistaferri (2016a): \Het- erogeneity and Persistence in Returns to Wealth," Tech. rep., National Bureau of Economic Research. ||| (2016b): \Heterogeneity in Returns to Wealth and the Measurement of Wealth Inequality," American Economic Review: Papers and Proceedings, 106, 651{655. Feinberg, E. A., P. O. Kasyanov, and N. V. Zadoianchuk (2014): \Fa- tou's Lemma for Weakly Converging Probabilities," Theory of Probability & Its Applications, 58, 683{689. Gabaix, X., J.-M. Lasry, P.-L. Lions, and B. Moll (2016): \The Dynamics of Inequality," Econometrica, 84, 2071{2111. Glaeser, E., J. Scheinkman, and A. Shleifer (2003): \The Injustice of In- equality," Journal of Monetary Economics, 50, 199{222. Gouin-Bonenfant, E. and A. A. Toda (2018): \Pareto Extrapolation: Bridg- ing Theoretical and Quantitative Models of Wealth Inequality," Tech. rep., SSRN Guvenen, F. and A. A. Smith (2014): \Inferring Labor Income Risk and Partial Insurance from Economic Choices," Econometrica, 82, 2085{2129. Hansen, B. E. and K. D. West (2002): \Generalized Method of Moments and Macroeconomics," Journal of Business & Economic Statistics, 20, 460{469. Hardy, G. H., J. E. Littlewood, and G. Polya (1952): Inequalities, Cam- bridge University Press. 48 Hills, T. S. and T. Nakata (2018): \Fiscal Multipliers at the Zero Lower Bound: The Role of Policy Inertia," Journal of Money, Credit and Banking, 50, 155{172. Hubmer, J., P. Krusell, and A. A. Smith, Jr. (2018): \A Comprehensive Quantitative Theory of the US Wealth Distribution," Tech. rep., Yale. Huggett, M. (1993): \The Risk-free Rate in Heterogeneous-agent Incomplete- insurance Economies," Journal of Economic Dynamics and Control, 17, 953{969. Kaymak, B., C. S. Leung, and M. Poschke (2018): \The Determinants of Wealth Inequality and Their Implications for Economic Policy," Tech. rep., Society for Economic Dynamics. Krasnosel'skii, M. A., G. M. Vainikko, R. Zabreyko, Y. B. Ruticki, and V. V. Stet'senko (2012): Approximate Solution of Operator Equations, Springer Netherlands. Krusell, P. and A. A. Smith, Jr. (1998): \Income and Wealth Heterogeneity in the Macroeconomy," Journal of Political Economy, 106, 867{896. Kuhn, M. (2013): \Recursive Equilibria in an Aiyagari-style Economy with Perma- nent Income Shocks," International Economic Review, 54, 807{835. Lawrance, E. C. (1991): \Poverty and the Rate of Time Preference: Evidence from Panel Data," Journal of Political Economy, 99, 54{77. Li, H. and J. Stachurski (2014): \Solving the Income Fluctuation Problem with Unbounded Rewards," Journal of Economic Dynamics and Control, 45, 353{365. Ljungqvist, L. and T. J. Sargent (2012): Recursive Macroeconomic Theory, MIT Press, 4 ed. Loewenstein, G. and D. Prelec (1991): \Negative Time Preference," American Economic Review, 81, 347{352. Loewenstein, G. and N. Sicherman (1991): \Do Workers Prefer Increasing Wage Pro les?" Journal of Labor Economics, 9, 67{84. Meyn, S. P. and R. L. Tweedie (2009): Markov Chains and Stochastic Stability, Springer Science & Business Media. Miao, J. (2006): \Competitive Equilibria of Economies with a Continuum of Con- sumers and Aggregate Shocks," Journal of Economic Theory, 128, 274{298. Morand, O. F. and K. L. Reffett (2003): \Existence and Uniqueness of Equilib- rium in Nonoptimal Unbounded In nite Horizon Economies," Journal of Monetary Economics, 50, 1351{1373. Pareto, V. (1896): La Courbe de la R epartition de la Richesse, Lausanne: Im- primerie Ch. Viret-Genton. 49 Rabault, G. (2002): \When Do Borrowing Constraints Bind? Some New Results on the Income Fluctuation Problem," Journal of Economic Dynamics and Control, 26, 217{245. Rouwenhorst, K. G. (1995): \Asset Pricing Implications of Equilibrium Busi- ness Cycle Models," in Frontiers of Business Cycle Research, ed. by T. F. Cooley, Princeton University Press, chap. 10, 294{330. Saez, E. and G. Zucman (2016): \Wealth Inequality in the United States since 1913: Evidence from Capitalized Income Tax Data," Quarterly Journal of Eco- nomics, 131, 519{578. Schechtman, J. (1976): \An Income Fluctuation Problem," Journal of Economic Theory, 12, 218{241. Schechtman, J. and V. L. S. Escudero (1977): \Some Results on \An Income Fluctuation Problem"," Journal of Economic Theory, 16, 151{166. Schorfheide, F., D. Song, and A. Yaron (2018): \Identifying Long-Run Risks: A Bayesian Mixed-Frequency Approach," Econometrica, 86, 617{654. Stachurski, J. (2009): Economic Dynamics: Theory and Computation, MIT Press. Stachurski, J. and A. A. Toda (2019): \An Impossibility Theorem for Wealth in Heterogeneous-agent Models with Limited Heterogeneity," Journal of Economic Theory, 182, 1{24. Toda, A. A. (2014): \Incomplete Market Dynamics and Cross-Sectional Distribu- tions," Journal of Economic Theory, 154, 310{348. ||| (2019): \Wealth Distribution with Random Discount Factors," Journal of Monetary Economics, 104, 101{113. Toda, A. A. and K. Walsh (2015): \The Double Power Law in Consumption and Implications for Testing Euler Equations," Journal of Political Economy, 123, 1177{1200. Vermeulen, P. (2018): \How Fat Is the Top Tail of the Wealth Distribution?" Review of Income and Wealth, 64, 357{387. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Economics arXiv (Cornell University)

The Income Fluctuation Problem and the Evolution of Wealth

Economics , Volume 2020 (1905) – May 29, 2019

Loading next page...
 
/lp/arxiv-cornell-university/the-income-fluctuation-problem-and-the-evolution-of-wealth-Audi04dKvz
ISSN
0022-0531
eISSN
ARCH-3349
DOI
10.1016/j.jet.2020.105003
Publisher site
See Article on Publisher Site

Abstract

The Income Fluctuation Problem and the Evolution of Wealth a b c Qingyin Ma , John Stachurski and Alexis Akira Toda International School of Economics and Management, Capital University of Economics and Business Research School of Economics, Australian National University Department of Economics, University of California San Diego January 30, 2020 Abstract. We analyze the household savings problem in a general setting where returns on assets, non- nancial income and impatience are all state dependent and uctuate over time. All three processes can be serially correlated and mutually dependent. Rewards can be bounded or unbounded and wealth can be arbitrarily large. Extending classic results from an earlier literature, we determine conditions under which (a) solutions exist, are unique and are globally computable, (b) the resulting wealth dynamics are stationary, ergodic and geometrically mixing, and (c) the wealth distribution has a Pareto tail. We show how these results can be used to extend recent studies of the wealth distribution. Our conditions have natural economic interpretations in terms of asymptotic growth rates for discounting and return on savings. Keywords: Income uctuation, optimality, stochastic stability, wealth distribution. 1. Introduction It has been observed that, in the US and several other large economies, the wealth distribution is heavy tailed and wealth inequality has risen sharply over the last few We thank the editors and two anonymous referees for many valuable comments and suggestions. This paper has also bene ted from discussion with many colleagues. We particularly thank Fedor Iskhakov, Larry Liu and Chung Tran for their insightful feedback and suggestions. The second author gratefully acknowledges nancial support from ARC grant FT160100423. Email addresses: qingyin.ma@cueb.edu.cn, john.stachurski@anu.edu.au, atoda@ucsd.edu. arXiv:1905.13045v3 [econ.TH] 29 Feb 2020 2 decades. This matters not only for its direct impact on taxation and redistribution policies, but also for potential ow-on e ects for productivity growth, business cycles and scal policy, as well as for the political environment that shapes these and other economic outcomes. At present, our understanding of these phenomena is hampered by the fact that standard tools of analysis|such as those used for heterogeneous agent models|are not well adapted to studying the wealth distribution as it stands. For example, while we have sound understanding of the household problem when returns on savings and rates of time discount are constant (see, e.g., Schechtman (1976), Schechtman and Escudero (1977), Deaton and Laroque (1992), Carroll (1997), or A ckg oz (2018)), our knowledge is far more limited in settings where these values are stochastic. This is problematic, since injecting such features into the household problem is essential for accurately representing the joint distribution of income and wealth (e.g., Benhabib et al. (2015), Benhabib et al. (2017), Stachurski and Toda (2019)). Moreover, models with time-varying discount rates and returns on assets are at the forefront of recent quantitative analysis of wealth and inequality. While it might be hoped that the analysis of the income uctuation problem (or household consumption and savings problem) changes little when we shift from con- stant to state dependent asset returns and rates of time discount, this turns out not For example, in a study based on capital income data, Saez and Zucman (2016) nd that, in the case of the US, the share of total household wealth held by the top 0.1% increased from 7 percent to 22 percent between 1978 and 2012. For a discussion of the heavy-tailed property of the wealth distribution, see Pareto (1896), Davies and Shorrocks (2000), Benhabib and Bisin (2018), Vermeulen (2018) or references therein. One analysis of the two-way interactions between inequality and political decision making can be found in Acemoglu and Robinson (2002). Glaeser et al. (2003) show how inequality can alter economic and social outcomes through subversion of institutions. The same study contains references on linkages between inequality and growth. Regarding scal policy, Brinca et al. (2016) nd strong correlations between wealth inequality and the magnitude of scal multipliers, while Bhandari et al. (2018) study the connection between scal-monetary policy, business cycles and inequality. Ahn et al. (2018) discuss the impact of distributional properties on macroeconomic aggregates. Also related is the recent experimental study of Epper et al. (2018), which nds a strong positive connection between dispersion in subjective rates of time discounting across the population and realized dispersion in the wealth distribution. This in turn is consistent with earlier empirical studies such as Lawrance (1991). For a recent quantitative study see, for example, Hubmer et al. (2018), where returns on savings and discount rates are both state dependent (as is labor income). Kaymak et al. (2018) nd that asset return heterogeneity is required to match the upper tail of the wealth distribution. 3 to be the case. E ectively modeling these features and the way they map to the wealth distribution requires signi cant advances in our understanding of choice and stochastic dynamics in the setting of optimal savings. One diculty is that state-dependent discounting takes us beyond the bounds of tradi- tional dynamic programming theory. This matters little if there exists some constant < 1 such that the discount process f g satis es  for all t with probability t t one, since, in this case, a standard contraction mapping argument can still be applied (see, e.g., Miao (2006) or Cao (2020)). However, recent quantitative studies extend beyond such settings. For example, AR(1) speci cations are increasingly common, in which case the support of is unbounded above at every point in time. Even if dis- cretization is employed, the outcome  1 can occur with positive probability when the approximation is suciently ne. Moreover, such outcomes are not inconsistent with empirical and experimental evidence, at least for some households in some states of the world. Do there exist conditions on f g that allow for  1 in some states t t and yet imply existence of optimal polices and practical computational techniques? Another source of complexity for the income uctuation problem in the general setting considered here is that the set of possible values for household assets is typically unbounded above. For example, when returns on assets are stochastic, a suciently long sequence of favorable returns can compound one another to project a household to arbitrarily high levels of wealth. This model feature is desirable: We wish to analyze these kinds of outcomes rather than rule them out. Indeed, Benhabib et al. (2015) and other related studies argue convincingly that such outcomes are a key causal mechanism behind the heavy tail of the current distribution of wealth. However, if we accept this logic, then stationarity and ergodicity of the wealth process|which are fundamental both for estimation and for simulation-based numerical methods|must now be established in a setting where the wealth distribution has unbounded support. See, for example, Hills and Nakata (2018), Hubmer et al. (2018) or Schorfheide et al. (2018). See, for example, Loewenstein and Prelec (1991) and Loewenstein and Sicherman (1991). One related study is Benhabib et al. (2011), who show that capital income risk is the driving force of the heavy-tail properties of the stationary wealth distribution. In Blanchard-Yaari style economies, Toda (2014), Toda and Walsh (2015) and Benhabib et al. (2016) show that idiosyncratic investment risk generates a double Pareto stationary wealth distribution. Gabaix et al. (2016) point out that a positive correlation of returns with wealth (\scale dependence") in addition to persistent heterogeneity in returns (\type dependence") can well explain the speed of changes in the tail inequality observed in the data. 4 In such a scenario, what conditions on preferences and nancial and labor income are necessary for these properties to hold? A nal and related example of the need for deeper analysis is as follows: To understand the upper tail of the wealth distribution, we must avoid unnecessarily truncating the upper tail of the set of possible asset values in quantitative work. While truncation is convenient because nite or compact state spaces are easier to handle computationally, we can attain greater accuracy in modeling the wealth distribution if truncation at the upper tail can be replaced locally by a parameterized savings function, such as a linear function (Gouin-Bonenfant and Toda, 2018). However, any such approximation must be justi ed by theory. What conditions can be imposed on primitives to generate such properties while still maintaining realistic assumptions for asset returns and non- nancial income? In this paper we address all of these questions, along with other key properties of the income uctuation problem, such as continuity and monotonicity of the optimal consumption policy. Our setting admits capital income risk, labor earnings shocks and time-varying discount rates, driven by a combination of iid innovations and an exogenous Markov chain fZ g. The supports of the innovations can be unbounded, so we admit practical innovation sequences such as normal and lognormal. As a whole, this environment allows for a range of realistic features, such as stochastic volatility in returns on asset holdings, or correlation in the shocks impacting asset returns and non- nancial income. The utility function can be unbounded both above and below, with no speci c structure imposed beyond di erentiability, concavity and the usual slope (Inada) conditions. To begin, when considering optimality in the household problem, we require a con- dition on the state dependent discount process f g that generalizes the classical condition < 1 from the constant case and, for reasons discussed above, permits > 1 with positive probability. To this end, we introduce the restriction 1=n G < 1 where G := lim E : (1) n!1 t=1 While the assumption that the exogenous state process fZ g is a ( nite state) Markov chain might appear restrictive, it ts most practical settings and avoids a host of technical issues that tend to obscure the key ideas. Moreover, the innovation shocks are not restricted to be discrete, and the same is true for assets and consumption. Q Q n n Here and below we set  1, so = . 0 t t t=1 t=0 5 Condition (1) clearly generalizes the classical condition < 1 for the constant dis- count case. In the stochastic case, ln G can be understood as the asymptotic growth rate of the probability weighted average discount factor. Indeed, if B := E n t t=1 is the average n-period discount factor, then, from the de nition of G and some straightforward analysis, we obtain ln(B =B ) ! ln G , so the condition G < 1 n+1 n implies that the asymptotic growth rate of the average n-period discount factor is negative, drifting down from its initial condition  1 at the rate ln G . This does not, of course, preclude the possibility that > 1 at any given t. We show that condition (1) is in fact a necessary condition in those settings where the classical condition is necessary for nite lifetime values. In this sense it cannot be further weakened for the income uctuation problem apart from special cases. At the same time, it admits the use of convenient speci cations such as the discretized AR(1) process from Hubmer et al. (2018). In addition, we prove that G can be represented as the spectral radius of a nonnegative matrix, and hence can be computed by numerical linear algebra (as discussed below). We also generalize the standard condition R < 1, where R is the gross interest rate in the constant case, which is used to ensure stability of the asset path and niteness of lifetime valuations, as well as existence of stationary Markov policies (see, e.g., Deaton and Laroque (1992), Chamberlain and Wilson (2000) or Li and Stachurski (2014)). Analogous to (1), we introduce the generalized condition 1=n G < 1 where G := lim E R : (2) R R t t n!1 t=1 Here fR g is a stochastic capital income process. Analogous to the case of G , the value ln G can be understood as the asymptotic growth rate of average gross payo on assets, discounted to present value. We show that, when Conditions (1){(2) hold and non- nancial income satis es two moment conditions, a unique optimal consumption policy exists. We also show that the policy can be computed by successive approximations and analyze its properties, such as monotonicity and asymptotic linearity. This asymptotic linearity can be used to successfully model wealth inequality by accurately representing asset path dynamics for very high wealth households (Gouin-Bonenfant and Toda, 2018). One important feature of Conditions (1){(2) is that they take into account the au- tocorrelation structure of preference shocks and asset returns. For example, if these processes depend only on iid innovations, then (1) reduces to E < 1 and (2) reduces t 6 to E R < 1. But returns on assets are typically not iid, since both mean returns t t and volatility are, in general, time varying, and preference shocks are typically mod- eled as correlated (see, e.g., Hubmer et al. (2018) or Schorfheide et al. (2018)). This dependence must be and is accounted for in (2), since long upswings in f g and fR g t t can lead to explosive paths for valuations and assets. Next we study asymptotic stability, stationarity and ergodicity of wealth. Such prop- erties are essential to existence of stationary equilibria in heterogeneous agent models (e.g., Huggett (1993), Aiyagari (1994) or Cao (2020)), as well as standard estimation, calibration and simulation techniques that connect time series averages with cross- sectional moments. These properties require an additional restriction, placed on the asymptotic growth rate of mean returns. Analogous to (1) and (2), this is de ned as 1=n G := lim E R : (3) R t n!1 t=1 We show that if G is suciently restricted and a degree of social mobility is present, then there exists a unique stationary distribution for the state process, the distri- butional path of the state process under the optimal path converges globally to the stationary distribution, and the stationary distribution is ergodic. We also show that, under some mild additional conditions, the rate of convergence of marginal distribu- tions to the stationary distribution is geometric, and that a version of the Central Limit Theorem is valid. Finally, under some mild additional conditions, we prove that the stationary distribution of assets is Pareto tailed, consistent with the data. Our study is related to Benhabib et al. (2015), who prove the existence of a heavy- tailed wealth distribution in an in nite horizon heterogeneous agent economy with capital income risk. In the process, they show that households facing a stochastic return on savings possess a unique optimal consumption policy characterized by the (boundary constraint-contingent) Euler equation, and that a unique and unbounded stationary distribution exists for wealth under this consumption policy. They assume isoelastic utility, constant discounting, and mutually independent, iid returns and labor income processes, both supported on bounded closed intervals with strictly positive lower bounds. We relax all of these assumptions. Apart from allowing more general utility and state dependent discounting, this permits such realistic features for household income as positive correlations between labor earnings and wealth returns A well-known example of a computational technique that uses ergodicity can be found in Krusell and Smith (1998). On the estimation side see, for example, Hansen and West (2002). 7 (an extension that was suggested by Benhabib et al. (2015)), or time varying volatility in returns. Another related paper is Chamberlain and Wilson (2000), which studies an income uctuation problem with stochastic income and asset returns and obtains many signif- icant results on asymptotic properties of consumption. Their study imposes relatively few restrictions on the wealth return and labor income processes. Our paper extends their work by allowing for random discounting, as well as dropping their boundedness restriction on the utility, which prevents their work from being used in many standard settings such as constant relative risk aversion. We also develop a set of new results on stability and ergodicity, as well as asymptotic normality of the wealth process. Our optimality theory draws on techniques found in Li and Stachurski (2014), who show that the time iteration operator is a contraction mapping with respect to a met- ric that evaluates consumption di erences in terms of marginal utility, while assuming a constant discount factor and constant rate of return on assets. We show that these ideas extend to a setting where both returns and discount rates are stochastic and time varying. Our results on dynamics under the optimal policy have no counterparts in Li and Stachurski (2014). In a similar vein, our work is related to several other papers that treat the standard income uctuation problems with constant rates of return on assets and constant discount rates, such as Rabault (2002), Carroll (2004) and Kuhn (2013). While Carroll (2004) constructs a weighted supremum norm contraction and works with the Bellman operator, the other two papers focus on time iteration. In particular, Rabault (2002) exploits the monotonicity structure, while Kuhn (2013) applies a version of the Tarski xed point theorem. Our techniques for studying optimality are close to those in Li and Stachurski (2014), as discussed above. Empirical motivation for these kinds of extensions can be found in numerous studies, including Guvenen and Smith (2014) and Fagereng et al. (2016a,b). Coleman (1990) introduced the time iteration operator as a constructive method for solving stochastic growth models. It has since been used in Datta et al. (2002), Morand and Re ett (2003) and many other studies. Our paper is also related to Cao and Luo (2017), who study wealth inequality in a continuous- time framework with heterogeneous returns following a two-state Markov chain. While we do not pursue the connection here, the generality of our setup, including a persistent shock structure to wealth returns, might permit a study of the continuous-time limit that yields the tail results of Cao and Luo (2017) in a general framework. 8 The rest of this paper is structured as follows. Section 2 formulates the problem and establishes optimality results. Sucient conditions for the existence and uniqueness of optimal policies are discussed. Section 3 focuses on stochastic stability. Section 4 discusses our key conditions and how they can be checked. Section 5 provides a set of applications and Section 6 concludes. All proofs are deferred to the appendix. Code that generates our gures can be found at https://github.com/jstac/ifp_public. 2. The Income Fluctuation Problem and Optimality Results This section formulates the income uctuation problem we consider, establishes the existence, uniqueness and computability of a solution, and derives its properties. 2.1. Problem Statement. We consider a general income uctuation problem, where a household chooses a consumption-asset path f(c ; a )g to solve t t ( ! ) 1 t X Y max E u(c ) 0 i t t=0 i=0 s:t: a = R (a c ) + Y ; (4) t+1 t+1 t t t+1 0  c  a ; (a ; Z ) = (a; z) given: t t 0 0 Here u is the utility function, f g is discount factor process with = 1, fR g t t0 0 t t1 is the gross rate of return on wealth, and fY g is non- nancial income. These t t1 stochastic processes obey = (Z ; " ) ; R = R (Z ;  ) ; and Y = Y (Z ;  ) ; (5) t t t t t t t t t where , R and Y are measurable nonnegative functions and fZ g is an irreducible t t0 time-homogeneous Z-valued Markov chain taking values in nite set Z. Let P (z; z ^) be the probability of transitioning from z to z ^ in one step. The innovation processes f" g, f g and f g are iid independent and their supports can be continuous and t t t vector-valued. The function u maps R to f1g [ R, is twice di erentiable on (0;1), satis es 0 00 0 0 u > 0 and u < 0 everywhere on (0;1), and that u (c) ! 1 as c ! 0 and u (c) < 1 as c ! 1. We de ne E := E  (a ; Z ) = (a; z) and E := E  Z = z : (6) a;z 0 0 z 0 The next period value of a random variable X is typically denoted X . Expectation without a subscript refers to the stationary process, where Z is drawn from its (necessarily unique) stationary distribution. 9 2.2. Key Conditions. Our conditions for optimality are listed below. In what fol- lows, G is the asymptotic growth rate of the discount process as de ned in (1). Assumption 2.1. The discount factor process satis es G < 1. Assumption 2.1 is a natural extension of the standard condition < 1 from the constant discount case. If  for all t, then G = , as follows immediately from the de nition. It is weaker than the obvious sucient condition  with probability one for some constant < 1, since in such a setting we have G  < 1. In fact it cannot be signi cantly weakened, as the proposition shows. Proposition 2.1 (Necessity of the discount condition). Let and u(Y ) be positive t t with probability one for all t and all initial states z in Z. If, in this setting, we have G  1, then the objective in (4) is in nite at every initial state (a; z). The positivity assumed here may or may not hold in applications, but Proposition 2.1 shows that special conditions will have to be imposed on preferences if Assumption 2.1 fails. Put di erently, allowing G  1 is tantamount to allowing  1 in the case when the discount rate is constant. Next, we need to ensure that the present discounted value of wealth does not grow too quickly, which requires a joint restriction on asset returns and discounting. When fR g and f g are constant at values R and , the standard restriction from the t t existing literature is R < 1. A generalization using G as de ned in (2) is Assumption 2.2. The discount factor and return processes satisfy G < 1. Finally, we impose routine technical restrictions on non- nancial income. The second restriction is needed to exploit rst order conditions. Assumption 2.3. E Y < 1 and E u (Y ) < 1. Next we provide one example where Assumptions 2.1{2.3 are easily veri ed. More complex examples are deferred to Sections 4 and 5. Example 2.1. Suppose, as in Benhabib et al. (2015), that there is a constant dis- count factor < 1, utility is CRRA with  1, fR g and fY g are iid, mutually t t independent, supported on bounded closed intervals of strictly positive real numbers, and, moreover, 1 1 1= ER < 1 and ( ER ) ER < 1: (7) t t t 10 Assumptions 2.1{2.3 are all satis ed in this case. To see this, observe that G = < 1 in the constant discount case, so Assumption 2.1 holds. Since x 7! x is convex when  1, Jensen's inequality implies that ER  (ER ) . Multiplying both sides of the last inequality by (ER ) yields G = ER = (ER ) (ER )  ( ER )(ER ) : R t t t t By the second condition of (7), Assumption 2.2 holds. Assumption 2.3 also holds because Y is restricted to a compact subset of the positive reals. 2.3. Optimality: De nitions and Fundamental Properties. To consider opti- mality, we temporarily assume that a > 0 and set the asset space to (0;1). The state space for f(a ; Z )g is then S := (0;1)  Z. A feasible policy is a Borel t t t0 0 measurable function c : S ! R with 0  c(a; z)  a for all (a; z) 2 S . A feasible 0 0 policy c and initial condition (a; z) 2 S generate an asset path fa g via (4) when 0 t t0 c = c(a ; Z ) and (a ; Z ) = (a; z). The lifetime value of policy c is t t t 0 0 V (a; z) = E  u [c(a ; Z )] ; (8) c a;z 0 t t t t=0 where fa g is the asset path generated by (c; (a; z)). In the Appendix we show that V is well-de ned on S . A feasible policy c is called optimal if V  V  on S for c 0 c c 0 any feasible policy c. A feasible policy is said to satisfy the rst order optimality condition if 0 0 ^ ^ ^ ^ (u  c) (a; z)  E R (u  c) R [a c(a; z)] + Y ; Z (9) for all (a; z) 2 S , and equality holds when c(a; z) < a. Noting that u is decreasing, the rst order optimality condition can be compactly stated as n   o 0 0 0 ^ ^ ^ ^ ^ (u  c) (a; z) = max E R (u  c) R [a c(a; z)] + Y ; Z ; u (a) (10) for all (a; z) 2 S . A feasible policy is said to satisfy the transversality condition if, for all (a; z) 2 S , lim E  (u  c) (a ; Z ) a = 0: (11) a;z 0 t t t t t!1 Theorem 2.1 (Suciency of rst order and transversality conditions). If Assump- tions 2.1{2.3 hold, then every feasible policy satisfying the rst order and transver- sality conditions is an optimal policy. 15 0 Assumption 2.3 combined with u (0) = 1 implies that PfY > 0g = 1 for all t  1. Hence, Pfa > 0g = 1 for all t  1 and excluding zero from the asset space makes no di erence to optimality. t 11 2.4. Existence and Computability of Optimal Consumption. Let C be the space of continuous functions c : S ! R such that c is increasing in the rst argument, 0 < c(a; z)  a for all (a; z) 2 S , and 0 0 sup j(u  c)(a; z) u (a)j < 1: (12) (a;z)2S To compare two consumption policies, we pair C with the distance 0 0 0 0 (c; d) := ku  c u  dk := sup j(u  c) (a; z) (u  d) (a; z)j ; (13) (a;z)2S which evaluates the maximal di erence in terms of marginal utility. While elements of C are not generally bounded,  is a valid metric on C . In particular,  is nite 0 0 0 0 on C since (c; d)  ku  c uk +ku  d uk, and the last two terms are nite by (12). In Appendix B, we show that (C ; ) is a complete metric space. The following proposition shows that, for any policy in C , the rst order optimality condition (10) implies the transversality condition. Proposition 2.2 (Suciency of rst order condition). Let Assumptions 2.1{2.3 hold. If c 2 C and the rst order optimality condition (10) holds for all (a; z) 2 S , then c satis es the transversality condition. In particular, c is an optimal policy. We aim to characterize the optimal policy as the xed point of the time iteration operator T de ned as follows: for xed c 2 C and (a; z) 2 S , the value of the image Tc at (a; z) is de ned as the  2 (0; a] that solves u () = (; a; z); (14) where is the function on G := f(; a; z) 2 R  (0;1) Z : 0 <   ag (15) de ned by n o 0 0 ^ ^ ^ ^ ^ (; a; z) := max E R(u  c)[R(a ) + Y ; Z ]; u (a) : (16) c z The following theorem shows that the time iteration operator is an n-step contraction mapping on a complete metric space of candidate policies and its xed point is the unique optimal policy. Theorem 2.2 (Existence, uniqueness and computability of optimal policies). If As- sumptions 2.1{2.3 hold, then there exists an n in N such that T is a contraction mapping on (C ; ). In particular, 12 (1) T has a unique xed point c 2 C . (2) The xed point c is the unique optimal policy in C . (3) For all c 2 C we have (T c; c ) ! 0 as k ! 1. Part (3) shows that, under our conditions, the familiar time iteration algorithm is globally convergent, provided one starts with some policy in the candidate class C . 2.5. Properties of Optimal Consumption. In this section we study the properties of the optimal consumption function obtained in Theorem 2.2. Assumptions 2.1{2.3 are held to be true throughout. The following two propositions show the monotonicity of the consumption function, which is intuitive. Proposition 2.3 (Monotonicity with respect to wealth). The optimal consumption and savings functions c (a; z) and i (a; z) := a c (a; z) are increasing in a. Proposition 2.4 (Monotonicity with respect to income). If fY g and fY g are two 1t 2t income processes satisfying Y  Y for all t and c and c are the corresponding 1t 2t 1 2 optimal consumption functions, then c  c pointwise on S . 1 2 Under further assumptions we can show that the optimal policy is concave and asymp- totically linear with respect to the wealth level. Proposition 2.5 (Concavity and asymptotic linearity of consumption function). If for each z 2 Z and c 2 C that is concave in its rst argument, h i 0 1 0 ^ ^ ^ ^ x 7! (u ) E R (u  c) (Rx + Y ; Z ) is concave on R ; (17) z + then (1) a 7! c (a; z) is concave, and (2) there exists (z) 2 [0; 1] such that lim [c (a; z)=a] = (z). a!1 Remark 2.1. Condition (17) imposes some concavity structure on utility. It holds for the constant relative risk aversion (CRRA) utility function u(c) = if > 0 and u(c) = log c if = 1; (18) as shown in Appendix B. 13 Proposition 2.5 states that c (a; z)  (z)a + b(z) for some function b(z) when a is large. This provides justi cation for linearly extrapolating the policy functions when computing them at high wealth levels. Together, parts (1) and (2) of Proposition 2.5 imply the linear lower bound c (a; z) (z)a, although they do not provide a concrete number for (z). The following proposition establishes an explicit linear lower bound. Proposition 2.6 (Linear lower bound on consumption). If there exists a nonnegative constant s  such that 0 0 ^ ^ ^ s  < 1 and E R u (R s a)  u (a) for all (a; z) 2 S ; (19) z 0 then c (a; z)  (1 s )a for all (a; z) 2 S . The second inequality in (19) restricts marginal utility derived from transferring wealth to the next period and then consuming versus consuming wealth today. The value s  can be clari ed once primitives are speci ed, as the next example illustrates. Example 2.2. Suppose that utility is CRRA, as in (18). If we now take 1= ^ ^ s  := max E R (20) z2Z and s  < 1, then the conditions of Proposition 2.6 hold. In particular, the second inequality in (19) holds, as follows directly from the de nition of s  and u (x) = x . In the case of Benhabib et al. (2015), where the discount rate is constant and returns 1= are iid, the expression in (20) reduces to s  := ( ER ) . The requirement s  < 1 then reduces to ER < 1, which is one of their assumptions (see Example 2.1). 3. Stationarity, Ergodicity, and Tail Behavior This section focuses on stationarity, ergodicity and tail behavior of wealth under the unique optimal policy c obtained in Theorem 2.2. So that this policy exists, Assumptions 2.1{2.3 are always taken to be valid. We extend c to S by setting We adopt the convention 0  1 = 0, so condition (19) does not rule out the case PfR = 0 j Z = zg > 0. Indeed, as shown in the proofs, the conclusions still hold if we replace this condition t1 0 0 ^ ^ ^ ^ by the weaker alternative E R u [Rsa  + (1 s )Y ]  u (a) for all (a; z) 2 S . z 0 14 c (0; z) = 0 for all z 2 Z and consider dynamics of (a ; Z ) on S := R  Z, the law t t + of motion for which is a = R (Z ;  ) [a c (a ; Z )] + Y (Z ;  ) ; (21a) t+1 t+1 t+1 t t t t+1 t+1 Z  P (Z ;  ) (21b) t+1 t Let Q be the joint stochastic kernel of (a ; Z ) on S. See Appendix A for this and t t related de nitions. 3.1. Stationarity. To obtain existence of a stationary distribution we need to re- strict the asymptotic growth rate for asset returns G de ned in (3). Assumption 3.1. There exists a constant s  such that (19) holds and s G < 1. Below is one straightforward example of a setting where this holds, with more complex applications deferred to Sections 4{5. Example 3.1. Assumption 3.1 holds in the setting of Benhabib et al. (2015). As 1= shown in Example 2.2, with s  := ( ER ) and the assumptions of Benhabib et al. (2015) in force, the conditions of (19) hold. Moreover, in their iid setting we 1= have G = ER , so s G < 1 reduces to ( ER ) ER < 1. This is one of their R t R t conditions, as discussed in Example 2.1. By Proposition 2.6, the value s  in Assumption 3.1 is an upper bound on the rate of savings. G is an asymptotic growth rate for each unit of savings invested. If the product of these is less than one, then probability mass contained in the wealth distribution will not drift to +1, which allows us to obtain the following result. Theorem 3.1 (Existence of a stationary distribution). If Assumption 3.1 holds, then Q admits at least one stationary distribution on S. Stationarity of the form obtained in Theorem 3.1 is required to establish existence of stationary recursive equilibria in heterogeneous agent models with idiosyncratic risk, such as Huggett (1993) or Aiyagari (1994). Assumption 3.1 is weaker than any restriction implying wealth is bounded from above|a com- mon device for compactifying the state space and thereby obtaining a stationary distribution. In- deed, under many speci cations of fY g and fR g that fall within our framework, wealth of a given t t household can and will, over an in nite horizon, exceed any nite bound with probability one. See, for example, Benhabib et al. (2015), Proposition 6. For models with aggregate shocks, such as Krusell and Smith (1998), a fully speci ed recursive equilibrium requires that households take the wealth distribution as one component of the state in 15 3.2. Ergodicity. While Assumption 3.1 implies existence of a stationary distribu- tion, it is not in general sucient for uniqueness or stability. For these additional properties to hold, we must impose sucient mixing. In doing so, we consider the following two cases: (Y1) The support of fY g is nite. (Y2) The process fY g admits a density representation. Condition (Y2) means that there exists a function f from R  Z to R such that + + PfY 2 A j Z = zg = f (y j z) dy (22) t t for all Borel sets A  R and all z in Z. Assumption 3.2. There exists a z  in Z such that P (z ; z ) > 0. Moreover, with y  0 de ned as the greatest lower bound of the support of fY g, either (Y1) holds and PfY = y j Z = z g > 0, or t ` t (Y2) holds and there exists a  > y such that f ( j z ) > 0 on (y ; ). ` ` Assumption 3.2 requires that there is a positive probability of receiving low labor income at some relatively persistent state of the world z . This is a mixing condition that enforces social mobility. The reason is that fZ g is already assumed to be irreducible, so z  is eventually visited by each household. For any such household, there is a positive probability of low labor income over a long period. Wealth then declines. In other words, currently rich households or dynasties will not be rich forever. This guarantees sucient social mobility between rich and poor, generating ergodicity. To state our uniqueness and stability results, let Q be the t-step stochastic kernel, let kk be total variation norm and let V (a; z) := a + m , where m is a constant TV V V to be de ned in the proof. For any integrable real-valued function h on S, let h(a; z) := h(a; z) Eh(a ; Z ) t t and 2 2 := E h (a ; Z ) + 2 E h(a ; Z )h(a ; Z ) ; 0 0 0 0 t t t=1 their savings problem, and that stationarity holds for the entire joint distribution (de ned over a product space encompassing both the wealth distribution and the exogenous state process). These problems fall outside the scope of Theorem 3.1, since fZ g is nite-valued. For a careful treatment of stationary recursive equilibrium in Krusell{Smith type models, see Cao (2020). 16 where, here and in the theorem below, E indicates expectation under stationarity. Theorem 3.2 (Uniqueness, stability, ergodicity and mixing). If Assumptions 3.1 and 3.2 hold, then (1) the stationary distribution of Q is unique and there exist constants  < 1 and M < 1 such that, t t Q ((a; z);)   MV (a; z) for all (a; z) 2 S: TV (2) For all (a; z) 2 S and real-valued function h on S such that Ejh(a ; Z )j < 1, t t ( ) P lim h(a ; Z ) = Eh(a ; Z ) = 1: a;z t t t t T!1 t=1 2 2 (3) Q is V -geometrically mixing. Moreover, if > 0 and h =V is bounded, p h(a ; Z ) ! N (0; 1) as T ! 1: t t t=1 Part 1 of Theorem 3.2 states that the stationary distribution is unique and asymp- totically attracting at a geometric rate. Part 2 states that the state process is er- godic, and hence long-run sample moments for individual households coincide with cross-sectional moments. The notion of mixing discussed in Part 3 is de ned in the appendix. It states that social mobility holds asymptotically and mixing occurs at a geometric rate, although the rate may be arbitrarily slow. This mixing is enough to provide a Central Limit Theorem for the state process, which is the second claim in Part 3. 3.3. Tail Behavior. Having established the stationarity and ergodicity of wealth, we now study the tail behavior of the wealth distribution. We show that the wealth distribution is either bounded or (unbounded and) heavy-tailed under mild conditions. To prove this result we introduce the following assumption. Assumption 3.3. The assumptions of Proposition 2.5 are satis ed, so the optimal policy a 7! c (a; z) is concave and asymptotically linear: lim c (a; z)=a = (z) 2 a!1 [0; 1]. Furthermore, there exists z 2 Z such that P (z ; z ) > 0 and P fR(z ;  )(1 (z )) > 1g > 0: (23) z  17 Remark 3.1. Condition (23) implies that wealth grows with nonzero probability when it is large. Indeed, using the law of motion (21a) and noting that Y  0, if Z = Z = z , then by (23) we have t t+1 t+1 R (z ;  ) [1 c (a ; z )=a ] > 1 t+1 t t with positive probability if a is large enough. To state our result on tail behavior, we introduce the following notation. For any nonnegative function A(z; z ^;  ), de ne the Z Z matrix-valued function M by (M (s))(z; z ^) = E A(z; z ^;  ) : (24) A z;z ^ Elements of M (s) are conditional moment generating functions of log A. In the statement below, denotes the Hadamard (entry-wise) product, and r() returns the spectral radius of a matrix. Also a is a random variable with distribution (; Z). 1 1 Theorem 3.3 (Tail behavior). Let Assumptions 3.1{3.3 hold and de ne ^ ^ G(z; z ^;  ) = R(z ^;  )(1 (z)); (25a) ^ ^ ^ A(z; z ^;  ) = G(z; z ^;  )1fG(z; z ^;  ) > 1g; and (25b) (s) = r(P M (s)): (25c) Then  is convex in s  0. Assume that there exists s > 0 in the interior of the domain of  such that 1 < (s) < 1 and let := inffs > 0j (s) > 1g: (26) If a has unbounded support, then it is heavy-tailed. In particular, for any " > 0, +" lim inf a Pfa  ag > 0: (27) a!1 Remark 3.2. The assumption 1 < (s) < 1 for some s > 0 is weak. Because the (z ; z )-th element of P M (s) is ^ ^ P (z ; z )E G(z ; z ;  ) 1fG(z ; z ;  ) > 1g; z ;z by the de nition of G in (25a) and condition (23), we always have (s) ! 1 as s ! 1. Hence there exists s > 0 such that (s) 2 (1;1) if, for example,  has a compact support. Condition (27) implies that for any " > 0, there exists a constant C (") > 0 such that Pfa  ag  C (")a for large enough a, so the upper tail of the wealth distribution is at least Pareto. 18 Remark 3.3. Toda (2019) constructs an example of a Huggett (1993) economy with Pareto-tailed wealth distribution when discount factors are random. Theorem 3.3 is signi cantly more general as we allow for stochastic returns and income. Stachurski and Toda (2019) prove that with constant discount factor, constant asset return, and light-tailed income, the wealth distribution is always light-tailed. Theorem 3.3 shows that sucient heterogeneity in discount factor or returns generates heavy tails. Example 3.2. The CRRA-iid setting of Benhabib et al. (2015) satis es the assump- tions of Theorem 3.3. When utility is CRRA, by Proposition 5 of Benhabib et al. (2015), condition (23) holds if R(z ;  ) > 1=s  with positive probability, where s  is given 1= in Example 2.2. In the iid case, this condition reduces to Pf( ER ) R > 1g > 0, which holds under the conditions of Benhabib et al. (2015). Thus, Assumption 3.3 holds. The existence of s > 0 with (s) 2 (1;1) follows from Remark 3.2 and the assumption that R has a compact support. 4. Testing the Growth Conditions The three key conditions in the paper are the restrictions on the growth rates G , G and G , with the rst two required for optimality and the last for stationarity (see Assumptions 2.1, 2.2 and 3.1 respectively). In this section we explore the restrictions implied by these conditions. We begin with the following result, which yields a straightforward method for computing these growth rates. Lemma 4.1 (Long-run growth rates and spectral radii). Let ' = '(Z ;  ), where t t t ' is a nonnegative measurable function and f g is an iid sequence with marginal distribution . In this setting we have 1=n G = r(L ); where G := lim E ' (28) ' ' ' t n!1 t=1 and r(L ) is the spectral radius of the matrix de ned by ^ ^ L (z; z ^) = P (z; z ^) '(z ^; )(d): (29) 19 1= Benhabib et al. (2015) assume that Pf R > 1g > 0, so it suces to show that ( ER ) or, equivalently, E( R )  1. By Jensen's inequality and their restriction  1, the last bound is true whenever (E R )  1. But this must hold because, under their conditions, we have ER < 1, as shown in Example 2.1. t 19 The matrix L is expressed as a function on Z Z in (29) but can be represented in traditional matrix notation by enumerating Z. What factors determine the long-run average growth rates embedded in our assump- tions, such as G or G ? Lemma 4.1 tells us how to compute these values for a given speci cation of dynamics, but how should we understand them intuitively and what factors determine their size? To address these questions, let us consider an AR(1) discount factor process, which has been adopted in several recent quantitative studies (see, e.g., Hubmer et al. (2018) or Hills and Nakata (2018)). In particular, suppose that the state process follows a discretized version of iid 2 1=2 Z = (1 ) + Z + (1  )  ; f g  N (0; 1); (30) t+1 t t+1 t and = Z . (The discretization implies that is always positive.) To simplify t t t interpretation, the process (30) is structured so that the stationary distribution of fZ g is N (;  ). We use Rouwenhorst (1995)'s method to discretize fZ g and then t t calculate G using Lemma 4.1, studying how G is a ected by the parameters in (30). Since = Z for all t, the structure of (30) implies that  is the long-run unconditional t t mean of f g. It can therefore be set to standard calibrated value for the discount factor, such as 0:99 from Krusell and Smith (1998). What we wish to understand is how the remaining parameters  and  a ect the value of G . While no closed form expression is available in this case, Figure 1 sheds some light by providing a contour plot of G over a set of (; ) pairs. The gure shows that G grows with both the persistence term  and volatility term . In particular, the condition G < 1 fails when the persistence and volatility of the discount factor process are suciently high. n 1=n This is because G is the limit of (E ) and, for positive random variables, t=1 sequence of large outcomes have a strong compounding e ect on their product. High volatility and high persistence reinforce this e ect. This discussion has focused on G but similar intuition applies to both G and G . R R If and R are both increasing functions of the state process, then these asymptotic t t growth rates also increase with greater persistence and volatility in the state process, as well as higher unconditional mean. The next section further illustrates these points. Speci cally, if Z := fz ; : : : ; z g, then L = PD where P is, as before, the transition matrix 1 N ' ' for the exogenous state, and D := diag (E '; : : : ; E ') when E ' := E '(z; ). In what follows, ' z z z z 1 N D , D and D are de ned analogously to D . R R ' 1.00 1.0100 0.015 1.0075 0.014 1.0050 1.0025 0.013 1.0000 0.012 0.9975 0.9950 0.011 0.9925 0.9900 0.010 0.960 0.965 0.970 0.975 0.980 0.985 0.990 Figure 1. Contour plot of G under AR(1) discounting 5. Application: Stochastic Volatility and Mean Persistence We showed in Examples 2.1, 2.2 and 3.1 that, in the setting of Benhabib et al. (2015), where the discount factor is constant and returns and labor income are iid, Assumptions 2.1{2.3 and Assumption 3.1 are all satis ed. Hence, by Theorems 2.2 and 3.1, the household optimization problem has a unique optimal policy and the wealth process under this policy has a stationary solution. If, in addition, the support of Y is nite or Y has a positive density, say, then the conditions of Theorem 3.2 also t t hold and the stationary solution is ergodic, geometrically mixing and its time series averages are asymptotically normal. Let us now bring the model closer to the data by relaxing the iid restrictions on nan- cial and non- nancial returns, introducing both mean persistence and time varying volatility in returns on assets. In particular, we set log R =  +   ; (31) t t t t where f g is iid and standard normal and f g and f g are nite-state Markov t t t chains, discretized from = (1  )  +   +   and log  = (1  )  +  log  +   : t   t1  t   t1 t t The importance of these features for wealth dynamics was highlighted in Fagereng et al. (2016a). 21 Innovations are iid and standard normal. Using the data in Fagereng et al. (2016b) on Norwegian nancial returns over 1993{2003, we estimate these AR(1) models to obtain   = 0:0281,  = 0:5722,  = 0:0067,   = 3:2556,  = 0:2895 and = 0:1896. Based on this calibration, the stationary mean and standard deviation of fR g are around 1:03 and 4%, respectively. To distinguish the e ects of stochastic volatility and mean persistence, we consider two subsidiary models. The rst reduces f g to its stationary mean  , while the 2 2 + =2(1 ) second reduces f g to its stationary mean  ~ := e . In summary, log R =   +   (Model I) t t t log R =  +  ~ (Model II) t t t We set = 0:95 and = 1:5. To test the stability properties of Model I, we explore a neighborhood of the calibrated ( ;  ) values, while in Model II, we do likewise for ( ;  ) pairs. In each scenario, other parameters are xed to the benchmark. The results are shown in Figures 2 and 3. In part (a) of each gure, we see that G is increasing in the persistence and volatility parameters of the state process. The intuition behind this feature was explained in Section 4 for the case of G and is similar here. (Note that G = G in the R R present case, since  is a constant, so G has the same shape as G in terms t R R of contours.) The dots in the gures show that G < 1 at the estimated parameter values. Part (b) of each gure shows the set of parameters under which the model is globally stable and ergodic. The stability threshold is the boundary of the set of parameter pairs that produce maxfG ; s;  sG  g < 1, where s  is given by (20). For such pairs, R R Assumptions 2.2 and 3.1 both hold, so the conditions of Theorems 3.1{3.2 are satis ed. (We are continuing to suppose that Y is nite or has a positive density, so that Assumption 3.2 holds. Assumptions 2.1 and 2.3 are always valid in the current setting). Observe that the estimated parameter values (dot points) lie inside the stable set. 6. Conclusion We studied an updated version of the income uctuation problem, the \common ancestor" of modern macroeconomic theory (Ljungqvist and Sargent (2012), p. 3.) 1.00 1.150 1.4 1.125 1.2 1.100 1.0 1.075 0.8 0.6 1.050 1.025 0.4 1.000 0.2 (0.2895, 0.1896) 0.975 0.0 0.0 0.2 0.4 0.6 0.8 1.0 (a) Contour plot of G 1.4 stability threshold 1.2 1.0 0.8 0.6 stable 0.4 0.2 (0.2895, 0.1896) 0.0 0.2 0.4 0.6 0.8 1.0 (b) Range and threshold of stability Figure 2. Stability tests for Model I Working in a setting where returns on nancial assets, non- nancial income and impa- tience are all state dependent and uctuate over time, we obtained conditions under which the household savings problem has a unique solution that can be computed by successive approximations and the wealth process under the optimal savings policy 1.00 1.44 0.14 1.38 0.12 1.32 0.10 1.26 0.08 1.20 0.06 1.14 0.04 1.08 0.02 1.02 (0.5722, 0.0067) 0.96 0.0 0.2 0.4 0.6 0.8 1.0 (a) Contour plot of G 0.14 stability threshold 0.12 0.10 0.08 0.06 0.04 stable 0.02 (0.5722, 0.0067) 0.0 0.2 0.4 0.6 0.8 1.0 (b) Range and threshold of stability Figure 3. Stability tests for Model II has a unique stationary distribution with Pareto right tail. We also obtained condi- tions under which wealth is ergodic and exhibits geometric mixing and asymptotic normality. We investigated the nature of our conditions and provided methods for testing them in applications. While our work was motivated by the desire to bet- ter understand the joint distribution of income and wealth, the income uctuation 24 problem also has applications in asset pricing, life-cycle choice, scal policy, monetary policy, optimal taxation, and social security. The ideas contained in this paper should be helpful for those elds after suitable modi cations or extensions. Appendix A. Preliminaries Given a topological space T, let B(T) be the Borel -algebra and P (T) be the probability measures on B(T). A stochastic kernel  on T is a map  : TB(T) ! [0; 1] such that x 7! (x; B) is B(T)-measurable for each B 2 B(T) and B 7! (x; B) is a probability measure on B(T) for each x 2 T. For all t 2 N, x; y 2 T and B 2 1 t t1 B(T), we de ne  :=  and  (x; B) :=  (y; B)(x; dy). Furthermore, for all R R t t 2 P (T), let ( )(B) :=  (x; B)(dx).  is called Feller if x 7! h(y)(x; dy) is continuous on T whenever h is bounded and continuous on T. We call 2 P (T) stationary for  if  = . A sequencef g  P (T) is called tight, if, for all " > 0, there exists a compact K  T such that  (TnK )  " for all n. A stochastic kernel  is called bounded in probability if the sequence fQ (x;)g is tight for all x 2 T. Given  2 P (T), we de ne the t0 total variation norm kk := sup g d . Given any measurable map V : T ! TV g:jgj1 [1;1), we say that  is V -geometrically mixing if there exist constants M < 1 and < 1 such that, for all x 2 T and t  0, the corresponding Markov process fX g satis es sup jE g(X )h(X ) [E g(X )] [E h(X )]j   MV (x). 2 2 x t t+k x t x t+k k0; h ; g V Below we use ( ;F; P) to denote a xed probability space on which all random variables are de ned. E is expectations with respect to P. The state process fZ g and the innovation processes f" g, f g and f g introduced in (5) live on this space. t t t In what follows, fZ g is a stationary version of the chain, where Z is drawn from its t 0 unique stationary distribution|henceforth denoted  . The marginal distributions of the innovations are denoted by  ,  and  respectively. We let fF g be the "   t natural ltration generated by fZ g and the three innovation processes. P conditions t z on Z = z and E is expectation under P . 0 z z We rst prove Lemma 4.1, since its implications will be used immediately below. In the proof, we consider the matrix L as a linear operator on R and identify vectors in R with real-valued functions on Z. 25 Proof of Lemma 4.1. A proof by induction con rms that, for any function h 2 R , L h(z) = E ' h(Z ); (32) z t t t=1 where L is the n-th composition of the operator L with itself (or, equivalently, the n-th power of the matrix L ). The positivity of L and Theorem 9.1 of Krasnosel'skii ' ' n 1=n Z et al. (2012) imply that r(L ) = lim kL hk when kk is any norm on R and ' n!1 h is everywhere positive on Z. With h  1 and kfk = Ejf (Z )j, this becomes ! ! 1=n 1=n n n Y Y 1=n r(L ) = lim E L 1(Z ) = lim E E ' = lim E ' (33) ' 0 Z t t ' 0 n!1 n!1 n!1 t=1 t=1 where the second equality is due to (32) and h = 1 and the third is by the law of iterated expectations. Lemma A.1. Let f' g and G be as de ned in Lemma 4.1. If G < 1, then there t ' ' exists an N in N and a  < 1 such that max E ' <  whenever n  N . z2Z z t t=1 n 1=n Proof. Recalling from the proof of Lemma 4.1 that r(L ) = lim kL hk when ' n!1 kk is any norm on R and h is everywhere positive on Z, we can again take h  1 but now switch to kfk = max jf (z)j, so that (33) becomes z2Z 1=n 1=n r(L ) = lim max L 1(z) = lim max E ' : (34) ' z t n!1 n!1 z2Z z2Z t=1 Since r(L ) = G and G < 1, the claim in Lemma A.1 now follows. ' ' ' Appendix B. Proof of Section 2 Results Proof of Proposition 2.1. Pick any a  0 and z 2 Z. Since c = Y for all t is t t dominated by a feasible consumption path, monotonicity of u and the law of iterated expectations give 1 t 1 t 1 t XY XY X Y max E u(c )  E u(Y ) = E h(Z ); a;z i t z i t z i t t=0 i=0 t=0 i=0 t=0 i=0 where h(Z ) := E u(Y ) and the monotone convergence theorem has been employed t Z to pass the expectation through the sum. In view of (32) and = 1, we then have 1 t 1 XY X max E u(c )  L h(z): (35) a;z i t t=0 t=0 i=0 26 By the assumed almost sure positivity of and the irreducibility of P , the matrix L is irreducible. Hence, by the Perron{Frobenius theorem, we can choose an everywhere positive eigenfunction e such that L e = r(L )e. By the everywhere positivity of u(Y ), the function h is everywhere positive on Z, and hence we can choose > 0 such that e := e is less than h pointwise on Z. We then have 1 1 1 X X X t t t L h(z)  L e (z) = r(L ) e(z): t=0 t=0 t=0 By lemma 4.1 we know that r(L )  1, and since and e are positive, this expres- sion is in nite. Returning to (35), we see that the value function is in nite at our arbitrarily chosen pair (a; z). For the rest of this section we suppose that Assumptions 2.1{2.3 hold. P Q P Q 1 t 1 t Lemma B.1. M := max E and M := max E R , 1 z2Z z i 2 z2Z z i i t=0 i=1 t=0 i=1 are nite, as are the constants M = max E Y and M = max E u (Y ). 3 z2Z z 4 z2Z z Proof. That M and M are nite follows directly from Lemma A.1, with ' = and 1 2 t t ' = R respectively. Regarding M , Assumption 2.3 states that EY < 1. By the t t t 3 Law of Iterated Expectations, we can write this as E Y  (z) < 1. As fZ g z Z t z2Z is irreducible, we know that  is positive everywhere on Z. Hence, M < 1 must Z 3 hold. The proof of M < 1 is similar. Lemma B.2. For the maximal asset path fa ~ g de ned by a ~ = R a ~ + Y and (a ~ ; z ~ ) = (a; z) given; (36) t+1 t+1 t t+1 0 0 P Q 1 t we have, for each (a; z) 2 S , that M (a; z) := E a ~ < 1. 0 a;z i t t=0 i=0 Q P Q t t t Proof. Iterating backward on (36), we can show that ~a = R a+ Y R . t i j i i=1 j=1 i=j+1 Taking expectation yields t t t t j Y Y X Y Y E a ~ = E R a + E R Y : a;z i t z i i z i i k j i=0 i=1 j=1 i=j+1 k=0 27 Then the Monotone Convergence Theorem and the Markov property imply that 1 t 1 t t j X Y XX Y Y M (a; z) = E R a + E R Y z i i z i i k j t=0 i=1 t=0 j=1 i=j+1 k=0 1 t 1 1 j i XY XX Y Y = E R a + E Y R z i i z k j j+` j+` t=0 i=1 j=1 i=0 k=0 `=1 1 t 1 1 i X Y X Y XY = E R a + E Y E R : z i i z k j Z ` ` t=0 i=1 j=1 k=0 i=0 `=1 By Lemma B.1, we now have, for all (a; z) 2 S , 1 t 1 t X Y X Y M (a; z)  M a + M E Y = M a + M E E Y: 2 2 z i t 2 2 z i Z t=1 i=0 t=1 i=0 Applying Lemma B.1 again gives M (a; z) < 1, as was to be shown. Proposition B.1. The value V (a; z) in (8) is well-de ned in f1g[ R. Proof. By the assumptions on the utility function, there exists a constant B 2 R P Q 1 t such that u(c)  c + B, and hence V (a; z)  E u(a ~ )  M (a; z) + c a;z i t t=0 i=0 P Q 1 t B E . The last term is nite by Lemma A.1. z i t=0 i=0 Proof of Thoerem 2.1. The proof is a long but relatively straightforward extension of Theorem 1 of Benhabib et al. (2015) and thus omitted. A full proof is available from the authors upon request. Proposition B.2. (C ; ) is a complete metric space. Proof. The proof is a straightforward extension of Proposition 4.1 of Li and Stachurski (2014) and thus omitted. A full proof is available from the authors upon request. Proof of Proposition 2.2. Let c be a policy in C satisfying (10). To show that any asset path generated by c satis es the transversality condition (11), observe that, by condition (12), we have 0 0 0 c 2 C =) 9M 2 R s.t. u (a)  (u  c)(a; z)  u (a) + M; 8(a; z) 2 S : (37) + 0 t t t Y Y Y 0 0 ) E (u  c)(a ; Z )a  E u (a )a + M E a : (38) a;z i t t t a;z i t t a;z i t i=0 i=0 i=0 28 Regarding the rst term on the right hand side of (38), x A > 0 and observe that 0 0 0 u (a )a = u (a )a 1fa  Ag + u (a )a 1fa > Ag t t t t t t t t 0 0 0 0 Au (a ) + u (A)a  Au (Y ) + u (A)a ~ t t t t with probability one, where a ~ is the maximal path de ned in (36). We then have t t t Y Y Y 0 0 0 E u (a )a  AE u (Y ) + u (A)E a ~ : (39) a;z i t t z i t a;z i t i=0 i=0 i=0 By Lemma B.1, we have t t t Y Y Y 0 0 A E u (Y ) = A E E u (Y )  M A E ; z i t z i Z 4 z i i=0 i=0 i=0 and the last expression converges to zero as t ! 1 by Lemma A.1. The second term in (39) also converges to zero by Lemma B.2. Hence E u (a )a ! 0 as a;z i t t i=0 t ! 1, which, combined with (38) and another application of Lemma B.2, gives our desired result. Proposition B.3. For all c 2 C and (a; z) 2 S , there exists a unique  2 (0; a] that solves (14). Proof. Fix c 2 C and (a; z) 2 S . Because c 2 C , the map  7! (; a; z) is 0 c increasing. Since  7! u () is strictly decreasing, the equation (14) can have at most one solution. Hence uniqueness holds. Existence follows from the intermediate value theorem provided we can show that (a)  7! (; a; z) is a continuous function, (b) 9 2 (0; a] such that u ()  (; a; z), and (c) 9 2 (0; a] such that u ()  (; a; z). For part (a), it suces to show that ^ ^ ^ ^ ^ g() := E R (u  c) [R(a ) + Y ; Z ] is continuous on (0; a]. To this end, x  2 (0; a] and  ! . By (37) we have 0 0 0 ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ R (u  c) [R (a ) + Y ; Z ]  R (u  c) (Y ; Z )  Ru (Y ) + RM: (40) The last term is integrable, as follows easily from Lemma B.1. Hence the domi- nated convergence theorem applies. From this fact and the continuity of c, we obtain g( ) ! g(). Hence,  7! (; a; z) is continuous. n c 29 Part (b) clearly holds, since u () ! 1 as  ! 0 and  7! (; a; z) is increasing and always nite (since it is continuous as shown in the previous paragraph). Part (c) is also trivial (just set  = a). Proposition B.4. We have Tc 2 C for all c 2 C . ^ ^ ^ ^ ^ Proof. Fix c 2 C and let g (; a; z) := E R (u  c) [R (a ) + Y ; Z ]. Step 1. We show that Tc is continuous. To apply a standard xed point parametric continuity result such as Theorem B.1.4 of Stachurski (2009), we rst show that is jointly continuous on the set G de ned in (15). This will be true if g is jointly continuous on G. For any f( ; a ; z )g and (; a; z) in G with ( ; a ; z ) ! (; a; z), n n n n n n we need to show that g( ; a ; z ) ! g(; a; z). To that end, we de ne n n n 0 0 ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ h (; a; Z; "; ^ ;  ^); h (; a; Z; "; ^ ;  ^) := R[u (Y ) + M ] R (u  c) [R (a ) + Y ; Z ]; 1 2 ^ ^ ^ ^ ^ ^ ^ where := (Z; " ^), R := R(Z;  ) and Y := Y (Z;  ^) as de ned in (5). Then h and h are continuous in (; a; Z ) by the continuity of c and nonnegative by (40). By Fatou's lemma and Theorem 1.1 of Feinberg et al. (2014), ZZZ ^ ^ h (; a; z ^; "; ^ ;  ^)P (z; z ^) (d" ^) (d ) (d ^) i " z ^2Z ZZZ ^ ^ lim inf h ( ; a ; z ^; "; ^ ;  ^)P (z ; z ^) (d" ^) (d ) (d ^) i n n n " n!1 z ^2Z ZZZ ^ ^ lim inf h ( ; a ; z ^; "; ^ ;  ^)P (z ; z ^) (d" ^) (d ) (d ^): i n n n " n!1 z ^2Z This implies that 0 0 ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ lim inf E R (u  c) [R (a  ) + Y ; Z ]  E R (u  c) [R (a ) + Y ; Z ] : z n n z n!1 The function g is then continuous, since the above inequality is equivalent to the statement lim inf g( ; a ; z )  g(; a; z)  lim sup g( ; a ; z ). Hence, n!1 n n n n n n c n!1 is continuous on G, as was to be shown. Moreover, since  7! (; a; z) takes values 0 0 0 ^ ^ ^ in the closed interval I (a; z) := [u (a); u (a) + E R(u (Y ) + M )], and the correspon- dence (a; z) 7! I (a; z) is nonempty, compact-valued and continuous, Theorem B.1.4 of Stachurski (2009) then implies that Tc is continuous on S . Step 2. We show that Tc is increasing in a. Suppose that for some z 2 Z and a ; a 2 (0;1) with a < a , we have  := Tc(a ; z) > Tc(a ; z) =:  . Since c 1 2 1 2 1 1 2 2 30 is increasing in a by assumption, is increasing in  and decreasing in a. Then 0 0 0 u ( ) < u ( ) = ( ; a ; z)  ( ; a ; z) = u ( ). This is a contradiction. 1 2 c 2 2 c 1 1 1 Step 3. We have shown in Proposition B.3 that Tc(a; z) 2 (0; a] for all (a; z) 2 S . 0 0 0 0 Step 4. We show that ku  (Tc) uk < 1. Since u [Tc(a; z)]  u (a), we have 0 0 0 0 ju [Tc(a; z)] u (a)j = u [Tc(a; z)] u (a) 0 0 ^ ^ ^ ^ ^ ^ ^ ^ E R (u  c) (R[a Tc(a; z)] + Y ; Z )  E R[u (Y ) + M ] z z for all (a; z) 2 S . The right hand side is easily shown to be nite via Lemma B.1. To prove Theorem 2.2, let H be all continuous functions h : S ! R that is decreasing in its rst argument and (a; z) 7! h(a; z) u (a) is bounded and nonnegative. Given h 2 H , let Th be the function mapping (a; z) 2 S into the  that solves 0 1 0 ^ ^ ^ ^ ^ = maxfE R h(R [a (u ) ()] + Y ; Z ); u (a)g: (41) Moreover, consider the bijection H : C ! H de ned by Hc := u  c. ~ ~ Lemma B.3. The operator T : H ! H and satis es TH = HT on C . Proof. Pick any c 2 C and (a; z) 2 S . Let  := Tc(a; z), then  solves 0 0 0 ^ ^ ^ ^ u () = maxfE R (u  c) [R (a ) + Y ; Z ]; u (a)g: (42) We need to show that HTc and THc evaluate to the same number at (a; z). In other words, we need to show that u () is the solution to 0 0 1 0 ^ ^ ^ ^ ^ = maxfE R (u  c) (R [a (u ) ()] + Y ; Z ); u (a)g: But this is immediate from (42). Hence, we have shown that TH = HT on C . Since H : C ! H is a bijection, we have T = HTH . Since in addition T : C ! C by Proposition B.4, we have T : H ! H . This concludes the proof. ~ ~ ~ Lemma B.4. T is order preserving on H . That is, Th  Th for all h ; h 2 H 1 2 1 2 with h  h . 1 2 Proof. Let h ; h be functions in H with h  h . Suppose to the contrary that there 1 2 1 2 ~ ~ exists (a; z) 2 S such that  := Th (a; z) > Th (a; z) =:  . Since functions in H 0 1 1 2 2 31 are decreasing in the rst argument, we have 0 1 0 ^ ^ ^ ^ ^ = maxfE R h (R [a (u ) ( )] + Y ; Z ); u (a)g 1 z 1 1 0 1 0 ^ ^ ^ ^ ^ maxfE R h (R [a (u ) ( )] + Y ; Z ); u (a)g z 2 1 0 1 0 ^ ^ ^ ^ ^ maxfE R h (R [a (u ) ( )] + Y ; Z ); u (a)g =  : z 2 2 2 This is a contradiction. Hence, T is order preserving. Lemma B.5. There exists an n 2 N and  < 1 such that T is a contraction mapping of modulus  on (H ; d ). Proof. Since T is order preserving and H is closed under the addition of nonnegative constants, based on Blackwell (1965), it remains to verify the existence of n 2 N and n n ~ ~ < 1 such that T (h + )  T h +  for all h 2 H and  0. By Lemma A.1 and Assumption 2.2, it suces to show that for all k 2 N and (a; z) 2 S , we have k k ~ ~ T (h + )(a; z)  T h(a; z) + E R : (43) z i i i=1 Fix h 2 H ,  0, and let h (a; z) := h(a; z) + . By the de nition of T , we have 0 1 0 ~ ^ ^ ^ ~ ^ ^ Th (a; z) = maxfE R h (R [a (u ) (Th )(a; z)] + Y ; Z ); u (a)g 0 1 0 ^ ^ ~ ^ ^ maxfE R h(R [a (u ) (Th )(a; z)] + Y ; Z ); u (a)g + E R z z 1 1 0 1 0 ^ ^ ^ ~ ^ ^ maxfE R h(R [a (u ) (Th)(a; z)] + Y ; Z ); u (a)g + E R : z z 1 1 Here, the rst inequality is elementary and the second is due to the fact that h  h ~ ~ ~ and T is order preserving. Hence, T (h + )(a; z)  Th(a; z) + E R and (43) holds z 1 1 for k = 1. Suppose (43) holds for arbitrary k. It remains to show that it holds for k + 1. For z 2 Z, de ne f (z) := E R  R . By the induction hypothesis, the z 1 1 k k monotonicity of T and the Markov property, k+1 k 0 1 k+1 0 ~ ^ ^ ~ ^ ~ ^ ^ T h (a; z) = maxfE R (T h )(R [a (u ) (T h )(a; z)] + Y ; Z ); u (a)g k 0 1 k+1 0 ^ ^ ~ ^ ~ ^ ^ maxfE R (T h + f )(R [a (u ) (T h )(a; z)] + Y ; Z ); u (a)g k 0 1 k+1 0 ^ ~ ^ ~ ^ ^ maxfE R (T h)(R [a (u ) (T h )(a; z)] + Y ; Z ); u (a)g + E R f (Z ) z 1 1 1 k 0 1 k+1 0 ^ ~ ^ ~ ^ ^ maxfE R (T h)(R [a (u ) (T h)(a; z)] + Y ; Z ); u (a)g + E R E R  R z 1 1 Z 1 1 k k k+1 = T h(a; z) + E R  R : z 1 1 k+1 k+1 32 Hence, (43) is veri ed by induction. This concludes the proof. Proof of Theorem 2.2. Let n and  be as in Lemma B.5. In view of Propositions 2.2, B.2 and B.4, to show that T is a contraction and verify claims (1){(3) of Theo- rem 2.2, based on the Banach contraction mapping theorem, it suces to show that n n (T c; T d)  (c; d) for all c; d 2 C . To this end, pick any c; d 2 C . Note that the topological conjugacy result established in Lemma B.3 implies that T = HTH . n 1 1 n 1 n n ~ ~ Hence, T = (HTH ) (HTH ) = HT H and T H = HT . By the de nition of  and the contraction property established in Lemma B.5, n n n n n n ~ ~ (T c; T d) = d (HT c; HT d) = d (T Hc; T Hd)  d (Hc; Hd) = (c; d): 1 1 1 Hence, T is a contraction and claims (1){(3) are veri ed. Our next goal is to prove Proposition 2.3. To begin with, we de ne C = fc 2 C : a 7! a c(a; z) is increasing for all z 2 Zg : Lemma B.6. C is a closed subset of C , and Tc 2 C for all c 2 C . 0 0 0 Proof. To see that C is closed, for a given sequence fc g in C and c 2 C with 0 n 0 (c ; c) ! 0, we need to show that c 2 C . This obviously holds since a 7! ac (a; z) n 0 n is increasing for all n, and, in addition, (c ; c) ! 0 implies that c (a; z) ! c(a; z) n n for all (a; z) 2 S . Fix c 2 C . We now show that  := Tc 2 C . Since  2 C by Proposition B.4, it 0 0 remains to show that a 7! a (a; z) is increasing. Suppose the claim is false, then there exist z 2 Z and a ; a 2 (0;1) such that a < a and a (a ; z) > a (a ; z). 1 2 1 2 1 1 2 2 Since a (a ; z)  0, a (a ; z)  0 and (a ; z)  (a ; z) by Proposition B.4, 1 1 2 2 1 2 we have (a ; z) < a and (a ; z) < (a ; z). However, based on the property of the 1 1 1 2 time iteration operator, we then have 0 0 ^ ^ ^ ^ ^ (u  )(a ; z) = E R(u  c)(R [a (a ; z)] + Y ; Z ) 1 z 1 1 0 0 ^ ^ ^ ^ ^ E R(u  c)(R [a (a ; z)] + Y ; Z )  (u  )(a ; z); z 2 2 2 which implies that (a ; z)  (a ; z). This is a contradiction. Hence, a 7! a(a; z) 1 2 is increasing, and T is a self-map on C . Proof of Proposition 2.3. Since T maps elements of the closed subset C into itself by Lemma B.6, Theorem 2.2 implies that c 2 C . Hence, the stated claims hold. 0 33 Proof of Proposition 2.4. Let T be the time iteration operator for the income process j established in Proposition B.4. It suces to show T c  T c for all c 2 C . To see 1 2 this, note that by Lemma B.4, we have T c  T c whenever c  c . Therefore if j 1 j 2 1 2 T c  T c for all c 2 C , we obtain T c  T c  T c . Iterating this starting from 1 2 1 1 1 2 2 2 n n any c 2 C , by Theorem 2.2, it follows that c = lim (T ) c  lim (T ) c = c , n!1 1 n!1 2 1 2 completing the proof. To show that T c  T c for any c 2 C , take any (a; z) 2 S and de ne  = (T c)(a; z). 1 2 0 j j To show    , suppose on the contrary that  >  . Since c is increasing in a and 1 2 1 2 00 0 u < 0 (hence u is decreasing), it follows from the de nition of the time iteration operator in (14){(16), Y  Y , u < 0 and the monotonicity of c 2 C that 1 2 0 0 0 0 ^ ^ ^ ^ ^ u ( ) > u ( ) = maxfE R (u  c)[R(a  ) + Y ; Z ]; u (a)g 2 1 z 1 1 0 0 0 ^ ^ ^ ^ ^ maxfE R (u  c)[R(a  ) + Y ; Z ]; u (a)g = u ( ); z 2 2 2 which is a contradiction. To prove Proposition 2.5, we need several lemmas. Lemma B.7. For all c 2 C , there exists a threshold a  (z) such that Tc(a; z) = a if 0 c and only if a  a  (z). In particular, there exists a threshold a (z) such that c (a; z) = a if and only if a  a (z). Proof. Recall that, for all c 2 C , (a; z) := Tc(a; z) solves 0 0 0 ^ ^ ^ ^ ^ (u  ) (a; z) = maxfE R (u  c) (R [a (a; z)] + Y ; Z ); u (a)g: (44) For each z 2 Z and c 2 C , de ne 0 0 ^ ^ ^ ^ a  (z) := (u ) [E R (u  c) (Y ; Z )] and a (z) := a   (z): (45) c z c To prove the rst claim, by Lemma B.6, it suces to show that (a; z) < a implies a > a  (z). This obviously holds since in view of (44), the former implies that 0 0 0 0 ^ ^ ^ ^ ^ ^ ^ ^ ^ u (a) < E R (u  c) (R [a (a; z)] + Y ; Z )  E R (u  c) (Y ; Z ) = u [a  (z)]; z z c which then yields a > a  (z). The second claim follows immediately from the rst claim and the fact that c 2 C is the unique xed point of T in C . Consider a subset C de ned by C := fc 2 C : a 7! c(a; z) is concave for all z 2 Zg. 1 1 0 Lemma B.8. C is a closed subset of C and C , and, Tc 2 C for all c 2 C . 1 0 1 1 34 Proof. The rst claim is immediate because limits of concave functions are concave. To prove the second claim, x c 2 C . We have Tc 2 C by Lemma B.6. It remains to 1 0 show that a 7! (a; z) := Tc(a; z) is concave for all z 2 Z. Given z 2 Z, Lemma B.7 implies that (a; z) = a for a  a  (z) and that (a; z) < a for a > a  (z). Since in c c addition a 7! (a; z) is continuous and increasing, to show the concavity of  with respect to a, it suces to show that a 7! (a; z) is concave on (a  (z);1). Suppose there exist some z 2 Z, 2 [0; 1], and a ; a 2 (a  (z);1) such that 1 2 c ((1 )a + a ; z) < (1 )(a ; z) + (a ; z): (46) 1 2 1 2 ^ ^ ^ ^ Let h(a; z; ! ^ ) := R [a (a; z)] + Y , where ! ^ := (R; Y ). Then by Lemma B.7 and noting that consumption is interior, we have 0 0 ^ ^ ^ (u  ) ((1 )a + a ; z) = E R (u  c)fh[(1 )a + a ; z; ! ^ ]; Zg 1 2 z 1 2 ^ ^ ^ E R (u  c) [(1 )h(a ; z; ! ^ ) + h(a ; z; ! ^ ); Z ]: z 1 2 Using condition (17) then yields 0 1 0 ^ ^ ^ ((1 )a + a ; z)  (u ) fE R(u  c)[(1 )h(a ; z; ! ^ ) + h(a ; z; ! ^ ); Z ]g 1 2 z 1 2 0 1 0 0 1 0 ^ ^ ^ ^ ^ ^ (1 )(u ) fE R(u  c)[h(a ; z; ! ^ ); Z ]g + (u ) fE R(u  c)[h(a ; z; ! ^ ); Z ]g z 1 z 2 0 1 0 0 1 0 = (1 )(u ) f(u  )(a ; z)g + (u ) f(u  )(a ; z)g = (1 )(a ; z) + (a ; z); 1 2 1 2 which contradicts (46). Hence, a 7! (a; z) is concave for all z 2 Z. Proof of Proposition 2.5. By Theorem 2.2, T : C ! C is a contraction mapping with unique xed point c . Since C is a closed subset of C and TC  C by Lemma B.8, 1 1 1 we know that c 2 C . The rst claim is veri ed. Regarding the second claim, note that c 2 C implies that a 7! c (a; z) is increasing and concave for all z 2 Z. Hence, a 7! c (a; z)=a is a decreasing function for all z 2 Z. Since 0  c (a; z)  a for all (a; z) 2 S , (z) := lim c (a; z)=a is well-de ned and (z) 2 [0; 1]. 0 a!1 Proof of Remark 2.1. For each c in C concave in its rst argument, let h (x; ! ^ ) := ^ ^ ^ ^ c(Rx + Y ; z ^), where ! ^ := (R; Y ; z ^). Then x 7! h (x; ! ^ ) is concave. Based on the generalized Minkowski's inequality (see, e.g., Hardy et al. (1952), page 146, theorem 35 198), we have 1 1 ^ ^ ^ ^ [E R h ( x + (1 )x ; ! ^ ) ]  fE R [ h (x ; ! ^ ) + (1 )h (x ; ! ^ )] g z c 1 2 z c 1 c 2 1 1 1 ^ ^ ^ ^ = fE [ ( R) h (x ; ! ^ ) + (1 )( R) h (x ; ! ^ ) ] g z c 1 c 2 1 1 1 1 ^ ^ ^ ^ (E [ ( R) h (x ; ! ^ )] ) + (E [(1 )( R) h (x ; ! ^ )] ) z c 1 z c 2 1 1 ^ ^ ^ ^ = [E R h (x ; ! ^ ) ] + (1 )[E R h (x ; ! ^ ) ] ; z c 1 z c 2 Since u (c) = c , the above inequality implies that condition (17) holds. To prove Proposition 2.6, let s  be as in (19) and de ne C := fc 2 C : c(a; z)  (1 s )a for all (a; z) 2 S g : (47) 2 0 Lemma B.9. C is a closed subset of C , and Tc 2 C for all c 2 C . 2 2 2 Proof. To see that C is closed, for a given sequence fc g in C and c 2 C with 2 n 2 (c ; c) ! 0, we need to verify that c 2 C . This obviously holds since c (a; z)=a n 2 n 1 s  for all n and (a; z) 2 S , and, on the other hand, (c ; c) ! 0 implies that 0 n c (a; z) ! c(a; z) for all (a; z) 2 S . n 0 We next show that T is a self-map on C . Fix c 2 C . We have Tc 2 C since T is 2 2 a self-map on C . It remains to show that  := Tc satis es (a; z)  (1 s )a for all (a; z) 2 S . Suppose (a; z) < (1 s )a for some (a; z) 2 S . Then 0 0 0 0 0 0 ^ ^ ^ ^ ^ u ((1 s )a) < (u  )(a; z) = maxfE R (u  c) (R [a (a; z)] + Y ; Z ); u (a)g: 0 0 Since u ((1 s )a) > u (a) and c 2 C , this implies that 0 0 ^ ^ ^ ^ u ((1 s )a) < E R (u  c) (R [a (a; z)] + Y ; Z ) ^ ^ ^ ^ E R u f(1 s )R [a (a; z)] + (1 s )Yg 0 0 ^ ^ ^ ^ ^ ^ ^ E R u [(1 s )Rsa  + (1 s )Y ]  E R u [Rs (1 s )a]; z z which contradicts (19) since ((1 s )a; z) 2 S . As a result, (a; z)  (1 s )a for all (a; z) 2 S and we conclude that Tc 2 C . 0 2 Proof of Proposition 2.6. We have shown in Theorem 2.2 that T is a contraction mapping on the complete metric space (C ; ), with unique xed point c . Since in addition C is a closed subset of C and TC  C by Lemma B.9, we know that 2 2 2 c 2 C . The stated claim is veri ed. 2 36 Appendix C. Proof of Section 3 Results As before, Assumptions 2.1{2.3 are in force. Notice that Assumption 2.2, Assump- tion 3.1 and Lemma A.1 imply existence of an n in N such that n n Y Y := max E R < 1 and := s  max E R < 1: (48) z t t z t z2Z z2Z t=1 t=1 Lemma C.1. For all (a; z) 2 S, we have sup E a < 1. a;z t t0 Proof. Since c (0; z) = 0, Proposition 2.6 implies that c (a; z)  (1 s )a for all (a; z) 2 S. For all t  1, we have t = kn + j in general, where the integers k  0 and j 2 f0; 1; : : : ; n 1g. Using these facts and (4), we have: t t1 a  s  R  R a + s  R  R Y + + sR  Y + Y t t 1 t 2 1 t t1 t kn+j kn+j` = s  R  R a + s  R  R Y kn+j 1 kn+j `+1 ` `=1 k n X X mn` + s  R  R Y kn+j (km)n+j+`+1 (km)n+j+` m=1 `=1 with probability one. Taking expectations of the above while noting that M := max E R < 1 by Assumption 3.1 and Lemma A.1, we have 1`n; z2Z z t t=1 k j k j` E a  s  E R  R a + s  E R  R Y a;z t z j 1 z j `+1 ` `=1 k1 n X X m n` + s  E R  R Y z (km)n+j (km1)n+j+`+1 (km)n+j+` m=0 `=1 k1 n X X X k k m M a + M E Y + M E Y 0 0 z ` 0 z (km1)n+j+` m=0 `=1 `=1 M a + M M n + M M n < 1: 0 0 3 0 3 m=0 or all (a; z) 2 S and t  0. Here we have used M in Lemma B.1 and the Markov property. Hence, sup E a < 1 for all (a; z) 2 S, as was claimed. a;z t t0 A function w : S ! R is called norm-like if all its sublevel sets (i.e., sets of the form fx 2 S : w(x)  bg; b 2 R ) are precompact in S (i.e., any sequence in a given sublevel set has a subsequence that converges to a point of S). 37 Proof of Theorem 3.1. Based on Lemma D.5.3 of Meyn and Tweedie (2009), a sto- chastic kernel Q is bounded in probability if and only if for all x 2 S, there exists a norm-like function w : S ! R such that the (Q; x)-Markov process fX g satis- + t t0 es lim sup E [w (X )] < 1. Fix (a; z) 2 S. Since Z is nite, P is bounded x t t!1 in probability. Hence, there exists a norm-like function w : Z ! R such that lim sup E w(Z ) < 1. Then w : S ! R de ned by w (a ; Z ) := a + w(Z ) z t + 0 0 0 0 t!1 is a norm-like function on S. The stochastic kernel Q is then bounded in prob- ability since Lemma C.1 implies that lim sup E w (a ; Z )  sup E a + a;z t t a;z t t!1 t0 lim sup E w(Z ) < 1. Regarding existence of stationary distribution, since P is z t t!1 Feller (due to the niteness of Z), whenever z ! z, the product measure satis es P (z ;) ! P (z;) Since in addition c is continuous, a simple application of the generalized Fatou's lemma of Feinberg et al. (2014) (Theorem 1.1) shows that the stochastic kernel Q is Feller. Moreover, since Q is bounded in probability, based on the Krylov-Bogolubov theorem (see, e.g., Meyn and Tweedie (2009), Proposition 12.1.3 and Lemma D.5.3), Q admits at least one stationary distribution. Lemma C.2. The borrowing constraint binds in nite time with positive probability. That is, for all (a; z) 2 S, we have P ([ fc = a g) > 0. a;z t0 t t Proof. The claim holds trivially when a = 0. Suppose the claim does not hold on S (recall that S = Snf0g), then P (\ fc < a g) = 1 for some (a; z) 2 S , i.e., the 0 a;z t0 t t 0 borrowing constraint never binds with probability one. Hence, 0  0 P (u  c )(a ; Z ) = E R (u  c )(a ; Z ) F = 1 a;z t t t+1 t+1 t+1 t+1 t for all t  0. Then we have 0  0 (u  c ) (a; z) = E R  R (u  c ) (a ; Z ) a;z 1 1 t t t t 0 0 E R  R [u (a ) + M ]  E R  R [u (Y ) + M ] (49) a;z 1 1 t t t z 1 1 t t t for all t  1. Let n and  be de ned by (48). Let t = kn + 1. Based on the Markov property and Lemma B.1, as k ! 1, E R  R = E R  R E R z 1 1 t t z 1 1 t1 t1 Z 1 1 t1 max E R (E R  R )  max E R  ! 0: z 1 1 z 1 1 nk nk z 1 1 z2Z z2Z 38 Similarly, as k ! 1, 0 0 E R  R u (Y ) = E R  R E [ R u (Y )] z 1 1 t t t z 1 1 t1 t1 Z 1 1 1 t1 0 0 k ^ ^ ^ ^ ^ ^ max E Ru (Y ) E R  R  max E Ru (Y )  ! 0: z z 1 1 nk nk z z2Z z2Z Letting k ! 1. (49) then implies that (u  c ) (a; z)  0, contradicted with the fact that u > 0. Thus, we must have P ([ fc = a g) > 0 for all (a; z) 2 S. a;z t0 t t Our next goal is to prove Theorem 3.2. In proofs we apply the theory of Meyn and Tweedie (2009). Important de nitions (their information in the textbook) include: -irreducibility (Section 4.2), small set (page 102), strong aperiodicity (page 114), petite set (page 117), Harris chain (page 199), and positivity (page 230). Recall that R paired with its Euclidean topology is a second countable topological space (i.e., its topology has a countable base). Since R and Z are respectively Borel subsets of R and R paired with the relative topologies, they are also second countable. Hence, S := R  Z satis es B(S) = B(R ) B(Z) (see, e.g., page 149, + + Theorem 4.44 of Aliprantis and Border (2006)). Recall (22). With slight abuse of notation, in proofs, we use f to denote the density of fY g in both cases (Y1) and (Y2) and write dy = (dy), where  is the related measure. Speci cally,  is the Lebesgue measure when (Y2) holds. Moreover, Let # be the counting measure. Recall z  2 Z and the greatest lower bound y  0 of the support of fY g given by ` t Assumption 3.2. Let p  := P (z ; z ). Then p  > 0 by Assumption 3.2. Lemma C.3. P f[ [fc = a g\ (\ fZ = z g)]g > 0 for all a 2 (0;1). (a;z ) t0 t t i i=0 Proof. Fix a 2 (0;1). If a  a (z ), the claim holds trivially by Lemma B.7. Now consider the case a > a (z ). Suppose P f[ [fc = a g\ (\ fZ = z g)]g = 0. (a;z ) t0 t t i i=0 Then, based on the De Morgan's law, we have P \ fc < a g[ [ fZ 6= z g = 1: (a;z ) t0 t t i i=0 ) P fc < a g[ [ fZ 6= z g = 1 for all t 2 N: (a;z ) t t i i=0 ) P fc < a g[ [ fZ 6= z g = 1 for all k; t 2 N with k  t: (a;z ) k k i i=0 t t ) P \ fc < a g [ [ fZ 6= z g = 1 for all t 2 N: i i i (a;z ) i=0 i=0 t t Note that the set 4(t) := (\ fc < a g)[ ([ fZ 6= z g) can be written as i i i i=0 i=0 4(t) = 4 (t)[4 (t); where 4 (t)\4 (t) = ;; 1 2 1 2 t t t 4 (t) := \ fc < a g \ \ fZ = z g and 4 (t) := [ fZ 6= z g: 1 i i i 2 i i=0 i=0 i=0 39 Assumption 3.2 then implies that, for all t  0, t t P f4 (t)g = 1 P f4 (t)g = P \ fZ = z g = p  > 0: (a;z ) 1 z  2 z  i i=0 Let n and  be de ned by (48) and let t = kn + 1. Similar to the proof of Lemma B.7, we can show that, with probability p  > 0, 0  k 0 ^ ^ ^ ^ ^ (u  c )(a; z )   max E Ru (Y ) + M max E R z z z2Z z2Z for some constant M 2 R . Since  2 (0; 1) and (u  c )(a; z ) > 0, Lemma B.1 implies that there exists N 2 N such that N 0 0 ^ ^ ^ ^ ^ max E Ru (Y ) + M max E R < (u  c )(a; z ): z z z2Z z2Z 0  0  Nn+1 As a result, we have (u  c )(a; z ) < (u  c )(a; z ) with probability p  > 0. This is a contradiction. Hence the stated claim is veri ed. Let F (da j a ; Z ; Z ) be de ned such that Pfa 2 A j (a ; Z ; Z ) = (a; z; z )g = t+1 t t t+1 t+1 t t t+1 0 0 0 1fa 2 AgF (da j a; z; z ) at A 2 B(R ). Lemma C.4. Let h : S ! R be an integrable map such that a 7! h(a; z) is de- creasing for all z 2 Z. Then, for all t 2 N and z 2 Z, the map a 7! `(a; z; t) := 0 0 t 0 0 h(a ; z )Q ((a; z); d(a ; z )) is decreasing. Proof. Fix z 2 Z. When t = 1, (21a) implies that Z Z 0 0 0 0 0 0 `(a; z; 1) = h(a ; z )F (da j a; z; z ) P (z; z )#(dz ): Since a 7! h(a; z) is decreasing, and by Proposition 2.3 and (21a), the optimal asset accumulation path a is increasing in a with probability one, we know that a 7! t+1 t 0 0 0 0 0 h(a ; z )F (da j a; z; z ) is decreasing for all z 2 Z. Thus, a 7! `(a; z; 1) is decreasing. The claim holds for t = 1. Suppose this claim holds for arbitrary t, it remains to show that it holds for t + 1. Note that ZZ 00 00 t 0 0 00 00 0 0 `(a; z; t + 1) = h(a ; z )Q ((a ; z ); d(a ; z ))Q((a; z); d(a ; z )) 0 0 0 0 = `(a ; z ; t)Q((a; z); d(a ; z )): 0 0 0 0 Since a 7! `(a ; z ; t) is decreasing for all z 2 Z, based on the induction argument, a 7! `(a; z; t + 1) is decreasing. The stated claim then follows. Lemma C.5. The Markov process f(a ; Z )g is -irreducible. t t t0 40 Proof. Recall  > y given by Assumption 3.2. Let D 2 B(S) be de ned by D := fy gfz g if (Y1) holds and D := (y ; )fz g if (Y2) holds. We de ne the measure ' ` ` on B(S) by '(A) := (#)(A\D) for A 2 B(S). Clearly ' is a nontrivial measure. In particular, #(fz g) = 1 as # is the counting measure. Moreover, since y is the greatest lower bound of the support of fY g, it must be the case that (fy g) > 0 if (Y1) holds t ` and that ((y ; )) > 0 if (Y2) holds. As a result, '(S) = (fy g) #(fz g) > 0 when ` ` (Y1) holds and '(S) = ((y ; )) #(fz g) > 0 when (Y2) holds. We rst show that f(a ; Z )g is '-irreducible. Let A be an element of B(S) such that t t '(A) > 0. Fix (a; z) 2 S. We need to show that f(a ; Z )g visits set A in nite time t t with positive probability. Since fz g is irreducible, P fZ = z g > 0 for some integer N  0. By Lemma C.1, t z N 0 there exists a ~ < 1 such that P fa < a; ~ Z = z g > 0. By Lemma C.3, there N N (a;z) 0 0 exists T 2 N such that P fc = a ; Z = z g  P c = a ; \ fZ = z g > (a; ~ z ) T T T (a; ~ z ) T T i i=0 0. Lemma B.7 and Lemma C.4 then imply that P 0 fc = a ; Z = z g > 0 for all (a ;z ) T T T a 2 (0; a ~). Hence, for N := N + T and E := fc = a ; Z = z g, we have 0 N N N N 0 0 P (E)  P 0 fc = a ; Z = z gQ ((a; z); d(a ; z )) > 0 (50) (a;z) (a ;z ) T T T 0 0 fa a; ~ z =z g based on the Markov property. By (21a), we have P f(a ; Z ) 2 Ag  P f(a ; Z ) 2 A; a = c ; Z = z g (a;z) N +1 N +1 (a;z) N +1 N +1 N N N = P f(a ; Z ) 2 A j a = c ; Z = z g P (E) (a;z) N +1 N +1 N N N (a;z) = P f(Y ; Z ) 2 A; a = c ; Z = z g : (51) (a;z) N +1 N +1 N N N 00 00 00 00 00 Note that, by Assumption 3.2, f (y j z )P (z ; z ) > 0 whenever (y ; z ) 2 D. Since in addition '(A) = (  #)(A\ D) > 0, we have 00 00 00 00 00 f (y j z )P (z ; z )(  #)[d(y ; z )] > 0: Let 4 := P f(a ; Z ) 2 Ag. Then (50) and (51) imply that (a;z) N +1 N +1 Z Z 00 00 0 00 00 00 N 0 0 4  f (y j z )P (z ; z )(  #)[d(y ; z )] Q ((a; z); d(a ; z )) > 0: E A Therefore, we have shown that any measurable subset with positive ' measure can be reached in nite time with positive probability, i.e., f(a ; Z )g is '-irreducible. Based t t on Proposition 4.2.2 of Meyn and Tweedie (2009), there exists a maximal probability measure on B(S) such that f(a ; Z )g is -irreducible. t t 41 Lemma C.6. Let the function a  be de ned as in (45). Then a (z )  y if (Y1) holds, while a (z ) > y if (Y2) holds. Proof. Suppose (Y1) holds and a (z ) < y . Then, by Lemma B.7, for all t 2 N, t t fc = a g\ \ fZ = z g = fa  a (Z )g\ \ fZ = z g t t i t t i i=0 i=0 fa < y g\ \ fZ = z g  fa < y g: (52) t ` i t ` i=0 Hence, for all a 2 (0;1) and t 2 N, P fc = a g\ \ fZ = z g  P fa < y g = 0; (a;z ) t t i (a;z ) t ` i=0 where the last equality follows from (21a), which implies that a  Y  y with t t ` probability one. This is contradicted with Lemma C.3. Suppose (Y2) holds and a (z )  y . By de nition, P fY  y g = 0 for all z 2 Z ` z t ` and t 2 N. Since a  Y with probability one, we have P fa  y g = 0 for t t (a;z) t ` all (a; z) 2 S and t 2 N. Via similar analysis to (52), Lemma B.7 implies that [fc = a g\ (\ fZ = z g)]  fa  y g for all t 2 N. Hence, for all a 2 (0; 1) and t t i t ` i=0 t 2 N, we have P [fc = a g\ (\ fZ = z g)]  P fa  y g = 0. Again, this t t i t ` (a;z ) (a;z ) i=0 contradicts Lemma C.3. Lemma C.7. The Markov process f(a ; Z )g is strongly aperiodic. t t t0 Proof. By the de nition of strong aperiodicity, we need to show that there exists a v -small set D with v (D ) > 0, i.e., there exists a nontrivial measure v on B(S) 1 1 1 1 1 and a subset D 2 B(S) such that v (D ) > 0 and 1 1 1 inf Q ((a; z); A)  v (A) for all A 2 B(S): (53) (a;z)2D For  > 0 given by Assumption 3.2, let C := (y ; minf; a (z )g) and let D := fy gfz g ` 1 ` if (Y1) holds and D := C  fz g if (Y2) holds. We now show that D satis es the 1 1 0 0 0 0 0 0 0 above conditions. De ne r(a ; z ) := f (a j z )P (z ; z ) and note that r(a ; z ) > 0 on 0 0 0 0 D . De ne the measure v on B(S) by v (A) := r(a ; z )(  #)[d(a ; z )]. If (Y1) 1 1 1 holds, then (fy g) > 0 as shown above, and, if (Y2) holds, Lemma C.6 implies that (C) > 0. Since in addition #(fz g) > 0, it always holds that (  #)(D ) > 0. 0 0 Moreover, since r(a ; z ) > 0 on D , we have v (D ) > 0 and v is a nontrivial measure. 1 1 1 1 For all (a; z) 2 D and A 2 B(S), Lemma B.7 implies that 0 0 0 0 Q ((a; z); A) = r(a ; z )(  #)[d(a ; z )] = v (A): Hence, D satis es (53) and f(a ; Z )g is strongly aperiodic. 1 t t t0 42 Lemma C.8. The set [0; d] Z is a petite set for all d 2 R . Proof. Fix d 2 (0;1) and z 2 Z. Let B := [0; d]fzg. By Lemma C.3, P fc = a ; Z = z g > 0 for some N 2 N: (54) (d;z) N1 N1 N1 We start by showing that there exists a nontrivial measure v on B(S) such that inf Q ((a; z); A)  v (A) for all A 2 B(S): (55) (a;z)2B In other words, B is a v -small set. Fix A 2 B(S). For all z 2 Z, de ne Z Z 0 00 00 00 00 00 0 00 00 m(z ) := 1f(y ; z ) 2 Agf (y j z ) dy P (z ; z )#(dz ): Note that for all (a; z) 2 B, Lemma B.7 implies that Q ((a; z); A)  P f(Y ; Z ) 2 A; a  a (Z ); Z = z g a;z N N N1 N1 N1 0 0 0 0 N1 0 0 = m(z )1fa  a (z ); z = z gQ ((a; z); d(a ; z )): 0 0 0 0 0 0 Since a 7! m(z )1fa  a (z ); z = z g is decreasing for all z 2 Z, by Lemma C.4, N 0 0 0 0 N1 0 0 Q ((a; z); A)  m(z )1fa  a (z ); z = z gQ ((d; z); d(a ; z )) = P f(Y ; Z ) 2 A; c = a ; Z = z g =: v (A): d;z N N N1 N1 N1 N Note that v is a nontrivial measure on B(S) since (54) implies that v (S) > 0. N N Furthermore, since (a; z) is chosen arbitrarily, the above inequality implies that (55) holds. We have shown that B is a v -small set, and hence a petite set. Since nite union of petite sets is petite for -irreducible chains (see, e.g., Proposition 5.5.5 of Meyn and Tweedie (2009)), the set [0; d] Z must also be petite. Recall s 2 [0; 1) in Assumption 3.1, n 2 N and 2 (0; 1) in (48). Let B := [0; d] Z. Lemma C.9. There exist constants b 2 R ,  2 (0; 1) and a measurable map V : S ! [n=;1) that is bounded on B, such that, for suciently large d 2 R and all (a; z) 2 S, we have E V (a ; Z ) V (a; z)  V (a; z) + b1f(a; z) 2 Bg. a;z n n 43 Proof. Since c (a; z)  (1 s )a by Proposition 2.6 and M := max E R < 1 by 0 z2Z z Assumption 3.1 and Lemma A.1, by Lemma B.1 and the Markov property, n nt E a  s  E R  R a + s  E R  R Y a;z n z n 1 z n t+1 t t=1 n n X X nt nt nt a + s  E Y E R  R  a + s  M M : z t Z t+1 n 3 t 0 t=1 t=1 nt nt De ne b := s  M M . Note that b < 1. Choose  2 (0; 1 ), m  n= 0 3 0 V t=1 and d 2 R such that (1 )d  b + m . Then, for V (a; z) := a + m , + 0 V V E V (a ; Z ) V (a; z)  (1 )a + b = a (1 )a + b a;z n n 0 0 = V (a; z) (1 )a + b + m : (56) 0 V In particular, if (a; z) 2= B, then a > d and (56) implies that E V (a ; Z ) V (a; z)  V (a; z) (1 )d + b + m  V (a; z): (57) a;z n n 0 V Let b := b + m . Then the stated claim follows from (56){(57) and the fact that V 0 V is bounded on B. Proof of Theorem 3.2. Claim (1) can be proved by applying Theorem 19.1.3 (or a combination of Proposition 5.4.5 and Theorem 15.0.1) of Meyn and Tweedie (2009). The required conditions in those theorems have been established by Lemmas C.5, C.7, C.8 and C.9 above. Regarding claim (2), Lemmas C.8 and C.9 imply that E V (a ; Z )V (a; z)  n+b1f(a; z) 2 Bg for all (a; z) 2 S, where B := [0; d]Z is a;z n n petite. Since in addition f(a ; Z )g is -irreducible by Lemma C.5, Theorem 19.1.2 of t t Meyn and Tweedie (2009) implies that f(a ; Z )g is a positive Harris chain. Claim (2) t t then follows from Theorem 17.1.7 of Meyn and Tweedie (2009). To verify claim (3), since we have shown that  := f(a ; Z )g is positive Harris with t t stationary distribution , based on Theorem 16.1.5 and Theorem 17.5.4 of Meyn and Tweedie (2009), it suces to show that Q is V -uniformly ergodic. Let  be the n-skeleton of  (see page 62 of Meyn and Tweedie (2009)). Then  is -irreducible and aperiodic by Proposition 5.4.5 of Meyn and Tweedie (2009). Theorem 16.0.1 of Meyn and Tweedie (2009) and Lemmas C.8 and C.9 then imply that  is V - nN uniformly ergodic, and, there exists N 2 N such that jjjQ 1 jjj < 1, where 1 V kk := sup j g dj for  2 P (S) and, for all t 2 N, g:jgjV kQ ((a; z);) k 1 V jjjQ 1 jjj := sup : 1 V V (a; z) (a;z)2S 44 To show that Q is V -uniformly ergodic, by Theorem 16.0.1 of Meyn and Tweedie (2009), it remains to verify: jjjQ 1 jjj < 1 for t  nN . This obviously holds 1 V since, by the proof of Lemma C.9, there exist L ; L 2 R such that, for all t 2 N, 0 1 0 0 t 0 0 jf (a ; z )jQ ((a; z); d(a ; z )) jjjQ 1 jjj  sup sup + L 1 V 0 V (a; z) (a;z)2SkfkV 0 0 t 0 0 V (a ; z )Q ((a; z); d(a ; z )) sup + L  L + L < 1: 0 0 1 V (a; z) (a;z)2S Hence, Q is V -uniformly ergodic and claim (3) follows. The proof is now complete. Proof of Theorem 3.3. Take an arbitrarily large constant k < 1 such that P (z ; z ) > 0 and P fkG(z ; z ;  ) > 1g > 0; which is possible by Assumption 3.3 and the de nition of G in (25a). For this k, since lim c (a; z)=a = (z) and Z is a nite set, we can take a  > 0 such that a!1 c (a; z) 1  k(1 (z)) for all z 2 Z and a  a . Multiplying both sides by R(z ^;  )  0, it follows from the law of motion (21a), Y (z ^;  ^)  0, and the de nition of G in (25a) that for a  a , a ^ = R(z ^;  )(a c (a; z)) + Y (z ^;  ^) c (a; z) ^ ^ R(z ^;  )(a c (a; z)) = R(z ^;  ) 1 a ^ ^ R(z ^;  )k(1 (z))a = kG(z; z ^;  )a: ~ ^ ^ ^ ^ Let A(z; z ^;  ) := kG(z; z ^;  )1fkG(z; z ^;  ) > 1g. Then for all z; z ^; ;  ^ and all a  a , ~ ^ a ^  A(z; z ^;  )a: (58) Start the wealth accumulation process a from a  a . Consider the following process: t 0 S = A(Z ; Z ;  )S ; t+1 t t+1 t+1 t where S = a . We now show that a  S with probability one for all t by induction. 0 0 t t Since S = a , the case t = 0 is trivial. Suppose the claim holds up to t. Because 0 0 a  0 and S remains 0 once it becomes 0, without loss of generality we may assume t t ~ ~ ~ S ; : : : ; S are all positive. Hence A ; : : : ; A > 0. By the de nition of A, we have 0 t 1 t ~ ~ A > 1 whenever A > 0. Therefore ~ ~ S = A  A S  S = a  a: t t 1 0 0 0 45 Hence applying (58), we get ~ ~ a  A(Z ; Z ;  )a  A(Z ; Z ;  )S = S : t+1 t t+1 t+1 t t t+1 t+1 t t+1 Now take any p 2 (0; 1) and let T be a geometric random variable with mean 1=p that is independent of everything. De ne (s) = (1 p)r(P M (s)); ~ ~ where M (s) is as in (24). Since clearly A  A and p > 0, we have  > . By Lemma 3.1 of Beare and Toda (2017), ;  are convex, and hence continuous in the interior of their domains. Therefore () = 1 and (s) > 1 for small enough s > . Hence, for any " > 0, we can take small enough p 2 (0; 1) and large enough k < 1 ~ ~ such that () < 1 < ( + ") < 1. By Lemma 3.1 of Beare and Toda (2017), there exists a unique  ~ 2 (;  + ") such that ( ~) = 1. Theorem 3.4 of Beare and Toda (2017) then implies that lim inf a P fS > ag > 0 a ;z T 0 0 a!1 for all (a ; z ) 2 S. In particular, for any initial (a ; z ) 2 S with a  a , 0 0 0 0 0 +" lim inf a P fS > ag > 0: (59) a ;z T 0 0 a!1 Now suppose that we draw a from the ergodic distribution. Then a has the same 0 t distribution as a , and so does a . Therefore 1 T Pfa > ag = Pfa > ag 1 T = Pfa < a gPfa > aj a < a g + Pfa  a gPfa > aj a  a g: (60) 0 T 0 0 T 0 If the ergodic distribution of fa g has unbounded support, then Pfa  a g > 0. As t 0 we have seen above, conditional on a  a , we have a  S for all t. Therefore 0 t t +" +" lim inf a Pfa > a j a  a g  lim inf a PfS > a j a  a g > 0 (61) T 0 T 0 a!1 a!1 by (59), and so (27) follows from (60) and (61). References Acemoglu, D. and J. A. Robinson (2002): \The Political Economy of the Kuznets curve," Review of Development Economics, 6, 183{203. Ac kgoz, O. T. (2018): \On the Existence and Uniqueness of Stationary Equilib- rium in Bewley Economies with Production," Journal of Economic Theory, 173, 18{55. 46 Ahn, S., G. Kaplan, B. Moll, T. Winberry, and C. Wolf (2018): \When Inequality Matters for Macro and Macro Matters for Inequality," NBER Macroe- conomics Annual, 32, 1{75. Aiyagari, S. R. (1994): \Uninsured Idiosyncratic Risk and Aggregate Saving," Quarterly Journal of Economics, 109, 659{684. Aliprantis, C. D. and K. C. Border (2006): In nite Dimensional Analysis: A Hitchhiker's Guide, Springer. Beare, B. K. and A. A. Toda (2017): \Geometrically Stopped Markovian Ran- dom Growth Processes and Pareto Tails," Tech. rep., UC San Diego. Benhabib, J. and A. Bisin (2018): \Skewed Wealth Distributions: Theory and Empirics," Journal of Economic Literature, 56, 1261{1291. Benhabib, J., A. Bisin, and M. Luo (2017): \Earnings Inequality and Other Determinants of Wealth Inequality," American Economic Review: Papers and Pro- ceedings, 107, 593{597. Benhabib, J., A. Bisin, and S. Zhu (2011): \The Distribution of Wealth and Fiscal Policy in Economies with Finitely Lived Agents," Econometrica, 79, 123{ ||| (2015): \The Wealth Distribution in Bewley Economies with Capital Income Risk," Journal of Economic Theory, 159, 489{515. ||| (2016): \The Distribution of Wealth in the Blanchard{Yaari Model," Macroe- conomic Dynamics, 20, 466{481. Bhandari, A., D. Evans, M. Golosov, and T. J. Sargent (2018): \Inequal- ity, Business Cycles, and Monetary-Fiscal Policy," Tech. rep., National Bureau of Economic Research. Blackwell, D. (1965): \Discounted Dynamic Programming," Annals of Mathe- matical Statistics, 36, 226{235. Brinca, P., H. A. Holter, P. Krusell, and L. Malafry (2016): \Fiscal Multipliers in the 21st Century," Journal of Monetary Economics, 77, 53{69. Cao, D. (2020): \Recursive Equilibrium in Krusell and Smith (1998)," Journal of Economic Theory, 186. Cao, D. and W. Luo (2017): \Persistent heterogeneous returns and top end wealth inequality," Review of Economic Dynamics, 26, 301{326. Carroll, C. (2004): \Theoretical Foundations of Bu er Stock Saving," Tech. rep., National Bureau of Economic Research. Carroll, C. D. (1997): \Bu er-stock Saving and the Life Cycle/Permanent Income Hypothesis," Quarterly Journal of Economics, 112, 1{55. 47 Chamberlain, G. and C. A. Wilson (2000): \Optimal Intertemporal Consump- tion under Uncertainty," Review of Economic Dynamics, 3, 365{395. Coleman, II, W. J. (1990): \Solving the Stochastic Growth Model by Policy- Function Iteration," Journal of Business and Economic Statistics, 8, 27{29. Datta, M., L. J. Mirman, and K. L. Reffett (2002): \Existence and Unique- ness of Equilibrium in Distorted Dynamic Economies with Capital and Labor," Journal of Economic Theory, 103, 377{410. Davies, J. B. and A. F. Shorrocks (2000): \The Distribution of Wealth," in Handbook of Income Distribution, Elsevier, vol. 1, 605{675. Deaton, A. and G. Laroque (1992): \On the Behaviour of Commodity Prices," Review of Economic Studies, 59, 1{23. Epper, T., E. Fehr, H. Fehr-Duda, C. Kreiner, D. Lassen, S. Leth- Petersen, and G. Rasmussen (2018): \Time Discounting and Wealth Inequal- ity," Tech. rep., Working paper. Fagereng, A., L. Guiso, D. Malacrino, and L. Pistaferri (2016a): \Het- erogeneity and Persistence in Returns to Wealth," Tech. rep., National Bureau of Economic Research. ||| (2016b): \Heterogeneity in Returns to Wealth and the Measurement of Wealth Inequality," American Economic Review: Papers and Proceedings, 106, 651{655. Feinberg, E. A., P. O. Kasyanov, and N. V. Zadoianchuk (2014): \Fa- tou's Lemma for Weakly Converging Probabilities," Theory of Probability & Its Applications, 58, 683{689. Gabaix, X., J.-M. Lasry, P.-L. Lions, and B. Moll (2016): \The Dynamics of Inequality," Econometrica, 84, 2071{2111. Glaeser, E., J. Scheinkman, and A. Shleifer (2003): \The Injustice of In- equality," Journal of Monetary Economics, 50, 199{222. Gouin-Bonenfant, E. and A. A. Toda (2018): \Pareto Extrapolation: Bridg- ing Theoretical and Quantitative Models of Wealth Inequality," Tech. rep., SSRN Guvenen, F. and A. A. Smith (2014): \Inferring Labor Income Risk and Partial Insurance from Economic Choices," Econometrica, 82, 2085{2129. Hansen, B. E. and K. D. West (2002): \Generalized Method of Moments and Macroeconomics," Journal of Business & Economic Statistics, 20, 460{469. Hardy, G. H., J. E. Littlewood, and G. Polya (1952): Inequalities, Cam- bridge University Press. 48 Hills, T. S. and T. Nakata (2018): \Fiscal Multipliers at the Zero Lower Bound: The Role of Policy Inertia," Journal of Money, Credit and Banking, 50, 155{172. Hubmer, J., P. Krusell, and A. A. Smith, Jr. (2018): \A Comprehensive Quantitative Theory of the US Wealth Distribution," Tech. rep., Yale. Huggett, M. (1993): \The Risk-free Rate in Heterogeneous-agent Incomplete- insurance Economies," Journal of Economic Dynamics and Control, 17, 953{969. Kaymak, B., C. S. Leung, and M. Poschke (2018): \The Determinants of Wealth Inequality and Their Implications for Economic Policy," Tech. rep., Society for Economic Dynamics. Krasnosel'skii, M. A., G. M. Vainikko, R. Zabreyko, Y. B. Ruticki, and V. V. Stet'senko (2012): Approximate Solution of Operator Equations, Springer Netherlands. Krusell, P. and A. A. Smith, Jr. (1998): \Income and Wealth Heterogeneity in the Macroeconomy," Journal of Political Economy, 106, 867{896. Kuhn, M. (2013): \Recursive Equilibria in an Aiyagari-style Economy with Perma- nent Income Shocks," International Economic Review, 54, 807{835. Lawrance, E. C. (1991): \Poverty and the Rate of Time Preference: Evidence from Panel Data," Journal of Political Economy, 99, 54{77. Li, H. and J. Stachurski (2014): \Solving the Income Fluctuation Problem with Unbounded Rewards," Journal of Economic Dynamics and Control, 45, 353{365. Ljungqvist, L. and T. J. Sargent (2012): Recursive Macroeconomic Theory, MIT Press, 4 ed. Loewenstein, G. and D. Prelec (1991): \Negative Time Preference," American Economic Review, 81, 347{352. Loewenstein, G. and N. Sicherman (1991): \Do Workers Prefer Increasing Wage Pro les?" Journal of Labor Economics, 9, 67{84. Meyn, S. P. and R. L. Tweedie (2009): Markov Chains and Stochastic Stability, Springer Science & Business Media. Miao, J. (2006): \Competitive Equilibria of Economies with a Continuum of Con- sumers and Aggregate Shocks," Journal of Economic Theory, 128, 274{298. Morand, O. F. and K. L. Reffett (2003): \Existence and Uniqueness of Equilib- rium in Nonoptimal Unbounded In nite Horizon Economies," Journal of Monetary Economics, 50, 1351{1373. Pareto, V. (1896): La Courbe de la R epartition de la Richesse, Lausanne: Im- primerie Ch. Viret-Genton. 49 Rabault, G. (2002): \When Do Borrowing Constraints Bind? Some New Results on the Income Fluctuation Problem," Journal of Economic Dynamics and Control, 26, 217{245. Rouwenhorst, K. G. (1995): \Asset Pricing Implications of Equilibrium Busi- ness Cycle Models," in Frontiers of Business Cycle Research, ed. by T. F. Cooley, Princeton University Press, chap. 10, 294{330. Saez, E. and G. Zucman (2016): \Wealth Inequality in the United States since 1913: Evidence from Capitalized Income Tax Data," Quarterly Journal of Eco- nomics, 131, 519{578. Schechtman, J. (1976): \An Income Fluctuation Problem," Journal of Economic Theory, 12, 218{241. Schechtman, J. and V. L. S. Escudero (1977): \Some Results on \An Income Fluctuation Problem"," Journal of Economic Theory, 16, 151{166. Schorfheide, F., D. Song, and A. Yaron (2018): \Identifying Long-Run Risks: A Bayesian Mixed-Frequency Approach," Econometrica, 86, 617{654. Stachurski, J. (2009): Economic Dynamics: Theory and Computation, MIT Press. Stachurski, J. and A. A. Toda (2019): \An Impossibility Theorem for Wealth in Heterogeneous-agent Models with Limited Heterogeneity," Journal of Economic Theory, 182, 1{24. Toda, A. A. (2014): \Incomplete Market Dynamics and Cross-Sectional Distribu- tions," Journal of Economic Theory, 154, 310{348. ||| (2019): \Wealth Distribution with Random Discount Factors," Journal of Monetary Economics, 104, 101{113. Toda, A. A. and K. Walsh (2015): \The Double Power Law in Consumption and Implications for Testing Euler Equations," Journal of Political Economy, 123, 1177{1200. Vermeulen, P. (2018): \How Fat Is the Top Tail of the Wealth Distribution?" Review of Income and Wealth, 64, 357{387.

Journal

EconomicsarXiv (Cornell University)

Published: May 29, 2019

References