Forecasts of US Short-term Interest Rates: A ... - Semantic Scholar


Aug 21, 2006 - hypothesis on the forecasting model are found to help at long ... How best to model and predict the dynamics in interest rates is ...... where CapUt is the rate of growth of capacity utilization (measured as the distance of real ...

Forecasts of US Short-term Interest Rates: A Flexible Forecast Combination Approach∗ Massimo Guidolin

Allan Timmermann

Federal Reserve Bank of St. Louis - Research Dept.

University of California San Diego

August 21, 2006

Abstract This paper develops a flexible approach to combine forecasts of future spot rates with forecasts from time-series models or macroeconomic variables. We find empirical evidence that accounting for both regimes in interest rate dynamics and combining forecasts from different models helps improve the outof-sample forecasting performance for US short-term rates. Imposing restrictions from the expectations hypothesis on the forecasting model are found to help at long forecasting horizons.

1. Introduction Accurate interest rate forecasts are crucial for investors’ savings and investment decisions as well as in many monetary policy decisions. How best to model and predict the dynamics in interest rates is therefore an issue that has long occupied researchers in finance and econometrics. Three broad approaches can be identified. Since interest rates tend to be highly persistent, autoregressive models are commonly used to produce benchmark forecasts. Alternatively, under the expectations hypothesis, forward rates can be expected to provide optimal and−under additional restrictions−unbiased forecasts of future spot rates. A third class of forecasting models expands the information set and identifies macroeconomic variables as predictors. Application of these forecasting models is made difficult by evidence of complicated non-linear dynamics in interest rates. This suggests that the relationship between spot and forward rates changes over time in a way that can sometimes involve discrete shifts, possibly due to changes in monetary policies.1 Furthermore, empirical findings suggest that the success of the expectations hypothesis and the ability of forward rates to forecast future spot rates may vary across forecast horizons: forward rates appear to be unbiased predictors of movements in future spot rates at longer horizons but perform worse at short horizons. Building on the insight that the relative precision of different forecasting approaches varies both over time and across forecast horizons, we propose in this paper a flexible forecast combination approach that ∗

We thank Chung-Ming Kuan, two anonymous referees and seminar participants at the SETA 2005 meetings at Academia Sinica for helpful comments. 1 Ait-Sahalia (1996) finds evidence of non-linear mean reversion in interest rates. Hall, Anderson and Granger (1992) find that the expectations hypothesis is rejected during periods that coincide with changes in the underlying monetary policy regime. Other papers documenting non-linearities in short-term interest rates include Hamilton (1988), Sola and Driffill (1994), Gray (1996), Ang and Bekaert (2002) and Bansal, Tauchen, and Zhou (2004).

does not restrict the forecaster to always choose one econometric model over other alternatives. Instead, the approach allows forecasts from different models to be combined and lets the combination weights vary across a set of underlying states that get identified through the joint process governing spot, forward rates and other conditioning information. Since both the persistence of the states and the covariance between the forecasts can vary across states, the optimal combination weights will depend both on the current state probabilities and on the forecast horizon. In real time, as the underlying state probabilities are updated based on the arrival of new information, the optimal forecast combination weights will also change. In summary, our approach exploits information in short-term spot and forward rates and in a variety of macroeconomic variables to compute combined forecasts that are optimal given the assumed (joint) data generating process. When applied to data on 1-month US interest rates over the period 1950-2003, our analysis reveals a number of interesting findings. Forecast combinations that incorporate information beyond spot rates are found to produce better predictions than traditional benchmarks both at short and long forecast horizons. Indeed, our results suggest that it is important both to combine information embedded in different forecasts and to allow for nonlinear (regime) dynamics in spot and forward rates. Gains in forecast precision can be sizeable both in statistical and economic terms. The paper’s main contributions are the following. First, our paper extends the forecast combination literature by proposing a flexible combination approach that extends earlier work in the combination literature by Bates and Granger (1969), Diebold and Pauly (1987), Deutsch, Granger and Terasvirta (1994), Stock and Watson (2001, 2005) and Aiolfi and Timmermann (2004). Our approach can also be viewed as an extension to earlier approaches to forecast combinations under regime switching (Elliott and Timmermann (2005)). Second, we develop a bivariate model that captures nonlinear dynamics in the joint process for spot and forward rates. The nonlinearities appear to be well captured through a regime switching model characterized by four states with very different levels of volatility and long-run (unconditional) interest rates. Third, we conduct an analysis of the out-of-sample predictability of spot rates that compares a wide range of approaches and shows how different sources of information — reflected in forward rates, autoregressive models, theory-based models and macroeconomic models — may be helpful at both short and long forecast horizons. We show that restrictions from the expectations hypothesis, when imposed on the forecasts, can be used to improve the out-of-sample forecasting performance compared to models that do not make use of these restrictions. We also find that model combination provides a useful approach to improve forecast accuracy for interest rates. The plan of the paper is as follows. Section 2 introduces the multivariate hidden Markov model used in the paper and derives analytical results on optimal forecast combinations. Section 3 presents econometric estimates for the US interest rate data and characterizes the optimal combination weights using these estimates. Section 4 conducts an out-of-sample forecasting experiment and section 5 concludes. 2. A Flexible Forecast Combination Model A large literature has found empirical evidence that combining forecasts from different models generally leads to improved out-of-sample forecasting performance relative to a strategy of selecting the single best

2

forecasting model.2 The classic Bates-Granger (1969) forecast combination regression takes the form: ˆt,t+1 + εt+1 , yt+1 = μy + a0y y

(1)

0 ˆt,t+1 is a vector of forecasts of yt+1 . This regression assumes that where yt+1 is the predicted variable and y the combination weights are stable through time. However, it is easy to think of situations where models that, on average, generate superior forecasting performance may be slower to adapt in some states of the world than other models that generate higher average loss. Similarly, forecasting models that are superior at short horizons, may fail to be so at medium or long horizons. If the parameters of the underlying data generating process are unstable and the instability is sufficiently idiosyncratic (model-specific), it is plausible that there can be gains from combining forecasts from different econometric specifications. Our paper is concerned with forecast combinations in situations such as these. We use a multivariate regime switching process to capture the existence of common, discrete factors driving both the stochastic process of the variable of interest (the 1-month spot rate) and a related market variable (the 1-month forward rate) that can be construed as a predictor of the target variable. Although the regime switching model only provides a reduced form for the underlying joint process, it can accommodate time-varying parameters and differences between the conditional (short-term) and unconditional (long-term) moments of the data generating process.3 0 0 To this end consider the following joint stochastic process for zt+1 ≡ (yt+1 y ˆt+1,t+2 )0 , where y ˆt+1,t+2 is a vector of forecasts of yt+2 produced at time t + 1 so zt+1 is adapted to the filtration Ft+1 : Ã ! Ã ! Ã !Ã ! p X μ∗y,St+1 a∗y,St+1 yt+1 yt+1−j = + + εt+1 , or ∗ y ˆt+1,t+2 μ∗St+1 y ˆ A t+1−j,t+2−j j,s j=1

zt+1 = μSt+1 +

p X

Aj,St+1 zt+1−j + εt+1 .

(2)

j=1

The discrete state variable St+1 (which is not known at time t, i.e. St+1 ∈ / Ft ) takes integer values between 1 and k, μSt+1 is the intercept in state St+1 , Aj,St+1 is the VAR(j) matrix in state St+1 , and εt+1 ∼ N (0, ΩSt+1 ) is the vector of innovations in zt+1 which has zero mean and state-specific covariance matrix ΩSt+1 . To complete the model we assume that St+1 is driven by a first order, homogeneous Markov process with constant transition probability matrix P, P[i,j] = Pr(St+1 = j|St = i) = pij ,

i, j = 1, .., k.

(3)

Both the realized values of the variable of interest, yt+1 , and the vector of one-step-ahead forecasts, y ˆt+1,t+2 , allow us to extract information about the unobserved (realized) states {s1 , ..., st+1 }. The model assumes that the realization and predictions, aligned in time so they are adapted to a common information set, are driven by a common state variable, St+1 . This model nests many interesting special cases. It can be viewed as a generalization of (1) which emerges as the first equation of (2) in the special case where p = k = 1, A∗1 = O, and the first element of a∗y is zero, i.e. when there is a single state, only the current forecast matters and past values of both the predicted 2 3

Surveys have been provided by Diebold and Lopez (1996) and Timmermann (2006). Rudebusch and Wu (2004) argue that regimes may occur in term structure data as a consequence of monetary policy shifts.

3

and actual variable are excluded. Furthermore, (2) provides a flexible parametric representation from which the first and second conditional moments of zt+1 can readily be derived. The model allows for conditional heteroskedasticity, skew and kurtosis in the forecast errors and can also facilitate persistent sources of bias. Our extension of the Bates-Granger framework can be motivated in several ways. First, it is unlikely that the best forecasting model remains the same through time or even that a particular model has a constant bias. By allowing for regimes with different degrees of persistence, the ranking of the models and their usefulness (weight) in a combined forecast will also change over time. Indeed, it seems reasonable to assign different weights to forecasts that embody different sources of information in accordance with the underlying state of the economy or the monetary policy regime. Secondly, the extension of the conditional model of Bates and Granger to a multivariate VAR setting means that forecasts at arbitrary horizons can be generated through forward iteration even if the underlying forecasts use a shorter horizon than desired. Third, the framework readily allows for combining of pure time-series forecasts (reflected in lagged values of y) and forecasts that use information from other sources such as macroeconomic models (reflected in yˆ). Another interesting special case is a regime-switching, heteroskedastic version of Vasicek’s affine model. To see this, let yt+1 and yˆt+1,t+2 be the 1-period spot and forward rate, respectively. Setting p = 1, a12St+1 = 0, and assuming no contemporaneous correlation between shocks to the spot and forward rates, the first equation of (2) implies yt+1 = μSt+1 + a11St+1 yt + εy,t+1 ,

εy,t+1 ∼ N(0, σ 2St+1 ),

Hence (2) generalizes classic, linear term structure models to incorporate regime shifts and allows for a two-factor specification in which the short rate is also affected by shocks to the forward rate. Finally, when k = 1, equation (2) reduces to a simple, homoskedastic VAR(p) model à ! à ! à ! p X yt+1 μ∗y yt+1−j Aj (4) = + + εt+1 , εt+1 ∼ N (0, Ω). y ˆt+1,t+2 μ∗ y ˆt+1−j,t+2−j j=1 2.1. Forecast Combinations We next turn to the issue of generating forecasts from the econometric model (2). To maintain sufficient generality and to ensure that our approach nests methods in common use and also is practical to implement, it 0 0 y1,t,t+h y ˆ2,t,t+h )0 that affect the identification is convenient to distinguish between those forecasts in y ˆt,t+h ≡ (ˆ of the regimes (labelled y ˆ1,t,t+h ) and those that do not (i.e. traditional Bates-Granger forecasting variables, y ˆ2,t,t+h ). We shall assume n1 and n2 of these forecasting variables, respectively, so n = n1 + n2 is the total 0 serve as benchmark ‘testers’, since the last n2 elements of number of forecasts. The forecasts in y ˆ2,t,t+h y ˆt,t+h should not matter under the assumption that the first n1 predictions are optimal. Forecast combinations seek to choose the n × 1 vector of weights ω t,t+h that minimize the average πt ] — denoted Et [L(et+h )] — from h−period expected loss given current state probabilities, π ˆ t , E[L(et+h )|ˆ forecast errors defined as: ˆt,t+h . et+h ≡ yt+h − ω 0,t,t+h − ω 0t,t+h y ˆt,t+h . Because of the The constant ω 0,t,t+h adjusts for possible biases in the combined forecast ω 0t,t+h y conditional regime switching dynamics, the combination weights will generally be a highly non-linear function of past data. More general combination schemes could be considered, but we shall not do so here. 4

Formally the forecast combination problem takes the form: min

ω0,t,t+h , ωt,t+h

Et [L(yt+h − ω 0,t,t+h − ω 0t,t+h y ˆt,t+h )] s.t. ω t,t+h ∈ C,

where C is an admissible region for the weights. When it can be assumed that the individual forecasts are unbiased, it is common to restrict the combination weights to be non-negative, sum to unity and impose that the intercept is zero: n X n ω t,t+h [i] = 1 ω 0,t,t+h = 0, C ≡ Xi=1 [0, 1] i=1

where ω[i] is the i-th element of ω. Such restrictions may lead to efficiency gains in the estimation of the combination weights. Throughout the paper we follow common practice and assume that loss is quadratic, i.e. L(et+h ) = e2t+h . Thus the objective is to minimize mean squared forecast error (MSFE) loss: yt,t+h ]ω t,t+h − 2ω 0t,t+h Covt [yt+h , y ˆt,t+h ] + Et [e2t+h ] = V art [yt+h ] + ω 0t,t+h V art [ˆ © ª 2 yt,t+h ] − 2(Et [yt+h ] − ω 0,t,t+h )ω 0t,t+h Et [ˆ yt,t+h ]. (5) + {Et [yt+h ] − ω 0,t,t+h }2 + ω 0t,t+h Et [ˆ

It is instructive to consider each of the elements in (5). For concreteness, assume that n1 = n2 = 1, h = 1, ˆ2,t,t+1 = ayt (a simple random walk benchmark when y ˆ1,t,t+1 = ft,1 , the one-month forward rate, and y ˆ (¨ zt−1 ) be the conditional state probabilities given the entire history of z, ¨ zt−1 ≡ {zt−j }tj=1 a = 1). Let π ˆt = π and define the 3 × 3 matrix Vt,t+1 with generic element vi,j : ⎧ hP i Pk 1,1 k 0 0 0 ⎪ e ej (ˆ π Pe )Ω + (ˆ π e )C i, j ≤ 2 ⎪ s s s s t s=1 ⎪ ⎪ i hP i s=1 t ⎪ ⎪ 0,1 k ⎨ ae0 (ˆ π 0 es )Cs ej i = 3, j ≤ 2 1 hP s=1 t i , (6) vi,j = 0,1 k 0 0 ⎪ ae e (ˆ π e )C j = 3, i ≤ 2 ⎪ s s 1 t i ⎪ ⎪ h s=1 i ⎪ Pk ⎪ 0 ⎩ a2 e0 Pk (ˆ e1 i = j = 3 π 0t es )C1,1 s 1 s=1 π t Pes )Ωs + s=1 (ˆ

where es denotes a column vector of suitable dimension with a 1 in the s-th position and zeros elsewhere so π ˆ 0t Pes selects the conditional probability of the s-th state at time t + 1. The relevant conditional moments, conditioned on the state probabilities π ˆ t and denoted by Et [·], V art [·] and Covt [·], respectively, are: ⎡ Ã !⎤ p k X X yt+1−j ⎦ (ˆ π 0t Pes ) ⎣μ∗ys + e01 Aj,s Et [yt+1 ] = y ˆ t+1−j,t+2−j s=1 j=1 V art [yt+1 ] = e01 Vt,t+1 e1

V art [ˆ yt,t+1 ] = Q0n1 +1 Vt,t+1 Qn1 +1 Covt [yt+1 , y ˆt,t+1 ] = e01 Vt,t+1 Qn1 +1 ´ ³ ⎞ ⎛ P P k π0t es )Q0n1 μs + pj=1 Aj Et [zt+h−1−j ] Qn1 s=1 (ˆ ³ ⎠, ´ Et [ˆ yt,t+1 ] = ⎝ P P a ks=1 (ˆ π 0t es )e01 μs + pj=1 Aj Et [zt+h−1−j ] e1 5

(7)

where ¡ 0 ¢ ¡ ¢ (M − μt+q ⊗ ι0k ) + ∆t+q Dqs Ph−q (M − ιk ⊗ μ0t+h ) + ∆0t+h h≥q≥0 ⎞ ⎛ P P p p 0 0 ¯ hj )zt+1−j ... ¯ hj )zt+1−j j=1 e1 (Aj,1 − α j=1 e1 (Aj,k − α ⎟ ⎜ .. .. ⎟ ≡ ⎜ . . ⎠ ⎝ Pp P p 0 h 0 h ¯ j )zt+1−j ... ¯ j )zt+1−j j=1 en (Aj,1 − α j=1 en (Aj,k − α

Cq,h ≡ s ∆t+h

α ¯ hj ≡ (ˆ π 0t ⊗ In )(Ph ⊗ In )Aj Dhs ≡ diag{e0s Ph e1 , e0s Ph e2 , ..., e0s Ph ek } ´0 ³ M ≡ , μ0t+h ≡ π ˆ 0t Ph M μ1 μ2 · · · μk " # ´0 ³ 0 0 ··· 0 Aj ≡ , Qn ≡ Aj,1 Aj,2 · · · Aj,k In

(8)

Q and, for any matrix B, Bq ≡ qi=1 B and B0 = I. The matrix C1,1 s collects variance terms from discrete shifts in the conditional mean parameters, i.e. differences between the period t + 1 regime-specific intercepts (μs for s = 1, ..., k, the rows of M), VAR(j) coefficients (Aj,s for s = 1, ..., k) and their predicted values at α1j }pj=1 .4 Similarly, the matrix C0,1 time t + 1, μt+1 and {¯ s collects auto-covariance terms from discrete shifts 5 in the conditional mean parameters. The diagonal matrix Dhs attaches probability weights to possible shifts in the mean parameters at time t + h. Using (7)-(8), the combination weights minimizing (5) emerge from the necessary and sufficient conditions for a minimum that take the form of a linear system in three equations and three unknowns: ˆ 0t,t+1 Et [ˆ yt,t+1 ] = 0 (9) ω ˆ 0,t,t+1 − Et [yt+1 ] + ω

yt,t+1 ])0 +ˆ ω 0,t,t+1 (Et [ˆ yt,t+1 ])0 + ω ˆ 0t,t+1 Et [ˆ yt,t+1 ] − (e01 Vt,t+1 Qn ) = 0. ω ˆ 0t,t+1 (Q0n Vt,t+1 Qn ) − Et [yt+1 ](Et [ˆ

ˆ1,t,t+1 and Clearly ω ˆ 0,t,t+1 corrects for any biases in the combined forecast. The combination weights on y the tester y ˆ2,t,t+1 depend not only on the usual covariance terms, but also reflect the means of the forecasts 0,1 ˆ t , the state within and across each state (through C1,1 s and Cs ) as well as the current state probabilities, π transition probabilities, P, and differences in the covariance terms across states. Compared to the standard 4

For instance, assuming that k = 2 and p = 1, we have μt+i

α ¯j





=

M0 P0 π t =

μ11 μ21

p11 p12

μ12 μ22

(ˆ π 0t ⊗I2 )(P ⊗ I2 )Aj =

π1t 0

0 π1t

p21 p22 π2t 0

π1t π2t 0 π2t

= ⎡ ⎢ ⎢ ⎢ ⎣

(p11 π1t + p21 π2t )a11,1 + (p12 π1t + p22 π2t )a11,2 (p11 π1t + p21 π2t )a21,1 + (p12 π1t + p22 π2t )a21,2

μ11 (p11 π1t + p21 π 2t ) + μ12 (p12 π 1t + p22 π2t ) μ21 (p11 π1t + p21 π 2t ) + μ22 (p12 π 1t + p22 π2t ) ⎤ p11 0 p12 0 0 p11 0 p12 ⎥ ⎥ A1,1 ⎥ p21 0 p22 0 ⎦ A1,2 0 p21 0 p22

(p11 π1t + p21 π2t )a12,1 + (p12 π 1t + p22 π2t )a12,2 (p11 π1t + p21 π 2t )a22,1 + (p12 π1t + p22 π2t )a221,2

where aij,1 is the [i, j] element of A1,1 . 5 In general, Cov(yt , yt+h |ˆ π t ) = ks=1 (ˆ π 0t Pes )Cov(yt , yt+h |St = s), where Cov(yt , yt+h |St = s) = (M0 − μt ⊗ ι0k ) + ∆t Dhs (M − ιk ⊗ μ0t+h ) + ∆0t+h .

6

,

,

forecast combination problem, this yields a richer optimization program that captures the idea of including a forecasting model in the combination because it acts as a hedge against model misspecification which takes the form of discrete shifts in the data generating process. These results are naturally generalized in two ways. First, the tester forecast, y ˆ2,t,t+1 , may differ from ayt . Allowing for this, the conditional MSFE is given by (dropping the h subscripts from ω t ) Et [e2t+1 ] = e01 Vt,t+1 e1 + {Et [yt+1 ]}2 + (ω0,t )2 − 2ω 0,t Et [yt+1 ] + (ω 0t y ˆt,t+1 )2 +

(10)

ˆt,t+1 + 2ω0,t ω 0t y ˆt,t+1 + ω 0t (Q0n Vt,t+1 Qn )ω t − 2ω 0t (e01 Vt,t+1 Qn ), −2Et [yt+1 ]ω 0t y

where Vt,t+1 generalizes (6). When producing h-step forecasts for h ≥ 2, (5) still describes the conditional MSFE, but the expression for the relevant moments are different. Forecasts are best understood by rewriting (2) as (assuming p > h) zt+h = μst+h +

h−1 X

Ast+h,j zt+h−j +

j=1

p X

Ast+h,j zt+h−j + εt+h ,

j=h

and using that zt+h−j ∈ Ft for j ≥ h. Only zt+1 , .., zt+h need to be predicted. Still, if the autoregressive matrices are state-dependent, the recursive expressions are generally path dependent. This can most easily be seen when p = 1, in which case we obtain the following expression: Ãh−1 Ã j ! ! h−1 Y X Y Ast+h+1−i (μst+h−j + εt+h−j ) + Ast+h−i zt , zt+h = j=0

i=1

i=0

Q0

where i=1 Ast+h+1−i ≡ In1 +1 . This expression is most easily evaluated by numerical simulation (Granger and Terasvirta (1993)), which is not a problem since the Markovian form of the model makes it ideally suited for this type of method. When A is not state dependent, the multi-step forecasts simplify considerably. This is an important case to consider since the empirical analysis in the next section finds no evidence of such state dependence. ˆt,t+h [n] = ah yt , so that The task is again easier when n2 = 1, y ⎡ ⎤ p k X X (ˆ π0t Ph ek ) ⎣μ∗y,s + e01 Aj Et [zt+h−j ]⎦ (11) Et [yt+h ] = s=1

V art [yt+h ] =

j=1

e01 Vt,t+h e1

V art [ˆ yt,t+h ] = Q0n1 +1 Vt,t+h Qn1 +1 Covt [yt+h , y ˆt,t+h ] = Q0n1 +1 Vt,t+h e1 ³ ⎞ ´ ⎛ P P k π 0t Ph−1 es )Q0n1 μs + pj=1 Aj Et [zt+h−1−j ] Qn1 s=1 (ˆ ³ ⎠, ´ Et [ˆ yt,t+h ] = ⎝ Pk Pp 0 h−1 h 0 a π t P es )e1 μs + j=1 Aj Et [zt+h−1−j ] e1 s=1 (ˆ

where Vt,t+h now has generic element vi,j defined as ⎧ hP i Pk h,h k 0 h 0 0 ⎪ e ej (ˆ π P e )Ω + (ˆ π e )C ⎪ s s s s t t ⎪ ⎪ i hs=1 i s=1 ⎪ P ⎪ 0,h k ⎨ ah e0 (ˆ π0 es )Cs ej 1 hP s=1 t i vi,j = 0,h k 0 h e0 ⎪ a (ˆ π e )C e1 ⎪ s s t i s=1 ⎪ ⎪ hP i ⎪ Pk ⎪ h,h k 0 0 h ⎩ a2h e0 π t P es )Ωs + s=1 (ˆ π t es )Cs e1 1 s=1 (ˆ 7

(12)

i, j ≤ n1 i = n1 + 1, j ≤ n1 j = n1 + 1, i ≤ n1 i = j = n1 + 1 = n

,

(13)

and Et [zt+i−j ] follows a recursive structure: Et [zt+i−j ] =

(

zt+i−j h i if i ≤ j . Pp Pk 0 i−j π t P ek ) μs + j=1 Aj,s Et [zt+i−j−1 ] if i > j s=1 (ˆ

The Ch,h s matrix represents the contribution to the variance of zt+h arising from the possibility of switches in the conditional mean parameters between periods t and t + h, while C0,h s collects h−step auto-covariances. The objective function in the minimization program now becomes yt,t+h ])2 -2Et [yt+h ]ω 0t,t+h Et [ˆ yt,t+h ] Et [e2t+h ] = e01 Vt,t+h e1 +Et [yt+h ]2 +ω 20,t,t+h -2ω0,t,t+h Et [yt+h ]+(ω 0t,t+h Et [ˆ +2ω0,t,t+h ω 0t,t+h Et [ˆ yt,t+h ] + ω 0t,t+h (Q0n1 +1 Vt,t+h Qn1 +1 )ω t,t+h − 2ω 0t,t+h (Q0n1 +1 Vt,t+h e1 ), with first order conditions ˆ 0t,t+h Et [ˆ yt,t+h ] ω ˆ 0,t,t+h = Et [yt+h ] − ω

0 = (Q0n1 +1 Vt,t+h Qn1 +1 )ˆ ω t,t+h − Et [yt+h ]Et [ˆ yt,t+h ] + ω ˆ 0,t,t+h Et [ˆ yt,t+h ] + (ˆ ω 0t,t+h Et [ˆ yt,t+h ])Et [ˆ yt,t+h ] − Q0n1 +1 Vt,t+h e1

(14)

3. Application to US Interest Rates Having introduced the regime switching model and characterized the solution to the forecast combination problem, we next provide estimation results for this model applied to US spot and forward rates. 3.1. Data We use the US Treasury Database from the University of Chicago Center for Research in Security Prices (CRSP). The sample period is January 1950 - December 2003, a total of 648 monthly observations. In addition to one-month T-bill rates, yt , we collect data on the one-month forward yields, ft,ϕ , implied by the term structure of spot rates at time t for the period starting at time t + ϕ, ϕ = 0, ..., 5, 11. All rates are extracted from the CRSP 6- and 12-month files. The continuously compoundedh yields ´i ³forward ´ rates are ³ and 100 30.4 standardized to a 30.4 day basis and calculated according to the formula yt ≡ ln Pt,1 τ t,1 , where Pt,1 is the average of the bid and ask prices for a one-month T-bill and τ t,1 is the time to expiration in calendar days. Forward rates are calculated using the methodology in Fama (1984). At time t, the 1-month forward rate from period t + ϕ to period t + ϕ + 1 is computed from the ϕ + 1 and ϕ-period spot yields, yt:t+ϕ+1 and yt:t+ϕ as ft,ϕ = (ϕ + 1)yt:t+ϕ+1 − ϕyt:t+ϕ . 3.2. Econometric Estimates Following the analysis in Section 2, we set n1 = 1 and model the bivariate system composed of the current 1month spot and forward rates, i.e. zt ≡ (yt ft,1 )0 . Even though theoretical term structure models can be used to constrain equation (2) (see e.g. Sola and Driffill (1994)), we adopt an unconstrained estimation strategy

8

similar to Ang and Bekaert (2002) and instead test restrictions on the forecast combination weights.6 To select the number of states we considered values of the Schwartz and Hannan-Quinn information criteria for a range of models with different values for p and k. Both criteria suggested that a four-state VAR(1) specification is required to fit the data. While the covariance matrix varies strongly across states, the matrix of VAR(1) coefficients, A, did not appear to be state-dependent. Hence we simplify the model by imposing that As = A across regimes. Panel A of Table 1 presents parameter estimates for the single-state VAR(1) model. Most of the estimates are statistically significant. Shocks to forward rates are slightly less volatile than those to spot rates and the simultaneous correlation between these shocks is quite high (0.88). The process is highly persistent with moduli of the eigenvalues of the estimated VAR matrix of 0.98 and 0.12. Panel B presents maximum likelihood estimates for the four state model. The VAR(1) matrix in panel B is similar to the one reported in panel A with a largest modulus of 0.97. The correlation between shocks to the spot and forward rates does not seem to depend on the state and falls in the narrow interval [0.89, 0.92] under regime switching. Volatility levels and unconditional means differ greatly across the states, however, and therefore effectively identify the four regimes.7 Figure 1 plots the smoothed state probabilities. State 1 is associated with low and stable interest rates — the implied unconditional annualized means are 2.23 and 2.21% — as appeared during the early 1950s, part of the 1960s and, more recently, during the 2001-2003 recession. Although this stable state has an ergodic probability of only 0.12, it has an average duration of seven months. State 2 identifies a regime with intermediate but stable interest rates: annualized means are 5.31 and 5.14 percent, while the regime-specific (unconditional) monthly volatilities are only at three-quarters of their average levels. This state covered long spans of time such as 1953-1960, 1975-1978 and most of the 1990s, in total almost half (42%) of our 54-year long sample. However, the duration of this regime is only six months. This is explained by the tendency of the economy to frequently switch between regimes 2 and 3. State 3 is associated with higher and relatively volatile interest rates. The mean interest rate in this state is above 10 percent and the volatility is close to its full-sample average. This regime occurs relatively frequently, with an average duration of five and a half months and a long-run probability of 0.36. Most of the 1970s and mid-1980s are best characterized by this regime which re-appears between 1999 and 2000, when the FED pulled the break on the US economy and interest rates gradually increased. Finally, state four captures volatile market conditions. The unconditional mean implied by the estimated parameters for this state is very high as is the unconditional volatility. At a first glance, this state seems 6

Similarly, Diebold, Rudebusch, and Aruoba (2006) do not impose no-arbitrage restrictions. Ang and Piazzesi (2003) find that such restrictions at the estimation stage may improve forecasting performance. 7 We tested for (regime-switching) ARCH effects using a model of the form yt ft,1 ΣSt

p

=

μSt +

Aj,St zt−j + εt j=1

=

εt ∼ N(0, ΣSt )

KSt + ∆St ε0t εt ∆0St ,

where KSt is symmetric and positive definite and ∆St captures regime-dependent effects of past shocks on current volatility. Most coefficients failed to be significant and a likelihood ratio test of the restriction ∆s = ∆, s = 1, 2, 3, 4 failed to reject the null of no ARCH effects.

9

to be mostly dominated by the 1979-1982 ‘monetarist experiment’ which is almost entirely captured by this state. Notice, however, that state four does much more than identify two structural breaks in short-term interest rates and is associated with short bursts of volatility in interest rates, as occurred during the Fall of 1984, during the FED contraction after October 1987, and during a few episodes in 1988 and 1989. The last panel of Table 1 presents estimates of the transition probability matrix. All four states are mildly persistent with probabilities of staying in each state that vary between 0.81 and 0.84. Exits from the first regime are mostly to the second regime. From the second state it is possible to switch to both more volatile and higher interest rates (state 3), or back to state 1. However, the probability of a direct shift to state 4 is very small. From state 3, the economy can revert to lower and more stable interest rates (state 2) and there is also some chance of a switch to the turbulent market conditions associated with state 4. Finally, from state 4 the economy can only switch back to state 3. The transition matrix makes it possible for the economy to cycle for long periods between states 3 and 4−states with above-average and volatile interest rates−as occurred between 1978 and 1985. Our analysis applies (2) to the level of US interest rates, rather than to their changes. This is an issue since a unit root is not rejected for the one-month spot rate: An augmented Dickey-Fuller (ADF) test produced a p−value of 0.14. However, the considerable persistence in interest rates may in part be due to shifts in the conditional mean parameters which can induce more persistent behavior. To investigate if this is indeed the case for our data, we simulated 1,000 time series of interest rates using the estimates of the fourstate regime switching model from Table 1, assuming ergodic state probabilities for the initial observation and using the BIC for lag length selection. Figure 2 shows the histogram of the resulting p−values associated with the ADF statistic. 46% of the simulations produced a smaller ADF statistic (i.e. a higher p−value) than that observed in the data and hence weaker evidence against the null of a unit root. This supports our decision to model interest rate levels by means of a persistent Markov switching process.8 3.3. Combination Weights We next turn to three different but related issues. First, do combination weights depend on the underlying state probabilities? Second, do combination weights depend on the forecast horizon? Third, how big is the potential reduction in the expected loss from using forecast combinations rather than forecasts from the individual models? To simplify the presentation, we start by considering as the testers in y ˆ2,t,t+h either (i) a random walk, Pp yˆ2,t,t+h = yt or (ii) AR(p) forecasts yˆ2,t,t+h = a ˆ0 + j=1 a ˆj yt−j , where the coefficients {ˆ aj }pj=0 are estimated by OLS. The random walk is a classical benchmark in the interest rate literature. Duffee (2002) shows that this model is difficult to beat even for relatively flexible and widely used affine models. Using the full-sample estimates from Table 1, Figure 3 plots optimal combination weights as a function of the forecast horizon for different values of the initial state probability, π ˆ t . The plots assume that the initial states are known but that future states are not and so do not correspond to any one particular point in time in the sample. The figure assumes that yˆ2,t,t+h is obtained from an AR(4) model (selected by the 8

When intercepts are non-zero, a regime switching specification for changes in interest rates implies persistent trends in the levels of interest rates. This is unlikely to be a plausible specification empirically. See also the discussion in Diebold and Kilian (2000) of the importance for forecasting performance of correctly identifying the order of integration.

10

BIC) but very similar results follow for the random walk benchmark and are omitted to save space. The combination weights are strongly dependent both on the state probabilities and on the forecast horizon. At short horizons the weights can be large in absolute value and can either have positive or negative signs ˆ t,t+h [2] > 0 are also characterized by ω vice versa. Assuming a short horizon, states 1 and 3 assign large positive weights to the forward rate forecast and a large negative weight to the random walk forecasts. In state 2 the opposite happens as the weight on the time-series forecast is large and positive while the weight on the forward rate is negative.9 As h grows the weight on the forward rate, ω ˆ t,t+h [1], converges to unity, while the weight on the time-series forecast, ˆ t,t+h [1], is downward ω ˆ t,t+h [2], goes to zero. This means that the optimal weight on the forward rate, ω sloping when starting from state 1 and 3, while it is upward sloping when starting from state 2 or 4. We conclude from these findings that forward rates are particularly important for short-term forecasting of interest rates in regimes 1 and 3. Conversely, more backward-looking time-series forecasts seem to perform well in states 2 and 4. Figure 3 also plots optimal weights when the initial state probabilities are set at their ergodic values — a scenario with high uncertainty about the current regime. At short horizons weights of 1.4 and -0.4 are obtained for the forward rate and AR(4) forecasts, respectively. As h grows, these weights approach unity and zero, respectively. These findings have implications for tests of the expectations hypothesis (EH). Under the EH, long-term spot yields are given by an arithmetic average of one-month expected spot rates and future term premia for the different maturities. Forward rates, corrected for a term premium, should therefore be unbiased predictors of the future spot rate. In the regression ˆ2,t,t+1 + ut+h , yt+h = α − Th + βft+h−1,1 + γ 0 y ˆ = 1 (unbiasedness), γˆ = 0 (efficiency), while α this implies that β ˆ should provide an estimate of the risk premium. In short, only forward rates should be able to forecast future spot rates. Our forecast combination results reveal that for most configurations of the initial state probabilities, the EH is strongly rejected at ˆ t,t+h [2] 6= 0, so there are advantages from using a short horizons since, for small h, ω ˆ t,t+h [1] 6= 1 and ω combination of forward rates and time-series forecasts.10 However, as h grows the restrictions implied by the EH become useful for forecasting purposes. The fact that the EH is rejected at short forecast horizons is also clear from Figure 4 which compares the (in-sample) expected loss under the optimal forecast combination against the expected loss under the separate forecasts from the forward rate or AR(4) model. Differences in expected loss are quite large, especially when comparing the optimal combination to the AR(4) model. The percentage decline in expected loss (relative to the benchmark) obtained by going from the AR(4) forecasts to a combination of time-series and forward rate forecasts exceeds 40% when h ≥ 4, but is typically more modest at shorter horizons. Reductions in losses are smaller−between 5 and 15%−against the forward rate forecasts. 9

Assigning a negative weight to a forecast does not make it useless but — on the contrary — makes it potentially useful for minimizing expected loss through its covariance with other forecasts. 10 Empirical findings have generally been unfavorable to the EH, see e.g., Campbell and Shiller (1991).

11

4. Out-of-sample Forecasting Performance The analysis has so far demonstrated potentially large gains from forecast combinations that account for the sensitivity of the combination weights to the underlying state. However, there is no guarantee that such gains are empirically achievable since the results assumed that the regime switching model was correctly specified. To tackle this issue, we next conduct an out-of-sample forecasting exercise. Both individual forecasts and combinations of forward rates and testers are compared to a wide array of benchmarks from the literature.11 We proceed as follows. For each model, we obtain recursive parameter estimates over expanding samples starting with 1950:01 - 1980:01, 1950:01 - 1980:02, up to 1950:01 - 2003:12-h, where h is the forecast horizon. When h = 1, this gives a sequence of 287 sets of parameter estimates for each of the models. Only information (m) available at the date when the forecast is formed is used. We refer to yˆt,t+h as the h−step forecasts generated (m)

(m)

by model m and evaluate the accuracy of the forecasts through the forecast errors et,t+h ≡ yt+h − yˆt,t+h and the associated RMSFE and bias computed as v u 2003:(12−h) ³ ´2 u 1 X (m) (m) yt+h − yˆt,t+h , ≡ 1200 × t RM SF Eh 288 − h t=1980:01

(m)

Biash

≡ 1200 ×

1 288 − h

2003:(12−h) ³

X

t=1980:01

´ (m) yt+h − yˆt,t+h .

We analyze the performance of several versions of the VAR regime switching model (2) fitted to the spot and forward rates measured at time t, zt ≡ (yt ft,1 )0 . Therefore n1 = 1 and yˆ1,t,t+h corresponds to some function of the (predicted) values of zt+h specified below. Two alternative tester functions are used (n2 = 1), P ˆ0 + 4j=1 a ˆj yt−j : namely the random walk, yˆ2,t,t+h = yt and a recursively estimated AR(4) model, yˆ2,t,t+h = a 1. The first method computes forecasts by using the optimal combination weights on the tester and the conditional forecast of the 1-period forward rate between time t + h − 1 and t + h, Et [ft+h−1,1 ]: (1) ˆ 0,t,t+h + ω ˆ t,t+h [1]Et [z0t+h−1 e2 ] + ω ˆ t,t+h [2]ˆ y2,t,t+h . Optimal combination weights are found yˆt,t+h = ω using either the random walk or AR(4) forecasts as testers, yˆ2,t,t+h .

2. The second method is analogous to the first but restricts the combination weights so that ω ˆ t,t+h [1], (2) 0 0 ˆ t,t+h ι2 = 1: yˆt,t+h = ω ˆ 0,t,t+h + ω ˆ t,t+h [1]Et [zt+h−1 e2 ] + (1 − ω ˆ t,t+h [1])ˆ y2,t,t+h . ω ˆ t,t+h [2] ∈ [0, 1] and ω (3)

ˆ 0,t,t+h = ω ˆ t,t+h [2] = 0, so yˆt,t+h = 3. The third method imposes the restriction ω ˆ t,t+h [1] = 1 and ω 0 Et [zt+h−1 e2 ], the conditional forecast of the 1-step forward rate that will apply between time t + h − 1 and t + h. This is an iterated version of the expectations hypothesis. (4)

4. The fourth method sets yˆt,t+h = Et [yt+h ] = Et [z0t+h e1 ], i.e. the conditional forecast of the future spot rate at t + h. This forecast ignores the direct contribution of forward rates. 11

See also Egorov, Hong and Li (2005) for comprehensive evidence on the out-of-sample performance of affine term structure models.

12

5. - 8. Methods five through eight are the single-state (k = 1) VAR versions of methods 1 - 4. For instance, method 5 combines the tester (the random walk or the AR(4) model) and the conditional forecast of the 1-period forward rate between time t + h − 1 and t + h when zt follows the VAR(1) in (4). 9. Method 9 adopts the ‘pure’ expectations hypothesis, i.e. a model where ft,h−1 is taken as an unbiased (9) and efficient forecast of yt+h , yˆt,t+h = ft,h−1 . 10. Method 10 is a modified version of the EH which corrects for possible biases in forward rates: yt+h = −Th + βft,h−1 + ut+h

h = 1, 4, 12,

(15)

where Th is an h−period term premium that is assumed to be time-invariant. The model is recursively (10) ˆ t,h−1 . Since risk premia can change as a function of the horizon estimated to generate yˆt,t+h = −Tˆh + βf (Th ), we refer to this as the ‘Liquidity Preference’ model. (11)

11. Method 11 is the random walk, yˆt,t+h = yt . (12)

ˆ0 + 12. Method 12 is a recursively estimated AR(4) specification, yˆt,t+h = a

P4

ˆj Et [yt+h−j ]. j=1 a

Table 2 shows results reported in annualized basis points. For example, a monthly RMSFE-value of 0.001 will be reported as an error of 120 basis point (b.p.) per year. A negative bias means that the forecasts on average exceed the realized spot rates. Which method is best depends on the forecast horizon. When h = 1, it is optimal not only to model the presence of regimes in the joint distribution of spot and forward rates, but also to exploit information on the current state to combine forward rate and time-series forecasts. Forecasts from the regime switching model combined with an AR(4) tester generates the lowest RMSFE-value (34 annualized basis points) and a negligible bias of -2.5 b.p. Ignoring the presence of regimes and either directly forecasting off the simpler VAR(1) model or computing optimal combinations assuming it is the data generating process increases the RMSFE to 62 b.p. or higher and hence reduces the forecast accuracy. Interestingly, this deterioration in the out-of-sample performance reflects both higher biases and more volatile forecast errors.12 All the proposed benchmarks fail at the short horizon. Even the best benchmark — the AR(4) model — produces a RMSFE-value more than double the value produced by method 1 and the random walk is not a particularly difficult benchmark to beat. The EH, either in the ‘pure’ form of model 9 ((15) with Tˆ1 = 0 ˆ = 1), or in the modified form of method 10, delivers disappointing results, with RMSFE-values in and β excess of 85 b.p. and biases of almost 25 b.p. We conclude that, at short forecast horizons, imposing the EH directly leads to imprecise and biased forecasts. Utilizing the restrictions from the EH in the context of a regime switching model that allows forward rates to be combined with simpler time-series forecasts produces better forecasts. As the forecast horizon grows, information from the filtered state probabilities becomes less useful. While there is some value in combining forecasts from (2) with either the random walk or AR(4) testers, at long horizons the minimum RMSFE is achieved by the VAR(1) regime switching model subject to the EH 12

Imposing restrictions on the combination weights significantly worsens forecast accuracy at the short horizons, since the restrictions increase the resulting biases. Such restrictions affect the regime switching forecasts most adversely.

13

restriction (method 3). This method produces RMSFE-values between one-half and four-fifths of the value achievable through alternative methods. For instance, at h = 12, method 3 produces an RMSFE-value of 131 b.p. against the 183 b.p. of the best alternative benchmark, the AR(4) model; method 1 produces a RMSFE-value of 197 b.p. Even at long horizons, the performance of the pure EH and of the liquidity preference hypothesis (models 9 and 10) remains disappointing.13 Conversely, when restrictions from the EH are applied on the forecasts from the regime switching model (method 3) or from the VAR model (method 7), forecast accuracy seems to improve. This effect is particularly important at the longer horizons. At these longer horizons, the EH restrictions on the combination weights lead to smaller biases and also reduce errors from parameter estimation, whereas at the short horizon (h = 1), the effect on the bias from ignoring the current state probabilities is more severe and so method 1 produces the best forecasts. It is useful to consider the separate effects of combining versus allowing for regime dynamics and Table 2 allows us to do so. Comparing the combined forecast to the direct forecast of the spot rate from the singlestate VAR models (method 5 versus method 8), benefits from combining mainly emerge at long horizons. Conversely, once regimes are introduced, combining seems to improve forecast accuracy at the short but not at the long horizons (method 1 versus method 4). Turning to the effect of allowing for regimes, comparisons of methods 1 and 5 or methods 4 and 8 suggest that the effect of regimes on forecast accuracy is of first order at most of the horizons. Clearly, the performance of regime switching models depends not only on the trade-off between the flexibility of the model and the effect of parameter estimation error, but also on the ability of the regime switching framework to accurately track the latent states. A well-specified model will not only accurately identify regime shifts in the parameters governing the joint dynamics in spot and forward rates, but should also prove useful in forecasting future spot rates when such switches occur. We therefore proceed to calculate the following measure of precision in regime classification, µ ¶µ ¶ ˆ t [s] max {ˆ π t [s]\ max π ˆ t [s]} k ≥ 2. ρ(ˆ π t ) ≡ 1 − 4 max π s=1,...,k

s=1,...,k

s=1,...,k

This measures is one minus four times the product of the two highest state probabilities estimated at time ˆ t [s] × maxs=1,...,k {ˆ π t [s]\ maxs=1,...,k π ˆ t [s]} = 1/4 when there is absolute uncertainty t.14 Since maxs=1,...,k π about regimes, the scalar 4 acts as a normalizing constant and in this case ρ(ˆ π t ) = 0. When any of the π t ) = 1. Also elements of π ˆ t equals one, investors perceive being in one of the regimes with certainty and ρ(ˆ in this respect, the four-state model seems to do a good job at describing the dynamics of short term interest rates, as the average value of the recursive estimates of ρ(ˆ π t ) over the out-of-sample period 1980:01-2003:11 πt ) exceeds 0.9, i.e. the is 0.67. In 72% of the sample ρ(ˆ π t ) exceeds 0.5 and in 39% of these months ρ(ˆ 15 regime is inferred with great precision. 13

Notice that imposing restrictions on the weights still leads to a deterioration in performance for h = 4, but seems not to matter for h ≥ 12. This is consistent with the previous evidence since ω ˆ 1t,t+h ≈ 1 and ω ˆ 2t,t+h ≈ 0 at long horizons. 14 The operator maxs=1,...,k {ˆ π t [s]\ maxs=1,...,k π ˆ t [s]} picks the second highest probability over s = 1, ..., k. 15 We also calculated correlations between the squares of the recursive, h−step forecast errors from a variety of regime switching models, starting from the one combining forward rate forecasts with AR(4) forecasts and the indicator variable I{ρ(ˆ π t )>0.9} . This generated correlations of -0.22 for h = 1 and -0.17 for h = 4 and 12 (all values are highly significant). Hence, the forecasting performance of the switching combination model is particularly good when the regime is well identified.

14

4.1. Forecasts with Macroeconomic Information Recently a number of studies have found that macroeconomic variables can predict future spot rates. Diebold, Rudebusch, and Aruoba (2006) show that inflation shocks predict changes in the level of interest rates, while shocks to the federal funds rate and manufacturing capacity utilization rates forecast changes to the slope of the yield curve. Ang and Piazzesi (2003) identify the factors driving the yield curve in a no-arbitrage framework with the principal components of observable macroeconomic variables and also report that shocks to inflation affect the level of the yield curve, while shocks to real activity variables such as employment and industrial production affect its slope. Dai and Philippon (2004) show that fiscal policies (the ratio of public expenditure over taxes) and indices of the state of the labor market (such as the FREDII series ‘Help Wanted’) do a better job than output or consumption at predicting the dynamics of the US yield curve. To explore the role of macroeconomic variables in predicting future spot rates, each month during the out-of-sample period, we estimate h−step predictive regression models yt+h = b0,h +

7 X

bj,h xjt + γ h σ ˆ t + θh yt + ut+h ,

(16)

j=1

where σ ˆ t is a measure of the 12-month rolling window volatility and xjt is the jth macroeconomic variable. The models differ in terms of which macroeconomic variables they include as we recursively estimate models by imposing a variety of zero coefficient restrictions. Each month the best h−period forecasting model is selected from a total of 29 models (all possible combinations of the regressors in (16)) by minimizing the BIC criterion. We also recursively estimate prediction models that include all nine regressors in (16). Over the full-sample 1954:01 - 2003:12, we obtain the following estimates of the forecasting model for h = 1 (standard errors are in parentheses): yˆt,t+1 = 0.0004 + 0.025 CapUt + 0.579 F F Rt − 0.004 CP I inflt + 0.005 RP Ct − 0.001 Helpt + (8.8 e−05)

(0.009)

(0.033)

(0.011)

(0.005)

ˆ t + 0.355 yt + 0.020 EM Pt − 0.020 IPt − 0.357 σ (0.015)

(0.010)

(0.040)

(0.041)

(0.001)

¯ 2 = 0.942 BIC = −12.99, R

where CapUt is the rate of growth of capacity utilization (measured as the distance of real activity from a non-inflationary trend), F F Rt is the federal funds rate, CP I inflt is the inflation rate, RP Ct is real personal consumption expenditure growth, Helpt is the rate of growth of the Help Wanted Index, EM Pt is the growth rate in employment, and IPt is industrial production growth. All series are seasonally adjusted and available from FREDII at the Federal Reserve Bank of St. Louis. The SIC selects very parsimonious ˆ t for most of the out-of-sample period. To models and tends to include only CapUt , F F Rt , CP I inflt , and σ keep the forecasting model as parsimonious as possible, we combine these h-step recursive forecasts from the macroeconomic model with the yield-based forecasts by taking the macro forecasts as given (i.e. not as part of model (2)) and then minimizing the RMSFE, using the variances and covariance between the forecast errors from the macro and yield models. The predictive accuracy results in Table 2 suggest that the macroeconomic forecasting models perform quite poorly for h ≤ 12 with RMSFE-values between 100 and 250 basis points. Furthermore, the bias associated with these models is quite sizeable. At the long horizon (h = 24), the performance of the two 15

macro-based models is similar to other benchmarks, although these models remain inferior to the best among the regime switching models (method 3). Macroeconomic forecasts are also dominated by the VAR forecast at the shorter horizons up to 12 months. We also use recursive forecasts from the macroeconomic models (selected using the SIC) as testers. Combining iterated forward rate forecasts with macro-based predictions generates RMSFE-values that are slightly higher than when the simple AR(4) or random walk testers were used (e.g. 93 vs. 90 b.p. at h = 4). 4.2. Statistical Significance To address whether the out-of-sample performances are sufficiently different to allow us to draw any conclusions on the relative precision of the various forecasting methods, we implement the forecast accuracy comparison test proposed by Diebold and Mariano (1995). Define the differential loss of model m relative to the loss from model n ´2 ³ ´2 ³ (m,n) (m) (n) dift,t+h = et,t+h − et,t+h . We test the significance of the differences between two sets of forecast errors based on the statistic (m,n) DMh

=

1 288−h

(m,n)

P2003:(12−h)

(m,n) t=1980:01 dift,t+h , (m,n) σ b(dift,t+h )

(17)

where σ b(dift,t+h ) is an estimate of the standard error of the loss function differential. In practice, we use (m,n)

σ b(dift,t+h ) =

h X

j=−h

i h d dif (m,n) , dif (m,n) Cov t,t+h t+j,t+j+h .

Tables 3 and 4 provide pair-wise test statistics when h = 1 or h = 12. Each cell below the main diagonal (m,n) for the models in row (m) and column (n). Negative (positive) numbers reports the value of DMh indicate that the row model out- (under-) perform the column model. Unsurprisingly, the sharpest conclusions on the relative performance of our methods can be reached for h = 1. At this horizon, method 1 outperforms all other methods at a statistically significant margin, except for method 3. The forecast comparison offers less clear-cut results (higher p-values) for h = 12 months. Even so, both the pure EH and the Liquidity Hypothesis are systematically outperformed by all other models. We address possible departures of the small-sample distribution of (17) from normality by block- boot(m,n) for all pairs of models and h = 1, 12 months. The resulting bootstrapping the distribution of DMh strapped p−values (using 50,000 trials) are reported above the main diagonal in Tables 3 and 4. At the 1-month horizon Table 3 indicates that forecast combinations based on regime switching models outperform single-state models by a statistically significant margin, with the majority of the p-values between 0.01 and 0.05. The only exception is the pairwise comparison with the AR(4) model. Although the regime-switching combination lowers the RMSFE of the AR(4) model by 37 b.p. (from 71 to 34 b.p.), the loss function (1) (12) differentials (et,t+1 )2 − (et,t+1 )2 are sufficiently volatile that the bootstrapped p-values exceed 0.25 for all testers. Single-state forecasts (both combinations and spot and forward predictions) are dominated by the regime switching equivalents. Table 4 confirms the earlier impression that differences are less clear-cut at longer forecast horizons. 16

4.2.1. A Parametric Bootstrap Caution should be exercised when interpreting these results. The small sample distribution of the statistic in (17) can show large departures from normality, thus complicating the task of conducting inferences on (m) (n) differential predictive accuracy. When et,t+h and et,t+h are generated from models whose parameters are (m)

(n)

estimated recursively, the test must be based on estimated values, using forecast errors eˆt,t+h and eˆt,t+h , and this can affect the sampling distribution of the test. Clark and McCracken (2001) also show that the test depends on whether the forecasting models are nested or non-nested. A partial solution to the fact that a number of the comparisons in Tables 3 and 4 involved models that are nested is to adopt the parametric bootstrap approach proposed by Kilian (1999). Consider the following VAR(4) representation of (spot and forward) interest rates and the macroeconomic variables that encompasses a number of the earlier forecasting models: qt+1 = μq +

4 X

Φj qt+1−j + ut+1 ,

(18)

j=1

where qt ≡ (z0t x0t )0 , xt is a vector of macroeconomic factors and ut+1 ∼ IIN (0, Σ). The first equation in (18) is just an extension of (16) to the case where four lags of the macroeconomic variables forecast future spot rates, while the last equations pick up dynamics in xt . Such a process is flexible enough to provide a good description of the data at hand. The parametric bootstrap algorithm proceeds in four steps: 1. Estimate the parameters μq , {Φj }4j=1 , and Σ in (18) over the full-sample 1954:01 - 2003:12. This yields the vector of residuals {ˆ ut }2003:12 t=1954:05 . 2. Based on these estimates, generate a sequence of pseudo observations {qbt }2003:12 t=1954:05 from (18). We initialize the process at the unconditional means of qt and discard the first 1,000 transients. The pseudo innovation terms {ˆ ubt }2003:12 t=1954:05 are drawn randomly with replacement from the set of observed 2003:12 residuals {ˆ ut }t=1954:05 . We repeat this step B times. 3. For each of the B bootstrap replications {qbt }2003:12 t=1954:05 generated in the previous step, recursively estimate models 1, 3-5, 7-8, and 11-14 over the expanding periods 1950:01 - 1980:01 up to 1950:01 2003:12-h, where h is the forecast horizon.16 This gives a sequence of h-step interest rate forecasts (m) (m) for each of the models. Compute the forecast errors et,t+h ≡ yt+h − yˆt,t+h and proceed to obtain B (m,n) B }b=1 .

d h,b bootstrap replications of {DM

B d (m,n) 4. Using the small-sample simulated distribution, {DM h,b }b=1 , the p-values for two-sided tests of the null of zero differential predictive accuracy is given by the percentage of bootstrapped simulations such (m,n) (m,n) d (m,n) |, where DMh is the sample statistic reported in Tables 3 and 4. that |DM h,b | > |DMh

Table 5 reports the results in the form of bootstrap p-values for h = 1 (below the main diagonal) and (m,n) are identical to those appearing in Tables 3 h = 12 (above the main diagonal). The values of DMh 16

Models 2 and 6 were dropped because of our earlier finding that restricting the combination weights fails to improve the forecast performance. Models 9 and 10 cannot be simulated because they involve forward rates and hence would imply bootstrapping the overall dynamics of the term structure of interest rates, which is beyond the scope of our paper.

17

and 4. Symbols J and N (respectively) are used to visualize when a row (column) model outperforms the column (row) model by a statistically significant amount. At h = 1 results remain favorable to the regime switching forecasts, which are found to significantly outperform most of the other models. The dominance of regime switching models over single-state models is unchanged. VAR(1) forecasts prove superior to either the random walk or to the simple use of macroeconomic variables. At the 12-month horizon, pairwise comparisons between single- and multi-state models show that the latter outperform the former. 5. Conclusion This paper proposed a four-state model to capture the dynamics in US spot and forward rates. We proposed a flexible approach that combines forecasts of future spot rates with other ‘testers’ that can be viewed as forecasts obtained from alternative model specifications. In an out-of-sample forecasting exercise we found evidence that, particularly at short horizons, combining regime switching forecasts with simpler, univariate time-series forecasts can help reduce the root mean squared forecast error. At longer horizons, we found that imposing theoretical restrictions from the expectations hypothesis linking future spot rates to forward rates helps improve forecasting accuracy. Although the expectations hypothesis is rejected using in-sample tests, it may still be helpful in improving out-of-sample prediction accuracy. References [1] Ait-Sahalia, Y., 1996, Testing Continuous-Time Models of the Spot Interest Rate. Review of Financial Studies 9, 385-426. [2] Aiolfi, M., and A., Timmermann, 2004, Persistence of Forecasting Performance and Combination Strategies. Forthcoming in Journal of Econometrics. [3] Ang, A., and G., Bekaert, 2002, Short Rate Nonlinearities and Regime Switches. Journal of Economic Dynamics and Control 26, 1243-1274. [4] Ang, A., and M., Piazzesi, 2003, A No-Arbitrage Vector Autoregression of Term Structure Dynamics with Macroeconomic and Latent Variables. Journal of Monetary Economics 50, 745-787. [5] Bansal, R., G., Tauchen and H., Zhou, 2004, Regime-Shifts, Risk Premiums in the Term Structure, and the Business Cycle. Journal of Business and Economic Statistics 22, 396-409. [6] Bates, J., and C., Granger, 1969, The Combination of Forecasts. Operations Research Quarterly 20, 451-468. [7] Campbell, J., and R., Shiller, 1991, Yield Spreads and Interest Rate Movements: A Bird’s Eye View. Review of Economic Studies 58, 495-514. [8] Clark, T., and M., McCracken, 2001, Tests of Forecast Accuracy and Encompassing for Nested Models. Journal of Econometrics 105, 85-110. [9] Dai, Q. and T., Philippon, 2004, Fiscal Policy and the Term Structure of Interest Rates. NBER working paper No. 11574. 18

[10] Deutsch, M., C., Granger and T. Terasvirta, 1994. The Combination of Forecasts using Changing Weights. International Journal of Forecasting 10, 47-57. [11] Diebold, F., and L., Kilian, 2000, Unit Root Tests Are Useful for Selecting Forecasting Models. Journal of Business and Economics Statistics 18 265-273. [12] Diebold, F., and J., Lopez, 1996, Forecast Evaluation and Combination. In Maddala and Rao (eds.), Handbook of Statistics. Elsevier: Amsterdam. [13] Diebold, F., and R., Mariano, 1995, Comparing Predictive Accuracy. Journal of Business and Economic Statistics 13, 253-263. [14] Diebold, F., and P., Pauly, 1987, Structural Change and the Combination of Forecasts. Journal of Forecasting 6, 21-40. [15] Diebold, F., G., Rudebusch and B., Aruoba, 2006, The Macroeconomy and the Yield Curve: A Dynamic Latent Factor Approach. Journal of Econometrics 131, 309-338. [16] Duffee, G., 2002, Term Premia and Interest Rate Forecasts in Affine Models. Journal of Finance 57, 405-443. [17] Egorov, A., Y., Hong, and H., Li, 2005, Forecasting the Joint Probability Density of Bond Yields: Can Affine Models Beat Random Walk?, mimeo, Cornell University. [18] Elliott, G., and A., Timmermann, 2005, Optimal Forecast Combination Weights Under Regime Switching. International Economic Review 46, 1081-1102. [19] Fama, E.F., 1984, Forward and Spot Exchange Rates. Journal of Monetary Economics 14, 319-338. [20] Granger, C., and T., Terasvirta, 1993, Modeling Nonlinear Economic Relationships (Oxford University Press, Oxford). [21] Gray, S., 1996, Modeling the Conditional Distribution of Interest Rates as a Regime-Switching Process. Journal of Financial Economics 42, 27-62. [22] Hall, A., H., Anderson, and C., Granger, 1992, A Cointegration Analysis of Treasury Bill Yields. Review of Economics and Statistics 74, 116-126. [23] Hamilton, J., 1988, Rational Expectations Econometric Analysis of Changes in Regime: An Investigation of the Term Structure of Interest Rates. Journal of Economic Dynamics and Control 12, 385-423. [24] Kilian, L., 1999, Exchange Rates and Monetary Fundamentals What Do We Learn from Long-Horizon Regressions? Journal of Applied Econometrics 14, 491-510. [25] Rudebusch, G., and T., Wu, 2004, A Macro-Finance Model of the Term Structure, Monetary Policy, and the Economy. Mimeo, Federal Reserve Bank of St. Francisco. [26] Sola, M., and J., Driffill, 1994, Testing the Term Structure of Interest Rates Using a Stationary Vector Autoregression with Regime Switching. Journal of Economic Dynamics and Control 18, 601-628. 19

[27] Stock, J., and M., Watson, 2001, A Comparison of Linear and Nonlinear Univariate Models for Forecasting Macroeconomic Time Series, in R.F. Engle and H. White (eds). Festschrift in Honour of Clive Granger, 1-44. [28] Stock, J., and M., Watson, 2005, Combination Forecasts of Output Growth in a Seven-Country Data Set. Journal of Forecasting 23, 405-430. [29] Timmermann, A., 2006, Forecast Combinations. Pages 135-196 in G. Elliott, C.W.J. Granger and A. Timmermann (eds.) Handbook of Economic Forecasting. Elsevier Press: Amsterdam.

20

Table 1

Estimates of a Four-State VAR(1) Regime Switching Model The table shows estimation results for the two-state vector autoregressive regime switching model: where z t +1 = [ y t +1 f t ,1 ]' , f t ,1

z t +1 = μ St +1 + Az t + ε t +1 is the one-month forward rate. μ St +1 is the intercept vector in state St+1, A is a matrix of

autoregressive coefficients, ε t +1 = [ε 1t +1 ε 2t +1 ]' ~ N (0, Σ 2St +1 ) . The unobserved state variable St+1 is governed by a firstorder Markov chain that can assume four values. The first panel reports estimates for the single-state case k = 1. Asterisks on correlation coefficients refer to covariance estimates. For mean coefficients and transition probabilities, standard errors are reported in parentheses. Data is expressed as percentages (basis points). The sample period is 1950:01 – 2003:12.

1. Intercept 2. VAR(1) Matrix Spot rate Forward rate 3. Correlations/Volatilities Spot rate Forward rate 1. Intercept State 1 (Low/Stable) State 2 State 3 State 4 (High Volatility) 2. VAR(1) Matrix Spot rate Forward rate 3. Correlations/Volatilities State 1 (Low/Stable): Spot rate Forward rate State 2: Spot rate Forward rate State 3: Spot rate Forward rate State 4 (High Volatility): Spot rate Forward rate 4. Transition probabilities State 1 (Low/Stable) State 2 State 3 State 4 (High Volatility)

Panel A – Single State Model Spot rate Forward rate 0.0097 (0.0041) 0.0059 (0.0033) 0.291 (0.081) 0.169 (0.067)

0.717 (0.085) 0.809 (0.070)

0.0501*** 0.8771*** 0.0422*** Panel B – Four State Model Spot rate Forward rate 0.0008 (0.0017) 0.0098 (0.0025) 0.0141 (0.0066) 0.0416 (0.0204)

0.0003 (0.0015) 0.0007 (0.0031) 0.0027 (0.0090) 0.0077 (0.0116)

0.1653 (0.0715) 0.0980 (0.0622)

0.8002 (0.0729) 0.9008 (0.0623)

0.0078*** 0.9237***

0.0063**

0.0213*** 0.8993***

0.0170***

0.0483*** 0.8928***

0.0381**

0.1257*** 0.9017*** State 1 State 2 0.843 0.153 0.044 0.830 0.000 0.149 0.000 0.000

0.1095*** State 3 State 4 0.004 0.000 0.110 0.016 0.818 0.033 0.190 0.810

* denotes significance at the 10%, ** significance at the 5%, *** significance at the 1% level.

21

Table 2

Out-of-Sample Forecasting Performance The table reports summary statistics for h-step-ahead recursive forecasts of spot rates under a variety of methods and testers such as random walk (RW), autoregressive (AR) and macro (M) forecasts. Bias and root mean squared forecast error (RMSFE) are expressed in basis points per annum. For each statistic and horizon, the best values are boldfaced. Tester

1. Switching VAR(1) Combination 2. Restricted Switching VAR(1) Combination 3. Switching VAR(1) Forward Forecast 4. Switching VAR(1) Spot Forecast

RMSFE Bias (annual basis points) (annual basis points) h=1 h=4 h=12 h=24 h =1 h=4 h=12 h=24 Regime Switching Models

RW

34.15

91.20

197.50

291.58

-2.46

-28.26

-76.35

-150.71

AR

34.14

90.03

197.41

291.39

-2.47

-28.25

-76.41

-150.71

M

68.18

93.48

197.29

291.30

-2.24

-23.84

-75.64

-150.06

RW

68.29

103.44

197.50

291.58

-22.46

-38.26

-76.37

-150.71

AR

68.28

103.44

197.48

291.53

-22.52

-38.25

-76.36

-150.69

M

73.80

119.65

224.69

341.33

-2.77

-24.28

-79.40

-160.34



34.95

52.41

130.99

208.59

4.55

-11.56

-50.05

-124.87



66.81

104.50

198.37

292.61

-23.11

-39.32

-77.47

-152.21

Single-State VAR(1) Models 5. VAR(1) Combination

6. Restricted VAR(1) Combination 7. VAR(1) Forward Forecast 8. VAR(1) Spot Forecast

RW

70.80

96.12

298.92

654.23

-21.72

-119.5

-172.5

-258.7

AR

70.74

95.16

299.52

653.16

-21.76

-119.3

-172.2

-258.8

M

78.82

99.94

376.09

809.15

-23.43

-124.4

-196.9

-271.7

RW

73.18

97.44

302.88

662.37

-21.59

-119.4

-172.1

-259.9

AR

71.97

95.98

299.99

657.63

-21.78

-119.2

-172.2

-259.3

M

76.56

118.75

358.99

685.92

-25.22

-152.9

-217.8

-280.5



62.43

86.52

270.24

595.32

4.64

-49.96

-164.6

-246.7



67.20

121.58

390.12

749.04

-21.83

-194.7

-272.5

-355.8

Benchmarks 9. Pure EH 10. Liquidity preference Hypothesis



85.07

150.53

395.42

NA

29.37

47.78

128.20

NA



115.59

161.70

299.16

NA

-24.12

-47.41

-84.48

NA

11. Random Walk



80.43

105.69

184.98

267.37

-26.19

-39.49

-69.04

-128.53

12. AR(4)



70.55

107.89

182.94

239.72

-26.68

-38.69

-64.51

-113.71

Macroeconomic Forecasts 13 Macroeconomic forecasts – BIC criterion 14. Macroeconomic forecasts – all variables



118.44

155.79

239.15

254.13

-34.75

-47.03

-77.15

-133.11



118.75

161.57

247.71

257.06

-36.41

-55.27

-85.50

-138.41

22

Table 3

Comparison of Predictive Accuracy – 1 Month Horizon

Switching VAR(1) Combination Restricted Switching VAR(1) Combination Switching VAR(1) Forward Forecast Switching VAR(1) Spot Forecast

AR 0.845

(3)

(4)

RW 0.546 0.546

AR 0.142 0.843

(8)

(9)

(10)

(11)

(12)

(13)

(14)

0.053 0.049

0.030 0.031

0.008 0.044

0.001 0.001

0.049 0.045

0.284 0.274

0.047 0.044

0.050 0.049

(5)

Pure EH

Macro Macro forecasts AR(4) forecasts – all – BIC variables

0.991 0.546

0.048 0.049

RW 0.029 0.021

0.579 0.044

0.105 0.100

0.276 0.299

0.305 0.279

0.914 0.900

0.856 0.861

0.001 0.051

0.001 0.001

0.050 0.021

0.271 0.270

0.046 0.064

0.052 0.069

0.154

0.015

0.014

0.018

0.012

0.020

0.007

0.010

0.304

0.043

0.047

0.040

0.038

0.037

0.035

0.039

0.001

0.041

0.542

0.045

0.048

0.820

0.098 0.100

0.905 0.785

0.004 0.004

0.001 0.000

0.015 0.013

0.985 0.984

0.049 0.052

0.053 0.055

0.279

0.006

0.000

0.020

0.712

0.038

0.043

0.008

0.002 0.005

0.024 0.652

0.904 0.198

0.042 0.156

0.044 0.354

0.002

0.003

0.085

0.088

0.027

0.077 0.056

0.079 0.061

RW AR

-0.37

(2)

RW AR

1.20 1.96

0.79 0.38

1.20

(3)

0.07

1.06

-0.55

-2.07

(4)

2.27

2.24

2.07

2.02

1.72

2.98 2.97

2.97 2.99

1.20 1.18

1.20 1.19

3.12 3.08

2.76 2.75

-0.16

RW AR

(7) AR 0.028 0.018

VAR(1) Combination

(1)

0.345

Random Walk

(2)

(1) RW

Liquidity Preference Hypothesis

Switching VAR(1) Spot Forecast

VAR(1) Spot Forecast

Restricted Switching VAR(1) Combination

VAR(1) Forward

Switching VAR(1) Combination

Switching VAR Forward

The table reports Diebold-Mariano statistics for comparisons of the MSFE produced by different forecasting methods. The test is applied pairwise to forecast errors from recursive, 1-step forecasts of spot interest rates using a variety of univariate models and two possible tester forecasts, the random walk (RW) and an AR(4) univariate model for the spot rate. Statistics illustrate the comparative forecasting performance of the model in the row vs. the model in the column. Negative (positive) values indicate that the row model out- (under-) performs the column model. Diebold-Mariano statistics are shown below the main diagonal while p-values from a block bootstrap with 50,000 trials are reported above the diagonal.

VAR(1) Combination

(5)

VAR(1) Forward Forecast

(7)

2.11

2.14

-0.06

-0.05

2.86

2.55

-1.72

-1.72

VAR(1) Spot Forecast Pure EH Liquidity Preference Hypothesis Random Walk AR(4) Macroeconomic forecasts – all variables Macroeconomic forecasts – BIC criterion

(8) (9)

2.86 8.43

2.89 8.44

-0.24 8.39

-0.22 8.44

3.30 4.90

2.71 7.06

0.34 5.38

0.41 5.41

0.82 4.88

4.46

(10)

9.88

9.86

9.88

9.85

9.42

9.88

7.05

7.12

8.03

5.65

9.85

(11) (12)

2.86 1.42

2.88 1.40

2.87 1.43

2.82 1.42

3.39 1.26

3.06 0.70

3.85 -0.05

3.88 -0.04

3.33 0.41

3.05 0.24

-1.15 -2.63

-9.86 -9.84

-3.00

(13)

2.59

2.64

2.59

2.35

2.72

2.61

2.36

2.34

2.60

2.43

2.06

2.93

2.19

2.46

(14)

2.49

2.52

2.32

2.18

2.62

2.51

2.21

2.22

2.35

2.37

1.97

2.94

2.10

2.37

23

0.856 -0.25

Table 4

Comparison of Predictive Accuracy – 12 Month Horizon

(3)

Restricted Switching VAR(1) Combination Switching VAR(1) Forward Forecast Switching VAR(1) Spot Forecast

RW 0.570 0.489

AR 0.683 0.440

(8)

(9)

(10)

(11)

(12)

(13)

(14)

0.184 0.170

0.045 0.036

0.011 0.013

0.042 0.043

0.161 0.274

0.121 0.145

0.086 0.084

0.138 0.142

(5)

(4)

Pure EH

Macro Macro forecasts AR(4) forecasts – all – BIC variables

0.095 0.093

0.925 0.894

RW 0.105 0.084

0.090 0.079

0.989 0.861

0.064 0.069

0.078 0.074

0.176 0.008

0.028 0.039

0.013 0.023

0.042 0.042

0.277 0.121

0.132 0.149

0.086 0.079

0.138 0.140

0.096

0.041

0.039

0.106

0.008

0.027

0.044

0.332

0.105

0.058

0.075

0.298

0.308

0.300

0.096

0.010

0.039

0.154

0.124

0.107

0.154

0.986

0.410 0.354

0.324 0.319

0.002 0.001

0.988 0.965

0.014 0.013

0.078 0.069

0.335 0.334

0.321 0.357

0.117

0.006

0.210

0.009

0.093

0.511

0.458

0.699

0.077 0.033

0.005 0.001

0.018 0.019

0.159 0.094

0.150 0.071

0.023

0.041

0.005

0.008

0.349

0.013 0.022

0.014 0.030

RW AR

-0.53

(2)

RW AR

0.99 -0.46

1.26 1.51

-1.02

(3)

-1.93

-1.91

-1.96

-1.93

(4)

0.17

0.29

0.03

0.16

1.90

1.98 1.97

2.00 1.96

2.06 2.01

2.05 2.02

2.71 2.72

1.31 1.33

-0.03

RW AR

(7) AR 0.101 0.085

Switching VAR(1) VAR(1) Combination Spot Forecast

(1)

0.347

Random Walk

Switching VAR(1) Combination

AR 0.402

Liquidity Preference Hypothesis

(2)

(1) RW

VAR(1) Spot Forecast

Restricted Switching VAR(1) Combination

VAR(1) Forward

Switching VAR(1) Combination

Switching VAR Forward

The table reports Diebold-Mariano statistics for comparisons of the MSFE produced by different forecasting methods. The test is applied pairwise to forecast errors from recursive, 1-step forecasts of spot interest rates using a variety of univariate models and two possible tester forecasts, the random walk (RW) and an AR(4) univariate model for the spot rate. Statistics illustrate the comparative forecasting performance of the model in the row vs. the model in the column. Negative (positive) values indicate that the row model out- (under-) performs the column model. Diebold-Mariano statistics are shown below the main diagonal while p-values from a block bootstrap with 50,000 trials are reported above the diagonal.

VAR(1) Combination

(5)

VAR(1) Forward Forecast

(7)

1.66

1.69

1.79

1.80

2.98

1.28

-0.98

-1.05

VAR(1) Spot Forecast Pure EH Liquidity Preference Hypothesis Random Walk AR(4) Macroeconomic forecasts – all variables Macroeconomic forecasts – BIC criterion

(8) (9)

2.45 3.58

2.51 3.52

2.58 3.55

2.57 3.62

3.59 3.66

1.98 3.58

1.05 5.38

1.10 5.41

1.69 4.88

0.56

(10)

2.50

2.49

2.50

2.49

2.44

2.47

0.03

0.09

1.33

2.06

-3.54

(11) (12)

-1.06 -1.20

-1.02 -1.14

-1.13 -1.21

-1.08 -1.16

1.22 1.47

-1.07 -1.21

-3.85 -1.89

-3.88 -1.91

-3.33 -1.86

-4.06 -2.76

-3.50 -3.65

-2.95 -2.74

-0.23

(13)

1.92

1.97

1.91

1.93

2.23

1.87

-1.01

-0.98

-0.77

-1.52

-2.91

-3.81

2.45

2.71

(14)

1.64

1.58

1.52

1.44

1.99

1.59

-1.14

-1.00

-0.85

-1.67

-3.06

-3.64

2.15

2.46

24

0.106 -1.85

Table 5

Predictive Accuracy Comparison Tests – Parametric Bootstrap P-Values The table reports p-values for the Diebold-Mariano test based on pairwise comparisons of the MSFE produced by different forecasting methods. The test is applied to forecast errors from recursive, 1 and 12-step forecasts of spot rates using a variety of univariate models and three possible tester forecasts, the random walk (RW), an AR(4) univariate model for the spot rate, and macro forecasts (M). P-values are computed by computing the Diebold-Mariano statistic on recursive forecast errors when the data are simulated from a VAR(4) model for spot and forward interests rates extended to include macroeconomic predictor variables. Innovations are drawn from a bivariate regime switching model and the regression model with macroeconomic factors. The bootstrap uses 1,000 independent trials. Statistics illustrate the comparative forecasting performance of the model in the row vs. the model in the column: a negative (positive) value indicates that the row model out(under-) performs the column model. Below the main diagonal we present results for 1-month forecasts; above the main diagonal, results are for 12-month forecasts.

Switching VAR(1) Combination

0.787 0.680 0.954 0.579 0.351 0.279 0.659 0.675 0.079 0.265 0.204 0.340 50.039 50.033 50.020 50.033 50.024 50.022 50.043 50.038 50.021 50.038 0.063 50.050 50.037 50.044 50.011 50.035 50.029 50.037 50.015 50.010 0.114

0.785 50.035 50.014 50.028 50.016 50.045 50.035 0.079

0.052 50.048 50.045 0.060 50.043 50.003 50.047

(5) (7) AR M 12-month horizon 0.112 0.056 0.100 0.192 0.052 0.086 0.199 30.049 30.039 0.072 30.048 0.140 30.006 30.012 30.020 30.035 0.252 0.300 0.450 0.249 0.919 0.749 0.346 0.816 0.799 0.333 0.759 0.800 0.258 0.115 0.051 0.089 0.896 0.812 0.658 0.275 50.031 50.049 50.033 50.030 0.882 0.819 0.846 0.701

0.068 50.048 50.035

0.073

0.350

50.027 50.016 50.008 50.014

50.031

50.046 50.025

0.059 50.043

0.080

0.462

50.014 50.041 50.009 50.018

50.006

50.016 50.018

Switching VAR Combination Switching VAR Forward Switching VAR Spot VAR Combination

RW (1) AR M (3) (4) RW (5) AR M (7) (8) (11) (12)

VAR Forward VAR Spot Forecast Random Walk AR(4) Macroeconomic forecasts – (13) all variables Macroeconomic forecasts – (14) BIC criterion

1-month horizon

RW

0.080

(1) AR

Switching Switching Macro Macro VAR(1) VAR(1) VAR(1) VAR(1) Random forecasts – VAR(1) Combination Forward AR(4) forecasts – Spot Forward Spot Walk BIC all variables Forecast Forecast Forecast Forecast criterion (3)

(4)

M

(8)

(11)

(12)

(13)

(14)

0.269 0.275 0.246 0.095 0.607 0.329 0.278 0.198 0.526 0.180 0.205 0.146

0.287 0.283 0.350 0.156 0.680 0.339 0.308 0.189 0.440 0.083 0.250 0.168

RW

0.079 0.085 0.073

0.785 0.750 0.698 0.085

25

30.032 30.003 30.011 30.001 0.100 0.332 0.332 0.210 0.079 50.030 0.927

0.205 0.178 0.066 0.069 0.077 0.076 0.084 0.078 0.055 0.115 50.007 0.096 50.017 0.071 50.041 0.082 50.013 0.070 50.019 50.009 0.569 0.699

0.670 0.945

Figure 1

Smoothed State Probabilities for the Four-State VAR(1) Regime Switching Model The graphs plot the smoothed state probabilities for the multivariate Markov Switching model for spot and forward rates.

1.0

1.0

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0.0

0.0 50

55

60

65

70

75

80

85

90

95

00

50

55

60

65

70

REGIME1

75

80

85

90

95

00

85

90

95

00

REGIME2

1.0

1.0

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0.0

0.0 50

55

60

65

70

75

80

85

90

95

00

50

REGIME3

55

60

65

70

75

80

REGIME4

26

Figure 2

Simulated Distribution for Augmented Dickey-Fuller Test P-values The graph displays the simulated distribution (over 1,000 trials) of the p-values obtained in ADF tests of the null of a unit root in the 1-month US T-bill rate. 648-month long time series are simulated from the bivariate four-state regime switching model after discarding 100 transients. The ADF test includes a constant while the number of lags is selected by minimizing the BIC for each simulation trial.

200

160

Empirical ADF stat.

120

80

40

0 0.000

0.125

0.250

0.375

0.500

ADF p-value

27

0.625

0.750

Figure 3

Optimal Forecast Combination Weights as a Function of State Probabilities – AR(4) Benchmark The graphs plot the values of ω1 (the weight assigned to the forward rate forecasts) and ω2 (the weight assigned to the AR(4) forecasts) that minimize the MSFE of the combined forecast as a function of the forecast horizon (h). Results are shown for different configurations of the initial state probabilities.

w1 (Forward Rate Optimal Weight)

2.50 2.00 1.50 1.00 0.50 0.00 -0.50 -1.00 -1.50 -2.00 0

2

4

6

8

10

12

14

16

18

Horizon Regime 1 Regime 4

Ergodic probs Regime 3

20

22

24

22

24

Regime 2

w2 (AR(4) Optimal Weight)

2.50 2.00 1.50 1.00 0.50 0.00 -0.50 -1.00 -1.50 -2.00 0

2

4

6

Ergodic probs Regime 3

8

10

12

14

Horizon Regime 1 Regime 4 28

16

18

20

Regime 2

Figure 4

Improvement in forecast Precision Due to Combining Forecasts The graphs show the percentage decline in root mean squared forecast error obtained by combining forward rates and AR(4) forecasts vs. pure forward forecasts (ω0 = ω2 = 0 and ω1 = 1) and AR(4) forecasts under a number of assumptions for the initial state.

% Improvement in RMSFE vs. Forward Rate 20

15

10

5

0 0

2

4

6

8

10

12

14

16

18

20

Horizon Regime 1 Regime 4

Ergodic probs Regime 3

22

24

22

24

Regime 2

% Improvement in RMSFE vs. AR(4) 80 70 60 50 40 30 20 10 0 0

2

4

6

Ergodic probs Regime 3

8

10

12

Horizon Regime 1 Regime 4 29

14

16

18

20

Regime 2

Recommend Documents
Product. Eurodollar. 2-Year U.S. Note. 10-Year U.S. Note. 30-Year U.S. Bond. Symbol. /GE. /ZT. /ZN. /ZB. ETF Equivalent. None. SHY. IEF. TLT. Contract Size.

Interest Rates: Still Low But on the Rise. What does this mean for you? It has been a difficult decade for savers and investors who look to interest rates to help their investment return. For 10 years they have suffered falling or ultra‐low interes

at the CRR. Rebecca Cannon Fraenkel is a research associate at the CRR. The Center gratefully acknowledges Prudential. Financial for its sponsorship of the ...

Aug 26, 2013 - traditional asset allocation, where weights are based on capital allocation. An example of the latter is a fixed mix strategy that targets a capital ...

Existing models of equilibrium term structure of interest rates are often based on the repre- sentative agent ... and lend] and how they are related to bond prices and interest rates. In addition, linking bond ...... This is best seen by considering

In this chapter, we examine how the overall level of nominal interest rates (which .... at the top. The right vertical axis shows the interest rate, which increases in ...

Strategies. IN THIS ISSUE. If the Bond. Bubble Bursts. Hedging Against ... B2B Marketing Publication - Financial Services. Page 2. 2 Business Strategies Today.

Conditions”, “Personal Banking – Charges and Rates of Interest”/“Private Banking ... Please refer to the 'Savings Interest Rates' booklet for our current rates. .... It helps you to compare the effective rates of credit interest on differen

Bond Valuation. The present value of a bond (Vb) can be written as: 2T ... bond matures r = the annual interest rate (often called yield to maturity (ytm)) ... the fair present value of the bond (Vb) is .... elasticity) of a fixed-income security's p

At Citibank's sole discretion, fees and rates may be revised at any time, based on market or other prevailing conditions. Super Account. Unfixed Deposit. 3M. 6M.