Abstract
Journal of Applied Economics. Vol XI, No. 2 (November 2008), 399-425 ARBITRAGE AND CONVERGENCE: EVIDENCE FROM MEXICAN ADRS Samuel Koumkwa and Raul Susmel* University of Houston Submitted September 2007; accepted October 2007 This paper investigates the convergence between the prices of ADRs and Mexican traded shares using a sample of 21 dually listed shares. Since both markets have similar trading hours, standard arbitrage considerations should make persistent deviation from price parity rare. We use a STAR model, where the dynamics of convergence to price parity are influenced by the size of the deviation from price parity. Based on different tests, we select the ESTAR model. Deviations from price parity tend to die out quickly; for 14 out of 21 pairs it takes less than two days for the deviations from price parity to be reduced by half. The average half-life of a shock to price parity is 3.1 business days, while the median half-life is 1.1 business days. By allowing a non-linear adjustment process, the average half-life is reduced by more than 50% when compared to the standard linear arbitrage model. We find that several liquidity indicators are positively correlated to the speed of convergence to price parity. JEL classification codes: G14, G15 Key words: ADRs, nonlinear convergence, arbitrage, ESTAR I. Introduction In this paper, we study the possible arbitrage opportunities that the American Depository Receipts (ADRs) market provides. Although trading ADRs in the United States is denominated in U.S. dollars, it should be equivalent to trading the foreign firms’ shares without actually trading them in their respective local markets. In the absence of direct or indirect trading barriers, there should not be significant differences between the return distribution of locally traded shares and that of the U.S. traded ADRs. That is, ADRs and their underlying shares are expected to be perfect substitutes and no arbitrage opportunities should prevail. If prices between the ADRs and their underlying shares differ substantially, arbitrage opportunities will arise. * Raúl Susmel (corresponding author): Department of Finance, C. T. Bauer College of Business, University of Houston, Houston TX 77204-6282, rsusmel@uh.edu. Koumkwa: skoum@yahoo.com. 400 Journal of Applied Economics We focus on the price convergence between Mexican ADRs and their underlying shares because the trading hours in Mexico and New York are almost identical. Thus, convergence to price parity should not be affected by possible lead-lag informational impact, as analyzed by Kim, Szakmary and Mathur (2000). The majority of the studies in this area have, implicitly, focused on linear convergence to arbitrage parity, with exceptions in Rabinovitch et al. (2003), Chung, Ho and Wei (2005) and Suarez (2005b), where threshold autoregressive models are used. Given the complexity of rules, direct and indirect transaction costs, however, non- linear adjustments to price parity deviation are more likely to occur. We use two popular non-linear models for our adjustment specification: the exponential smooth transition autoregressive (ESTAR) and the logarithmic smooth transition autoregressive (LSTAR). The ESTAR model allows symmetric adjustments, while the LSTAR model allows for asymmetric adjustments. From our estimation results, first, we reject the linear adjustment model; and, second, based on different tests, we select the ESTAR model. Using the ESTAR model, we are able to estimate the half-life of different shocks. This allows us to measure the speed of convergence. The faster the convergence, the more efficient the pricing in the ADR and underlying markets –i.e., the faster arbitrage opportunities vanish from both markets. We find that price spreads tend to die out quickly, for 14 out of 21 firms it takes less than 2 days for the ADR- underlying price spread to be reduced by half. These results are consistent with the dynamics of arbitrage in the ADR market. Gagnon and Karolyi (2003) mention that although the process of issuance and cancellation of ADRs can take place on the same day; the process usually occurs on an overnight basis. We find that for four firms, however, the half-life estimates seem very high (seven days or more). Three of these four firms correspond to companies that display very low volume, and thus, arbitrage might be difficult to execute. The average half-life is 3.1 business days and the median half-life is 1.08 days. By allowing non-linear adjustments, the average half-life and the median half-life are reduced by more than 50%, when compared to the standard linear model. This paper is organized as follows. Section II presents a brief literature review. Section III motivates the STAR model and briefly discusses estimation and testing issues. Section IV presents the data. Section V estimates the non-linear model and analyzes the conversion path to arbitrage parity. Section VI concludes the paper. See for instance Kim et al. (2000) for VAR and SUR approaches to analyze the speed of adjustment of ADR prices; and Gagnon and Karolyi (2003) for a standard AR model. 401 II. Literature review There is a growing body of literature that studies the potential arbitrage opportunities that cross-listed shares create. If prices between the local shares and their cross- listed shares differ substantially, arbitrage opportunities will arise. The early studies by Kato, Linn and Schallheim (1991), Miller and Morey (1996) and Karolyi and Stulz (1996) conclude that ADRs do not present any arbitrage opportunities. The only early study that did find some arbitrage opportunities is by Wahab, Lashgari and Cohn (1992). Substantial deviations from arbitrage pricing are consistent with other studies in the literature of dually-listed shares, such as Rosenthal and Young (1990), and, more recently, Froot and Dabora (1999). Froot and Dabora (1999), studying the pricing of two dual-listed companies, Royal Dutch and Shell, and Unilever N.V. and Unilever PLC, find a large and significant price deviation from arbitrage parity. As discussed by Gagnon and Karolyi (2003), there are impediments due to market frictions and imperfect information that can seriously limit arbitrage. They quantify sizable price deviations from arbitrage-free pricing between ADRs and their underlying assets, documenting the existence of large price deviations for many of the 581 ADR-underlying pairs they study. They estimate discounts of up to 87% and premia of up to 66%. After taking into account direct and indirect transaction costs, they still find the price deviations to be exceeding reasonable measures of transaction costs. Still, Gagnon and Karolyi (2003) mention that the complexity of rules in the ADR-underlying arbitrage precludes definite conclusions about potential market inefficiencies. The convergence to price parity has also been recently studied. Gagnon and Karolyi (2003) discuss the mechanics of arbitrage in the ADR market. Arbitrage, which involves the issuance and cancellation of ADRs, can take place on the same day, but it usually occurs on an overnight basis. They report the average deviation from price parity can persist for up to five days. Some studies, however, find convergence to price parity to be surprisingly slow. For example, De Jong, Rosenthal and van Dijk (2004) find substantial variation in the number of days for which an arbitrageur has to maintain a position before convergence. In some cases, arbitrageurs have to wait for almost 9 years. Large price deviations from arbitrage-price parity do not necessarily imply arbitrage profits are possible. Transaction costs, capital control restrictions, conversion rules, and lack of liquidity might make arbitrage very difficult. De Jong et al. (2003) and Hong and Susmel (2003) attempt to construct realistic arbitrage strategies to see whether arbitrage is possible. De Jong et al. (2003) study 13 dual-listed companies 402 Journal of Applied Economics and show that for every individual dual-listed company, deviations from arbitrage price parity are large. They design investment strategies for exploiting these deviations from price parity. They find that some arbitrage strategies in all dual-listed companies produce excess returns of up to 10% per annum on a risk-adjusted basis, after transaction costs and margin requirements. Hong and Susmel (2003) study simple arbitrage profits for ADR-underlying pairs. They find that pairs-trading strategies deliver significant profits. The results are robust to different profit measures and different holding periods. For example, for a conservative investor willing to wait for a one-year period, before closing the portfolio pairs-trading positions, pairs-trading delivers annualized profits over 33%. Suarez (2005a), using intradaily data for French ADR-underlying pairs, shows that large deviations from the law of one price are present in the data and that an arbitrage rule can be designed to exploit the large deviation from price parity. A related line of research deals with the price discovery process. Eun and Sabherwal (2003) apply a standard linear error correction model to study price discovery shares for 62 Canadian shares cross-listed in the NYSE. They find a significant price deviation from arbitrage parity. They find that the price adjustments of U.S. prices to deviation from Canadian prices are significantly larger in absolute value. They also find that trading volume in the U.S. is the most important variable in the determination of relative information contribution of the two markets. Using intradaily data and a similar methodology, but for only three German firms, Gramming, Melvin, and Schlag (2001) find that the majority of the price discovery is done at home (Germany), but following a shock to the exchange rate, almost all of the adjustment comes through the New York price. A similar model, but using non- linear adjustment dynamics, is estimated by Rabinovitch et al. (2003). Using a non- linear threshold model for 20 Chilean and Argentine cross-listed stocks, Rabinovitch et al. (2003) estimate transactions costs and show that transaction costs play an important role in the convergence of prices of ADRs and their underlying securities. They find that capital control measures and liquidity significantly affect the price adjustment process, through increasing transactions costs. Melvin (2003) and Auguste et al. (2006) also find that capital movement restrictions can seriously affect the arbitrage price parity, especially during economic and currency crisis. III. Non-linear convergence and arbitrage models Let P () A represent the price of ADRs and PL () the price of underlying (locally) t t traded shares at time t. The relationship between both prices, under the arbitrage- free condition, with absence of transaction costs is specified as: 403 PA () = BS PL( ), (1) ttt where S denotes the nominal exchange rate at time t, and B the bundling price ratio. Equation (1), price parity, is usually expressed in log form. The deviations from log price parity, q , is given by L A qp≡+ p + s , (2) tt t t where small letters represents the log form of the above defined variables. Let κ measure the transaction costs, as a percentage, faced by arbitrageurs. Provided that κ is small, arbitrage will occur when: q > κ . (3) The dynamic behavior of q , the deviation from price parity between the ADRs and their underlying shares, has been mostly analyzed in a linear framework. This linear framework is counterintuitive since, once arbitrage is triggered, arbitrage opportunities may disappear very slowly and always at the same speed. One way to address this issue is to consider that, under certain conditions, price differences should converge faster to price parity. This can happen when the convergence dynamics are governed by a nonlinear process. We start by assuming that small deviations from arbitrage-free prices between ADRs and their underlying shares may be considered negligible to generate arbitrage activities, notably when transactions and other related trading costs are not covered by the deviation from price parity. In this case, the deviation from price parity would behave as a near unit root process and would not converge to parity in a linear framework. On the other hand, when deviations from price parity are large, arbitrage activities, then, will create a reversion to the long-run equilibrium price parity. As the ADR-underlying pair moves further away from arbitrage parity, or long run equilibrium, arbitrage activities will likely increase. Therefore, the dynamics of convergence to price parity should be influenced by the size of the deviation from price parity. See Dumas (1992), Sercu et al. (1995), Obstfeld and Taylor (1997). These articles find that market frictions create an inactive transaction band, where small deviations from purchasing power parity prevent the real exchange rate to mean revert. Arbitrage opportunities exist only for large deviations outside the inactive band. Traders have a tendency to postpone entering the market until enormous arbitrage opportunities open up. 404 Journal of Applied Economics A. Modeling nonlinear adjustments A model that captures this nonlinear adjustment process is the smooth transition autoregressive (STAR) model studied by Granger and Teräsvirta (1993) and Teräsvirta (1994). The STAR model also displays regimes, but the transitions between regimes occur gradually. In the STAR literature, the Exponential STAR (ESTAR) and the Logistic STAR (LSTAR) are the most popular models used for symmetric and asymmetric adjustments, respectively. The adjustment structure of both models depends on the magnitude of the departure of the underlying process from its equilibrium. A STAR model of order p for the univariate time series q can be formulated as: p p ⎡ ⎤ j j qL =+μμΨΨ ()q − + L(q −μ ) Φ(;z λμ , ) +> ε ,λ 0, (4) [] ∑∑ ⎢ ⎥ tj 1 tj 2 t tt j== 1 j 1 ⎣⎣ ⎦ where the error term, ε , follows an identical and independent distribution, with zero mean and constant variance σ . The independent variable x is defined x = (, 1 w )′ tt with ) w = (q , q , ..., q )′ and , denotes the ΨΨ = ,Ψ , ...,Ψ , i = 12 , () tt−− 12t t−p ii01i ip autoregressive parameters vector of dimension p of an AR(p); L is the lag operator; Φ(; z λμ , ) is the smooth transition function, which determines the degree of convergence. The ESTAR model uses the exponential function as the transition function : Φ(;zz λμ , ) =−10 exp −λ( −μ) /σ ,, λ > (5) {} tt z where z , the transition variable, is assumed to be a lagged endogenous variable zq = for which d is the delay lag, a nonzero integer (d > 0), that determines the tt −d lagged time between a shock and the response by the process, the parameter λ Another popular nonlinear specification is the threshold autoregressive (TAR) model in which regime changes occur abruptly, see Tong (1990). A problem with this approach is that the model has two very distinct regimes: outside the threshold (where arbitrage happens) and inside the threshold (where there is no arbitrage). The change from one regime to the other is abrupt and it presumes the same speed of adjustment outside the threshold. The LSTAR model contains as a special case the single-threshold TAR model, discussed in this section. The sample variance of the transition variable is used to scale the argument of the exponential as suggested by Granger and Teräsvirta (1993, p.124). The scaling enables a stability improvement of the nonlinear least squares estimation algorithm, a fast convergence, and an interpretation and comparison of λ estimates across equations in a scale-free environment. 405 determines the speed of transition between regimes, and μ can be interpreted as the arbitrage parity, equilibrium level. Note that, for a given price parity deviation, lower (higher) values of l determine slower (faster) values for Φ(.) and, thus, slower regime transitions. The transition function is symmetrical around the equilibrium level (mean). Substituting (5) into (6), the ESTAR model can be written as: p p ⎡ ⎤ j j 22 ⎡ ˆ ⎤ qL =+μμΨΨ ()q − + L()q −μ 1−− exp λμ (z − ) /σ +ε . (6) ∑∑ ⎢ ⎥⎥ {} tj 1 tj 2 t tz t ⎣ ⎦ j== 1 j 1 ⎣ ⎦ The transition function is bounded between zero and one. The inner regime is characterized by q = μ , when Φ(.) = 0. The ESTAR model (6) then degenerates td − to a standard linear AR (p): (7) qL =+μμ Ψ ()q − + ε. tj ∑ tt j =1 The outer regime is characterized by an extreme deviation from the price parity, when Φ(.) = 1, in which case model (6) converts to a different AR(p) representation: (8) qL =+μμ () ΨΨ + (q −) + ε. tj12j tt j =1 The model displays global stability provided ΨΨ + < 1, although () ∑ 12jj j =1 it is possible that implying that q may follow a unit root process or Ψ ≥ 1 ∑ t j =1 even explodes around the arbitrage free parity level. The LSTAR model uses the logistic function, instead of an exponential function, to model the transition function Φ(.). Thus, after substituting in (4), the LSTAR model can be written as: '' 22 ⎡ ˆ ⎤⎤ (9) qx=+ ψψx z 11 / +− exp λ( −μ ) /σ +> ελ,. 0 () {} tt12t t z t ⎣ ⎦ B. Estimation, testing and model selection Following Teräsvirta (1994), the starting point in modeling a STAR specification consists of an adequate choice of the autoregressive parameter, p, and of the delay parameter, d. Second, a sequence of tests of the null hypothesis of linearity (AR See the Appendix for details. 406 Journal of Applied Economics model) is performed, along with other diagnostic tests. Third, if the null hypothesis of linearity is rejected, the model is specified as ESTAR or LSTAR. The choice of ESTAR or LSTAR model is based on a comparison of p-values for a sequence of LM tests. The choice of the autoregressive parameter, p, is based on the Akaike information criterion (AIC). However, the AIC tends to under-parameterize an AR model. Thus, we also look at the partial autocorrelation function (PACF) using a 95% confidence interval band. In order to specify the delay parameter, d, a sequence of linearity tests is carried out for different ranges of d with 1 ≤ d ≤ D considered appropriate. If the null hypothesis of linearity is rejected at a pre-specified level for more than one value * * of d, then d is determined at d = d such that: d = Arg{Min p(d)} for 1 ≤ d ≤ D, where p(d) denotes the p-value of the selected test. The correct choice of d is important for the test to have a maximum power. For this paper, we set the maximum value of d equal to 5 business days as it seems unreasonable to argue that it would take more than 5 days for the price spread to start adjusting if there is an arbitrage activity. Once p and d are selected, estimation of a STAR model can be straightforward using non-linear least squares. We test for the presence of nonlinearity in the price spread between the local assets and their corresponding ADRs using the Lagrange Multiplier (LM) tests proposed by Luukkonen et al.(1988); Granger and Teräsvirta (1993) and Teräsvirta (1994) (hereafter, the TP procedure); and Escribano and Jordá (1999) (hereafter, the EJP procedure). For each test, we conduct a heteroskedasticity-consistent specification since neglecting heteroskedasticity can seriously affect the power of LM tests, see Wooldridge (1990, 1991). Once a nonlinear specification is found adequate, the next task is to choose between the ESTAR and the LSTAR models. Teräsvirta (1994) suggests the following EST model selection procedure. Let LM denote the F-test of the ESTAR null hypothesis, LST and let LM denote the F-test of the LSTAR null hypothesis. The relative strength of the rejection of each hypothesis is then compared. If the minimum p-value LST EST corresponds to LM , the LSTAR model is selected, but if it corresponds to LM , the selected model is the ESTAR. See Van Dijk et al. (2002) for a survey of the different modeling procedures for STAR models. Van Dijk et al. (1999) develop outliers-robust tests, since they show that in the presence of additive outliers, LM tests for STAR nonlinearity tend to incorrectly reject the null hypothesis of linearity. We used such tests along with the heteroskedasticity tests, but there were no major changes for our sample. 407 IV. The data The data analyzed in this paper are the daily prices on twenty one locally traded firms from Mexico, obtained from Datastream. To be part of our sample, the ADR has to be Level III or Level II. The sample periods are different for the different firms, depending on the dates for which ADRs started trading on these firms on the U.S. market. Table 1 presents the twenty one firms and the sample period for each of them. Table 1. Data description ADR issue Symbol Ratio Industry Eff. date America Movil S.A. de CV- Series ‘L’ AMX 1:20 Wireless Comm. 8-Feb-01 Cemex S.A. de CV CX 1:5 Building Materials 1-Sep-99 Coca-Cola Femsa ‘L’ Shares KOF 1:10 Beverage 1-Sep-93 Corporacion Durango CDG 1:2 Forest Products & Paper 1-Jul-94 Desc, S.A. de C.V. DES 1:20 Auto Parts & Tires 20-Jul-94 Empresas Ica, S.A. de C.V. ICA 1:6 Heavy Construction 1-Apr-92 Fomento Economico Mexicano, S.A. de C.V. FMX 1:10 Beverage 11-Feb-04 Gruma, S.A. de C.V. - ‘B’ Shares GMK 1:4 Food 6-Nov-98 Grupo Aeroportuario del Sureste ASR 1:10 Gen. Industrial Svcs 28-Sep-00 Grupo Imsa IMY 1:9 Industrial Diversified 10-Dec-96 Grupo Industrial Maseca S.A. de C.V. MSK 1:15 Food 17-May-94 Grupo Iusacell CEL 1:5 Wireless Comm. 5-Aug-99 Grupo Radio Centro, S.A. de C.V. RC 1:09 Broadcasting 9-Jul-93 Grupo Simec ‘B’ Shares SIM 1:1 Mining & Metals 1-Jun-93 Grupo Televisa, S.A. TV 1:20 Broadcasting 16-Sep-02 Grupo TMM TMM 1:1 Industrial Transport 17-Jun-92 Industrias Bachoco IBA 1:6 Food 26-Sep-97 Internacional de Ceramica ICM 1:5 Building Materials 15-Dec-94 Telefonos de Mexico S.A. de C.V.-Series ‘L’ TMX 1:20 Fixed Line Comm. 13-May-91 Tv Azteca, S.A. de C.V. TZA 1:16 Broadcasting 1-Aug-97 Vitro, S.A. de C.V. VTO 1:3 Industrial Diversified 19-Nov-91 Notes: as to exchange, all ADRs are traded on the NYSE, except for SIM that is traded on AMEX; as to type, AMX, CX, FMX, GMK, MSK and CEL are listed as level II ADRs, the rest as level III ADRs. 408 Journal of Applied Economics Table 2. Market statistics (1) (1) Symbol Volume MC Float Short rate Mean(Q ) SD(Q ) Max/Min AR(1) LB(5) t t AMX 1,531,594 22.13B 386.70M 3.155 0.1154 1.0538 8.27/-9.46 0.391 207.31 CX 617,121 9.87B 139.84M 2.586 -0.4298 0.9613 5.44/-6.26 0.490 844.66 KOF 179,258 3.83B 27.08M 1.909 0.372 1.8516 12.79/-9.82 0.414 615.21 CDG 23,641 55.10M 2.30M 5 16.575 58.1485 483.44/-52.00 0.988 116118.2 DES 50,501 665.14M 34.66M 2.667 -10.55 6.6358 14.01/-37.29 0.917 8659.15 ICA 266,505 615.47M 71.20M 16.645 -2.6896 2.4973 13.66/-17.53 0.456 1427.63 FMX 237,435 4.78B 66.26M 2.851 -0.1025 0.9973 7.56/-17.41 0.194 81.35 GMK 8,323 743.26M 21.68M 7.25 0.6519 3.9613 14.19/-81.40 0.479 758.81 ASR 130,500 587.40M 10.80M 0.407 0.8898 6.8541 86.48/-12.32 0.810 1172.07 IMY 19,590 1.13B 10.02M 9.5 0.5551 2.538 28.56/-17.34 0.633 1842.6 MSK 33,240 397.98M 4.42M N/A 0.5484 2.4037 14.38/-29.99 0.501 1539.18 CEL 150,975 116.56M 4.72M 18.2 -0.3614 4.4567 22.51/-29.53 0.667 1291.53 RC 37,263 98.50M 8.68M 2.042 2.7286 9.3029 64.47/-24.51 0.898 8233.55 SIM 11,000 298.47M 16.10M 1.978 1.0113 11.17 77.66/-59.88 0.874 8139.64 TV 692,304 5.98B 105.89M 2.884 0.0915 1.1807 16.28/-11.14 0.267 291.30 TMM 62,341 150.38M 6.30M 16.968 -0.659 15.3707 101.31/-68.87 0.963 12768.54 IBA 18,586 471.38M 8.17M 1.294 0.1688 2.7932 17.90/-12.13 0.702 2054.45 ICM 6,945 107.68M 5.54M N/A 3.5991 13.2577 64.95/-49.16 0.955 9441.14 TMX 2,520,634 19.09B 392.98M 8.095 0.6146 0.971 16.53/-10.86 0.305 730.84 TZA 440,805 1.49B 72.68M 7.42 -0.2084 1.4993 10.64/-10.26 0.309 259.07 VTO 164,141 304.66M 24.70M 4.636 0.3206 2.2561 14.81/-17.16 0.473 1819.54 (1) Notes: As of May 18, 2004. N/A: not available. MC: market capitalization (in USD); Volume: average daily volume since inception (in USD), Float: number of freely traded shares in the hands of the public. Float is calculated as Shares Outstanding minus Shares Owned by Insiders, 5% Owners, and Rule 144 Shares. Mean(Par ) is the mean of Q , the deviations from price parity (in %); SD(Q ) is the t t t SD of Q ; Max/Min represents the maximum and the minimum of Q ; AR(1) is the AR(1) coefficient of Q ; and LB(5) is the Ljung-Box statistics with 5 lags for Q . t t t t 409 Table 2 exhibits several statistics for each firm: market capitalization (MC), average daily volume since inception (Volume), the number of freely traded shares in the hands of the public (Float), and the short-ratio, which is calculated as the short interest for the current month divided by the average daily volume. In the last four columns of Table 2, we also present summary statistics for the deviations from price parity (in %): BS P() L tt Q=− ( 1)x100 (10) PA () Analyzing the statistics for Q , we observe evidence for autocorrelation. We also tend to observe a negative relation between liquidity and departure from theoretical price parity: the less liquid a stock is, the bigger the departures from price parity, as shown by the mean and maximum and minimum statistics. V. Results The lag selection is based on both the AIC and the partial autocorrelation functions (PACF). For most series, only the first or second autocorrelation coefficients are significant at the 5% level. Therefore, the maximum AR used is 2, which seems to purge the residuals series from serial correlation. As a check, we also estimate models with p > 3, with d = {1,2,…,10}, to test for a higher AR order in q ; but the results are very similar to the ones presented below. Table 3 reports p-values for the standard and heteroskedasticity-consistent test statistics NLM3 and NLM4 for testing the linearity hypothesis. Table 3 also reports EST LST EST test statistics NLM2 (an LM test), LM and LM for choosing between ESTAR and LSTAR (see Appendix for details). Panel A shows all the test results for one firm, TMX. Panel B shows a summary of the test results for all the other firms. The second column of Table 3 displays the different values for the delay parameter, d = {1,2,…,5}. Using the results on the first panel of Table 3 for TMX, we select d = 2, as the results The results for the other firms can be reported similarly, but are not included to save space. They are available under request. The tests are performed with values of the delay parameter, d = {1,2,…,10}, yet we report the tests statistics for d ={1,2,…,5} since d ={6,…,10} do not alter the choice of d and are less relevant for the convergence of a daily price spread series. We also used as the transition variable, z , the first lag of the average absolute volatility, v as suggested by LeBaron (1992), where k is the number of days used in t,k the summation of absolute values, with a maximum of 5 business days. The tests selected v as adequate t,k transition variables for six stocks. Overall, our results are unchanged. 410 Journal of Applied Economics Table 3. LM tests for nonlinearity and LM tests for model selection Standard tests Heroskedasticity-robust tests Tests AR vs STAR ESTAR vs LSTAR AR vs STAR ESTAR vs LSTAR TP EJP TP EJP TP EJP TP EJP LST EST LST EST Firm d NLM3 NLM4 NLM2 LM LM NLM3 NLM4 NLM2 LM LM Panel A 1 0.000 0.000 0.000 0.000 0.000 0.387 0.390 0.042 0.016 0.002 2 0.000 0.000 0.000 0.000 0.000 0.049 0.037 0.078 0.345 0.083 TMX 3 0.000 0.000 0.000 0.000 0.000 0.008 0.601 0.138 0.098 0.092 4 0.000 0.000 0.000 0.127 0.000 0.316 0.271 0.192 0.821 0.162 5 0.000 0.000 0.002 0.009 0.043 0.263 0.426 0.220 0.404 0.430 Panel B AMX 4 0.000 0.000 0.001 0.001 0.000 0.021 0.037 0.002 0.035 0.017 CX 2 0.000 0.029 0.000 0.002 0.000 0.200 0.569 0.169 0.101 0.113 KOF 1 0.000 0.000 0.013 0.000 0.020 0.069 0.059 0.049 0.073 0.037 CDG 1 0.000 0.019 0.000 0.002 0.000 0.020 0.599 0.069 0.091 0.063 DES 1 0.000 0.000 0.010 0.000 0.020 0.054 0.070 0.041 0.093 0.057 ICA 1 0.000 0.000 0.015 0.000 0.000 0.000 0.000 0.576 0.519 0.530 FMX 1 0.000 0.000 0.004 0.032 0.000 0.072 0.042 0.051 0.051 0.014 GMK 1 0.000 0.000 0.027 0.000 0.644 0.030 0.051 0.094 0.045 0.002 ASR 1 0.000 0.000 0.000 0.000 0.000 0.112 0.066 0.032 0.071 0.033 IMY 1 0.000 0.000 0.000 0.000 0.000 0.202 0.034 0.053 0.054 0.032 MSK 2 0.000 0.007 0.000 0.026 0.003 0.046 0.270 0.037 0.074 0.055 CEL 4 0.000 0.019 0.001 0.053 0.000 0.057 0.031 0.095 0.099 0.023 RC 1 0.000 0.012 0.001 0.080 0.020 0.072 0.065 0.068 0.097 0.053 411 Table 3. (continued) LM tests for nonlinearity and LM tests for model selection Standard tests Heroskedasticity-robust tests Tests AR vs STAR ESTAR vs LSTAR AR vs STAR ESTAR vs LSTAR TP EJP TP EJP TP EJP TP EJP LST EST LST EST Firm d NLM3 NLM4 NLM2 LM LM NLM3 NLM4 NLM2 LM LM Panel B SIM 1 0.000 0.000 0.120 0.007 0.067 0.002 0.048 0.029 0.034 0.051 TV 1 0.000 0.000 0.000 0.001 0.000 0.045 0.055 0.032 0.107 0.092 TMM 1 0.000 0.000 0.000 0.000 0.000 0.004 0.027 0.076 0.088 0.078 IBA 2 0.000 0.001 0.010 0.001 0.016 0.032 0.019 0.164 0.026 0.014 ICM 1 0.000 0.000 0.000 0.000 0.000 0.151 0.046 0.105 0.062 0.043 TMX 2 0.000 0.000 0.000 0.000 0.000 0.049 0.037 0.078 0.345 0.083 TZA 5 0.000 0.000 0.004 0.000 0.070 0.013 0.018 0.057 0.081 0.043 VTO 2 0.000 0.000 0.268 0.025 0.082 0.080 0.067 0.063 0.081 0.052 Notes: This table presents the p-values of the Lagrange Multiplier (LM) tests for AR linearity against STAR nonlinearity, denoted AR vs. STAR and LM tests for choosing between the ESTAR and the LSTAR model, denoted ESTAR vs. STAR of the daily price differential between ADRs and their underlying shares. The tests are performed following two tests: the Teräsvirta (1994) test (TP) with the corresponding LST EST statistics NLM3, and NLM2; and the Escribano and Jordá (1999) test (EJP) with corresponding statistics NLM4, LM , and LM . The NLM3 and NLM2 statistics are based on the auxiliary regression LST EST model, equation (A1) and the NLM4, LM , and LM statistics are based on equation (A2). For each test, two versions of tests are estimated, the standard test and the heteroskedasticity-consistent test. The first panel shows p-values for all possible choices of d, d = {1,…,5} only for the firm TMX. The second panel also reports the selected p-values and the delay parameter, d, for all the other firms. For each test, the rejection of the null hypothesis, the selection of the delay parameter d, and the resultant model are based on the smallest p-value. 412 Journal of Applied Economics indicate the smallest p-values (corresponding to NLM3 and NLM4) for both tests; that is, for d = 2 we obtain the strongest rejection of the AR linear hypothesis. Also, for d = 2, the ESTAR model is selected over the LSTAR model since the p-value of EST LST the LM test is smaller than the p-value of the LM test, for both versions of the test. Note that the p-value of the NLM2 test confirms this selection. We follow this process for the other firms. Based on the standard LM test statistics NLM3 and NLM4, reported in Panel B, of Table 3 for all the firms, the null hypothesis of linearity can be rejected for any values of d and corresponding transition variables, at the 1% level. For the majority of the firms, we select d = 1, that is, yesterday’s deviation from price parity. When we use the heteroskedasticity-consistent robust tests, the null hypothesis of linearity is still rejected for the majority of the firms. Using the NLM3 test, and the lag selected by the standard homoscedastic test, we find eighteen firms with a p- value lower than 10%. For example, for FMX the results of the heteroskedasticity- zq = ,q robust test indicate that the transition variables , and v are adequate tt−− 12t t −14 , transition variables, since the corresponding p-values are smaller than .10. The NLM3 test, for q rejects linearity, showing a p-value of .072. The results from the NLM4 t-1 test statistic, computed using the Escribano and Jordá test, confirm the NLM3 selection. Finally, the p-values of the LM statistics (standard or heteroskedasticity-consistent) NLM2 suggest an ESTAR model is the more appropriate model. Comparing relative EST LST EST strength of the tests LM and LM , the minimum p-values correspond to LM , EST indicating a choice in favor of the ESTAR model. In most cases, the LM is significant at the 5% level for d = 1. Thus, based on the decision rules of Teräsvirta (1994), the ESTAR model with a delay, d = 1 should be an adequate model specification for FMX return spread. We carry on an identical evaluation for the other firms. With few exceptions we find the ESTAR to be the most adequate model. A. Nonlinear estimation results Following Gallant and White (1988), the resulting ESTAR(p) models, with p = {, 12}, are estimated by nonlinear least squares. We test the following four restrictions consistent with the application of ESTAR specifications to arbitrage models, ΨΨ+=11 ,, Ψ =−Ψ( j=,2) and μ = 0. Under the first restriction, the model 11 12 jj behaves like a random walk, and thus there is no convergence to equilibrium, when the transition function is equal to 0 (no arbitrage regime). Under the second set of restrictions, ΨΨ =−,, and Ψ =−Ψ there is full convergence to price parity 21 11 22 12 when the transition function is equal to 1 (full arbitrage regime). The fourth restriction, μ = 0, implies that the equilibrium price parity deviation is zero. The restrictions 413 are tested using likelihood ratio tests. If all the restrictions cannot be rejected, when they are imposed, the final model is governed by λ, the speed of transition between regimes. The higher the λ, the higher the speed of transition between regimes, and, thus, the faster the convergence to parity. When the last restriction cannot be rejected, we impose it and re-estimate the model. The model estimates, the likelihood ratio, and residuals diagnostic statistics are presented in Table 4. In column ten, we report the p-value associated with the likelihood ratio statistic, LR(k). The LR(k) statistics show that at least one of the restrictions cannot be rejected at the standard 5% level for all series. The number of restrictions that cannot be rejected varies from one firm to another. For example for the firm AMX, the p-value of LR(4) is 0.561, thus, we failed to reject four restrictions. The failure to reject the first three restrictions indicate that for small deviations from price parity there is no tendency for reversion towards price parity; while for large deviations from price parity there is a full reversion to price parity. Overall, this type of dynamic adjustment for deviations from price parity is the usual for all the firms. The restriction μ = 0 cannot be rejected for the majority of the firms, that is, the long-run deviation from price parity is zero. In the fourth column of Table 4, we report the estimated λ’s, the transition parameters. With only one exception, TMM, the estimates of λ are all significantly different than zero. The size of λ changes from 2.971 to 0.315. It is worth noticing that firms with a higher estimate of l tend to have higher average daily volume and market capitalization. Whereas firms for which the price spread series exhibits a lower speed of adjustment coefficients, such as ICM (λ = 0.317), GMK (λ = 0.361), and TMM (λ = 0.315), tend to have lower average daily volume and market capitalization. Overall, the estimated values reported in Table 4 support a nonlinear dynamic convergence of the price spread series towards price parity. We also conduct specification tests for our ESTAR model. The residuals diagnostic statistics for the estimated equations are reported in the last two columns of Table 4. Following Eitrheim and Teräsvirta (1996), we calculate LMNA and NL . LMNA Max AR(1-6) is a LM-test statistics for testing the null hypothesis of no serial correlation in the residuals of order 1 up to 6. NL represents the maximum LM-test statistic Max of no additive nonlinearity with the delay length in the range from 3 to 6. The associated p-values indicate that we cannot reject those null hypotheses for all firms at the 5% level or better. Therefore, an ESTAR specification seems adequate for the price spread series. Taylor et al. (2001) point out that the significance of λ estimate based on individual t-ratios should be checked for robustness. Technical problems emerge under the null hypothesis that λ = 0. 414 Journal of Applied Economics Table 4. Nonlinear estimation results for ESTAR model of price spread (equation 6) Firm p, d μ λ Ψ Ψ Ψ Ψ S LR(k) NL LMNA 11 12 21 22 Max AMX 2,4 - 2.745 0.842 0.155 -0.837 -0.137 0.453 LR(4) [0.498] [0.336] * * (0.019) (0.106) -0.082 -0.547 -0.544 [0.561] CX 2,2 - 2.864 0.494 0.259 -0.470 -0.258 0.061 LR(3) [0.502] [0.452] * * * * * (0.008) (0.066) (0.046) (0.108) (0.114) [0.754] KOF 1,1 0.045 1.575 0.643 - -0.612 - 0.036 LR(1) [0.471] [0.357] * * * * (0.001) (0.025) (0.049) (0.077) [0.224] CDG 1,1 - 1.981 0.993 - -0.996 - 0.041 LR(3) [0.211] [0.405] * * * (0.528) (0.009) (0.048) [0.582] DES 2,1 0.163 2.971 0.921 0.164 -0.928 0.141 0.028 LR(3) [0.404] [0.471] * * * * * * (0.003) (0.068) (0.050) (0.048) (0.062) (0.066) [0.672] ICA 2,1 -0.221 0.992 -0.765 -0.213 0.548 0.098 0.039 LR(3) [0.397] [0.545] * * * * * (0.012) (0.082) (0.113) (0.108) (0.082) -0.074 [0.423] FMX 2,1 - 1.793 0.865 0.147 -0.859 -0.161 0.027 LR(4) [0.438] [0.443] * * * * (0.014) (0.011) (0.075) (0.070) -0.094 [0.522] GMK 1,1 - 0.361 0.839 - -0.841 - 0.042 LR(2) [0.399] [0.562] * * * (0.039) (0.234) (0.043) [0.252] 415 Table 4. (continued) Nonlinear estimation results for ESTAR model of price spread (equation 6) Firm p, d μ λ Ψ Ψ Ψ Ψ S LR(k) NL LMNA 11 12 21 22 Max ASR 1,1 - 1.277 0.806 - -0.809 - 0.021 LR(2) [0.305] [0.668] * * (0.027) (0.122) -0.439 [0.252] IMY 2,2 - 0.642 0.526 0.164 -0.535 0.160 0.048 LR(3) [0.574] [0.218] * * * (0.030) (0.121) -0.113 (0.109) -0.114 [0.352] MSK 2,2 0.025 0.582 0.897 0.175 -0.876 -0.133 0.030 LR(3) [0.327] [0.525] * * * * * (0.003) (0.053) (0.111) (0.077) (0.094) -0.083 [0.571] CEL 2,4 - 0.496 0.611 0.305 -0.609 -0.095 0.041 LR(3) [ 0.318] [0.280] * * * * (0.038) (0.071) (0.066) (0.123) -0.132 [0.471] RC 1,2 0.211 0.835 0.876 - -0.950 - 0.038 LR(1) [0.390 ] [0.572] * * * * (0.012) (0.104) (0.031) (0.043) [0.197] SIM 2,1 0.348 0.514 0.911 0.234 -1.090 -0.239 0.052 LR(3) [0.795] [0.489] * * * * * (0.010) (0.099) (0.084) (0.065) (0.183) -0.173 [0.458] TV 2,1 - 2.289 0.823 0.081 -0.824 -0.078 0.032 LR(4) [0.321] [0.254] * * * (0.030) (0.158) -0.139 (0.101) -0.073 [0.285] TMM 1,1 - 0.315 0.973 - -0.973 - 0.042 LR(2) [0.244] [0.145] * * -0.737 (0.019) (0.019) [0.628] 416 Journal of Applied Economics Table 4. (continued) Nonlinear estimation results for ESTAR model of price spread (equation 6) Firm p, d μ λ Ψ Ψ Ψ Ψ S LR(k) NL LMNA 11 12 21 22 Max IBA 2,2 - 1.55 0.638 0.296 -0.641 -0.278 0.019 LR(4) [0.275] [0.323] * * * * (0.021) (0.038) (0.031) (0.274) -0.276 [0.356] ICM 1,1 -0.053 0.317 0.975 - -0.846 - 0.035 LR(1) [0.399] [0.379] * * * * (0.014) (0.094) (0.010) (0.039) [0.334] TMX 2,2 0.026 2.853 0.718 0.261 -0.709 -0.131 0.010 LR(3) [0.589] [0.258] * * * * (0.001) (0.018) (0.065) (0.057) -0.456 -0.173 [0.573] TZA 2,5 - 1.387 0.965 0.023 -0.978 0.127 0.031 LR(3) [0.535] [0.332] * * * (0.013) (0.042) -0.042 (0.149) -0.135 [0.628] VTO 2,2 -0.037 0.984 0.771 0.156 -0.725 -0.134 0.019 LR(4) [0.419] [0.425] * * * * * * (0.005) (0.093) (0.092) (0.058) (0.065) (0.048) [0.425] Notes: p and d denote the autoregressive order and the number of periods for the delay parameter. The estimates Ψ , Ψ , Ψ and Ψ represent the autocorrelation parameters, λ the speed of 11 12 21 22 transition, μ the mean, and S the residual standard errors of models. denotes significance at the 5% level. Figures reported in squared brackets are p-values, numbers in parentheses are heteroscedastic- consistent standard errors. LR(k) is a likelihood ratio test statistic for k parameter restrictions implicit to the estimated equation against the unrestricted ESTAR model. NL denotes the maximum LM- Max test statistic of no additive nonlinearity with d from 3 to 6. LMNA tests the null hypothesis of no serial correlation in the residuals of order 1,2,...,6. 417 B. Estimated transition functions The transition function measures the magnitude of deviations of the price spread from its arbitrage-free level. The estimates of the transition functions are shown on Figure 1 for two selected stocks; they are plotted against the transition variable, zq = (Panel A), and against time (Panel B). The estimated transition functions tt −d visually support the nonlinear nature of the price spread series and the appropriateness of the ESTAR model, since, in general, observations seem to symmetrically lie above and below the parity. Again, we notice a relation between slow convergence and liquidity. For example, in Panel A, for a firm with a good daily volume like KOF, a previous day’s deviation from parity of the order plus or minus 2%, the transition function attains smaller values (0.5), implying a relatively slow mean reversion, whereas for a larger previous day’s deviation around 4%, the transition function reaches the value of 1, the regime of full arbitrage, signaling a faster reversion. On the other hand, for TMM, a firm with a low daily volume, a 30% spread makes the transition function equal to .5. In general, most of the transition functions indicate that deviations lower than 5% trigger a full arbitrage regime. Figure 1. Estimated transition function for selected firms We included lagged changes in volume in the transition function, but the model did not perform better than our model. 418 Journal of Applied Economics For some firms, however, there are few days of full arbitrage –i.e., when the transition function is equal to 1–, while for others, there are many days of full arbitrage. Again, there seems to be a positive relation between low volume and number of days under the full arbitrage regime. C. Half-lives and convergence to parity While both estimated ESTAR models and transition functions shed light on the nonlinear nature of the reversion of the price spread to parity, more insights into the adjustment mechanism of the models can be gained by estimating the average time it takes for a given shock to die out, also called the speed of convergence to parity. As a measure of the speed of convergence, we calculate the half-life of a shock, defined as the number of periods it takes for shocks to the price spread to dissipate by half. Following Taylor and Peel (2000) and Taylor et al. (2001), we estimated the half- lives for shocks using the generalized impulse response function (GIRF). The half-life is defined in a non-linear framework as the number of periods taken by the impulse response function to fall below 0.5 γ, or GIRF < 0.5 γ, with γ=+ ln(1 ), where k represents the percentage of shocks. Alternatively, to mitigate differences in GIRF due to the different variability of the underlying series, shocks ˆ ˆ can be set as γσ = c where σ denotes the residual standard deviations and c is a ε ε scalar. We use this formulation to calculate half-lives. We estimate the half-lives ˆ ˆ ˆ for all price spread series for three sizes of shocks: 1 σ , 3 σ and 5 σ . For ε ε ε comparison purposes, we also compute half-lives for a linear adjustment. In the second to fourth columns of Table 5, we report the estimated half-lives for all firms, using the ESTAR model, for three different sizes of shocks. In the last column, we also report the half-life estimates for the standard AR linear adjustment model. All half-life estimates are expressed in business days. From the non-linear estimation, we observe faster adjustments for the majority of firms. The half-life estimates are similar across shock sizes. A larger shock to the price spread triggers a faster reversion to parity. For the non-linear model, using one residual standard deviation as the shock, the average half-life is 3.1 business days, a reduction of more than half when compared to an average half-life for the linear model of 7.26 Following Koop et al. (1996), the generalized impulse response function is computed using a dynamic stochastic simulation. See also Peel and Venetis (2003) for a similar application to measure the half- lives of real exchange rates. We also compute half-lives for a 1% shock. The results are in line with the results reported in Table 5. 419 business days. That is, we observe for all firms a significant reduction in the half- life estimates when nonlinearities are incorporated into the arbitrage model. These averages, however, are influenced by a few large observations. The non-linear half- life median is 1.08 business days, also a reduction of more than half when compared to the median half-life for the linear model of 2.29 business days. These nonlinear Table 5. Speed of convergence: half-lives a b Firm Nonlinear adjustment (ESTAR) Linear adjustment AR) ˆ ˆ ˆ q (5) 1 σ 3 σ 5 σ ε ε ε AMX 0.643 0.588 0.507 0.850 CX 0.516 0.544 0.514 2.318 KOF 0.701 0.634 0.612 0.839 CDG 10.895 10.759 9.661 50.027 DES 2.356 2.267 1.084 12.313 ICA 12.908 12.772 11.674 15.685 FMX 0.542 0.497 0.494 0.505 GMK 1.079 0.968 0.555 0.732 ASR 0.945 0.892 0.712 2.287 IMY 0.986 0.866 0.793 2.057 MSK 1.042 1.045 0.947 1.691 CEL 2.094 2.195 2.183 2.601 RC 3.616 3.527 2.344 7.014 SIM 1.893 1.846 1.008 7.363 TV 0.694 0.664 0.666 0.704 TMM 12.575 12.439 11.341 32.55 IBA 1.995 1.764 1.103 2.344 ICM 7.116 7.027 5.844 7.014 TMX 0.551 0.458 0.475 1.040 TZA 0.606 0.603 0.603 0.792 VTO 1.164 0.976 0.832 1.799 Average 3.10 3.02 2.57 7.26 Notes: All figures are in (business) days. A half-life is defined as the number of periods it takes for shocks to pricing error to dissipate by a half. In a non-linear framework, it is such that the impulse response function is less than unity or GIRF < 0.5. a. Half-lives for shocks ˆ where σ denotes the residual standard deviation. δσ== ii(, 135, ) b. Half-lives computed in a linear framework, using the Augmented Dickey-Fuller (ADF) representation, allowing for a deterministic component which can be a constant, μ , or a constant and a time trend, μβ + t . The maximum lag length in the ADF specification 0 0 is set equal to 5 business days. The lag truncation is selected using AIC. 420 Journal of Applied Economics results are in line with the findings of Gagnon and Karolyi (2003), where the average deviation from price parity can persist for up to five days. Note that for 14 out of 21 firms, using the nonlinear model, it takes less than two day for the ADR-underlying price spread to be reduced by half. The size of the shock to price parity also matters, for 17 firms the half-life is reduced to less than 2.3 days if the shock size is five times the residual standard deviation. Again, these results seem consistent with the discussion in Gagnon and Karolyi (2003), where it is mentioned that although the process of issuance and cancellation of ADRs can take place on the same day, it usually occurs on an overnight basis. D. Nonparametric tests of association between liquidity and convergence Some of the high half-life estimates correspond to companies that display very low volume (CDG, ICM, TMM). This finding is similar to the results reported in Rabinovitch et al. (2003), where low volume is associated with higher transaction costs, and in Roll, Schwartz, and Subrahmanyam (2004), where liquidity and lack of arbitrage opportunities are positively related. To formally explore whether popular indicators of a firm’s liquidity such as daily volume, market capitalization, and float are correlated with a firm’s convergence to price parity, a nonparametric Spearman rank correlation test is conducted. The null hypothesis is that a firm’ liquidity characteristics are not related to the speed of transition between regimes or the speed of convergence to parity against the alternative of them being associated. Table 6 shows the raking of firms’ liquidity indicators, while Table 7 shows the Spearman rank correlations. For the non-linear adjustment model, the results indicate that the null hypothesis of no association can be rejected at the 5% level for all liquidity characteristics. The average daily volume, market capitalization, and float are all positively and significantly correlated to the half-life and the speed of transition between regimes calculated using our non-linear estimators. If we consider faster convergence as a sign of higher market liquidity, our non-linear estimates provide a better measure of liquidity than the standard linear estimates. The estimated correlations ICA, the other firm with a high half-life estimate, is seriously affected by a significant change in the premium after December 3, 2003. The average premium changed from 27% to 3%. Besides a significant investment by Mexican investor Carlos Slim, we could not find any information as to why ICA shows such a significant change in premium. ICA’s half-life estimates before and after December 3, 2003, are in line with the rest of the firms. 421 Table 6. Ranks of firms according to market characteristics Firm Speed of Price spread nonlinear Price spread Average daily Market Float transition half-life linear half-life volume capitalization AMX 4 5 6 2 1 2 CX 2 1 12 4 3 3 KOF 8 7 5 8 6 9 CDG 6 19 21 16 21 21 DES 1 16 18 13 10 8 ICA 12 21 19 6 11 6 FMX 7 2 1 7 5 7 GMK 19 11 3 20 9 11 ASR 11 8 11 11 12 13 IMY 15 9 10 17 8 14 MSK 16 10 8 15 14 20 CEL 18 15 14 10 18 19 RC 14 17 15 14 20 15 SIM 17 13 17 19 16 12 TV 5 6 2 3 4 4 TMM 21 20 20 12 17 17 IBA 9 14 13 18 13 16 ICM 20 18 16 21 19 18 TMX 3 3 7 1 2 1 TZA 10 4 4 5 7 5 VTO 13 12 9 9 15 10 Table 7. Nonparametric tests of association between firm market characteristics and convergence to parity: Spearman rank correlation coefficient (r ) Average daily Price spread Price spread Market Float volume nonlinear half-life linear half-life capitalization * * * * Speed of transition 0.627 0.513 0.223 0.633 0.651 * * * Average daily volume 0.579 0.404 0.513 0.777 * * * Price spread nonlinear half-life 0.810 0.826 0.655 * * Price spread linear half-life 0.702 0.505 Market capitalization 0.852 Notes: denotes significance at the 5% level. The Spearman rank statistics indicate that volume, market capitalization, and float are positively and strongly correlated with the price spread half live and the speed of transition. This implies the higher the average daily volume, the faster an arbitrage can be executed. This observation remains true for market capitalization and float. Critical values of Spearman’s Rank correlation coefficient for n = 21 are: 0.368 (5%); 0.438 (2.5%); and 0.521 (1%). 422 Journal of Applied Economics using the non-linear half-life estimates are substantially higher than the estimated correlations using the linear half-life estimates. For example, the correlations between market capitalization, average daily volume and float and the non-linear half-life estimates are .83, .58, and .66, respectively, while the correlation between the same liquidity indicators and the linear half-life estimates are .70, .40, and .51, respectively. VI. Conclusions In this paper we study the convergence between the prices of ADRs and Mexican traded shares. We have a sample of 21 dually listed shares, listed in Mexico and in the United States. Since both markets have similar trading hours, standard arbitrage considerations should make persistent deviation from price parity rare. We estimate two different non-linear adjustment models, the LSTAR and ESTAR models, along with a standard linear model to estimate the convergence of the ADRs and the locally traded shares. From our estimation results, first, we reject the linear adjustment model; and, second, based on different tests, we select the ESTAR model. Overall, we find that for small deviation from price parity there is no tendency for convergence towards price parity; while for large deviations from price parity there is a full reversion to price parity. Using the ESTAR model, we are able to estimate the half-life of different shocks to price spreads. We find that price spreads tend to die out quickly in a nonlinear framework. The sample average half-life is 3.1 business days, while the median half- life is 1.08 business days. By allowing non-linear adjustments, the average half-life is reduced by more than 57%, when compared to the standard linear model. For 14 out of 21 firms it takes less than 2 days for the ADR-underlying price spread to be reduced by half. Four firms, however, have high half-life estimates (seven days or more), and, in general, correspond to companies that display very low volume, and thus, arbitrage might be difficult to execute. The results of a Spearman correlation tests confirm this finding, as most firm’s liquidity market indicators are positively correlated to the speed of convergence to parity. The size of the shock to price parity also matters, for 17 out of 21 firms the half-life is reduced to less than 2.3 days when the shock size is five times the residual standard deviation. This work can be easily extended to other markets that have similar trading hours to the U.S., for example, Argentina, Brazil, and Chile. Using these markets, along with Mexico, will allow pool estimation of half-lives per market and, thus, compare liquidity across these Latin American emerging markets. Data on ADRs conversion can also help to understand the dynamics behind the convergence of the ADR-underlying pair. 423 Appendix A. Testing and estimation of STAR models Using equation (4), we can test the null hypothesis of linearity, by testing H : Ψ = 0 against the alternative hypothesis H : Ψ ≠ 0 for at least one jp ∈{0, ..., }. However, 2, j under the null, the transition function’s parameters λ and μ are unidentified. Following Saikkonen and Luukkonen (1988a) and Teräsvirta (1994), a third order Taylor series expansion of the transition function Φ(q ; λ, μ) around zero is used to overcome non-identification issues. The re-parameterization of equation (4) yields the following artificial regression: p p p p p 2 3 qq =+ββ + βqq + β qq++ β qq v (A1) ∑∑ ∑∑ tj 00 0 tj − 1j tj−−t d 2j tj−−t d 3j tj−−t d t j =1 j =1 j =1 j =1 where with ββ = ( ,β , ...,β ) j = 1,2,3 are function of the AR coefficients vector jj01j pj (ΨΨ , , ...,Ψ ), i = 12 , , and the transition function parameters λ and μ. Thus, ii,, 01 i,p assuming d is known, the null hypothesis of the linearity test can be written as H : [] ββ==β = 0 , with jp = 12 , , .., . For large samples, the derived test statistic, 12jj 3j NLM3, follows a χ distribution with (p+1) degrees of freedom. We also use the non- linearity tests developed by Escribano and Jordá (1999) that account for the fourth power of the transition variable. This test tries to overcome the finding that when the variance of the error terms is large, the LSTAR (a nonlinear model) will be wrongly detected by the test more frequently. The underlying auxiliary regression is: p p p p p p 2 3 4 qq =+ββ + βqq + β qq++ββ qq q q q (A2) tj 00∑∑ 0 tj − 1j tj−−t d∑∑ 2j tj−−t d 3j tj−−t d∑ 4j tj − td − j =1 j =1 j =1 j =1 j =1 The null hypothesis of linearity is then: H: , [] ββ==β =β = 0with 12jj 3j 4 ,j j = 1,2,…, p. The resulting test statistic, denoted NLM4, follows a chi-squared distribution with 4(p+1) degrees of freedom for large samples. The rejection of the null hypothesis will indicate the presence of nonlinearity. B. Model selection: testing ESTAR vs. LSTAR Once a nonlinear specification is found adequate, the next task is to choose between the ESTAR and the LSTAR models. Teräsvirta (1994) suggests the use of the artificial regression (A1) to perform a LM test of the ESTAR specification against the alternative of the LSTAR specification. In fact, the significance of cubic terms in 424 Journal of Applied Economics equation (A1) will not indicate the ESTAR adjustment in that the third order Taylor expansion of the transition function of an ESTAR model has a quadratic form (U- shape). The cubic terms will rather signal a LSTAR type of adjustment (asymmetry). In other words, the rejection of the null hypothesis H: , [] β = 0with j = 1,2,…, p 0L 3 j leads to the selection of the LSTAR model, whereas the rejection of the null hypothesis H: , [| ββ== 00]with j = 1,2,…, p leads to the selection of the ESTAR model. 0E 23jj The test NLM2 tests H . Escribano and Jordá (1999) also develop a LM-type test 0E to discriminate between LSTAR and ESTAR using the artificial Equation (A2) and conditional on prior rejection of linearity. The selection procedure is as follow: Let LMEST denote the F-test of the null hypothesis H: , [] ββ== 0with j = 1,2,…, p 0E 24jj LST for ESTAR, and LM the null hypothesis H: , [] ββ== 0with j = 1,2,…, p 0L 13jj for LSTAR. The relative strength of the rejection of each hypothesis is then compared. LST If the minimum p-value corresponds to LM , LSTAR is selected, if it rather EST corresponds to LM , the model selected is ESTAR. References Auguste, Sebastián, Kathyrn M. E. Dominguez, Herman Kamil and Linda L. Tesar (2006), Cross-border trading as a mechanism for capital flight: ADRs and the Argentine crisis, Journal of Monetary Economics 53: 1259-1295. Chung, Huimin, Tsung-Wu Ho and Ling-Ju Wei (2005), The dynamic relationship between the prices of ADRs and their underlying stocks: Evidence from the threshold vector error correction model, Applied Economics 37: 2387 – 2394. De Jong, Abe, Leonard Rosenthal, and Mathijs van Dijk (2003), The limits of arbitrage: Evidence from dual-listed companies, working paper, Erasmus University. Dumas, Bernard (1992), Dynamic equilibrium and the real exchange rate in a spatially separated world, Review of Financial Studies 8: 709-742. Eitrheim, Oyvind and Timo Teräsvirta (1996), Testing the adequacy of smooth transition autoregressive models, Journal of Econometrics 74: 59-75. Escribano, Alvaro, and Oscar Jordá (1999), Improved testing and specification of smooth transition regression models, in R. Philip, ed., Nonlinear time series analysis of economic and financial data, Boston, MA, Kluwer Academic Publishers. Eun, Cheol S. and Sanjiv Sabherwal (2003), Cross-border listing and price discovery: Evidence from U.S. listed Canadian stocks, Journal of Finance 58: 549-577. Froot, Kenneth A., and Emil M. Dabora (1999), How are stock prices affected by the location of trade?, Journal of Financial Economics 53: 189-216. Gagnon, Louis, and G. Andrew Karolyi (2003), Multi-market trading and arbitrage, working paper, Ohio State University. Grammig, Joachim, Michael Melvin, and Christian Schlag (2001), Internationally cross-listed stock prices during overlapping trading hours: Price discovery and exchange rate effects, working paper, Arizona State University. Granger, Clive W. J. and Timo Teräsvirta (1993), Modeling nonlinear economic relationships, Oxford, Oxford University Press. 425 Hong, Gwangheong and Raul Susmel (2003), Pairs-trading in the Asian ADR market, unpublished manuscript, University of Houston. Kato, K., S. Linn, and James S. Schallheim (1991), Are there arbitrage opportunities in the market for American depository receipts?, Journal of International Financial Markets, Institutions & Money 1: 73-89. Koop, Gary, M. Hashem Pesaran, and Simon M. Potter (1996), Impulse response analysis in non-linear multivariate models, Journal of Econometrics 74: 119-147. LeBaron, Blake (1992), Do moving average trading rule results imply nonlinearities in foreign exchange markets?, working paper, University of Wisconsin-Madison. Melvin, Michael (2003), A stock market boom during a financial crisis: ADRs and capital outflows in Argentina, Economics Letters 81: 129-136. Miller, Darius P. and Matthew R. Morey (1996), The intraday pricing behavior of international dually listed securities, Journal of International Financial Markets, Institutions and Money 6: 79-89. Obstfeld, Maurice and Alan M. Taylor (1997), Nonlinear aspects of goods-market arbitrage and adjustment: Heckscher’s commodity points revisited, Journal of Japanese and International Economics 11: 441-479. Peel, David A., and Ioannis A. Venetis (2003), Purchasing power parity over two centuries: Trends and non-linearity, Applied Economics 35: 609-617. Rabinovitch, Ramon, Ana C. Silva and Raul Susmel (2003), Returns on ADRs and arbitrage in emerging markets, Emerging Markets Review 4: 225-328. Rosenthal, Leonard and Colin Young (1990), The seemingly anomalous price behavior of Royal Dutch/Shell and Unilever N.V./PLC, Journal of Financial Economics 26: 123-41. Roll, Richard, Eduardo Schwartz, and Avanidhar Subrahmanyam (2004), Liquidity and arbitrage, working paper, UCLA. Sercu, Piet, Rammal Uppal and Cynthia Van Hulle (1995), The exchange rate in the presence of transactions costs: Implications for tests of purchasing power parity, Journal of Finance 50: 1309-1319. Suarez, E. Dante (2005a), Arbitrage opportunities in the depositary receipts market: Myth or reality?, Journal of International Financial Markets, Institutions and Money 15: 469-480. Suarez, E. Dante (2005b), Enforcing the law of one price: Nonlinear mean reversion in the ADR market, Managerial Finance 31: 1-17. Taylor, Nick, Dick van Dijk, Philip H. Franses, and André Lucas (2000), SETS, arbitrage activity, and stock price dynamics, Journal of Banking & Finance 24: 1289-1306. Taylor, Mark P., and David A. Peel (2000), Nonlinear adjustment, long-run equilibrium and exchange rate fundamentals, Journal of International Money and Finance 19: 33-53. Teräsvirta, Timo (1994), Specification, estimation, and evaluation of smooth transition autoregresive models, Journal of the American Statistical Association 89: 208-218. Tong, Howell (1990), Non-linear time series: A dynamical system approach, Oxford, Oxford University Press. Van Dijk, Dick, Timo Teräsvirta, and Philip H. Franses (2002), Smooth transition autoregressive models: A survey of recent developments, Econometrics Review 21: 1-47. Wahab, Mahmoud S., Malek Lashgari, and Richard Cohn (1992), Arbitrage in the American depository receipts market revisited, Journal of International Markets, Institutions and Money 2: 97. Wooldridge, Jeffrey M. (1990), A unified approach to robust, regression-based specification tests, Econometric Theory 6: 17-43. Wooldridge, Jeffrey M. (1991), On the application of robust, regression-based diagnostics to models of conditional means and conditional variances, Journal of Econometrics 47: 5-46.
Journal
Journal of Applied Economics
– Taylor & Francis
Published: Nov 1, 2008
Keywords: G14; G15; ADRs; nonlinear convergence; arbitrage; ESTAR