How to deal with missing observations in surveys of professional forecasters
How to deal with missing observations in surveys of professional forecasters
Bürgi, Constantin Rudolf Salomo
2023-12-31 00:00:00
JOURNAL OF APPLIED ECONOMICS 2023, VOL. 26, NO. 1, 2185975 https://doi.org/10.1080/15140326.2023.2185975 RESEARCH ARTICLE How to deal with missing observations in surveys of professional forecasters Constantin Rudolf Salomo Bürgi School of Economics, University College Dublin, Dublin 4, Ireland ABSTRACT ARTICLE HISTORY Received 30 March 2022 Survey forecasts are prone to entry and exit of forecasters as well as Accepted 22 February 2023 forecasters not contributing every period leading to gaps. These gaps make it difficult to compare individual forecasters to each KEYWORDS other and raises the question of how to deal with the missing Gap; entry; exit; imputation observations. This is addressed for the variables GDP, CPI inflation, and unemployment for the US. The theoretically optimal method of filling in missing observations is derived and compared to several competing methods. It is found that not filling in missing observa- tions and taking the previous value do not perform particularly well. For the other methods assessed, there is no clear superior approach for all use cases, but the theoretically optimal one usually performs quite well. 1. Introduction There is an extensive literature on the use of survey data of professional forecasters and how to compare and combine individual contributors of these surveys going back to at least Bates and Granger (1969). One difficulty when working with survey forecasts is the non-response of survey contributors including the extensive entry and exit of individuals. For example, over the entire 210 quarter history (Q4 1968-Q1 2021) of the Survey of Professional Forecasters (SPF) conducted by the Philadelphia Fed, there are almost 450 individual contributors, and each has contributed 19 forecasts on average. While entry and exit play a large role over such a long period, the non-response is also an issue when looking at a shorter time period. For example, for the four surveys in 2020, there were 49 contributors with a total of 151 forecasts, implying that forecasters on average did not contribute a forecast for around 25% of that year. In turn, the missing observations are not fully random and can influence the analysis. For example, Bürgi (2017) has shown that the common finding that a majority of individual forecasters appear biased can mainly be attributed to the gaps in the survey, rather than asymmetric loss function or sub-optimal forecasting behavior. Due to the potential impact of missing observations on inferences, it is instrumental to address how to best deal with the gaps in forecasting surveys. CONTACT Constantin Rudolf Salomo Bürgi constantin.burgi@ucd.ie School of Economics, University College Dublin, Belfield, Dublin 4, Ireland © 2023 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/ licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 2 C. R. S. BÜRGI While there is an extensive literature on how to fill in missing observations in general survey data (e.g., see Andridge and Little (2010) or Little and Rubin (2019) for a review), forecasting surveys have some key features that many other surveys do not share. For example, the forecasters are asked to predict the same event across multiple surveys, only few forecasters are included in the surveys, and a large share of responses is missing. These features can limit options but also open up new avenues to handle non-response and have led to a multitude of methods. For example, one can use a previous response regarding the same event to fill in a missing observation, which is not possible in surveys where people are not asked repeatedly about the same event. Indeed, under quite general assumptions, the optimal method of filling in missing observations are derived. It is shown that the method proposed in Genre et al. (2013) coincides with this optimal method and it is compared with the different methods suggested in the literature. These methods for surveys of professional forecasters can be put into four groups: not filling in missing observations, replacing missing observations with the simple average, filling missing observations with a function of previous predictions of the same event made by the same forecaster, and filling gaps with the predictions made by a similar forecaster. The results presented here also have important implications for methods used in the context of surveys of professional forecasters that are robust against missing observa- tions. Examples of these methods include Mack and Skillings (1980), D’agostino et al. (2012), or Bürgi and Sinclair (2017). While these methods do not explicitly fill in missing observations, they make specific assumptions about the missing observations. These assumptions can then be linked to a specific method of explicitly imputing the missing observations and using the corresponding method of explicitly filling in the missing observations, one can obtain the same result as using the robust method directly. For example, the method in Bürgi and Sinclair (2017) implicitly assumes that the forecast performance for missing observations is similar to the non-missing observations and would be equivalent to leaving gaps explicitly missing. If it was found that there are superior methods to explicitly fill in missing observations than the one implied by the assumptions of the robust method, it might be worthwhile to first fill in the missing observations using the superior method before applying the robust method. This would also allow to more easily compare methods that are robust against missing observations to ones that are not. In order to provide results that are broadly applicable, the analysis is done for GDP growth, unemployment and CPI inflation at the quarterly frequency and for various horizons in the Bloomberg survey and for GDP in the Wall Street Journal survey from 2002 to 2015. The reminder of the paper is structured as follows: the next section describes the theoretically optimal approach followed by different approaches. Section 4 runs the simulations followed by the application to a forecast combination problem. The final section concludes. Mack and Skillings (1980) implicitly replace missing values with the median and the robust measure in D’agostino et al. (2012) would also leave gaps missing as it assumes that missing observations behave the same way as non-missing observations. JOURNAL OF APPLIED ECONOMICS 3 2. Optimal signal extraction Explicitly filling in the missing observations is typically done in two steps and the second step is the one of interest here. First, survey participants that only made few predictions got excluded from the analysis. In the extreme case, only those participants are kept in the sample that participated in every survey round like Issler and Lima (2009) and this already solves the data gaps issue. However, this extreme measure dramatically reduces the sample both in the number of forecasters included and the time period over which forecasts can be assessed. As the missing observations are not necessarily random (e.g., see Bürgi (2017)), this can cause a sample selection bias. Because of this, the extreme case is not often pursued in the literature. Instead, a second step is added where models are used to replace the (few) remaining missing observations. This second step is of main interest here. In order to obtain a theoretically optimal way to fill in the missing observations, assume that each period, forecasters receive (forward looking) noisy signals about each future date (event) of the underlying variable as in Bürgi (2020). These signals might mainly include the data releases of the variable of interest as well as the ones of related variables. They could also include forward looking information like the announcements by policymakers which only affect the underlying variable at specific future dates (e.g., a tax change typically only has a transitory effect on inflation at the effective date). The optimal prediction for a specific event in period t then becomes the weighted average of all signals received prior to t and the current signal received in period t. This is a flexible generalization of the standard Kalman filter setup with three key advantages: It does not require any assumptions about the data generating process, it has horizon-specific signals and there is no need to assume that the underlying variable is unobservable. In this setup, a forecast is made up by ^ y ¼ α^ y þð1 αÞx (1) i;t;t h i;t;t h 1 i;t;t h where ^ y is the prediction for variable y in period t made by individual i in period t-h and i;t;t h made up by the (optimally) weighted average between x ¼ y þ ν ; the forward i;t;t h t i;t;t h looking signal with noise ν ,Nð0; σ Þ and the previous prediction ^ y (which in i;t;t h i;t;t h 1 t;t h turn is a weighted average of signals). Assuming ε is the prediction error made when i;t;t h t;t h predicting ^ y , the optimal α becomes . Unfortunately, x is unobser- i;t;t h 2 2 i;t;t h σ þσ ν P t;t h i;t;t h 1 vable and hence the missing predictions cannot be directly constructed. However, if ν has i;t;t h a common component across forecasters (e.g., ν ¼ μ þ η ), one can use this i;t;t h i;t;t h t;t h information to get an estimate for x . Specifically, the simple average can be used to i;t;t h estimate x . Equation 1 can be reformulated for the simple average as i;t;t h � y ¼ α� y þð1 αÞx (2) t;t h t;t h 1 t;t h assuming the same weighting is optimal for the aggregate and the individual level. Replacing the individual signal with the aggregate signal in equation 1, can be rearranged to ^ � ^ � ^ � If the αs are different, one would estimate ðy y Þ ¼ β ðy y Þþ β ðy y Þþ β � i;t;t h t;t h i;t;t h 1 t;t h i;t;t h t;t h 1 1 2 2 � ^ β ðy y Þþ e instead of the expression in Genre et al. (2013). However, this method performs i;t;t h 1 t;t h 1 i;t;t h worse based on simulations than if the same αs are assumed. This suggests that the αs are relatively close to each other. These results are available upon request. 4 C. R. S. BÜRGI ^ y � y ¼ αð^ y � y Þ (3) i;t;t h t;t h i;t;t h 1 t;t h 1 as the forecasted prediction. If the signal noise for individual i is assumed to be of the form ν ¼ μ þ η with individual component variances σ , the prediction i;t;t h i;t;t h t;t h μ t;t h error made with this approach is ð1 αÞμ . As μ is assumed to be unobserved i;t;t h i;t;t h white noise and α is the optimal (inverse variance) weight, one cannot improve upon this prediction. This approach has been used in the literature previously but without any theoretical foundations. Specifically, it was introduced by Genre et al. (2013) and also applied by Kenny et al. (2015b), Kenny et al. (2015a) and Diebold and Shin (2019) who estimate the equation ^ y � y ¼ βð^ y � y Þþ ε (4) i;t;t h t;t h i;t;t h 1 t;t h 1 i;t;t h This approach is also closely related to the Kalman filter (e.g., as described in Ghysels and Wright (2009) or Grishchenko et al. (2019)). Specifically, one could assume that for each horizon, the (unobservable) true difference between the individual prediction and the simple average follows an AR(1) process and one observes an iid signal of this difference. The optimal prediction then becomes the weighted average between the previous prediction of the state ð^ y � y Þ and the signal, resulting in the i;t;t h 1 t;t h 1 new optimal prediction ð^ y � y Þ. While this way of motivating the regression i;t;t h t;t h leads to the same estimation equation, the assumptions are a bit more stringent as an AR (1) process is assumed, it is assumed that the true state is unobservable and a separate model is assumed for each horizon. This approach can also be applied to consumer surveys where participants are repeatedly asked about a fixed horizon forecast for a serially correlated variable (e.g., inflation). In this case, the signal reflects the change from one event to another and the weight on the previous prediction is directly related to the serial correlation of the underlying variable. However, the approach might not be optimal anymore as the variables predicted are not necessarily the same for all participants. 3. Alternative approaches In order to assess the performance of the theoretically optimal approach, it is compared to six other individual approaches proposed in the literature as well as the average across all methods. The first approach to handle missing observations is to leave them missing (e.g., see Capistrán and Timmermann (2009), or Bürgi and Sinclair (2017) for examples). This approach assumes that replacing the missing observations does more harm than good. The second approach replaces the missing observations with the simple average across all forecasters (e.g., see Capistrán and Timmermann (2009) or Lahiri et al. (2017) for examples). This approach is the only approach discussed here which always fills all If μ was correlated with μ for two forecasters i and j, one might be able to improve upon this prediction. i;t;t h j;t;t h However, this would require a two step approach where one needs to find the most correlated forecaster and then estimate the regression. As shown below with the covariance approach, this does not often produce better predictions. Without this last assumption, a more standard noisy information model would result in longer horizons being just an autocorrelation coefficient times the next shorter horizon (e.g., see Coibion and Gorodnichenko (2015). JOURNAL OF APPLIED ECONOMICS 5 missing values. All other approaches might leave some observations missing unless the sample is restricted in a specific way. Due to this property, the simple average is used as a fallback option in a robustness check. The next approach uses prior predictions for the same event made by the same forecaster to fill in missing observations (e.g., see Poncela et al. (2011) or Conflitti et al. (2015)). For example, a forecaster might contribute to a survey in the first quarter of 2005 but not the second quarter of the same year. If forecasts for the fourth quarter of 2005 are made in both surveys, one could use the first quarter survey predictions to replace the missing observation. This approach cannot replace missing observations at the beginning of the sample and is implicitly assuming that forecasts remain unchanged. The assumption of an unchanged forecast might not necessarily be adequate even in monthly surveys (see Sheng and Wallen (2014), Andrade and Le Bihan (2013), or Bürgi (2020)) and ignore the reduction in uncertainty over time. This is also why the theore- tically optimal approach derived above takes into account changes in the simple average to fill in missing observations. Approaches four and five are variations of this approach. Specifically, Lahiri et al. (2017) and Zhao (2020) proposed an approach that can deal with multi-period gaps. Specifically, they estimate ^ y � y ¼ β ^ y � y þ ε (5) i;t;t h t;t h i;t;t h i i;t ðj 1Þ;t h j t ðj 1Þ;t h j j¼1 That is, the past four deviations from the simple average by forecaster i are averaged and then regressed on the current deviation. While this approach can handle gaps that are larger than one period, it averages negative and positive deviations, causing a trade-off. As with the previous approach, the estimation uses predictions pooled across horizons. The fifth approach aims to more explicitly model the signal that agents use to update. Under the assumption that forecasters base their forecasts on similar data, they might weight different data differently. For example, two forecasters might react differently to a higher than expected payroll release, even if they both watch it closely. In order to capture this, a mixed data sampling (MIDAS) approach is utilized where data surprises from Scotti (2016) are added to the optimal model. Specifically, ^ y � y ¼ βð^ y � y Þþ γ SurpM1þ γ SurpM2þ ε (6) i;t;t h t;t h i;t;t h 1 t;t h 1 i;t;t h i i;1 i;2 is estimated where SurpM1 is the value of the Scotti (2016) surprise index at the end of the first month of the quarter and SurpM2 the value at the end of the second month of the quarter. A related approach has been used by Ghysels and Wright (2009) where they used daily market returns in a MIDAS regression to increase the survey frequency to daily. The sixth and last individual approach considered follows the idea in Andridge and Little (2010) that one can replace missing observations with the predictions of a similar forecaster. The specific approach taken here is to replace missing observations of fore- caster i with the ones made by the forecaster that has the highest correlation across horizons. Specifically, the forecasts made by the forecaster whose predictions satisfy max corð^ y ; ^ y Þ (7) i;k j;k k¼1 6 C. R. S. BÜRGI where ^ y are the predictions made by forecaster i with horizon k. As with the previous i;k approaches, using multiple horizons in the model improves the prediction. This approach does not have the restriction for the first period in the sample but leaves random observations missing. The reason for this is that the highest correlated forecaster might miss some of the same predictions as the one whose missing predictions he should replace. In addition to these six alternative approaches plus the theoretically optimal one, the simple average across the filled in values is also considered. This average uses the simple average approach if no other approach filled in a specific value and the average of up to six values (one of the seven approaches was to leave it blank). 4. Simulation In order to assess how the above approaches compare in filling the missing observations, a simulation is run. The sample over which the performance is assessed is the quarterly forecasts made in March, June, September, and December starting in December 2002 and ending in March 2015; a total of 50 observations. The 2002 start date is the first date when the WSJ survey moved from a semi-annual survey to a monthly survey and the quarterly Bloomberg data is from Bürgi (2017) and ends in March 2015. Due to entry and exit, longer samples are not necessarily better than shorter samples as forecasters become less likely to overlap. In addition, any forecaster with less than 16 predictions (equivalent to 4 years) is not included in the simulation. This leaves 75 individual forecasters in the Bloomberg survey and 65 individual forecasters in the WSJ survey. In order to be able to compare the filled in values to actual values, it is necessary to randomly replace values with missing observations. To this end, the simulation with 100 replications replaces the actual predictions with missing observations for 10% of the date- forecaster pairs. This format of randomly replacing values mimics that forecasters either contribute forecasts for all horizons in a survey round or do not participate. Once the forecasts have been replaced with missing observations, the above methods are used to fill in these missing observations. As mentioned above, some of the approaches will leave some observations missing. To maintain comparative results, the missing observations are left missing for one set of simulation results and replaced with the simple average for another set of results. While it is possible to fill in missing values in an iterative approach, this is not chosen for two reasons. First, even under the iterative approach, the regres- sion-based approaches might still leave some observations missing. Second, each itera- tion will fill the missing observations with inferior predictions and the goal is to compare the approaches using a best case scenario. In order to compare and assess the approaches, a total of six metrics are calculated for the current quarter, one-quarter ahead and two-quarter ahead predictions. These six measures can be grouped into (root) mean squared measures that strongly penalize large Another approach used in Steira (2012) proposes to make a linear projection of the quarterly forecast path to fill in missing observations at the longest horizon. Since the main objective is to fill in missing observations due to non- participation in a specific period, this approach is not suitable here. The results are similar for 1% or 5% of observations being replaced with missing and for 20% of the observations, this causes issues as some forecasters randomly might not have any observations anymore. While three quarter ahead predictions are also available, these are only used for the approaches that use the previous prediction. JOURNAL OF APPLIED ECONOMICS 7 deviations and absolute measures that penalize large deviations to a lesser extent. In order to keep the results compact, mainly the (root) mean squared results are reported in the main text and the absolute results are shown in the appendix. The first measures compare the predicted value to the actual value. These are the root mean squared difference (RMSD) and the mean absolute difference (MAD) between the filled values and the actual values. That is sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi n n X X 1 1 RMSD ¼ ðA FÞ MAD ¼ jA Fj (8) i i i i n n i¼1 i¼1 where A is the actual value in the survey and F is the filled in value by the different i i approaches. For MSD, only observations that were filled in are included. The second set of measures compares the pairwise correlation matrix for both the actual and the filled values across all forecasters. They are the root mean squared correlation difference (RMSCD) and the mean absolute correlation difference (MACD). That is sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi RMSCD ¼ ðcorðA ; AÞ