Construction in Serbia, 2000Q1–2024Q4: Part 2 — Dependence, forecasting, and synchronization with GDP
1. Part 2 introduction
Part 1 established the structural anatomy of the series: a strong trend plus durable seasonality. Part 2 asks three applied questions that matter for analysts and policymakers. First, how persistent is construction activity once you look beyond the seasonal pattern? Second, can a standard ARIMA model forecast it credibly at an 8-quarter horizon? Third, does construction move with the GDP cycle in a way that is economically meaningful, or is it a sector that mostly lives by its own timing?
2. Features and autocorrelation
Figure 10 (autocorrelation function) shows high positive autocorrelation at short lags and pronounced seasonal autocorrelation at the quarterly seasonal lag and its multiples, which is the statistical equivalent of saying: this series remembers its recent past, and it especially remembers the same quarter last year. That is consistent with what the seasonal diagnostics already suggested: persistence is not an incidental property here; it is a defining feature of the data-generating process.

Table 1 provides a battery of numeric “features” that quantify this anatomy. Trend strength is very high (0.977), indicating that a smooth long-run component accounts for most of the systematic variation. Seasonal strength is also very high (0.957), confirming that the within-year pattern is not weak decoration but a major driver. The seasonal peak and trough indicators (peak_year = 0, trough_year = 1) encode that the seasonal pattern is anchored in a stable quarterly structure rather than drifting unpredictably. Spikiness is essentially zero (0.000), suggesting the remainder is not dominated by frequent sharp jumps, even if a few larger episodes exist. Linearity (3.407) implies the long-run path is strongly directional, while curvature (0.228) indicates that the direction changes over time rather than following a single straight-line growth path.
Table 1. Features of Construction time series
| Feature | Value | Definition |
|---|---|---|
| trend_strength | 0.977 | Strength of trend from STL decomposition (near 1 = strong trend). |
| seasonal_strength_year | 0.957 | Strength of seasonality from STL decomposition (near 1 = strong seasonality). |
| seasonal_peak_year | 0 | Season-of-year (month/quarter) where the seasonal component peaks. |
| seasonal_trough_year | 1 | Season-of-year (month/quarter) where the seasonal component is lowest. |
| spikiness | 0 | Spikiness/abruptness in the STL remainder (implementation-specific summary). |
| linearity | 3.407 | How well a linear trend describes the long-term movement (implementation-specific). |
| curvature | 0.228 | How curved/nonlinear the trend is (implementation-specific). |
| stl_e_acf1 | -0.278 | Lag-1 autocorrelation of the STL remainder. |
| stl_e_acf10 | 0.473 | Lag-10 autocorrelation of the STL remainder. |
| acf1 | 0.552 | Lag-1 autocorrelation of the original series. |
| acf10 | 2.464 | Lag-10 autocorrelation of the original series. |
| diff1_acf1 | -0.34 | Lag-1 autocorrelation of the first-differenced series. |
| diff1_acf10 | 2.244 | Lag-10 autocorrelation of the first-differenced series. |
| diff2_acf1 | -0.532 | Lag-1 autocorrelation of the second-differenced series. |
| diff2_acf10 | 2.729 | Lag-10 autocorrelation of the second-differenced series. |
| season_acf1 | 0.861 | Lag-1 autocorrelation after seasonal differencing. |
| pacf5 | 1.426 | PACF at lag 5 of the original series. |
| diff1_pacf5 | 1.323 | PACF at lag 5 of the first-differenced series. |
| diff2_pacf5 | 1.253 | PACF at lag 5 of the second-differenced series. |
| season_pacf | 0.787 | Seasonal PACF feature targeting seasonal partial dependence (implementation-specific). |
| zero_run_mean | 0 | Mean length of consecutive zero runs (zero spells). |
| nonzero_squared_cv | 0.015 | Squared coefficient of variation of the non-zero values. |
| zero_start_prop | 0 | Proportion of leading zeros at the start of the series. |
| zero_end_prop | 0 | Proportion of trailing zeros at the end of the series. |
| lambda_guerrero | 2 | Guerrero-recommended Box-Cox lambda for variance stabilization. |
| kpss_stat | 1.644 | KPSS stationarity test statistic. |
| kpss_pvalue | 0.01 | KPSS test p-value (small => reject stationarity). |
| pp_stat | -4.84 | Phillips-Perron unit-root test statistic. |
| pp_pvalue | 0.01 | Phillips-Perron test p-value (small => reject unit root). |
| ndiffs | 1 | Recommended number of non-seasonal differences to achieve stationarity. |
| nsdiffs | 1 | Recommended number of seasonal differences to remove seasonal unit roots. |
| bp_stat | 30.521 | Box-Pierce portmanteau test statistic. |
| bp_pvalue | 0 | Box-Pierce test p-value (small => autocorrelation remains). |
| lb_stat | 31.446 | Ljung-Box portmanteau test statistic. |
| lb_pvalue | 0 | Ljung-Box test p-value (small => autocorrelation remains). |
| var_tiled_var | 0.092 | Variance of window variances across tiled segments (time-varying volatility). |
| var_tiled_mean | 0.638 | Variance of window means across tiled segments (time-varying level). |
| shift_level_max | 0.377 | Maximum mean shift between adjacent windows. |
| shift_level_index | 2 | Index/location where the maximum mean shift occurs. |
| shift_var_max | 0.208 | Maximum variance shift between adjacent windows. |
| shift_var_index | 21 | Index/location where the maximum variance shift occurs. |
| shift_kl_max | 0.435 | Maximum KL divergence (distribution shift) between adjacent windows. |
| shift_kl_index | 9 | Index/location where the maximum KL shift occurs. |
| spectral_entropy | 0.393 | Entropy of the normalized spectrum (higher => flatter spectrum / weaker dominant periodicity). |
| n_crossing_points | 27 | Number of times the series crosses a reference level (mean/median depending on method). |
| longest_flat_spot | 3 | Length of the longest constant run. |
| coef_hurst | 0.886 | Estimated Hurst exponent (long-range dependence/persistence). |
| stat_arch_lm | 0.641 | ARCH LM test statistic (conditional heteroskedasticity; larger => more ARCH effects). |
The STL remainder autocorrelation summaries (stl_e_acf1 = −0.278 and stl_e_acf10 = 0.473) imply that, once STL components are removed, the residual structure is altered, short-run behaviour differs from longer-lag behaviour, consistent with some remaining dependence at broader horizons. The raw ACF summaries (acf1 = 0.552 and acf10 = 2.464) indicate meaningful persistence in levels, which is exactly what Figure 10 visualises. After one difference, the first-lag autocorrelation turns negative (diff1_acf1 = −0.340), a common sign that differencing removed a persistent component but may have introduced short-run mean reversion; diff1_acf10 remains high (2.244), indicating that dependence is not fully eliminated at longer lags. After a second difference, diff2_acf1 becomes more negative (−0.532) while diff2_acf10 remains elevated (2.729), which is often what you see if you “over-difference” at short horizons while long-horizon seasonal structure still echoes through.
Seasonal dependence is strong (season_acf1 = 0.861), consistent with the prominent quarterly seasonal spikes. Partial autocorrelation features are also elevated (pacf5 = 1.426; diff1_pacf5 = 1.323; diff2_pacf5 = 1.253; season_pacf = 0.787), reinforcing that the series contains structured dependence not captured by a single-lag story.
Several features confirm that zeros are not an issue here (zero_run_mean = 0.000; zero_start_prop = 0.000; zero_end_prop = 0.000), so the analyst need not worry about intermittency. The nonzero_squared_cv (0.015) indicates relatively low dispersion in squared values once non-zeros are considered, consistent with a series that varies but is not “explosive” in a heteroskedastic way most of the time. The Guerrero lambda (2.000) suggests a power transform could be variance-stabilising, which is a reminder that logs are a convention, not a law, though logs remain useful for interpretability and for multiplicative seasonal structures.
Stationarity tests tell a nuanced story: KPSS statistics and p-values (kpss_stat = 1.644, kpss_pvalue = 0.010) reject level stationarity under the KPSS null, while the Phillips–Perron result (pp_stat = −4.840, pp_pvalue = 0.010) rejects a unit root under the PP null. That apparent tension is common in macro data where structural change, strong seasonality, and evolving variance complicate “textbook” stationarity logic. The practical guidance embedded in the feature extraction is clear: one regular difference and one seasonal difference are recommended (ndiffs = 1; nsdiffs = 1). Structural instability is also flagged: the Bai–Perron-related statistic is large with a near-zero p-value (bp_stat = 30.521; bp_pvalue = 0.000), and Ljung–Box in this features table indicates dependence remains in the raw series (lb_stat = 31.446; lb_pvalue = 0.000). Finally, several change features (shift_level_max, shift_var_max, shift_kl_max with associated indices) are consistent with the visual impression that some parts of the sample behave differently from others, and the Hurst coefficient (0.886) signals strong persistence/long-memory-like behaviour.
3. Forecasting construction: ARIMA performance and uncertainty
Figure 11 shows an ARIMA forecast produced on a log-transformed basis, with 80% and 95% prediction intervals that widen with the horizon, as they should. The more demanding test of a forecast is not whether the mean path looks “reasonable,” but whether realised values fall inside the intervals at the expected rate.

Table 2 provides exactly that check for an 8-quarter holdout. For each quarter from 2023Q1 through 2024Q4, the actual value falls inside the 80% interval (and therefore also inside the 95% interval). This is not proof of perfection, 8 observations is a small sample, but it is a meaningful sanity check: the model’s uncertainty bands are not absurdly tight, and the realised path is not systematically outside what the model considered plausible.
Table 2. Forecast 8 quarters ahead – ARIMA model
| Date | Actual | Mean | Lower 80% | Upper 80% | Lower 95% | Upper 95% |
|---|---|---|---|---|---|---|
| 2023 Q1 | 68.3 | 63 | 54 | 73 | 50 | 80 |
| 2023 Q2 | 98.3 | 85 | 70 | 103 | 63 | 114 |
| 2023 Q3 | 104.4 | 100 | 80 | 126 | 71 | 141 |
| 2023 Q4 | 121.2 | 117 | 94 | 148 | 83 | 166 |
| 2024 Q1 | 74.8 | 64 | 49 | 82 | 43 | 94 |
| 2024 Q2 | 104.4 | 89 | 68 | 117 | 59 | 135 |
| 2024 Q3 | 108.8 | 106 | 80 | 140 | 68 | 163 |
| 2024 Q4 | 117.3 | 124 | 93 | 166 | 80 | 193 |
Table 3 summarises the estimated model and forecast accuracy. The training sample is 92 quarters with an 8-quarter holdout, and the model is reported as ARIMA with a small residual variance (sigma2 = 0.014) and a Ljung–Box test on residuals that does not reject remaining autocorrelation (lb_pvalue = 0.708). In plain language, the residual diagnostics are consistent with “no obvious leftover structure,” which is what you want from a baseline ARIMA.
Table 3. ARIMA model parameters accuracy
| Series | Construction | Group | Description |
|---|---|---|---|
| freq_in | 4 | metadata | Input sampling frequency/seasonal period m used by the script (12 monthly, 4 quarterly). |
| n_train | 92 | metadata | Number of observations used to estimate models (training sample size). |
| n_holdout | 8 | metadata | Number of observations held out for forecast evaluation (holdout/test sample size). |
| holdout_h | 8 quarters | metadata | Human-readable forecast horizon label (e.g., ’24 months’ or ‘8 quarters’). |
| .model | arima | metadata | Model identifier (e.g., snaive, naive, ets, arima). |
| sigma2 | 0.014 | fit_stats | Estimated innovation/residual variance (model noise level). |
| log_lik | 65.343 | fit_stats | Log-likelihood of the fitted model at estimated parameters. |
| AIC | -114.686 | fit_stats | Akaike Information Criterion (smaller is better). |
| AICc | -112.863 | fit_stats | Small-sample corrected AIC (smaller is better). |
| BIC | -94.867 | fit_stats | Bayesian Information Criterion (smaller is better). |
| lb_stat | 2.945 | diagnostics | Ljung-Box Q statistic testing residual autocorrelation up to selected lag K. |
| lb_pvalue | 0.708 | diagnostics | p-value for Ljung-Box test (small values suggest residual autocorrelation remains). |
| .type | Test | metadata | Accuracy table type indicator (typically ‘Test’ for holdout evaluation). |
| ME | 0.073 | accuracy | Mean Error on holdout: average(y – forecast); signed bias. |
| RMSE | 0.102 | accuracy | Root Mean Squared Error on holdout. |
| MAE.y | 0.087 | accuracy | Mean Absolute Error on holdout. |
| MPE | 1.625 | accuracy | Mean Percentage Error on holdout (signed). |
| MAPE | 1.928 | accuracy | Mean Absolute Percentage Error on holdout. |
| ACF1 | 0.123 | diagnostics | Lag-1 autocorrelation of residuals (quick diagnostic for remaining serial correlation). |
| estimate__ar1 | 0.807 | parameters | Estimated parameter value for term ‘ar1’. |
| std.error__ar1 | 0.115 | parameters | Standard error of the estimate for term ‘ar1’. |
| statistic__ar1 | 7.043 | parameters | Test statistic for H0: term ‘ar1’ equals 0 (typically estimate/SE). |
| p.value__ar1 | 0 | parameters | p-value associated with the test statistic for term ‘ar1’. |
| estimate__ma1 | -0.01 | parameters | Estimated parameter value for term ‘ma1’. |
| std.error__ma1 | 0.133 | parameters | Standard error of the estimate for term ‘ma1’. |
| statistic__ma1 | -0.073 | parameters | Test statistic for H0: term ‘ma1’ equals 0 (typically estimate/SE). |
| p.value__ma1 | 0.942 | parameters | p-value associated with the test statistic for term ‘ma1’. |
| estimate__ma2 | 0.138 | parameters | Estimated parameter value for term ‘ma2’. |
| std.error__ma2 | 0.136 | parameters | Standard error of the estimate for term ‘ma2’. |
| statistic__ma2 | 1.02 | parameters | Test statistic for H0: term ‘ma2’ equals 0 (typically estimate/SE). |
| p.value__ma2 | 0.31 | parameters | p-value associated with the test statistic for term ‘ma2’. |
| estimate__ma3 | -0.396 | parameters | Estimated parameter value for term ‘ma3’. |
| std.error__ma3 | 0.142 | parameters | Standard error of the estimate for term ‘ma3’. |
| statistic__ma3 | -2.798 | parameters | Test statistic for H0: term ‘ma3’ equals 0 (typically estimate/SE). |
| p.value__ma3 | 0.006 | parameters | p-value associated with the test statistic for term ‘ma3’. |
| estimate__ma4 | 0.404 | parameters | Estimated parameter value for term ‘ma4’. |
| std.error__ma4 | 0.106 | parameters | Standard error of the estimate for term ‘ma4’. |
| statistic__ma4 | 3.816 | parameters | Test statistic for H0: term ‘ma4’ equals 0 (typically estimate/SE). |
| p.value__ma4 | 0 | parameters | p-value associated with the test statistic for term ‘ma4’. |
| estimate__sma1 | -0.815 | parameters | Estimated parameter value for term ‘sma1’. |
| std.error__sma1 | 0.078 | parameters | Standard error of the estimate for term ‘sma1’. |
| statistic__sma1 | -10.427 | parameters | Test statistic for H0: term ‘sma1’ equals 0 (typically estimate/SE). |
| p.value__sma1 | 0 | parameters | p-value associated with the test statistic for term ‘sma1’. |
| estimate__constant | 0.009 | parameters | Estimated parameter value for term ‘constant’. |
| std.error__constant | 0.003 | parameters | Standard error of the estimate for term ‘constant’. |
| statistic__constant | 2.99 | parameters | Test statistic for H0: term ‘constant’ equals 0 (typically estimate/SE). |
| p.value__constant | 0.004 | parameters | p-value associated with the test statistic for term ‘constant’. |
| parameters | p-value associated with the test statistic for term ‘sma1’. |
Forecast accuracy measures are also reported (ME, RMSE, MAE, MPE, MAPE). In context, a MAPE around 1.928 (as reported) indicates that, over the holdout, typical percentage errors were small in magnitude. Equally important, the residual ACF1 is low (0.123), consistent with the residual plots in Figure 12 looking broadly well-behaved.
Parameter-by-parameter, the AR(1) term is strongly significant (estimate 0.807 with p-value 0.000), which fits the persistence story. Several moving-average terms are not significant (MA1 and MA2), while others are (MA3 and MA4), and the seasonal MA(1) is strongly significant (estimate −0.815, p-value 0.000), which is exactly what you would expect in a quarterly series with strong seasonal dependence. The constant term is also significant at conventional levels. The right way to read this is not “every coefficient is sacred,” but rather: the model captures persistence and seasonal structure in a statistically coherent way, and the residual checks do not raise red flags.

Figure 12 (residual diagnostics) supports that interpretation visually: residuals fluctuate around zero with a few larger episodes; the residual ACF bars mostly sit within bounds; and the histogram looks broadly unimodal rather than pathological. Put together, the evidence suggests the ARIMA is a defensible short-horizon forecasting device for this series, provided users remember what ARIMA is and is not: it extrapolates patterns embedded in the past (including seasonality and persistence), but it cannot “know” future policy shifts, project pipelines, or financing regime changes unless those shifts are already visible in the data.
4. Construction and GDP: Do their cycles move together?
Figure 13 compares HP-filtered cycle components of construction and GDP. Two features stand out. First, construction’s cycle is visibly more volatile than GDP’s cycle. That is economically intuitive: construction is a smaller, more project-driven sector, so its deviations from trend can be larger even when the broader economy moves modestly. Second, despite that volatility, there are stretches where the two cycles share the same sign and rough timing, suggesting that construction is not purely idiosyncratic; it does participate in the broader cycle, even if it amplifies it.

Table 4 quantifies synchronisation using a concordance index. The concordance index is 0.61 with a p-value of 0.04, indicating that the two series are in the same “phase” more often than not, and that this alignment is unlikely to be pure coincidence under the test used. A 0.61 concordance is not “lockstep,” but it is also not trivial: it fits the visual impression that co-movement exists, but it is partial and occasionally interrupted.
Table 4. Concordance index
| Series1 | Series2 | N | N11 | N00 | Nc | C_index | p_value |
|---|---|---|---|---|---|---|---|
| Construction | GDP | 100 | 27 | 34 | 61 | 0.61 | 0.04 |

Figure 14 uses cross-correlation to probe lead–lag structure. The strongest positive correlation is at lag 0, and nearby lags (notably 1 and 2) are also positive and visibly above the significance threshold marked in the plot. The macro reading is that the construction and GDP cycles are primarily contemporaneous, with some evidence that the relationship extends a couple of quarters in one direction. Because cross-correlation lead–lag interpretation depends on the plotting convention (which series is “x” and which is “y”), the safe statement, supported directly by the figure, is: the link is strongest within a window of about 0–2 quarters, and it fades beyond that. In practical terms, this means construction can be treated as a near-coincident cyclical indicator for GDP, not a long-horizon leading indicator.
5. Economic outlook
Forecasts should not be mistaken for prophecies, but the combination of evidence here does justify a disciplined “near-term narrative.” The trend in construction is high and has remained elevated into the end of the sample, and seasonality remains strong and predictable. The ARIMA holdout performance suggests that, absent a structural break, the next couple of years’ quarterly path is likely to continue exhibiting strong seasonal swings around a relatively high baseline.
The more economically interesting uncertainty is not statistical but structural. When a sector’s level shifts upward, the key question becomes whether the shift is backed by durable drivers, housing demand, sustained public infrastructure programs, improved financing depth, or whether it is vulnerable to a change in policy stance, external financing conditions, or project timing effects. The cycle evidence implies construction is connected to GDP but not perfectly synchronized, which is both a risk and a buffer: construction can weaken even when GDP is stable (project pauses), but it can also remain resilient when GDP softens (public works smoothing). A balanced near-term view is therefore: construction in Serbia looks structurally strong in level and trend, with predictable seasonality and short-horizon forecastability, but its cyclical volatility means it may amplify macro surprises rather than merely reflect them.
6. Methodological appendix (Part 2): Dependence features and ARIMA forecasting
Table 1’s features are best understood as a compact “diagnostic dashboard.” Trend_strength and seasonal_strength quantify how much of the series is explainable by smooth long-run movement and stable within-year repetition, respectively. Autocorrelation (ACF) and partial autocorrelation (PACF) features measure persistence and help motivate why ARIMA models, built explicitly around lagged dependence, are a natural baseline. Stationarity tests (KPSS, PP) and differencing recommendations (ndiffs, nsdiffs) indicate whether modelling should occur in levels, differences, or seasonally differenced form to avoid spurious dynamics. Structural break features and shift indices warn that “one model for the entire sample” may be an approximation if the economy moved between regimes.
The ARIMA forecasting strategy used here explicitly separates estimation from evaluation by holding out the last 8 quarters. The model is estimated on the training sample, then forecasts are generated for the holdout horizon, producing not just point forecasts but prediction intervals. Forecast accuracy is assessed using standard error measures (ME, RMSE, MAE, MAPE) alongside residual checks. Figure 12 complements this with visual diagnostics: residual time plots (looking for drift or clustering), residual ACF (checking for leftover dependence), and residual distribution shape (checking for extreme non-normality). Together with the Ljung–Box p-value reported in Table 3, these checks aim to establish that the ARIMA has captured the main time dependence well enough to be useful for short-horizon projection, while acknowledging that any future regime change would still break purely statistical extrapolation.
Concordance index (cycle synchronization): Definition, counts, and formula
To quantify how often Construction and GDP are in the same phase of the cycle, we use the concordance index introduced in the business-cycle context by Harding and Pagan (2002).
The starting point is to convert each cyclical series into a binary phase indicator for each quarter . A common operational choice (consistent with your HP-filter cycle plots) is:
if series
is in expansion at 
(cycle > 0);
if in contraction (cycle < 0).- Similarly define
for series
.
We then count how many quarters the two series share the same phase:
: number of quarters where both are in expansion
![]()
: number of quarters where both are in contraction
![]()
: the total number of comparable quarters used in the calculation (typically the full sample after aligning the two series and excluding any missing endpoints).
With these counts, the concordance index is simply the share of time the two series are in the same phase:
.
Equivalently, using the binary indicators directly:
.
Interpretation is intuitive:
means the cycles are always in the same phase;
means they are never in the same phase;
corresponds to “no systematic phase alignment” in a rough practical sense (though formal inference uses a dedicated test).
Reference
Harding D., & Pagan, A.R. (2002). Dissecting the cycle: A methodological investigation. Journal of Monetary Economics, 49(2), 365-381.
