Construction in Serbia, 2000Q1–2024Q4: Part 2 — Dependence, forecasting, and synchronization with GDP

Zlatko Kovacic February 6, 2026 Leave a Comment

Reading Time: 9 minutes

1. Part 2 introduction

Part 1 established the structural anatomy of the series: a strong trend plus durable seasonality. Part 2 asks three applied questions that matter for analysts and policymakers. First, how persistent is construction activity once you look beyond the seasonal pattern? Second, can a standard ARIMA model forecast it credibly at an 8-quarter horizon? Third, does construction move with the GDP cycle in a way that is economically meaningful, or is it a sector that mostly lives by its own timing?

2. Features and autocorrelation

Figure 10 (autocorrelation function) shows high positive autocorrelation at short lags and pronounced seasonal autocorrelation at the quarterly seasonal lag and its multiples, which is the statistical equivalent of saying: this series remembers its recent past, and it especially remembers the same quarter last year. That is consistent with what the seasonal diagnostics already suggested: persistence is not an incidental property here; it is a defining feature of the data-generating process.

*Figure 10*. Construction – Autocorrelation function

Table 1 provides a battery of numeric “features” that quantify this anatomy. Trend strength is very high (0.977), indicating that a smooth long-run component accounts for most of the systematic variation. Seasonal strength is also very high (0.957), confirming that the within-year pattern is not weak decoration but a major driver. The seasonal peak and trough indicators (peak_year = 0, trough_year = 1) encode that the seasonal pattern is anchored in a stable quarterly structure rather than drifting unpredictably. Spikiness is essentially zero (0.000), suggesting the remainder is not dominated by frequent sharp jumps, even if a few larger episodes exist. Linearity (3.407) implies the long-run path is strongly directional, while curvature (0.228) indicates that the direction changes over time rather than following a single straight-line growth path.

Table 1. Features of Construction time series

Feature	Value	Definition
trend_strength	0.977	Strength of trend from STL decomposition (near 1 = strong trend).
seasonal_strength_year	0.957	Strength of seasonality from STL decomposition (near 1 = strong seasonality).
seasonal_peak_year	0	Season-of-year (month/quarter) where the seasonal component peaks.
seasonal_trough_year	1	Season-of-year (month/quarter) where the seasonal component is lowest.
spikiness	0	Spikiness/abruptness in the STL remainder (implementation-specific summary).
linearity	3.407	How well a linear trend describes the long-term movement (implementation-specific).
curvature	0.228	How curved/nonlinear the trend is (implementation-specific).
stl_e_acf1	-0.278	Lag-1 autocorrelation of the STL remainder.
stl_e_acf10	0.473	Lag-10 autocorrelation of the STL remainder.
acf1	0.552	Lag-1 autocorrelation of the original series.
acf10	2.464	Lag-10 autocorrelation of the original series.
diff1_acf1	-0.34	Lag-1 autocorrelation of the first-differenced series.
diff1_acf10	2.244	Lag-10 autocorrelation of the first-differenced series.
diff2_acf1	-0.532	Lag-1 autocorrelation of the second-differenced series.
diff2_acf10	2.729	Lag-10 autocorrelation of the second-differenced series.
season_acf1	0.861	Lag-1 autocorrelation after seasonal differencing.
pacf5	1.426	PACF at lag 5 of the original series.
diff1_pacf5	1.323	PACF at lag 5 of the first-differenced series.
diff2_pacf5	1.253	PACF at lag 5 of the second-differenced series.
season_pacf	0.787	Seasonal PACF feature targeting seasonal partial dependence (implementation-specific).
zero_run_mean	0	Mean length of consecutive zero runs (zero spells).
nonzero_squared_cv	0.015	Squared coefficient of variation of the non-zero values.
zero_start_prop	0	Proportion of leading zeros at the start of the series.
zero_end_prop	0	Proportion of trailing zeros at the end of the series.
lambda_guerrero	2	Guerrero-recommended Box-Cox lambda for variance stabilization.
kpss_stat	1.644	KPSS stationarity test statistic.
kpss_pvalue	0.01	KPSS test p-value (small => reject stationarity).
pp_stat	-4.84	Phillips-Perron unit-root test statistic.
pp_pvalue	0.01	Phillips-Perron test p-value (small => reject unit root).
ndiffs	1	Recommended number of non-seasonal differences to achieve stationarity.
nsdiffs	1	Recommended number of seasonal differences to remove seasonal unit roots.
bp_stat	30.521	Box-Pierce portmanteau test statistic.
bp_pvalue	0	Box-Pierce test p-value (small => autocorrelation remains).
lb_stat	31.446	Ljung-Box portmanteau test statistic.
lb_pvalue	0	Ljung-Box test p-value (small => autocorrelation remains).
var_tiled_var	0.092	Variance of window variances across tiled segments (time-varying volatility).
var_tiled_mean	0.638	Variance of window means across tiled segments (time-varying level).
shift_level_max	0.377	Maximum mean shift between adjacent windows.
shift_level_index	2	Index/location where the maximum mean shift occurs.
shift_var_max	0.208	Maximum variance shift between adjacent windows.
shift_var_index	21	Index/location where the maximum variance shift occurs.
shift_kl_max	0.435	Maximum KL divergence (distribution shift) between adjacent windows.
shift_kl_index	9	Index/location where the maximum KL shift occurs.
spectral_entropy	0.393	Entropy of the normalized spectrum (higher => flatter spectrum / weaker dominant periodicity).
n_crossing_points	27	Number of times the series crosses a reference level (mean/median depending on method).
longest_flat_spot	3	Length of the longest constant run.
coef_hurst	0.886	Estimated Hurst exponent (long-range dependence/persistence).
stat_arch_lm	0.641	ARCH LM test statistic (conditional heteroskedasticity; larger => more ARCH effects).

The STL remainder autocorrelation summaries (stl_e_acf1 = −0.278 and stl_e_acf10 = 0.473) imply that, once STL components are removed, the residual structure is altered, short-run behaviour differs from longer-lag behaviour, consistent with some remaining dependence at broader horizons. The raw ACF summaries (acf1 = 0.552 and acf10 = 2.464) indicate meaningful persistence in levels, which is exactly what Figure 10 visualises. After one difference, the first-lag autocorrelation turns negative (diff1_acf1 = −0.340), a common sign that differencing removed a persistent component but may have introduced short-run mean reversion; diff1_acf10 remains high (2.244), indicating that dependence is not fully eliminated at longer lags. After a second difference, diff2_acf1 becomes more negative (−0.532) while diff2_acf10 remains elevated (2.729), which is often what you see if you “over-difference” at short horizons while long-horizon seasonal structure still echoes through.

Seasonal dependence is strong (season_acf1 = 0.861), consistent with the prominent quarterly seasonal spikes. Partial autocorrelation features are also elevated (pacf5 = 1.426; diff1_pacf5 = 1.323; diff2_pacf5 = 1.253; season_pacf = 0.787), reinforcing that the series contains structured dependence not captured by a single-lag story.

Several features confirm that zeros are not an issue here (zero_run_mean = 0.000; zero_start_prop = 0.000; zero_end_prop = 0.000), so the analyst need not worry about intermittency. The nonzero_squared_cv (0.015) indicates relatively low dispersion in squared values once non-zeros are considered, consistent with a series that varies but is not “explosive” in a heteroskedastic way most of the time. The Guerrero lambda (2.000) suggests a power transform could be variance-stabilising, which is a reminder that logs are a convention, not a law, though logs remain useful for interpretability and for multiplicative seasonal structures.

Stationarity tests tell a nuanced story: KPSS statistics and p-values (kpss_stat = 1.644, kpss_pvalue = 0.010) reject level stationarity under the KPSS null, while the Phillips–Perron result (pp_stat = −4.840, pp_pvalue = 0.010) rejects a unit root under the PP null. That apparent tension is common in macro data where structural change, strong seasonality, and evolving variance complicate “textbook” stationarity logic. The practical guidance embedded in the feature extraction is clear: one regular difference and one seasonal difference are recommended (ndiffs = 1; nsdiffs = 1). Structural instability is also flagged: the Bai–Perron-related statistic is large with a near-zero p-value (bp_stat = 30.521; bp_pvalue = 0.000), and Ljung–Box in this features table indicates dependence remains in the raw series (lb_stat = 31.446; lb_pvalue = 0.000). Finally, several change features (shift_level_max, shift_var_max, shift_kl_max with associated indices) are consistent with the visual impression that some parts of the sample behave differently from others, and the Hurst coefficient (0.886) signals strong persistence/long-memory-like behaviour.

3. Forecasting construction: ARIMA performance and uncertainty

Figure 11 shows an ARIMA forecast produced on a log-transformed basis, with 80% and 95% prediction intervals that widen with the horizon, as they should. The more demanding test of a forecast is not whether the mean path looks “reasonable,” but whether realised values fall inside the intervals at the expected rate.

*Figure 11*. Construction – ARIMA model forecast (LOG transformation)

Table 2 provides exactly that check for an 8-quarter holdout. For each quarter from 2023Q1 through 2024Q4, the actual value falls inside the 80% interval (and therefore also inside the 95% interval). This is not proof of perfection, 8 observations is a small sample, but it is a meaningful sanity check: the model’s uncertainty bands are not absurdly tight, and the realised path is not systematically outside what the model considered plausible.

Table 2. Forecast 8 quarters ahead – ARIMA model

Date	Actual	Mean	Lower 80%	Upper 80%	Lower 95%	Upper 95%
2023 Q1	68.3	63	54	73	50	80
2023 Q2	98.3	85	70	103	63	114
2023 Q3	104.4	100	80	126	71	141
2023 Q4	121.2	117	94	148	83	166
2024 Q1	74.8	64	49	82	43	94
2024 Q2	104.4	89	68	117	59	135
2024 Q3	108.8	106	80	140	68	163
2024 Q4	117.3	124	93	166	80	193

Table 3 summarises the estimated model and forecast accuracy. The training sample is 92 quarters with an 8-quarter holdout, and the model is reported as ARIMA with a small residual variance (sigma2 = 0.014) and a Ljung–Box test on residuals that does not reject remaining autocorrelation (lb_pvalue = 0.708). In plain language, the residual diagnostics are consistent with “no obvious leftover structure,” which is what you want from a baseline ARIMA.

Table 3. ARIMA model parameters accuracy

Series	Construction	Group	Description
freq_in	4	metadata	Input sampling frequency/seasonal period m used by the script (12 monthly, 4 quarterly).
n_train	92	metadata	Number of observations used to estimate models (training sample size).
n_holdout	8	metadata	Number of observations held out for forecast evaluation (holdout/test sample size).
holdout_h	8 quarters	metadata	Human-readable forecast horizon label (e.g., ’24 months’ or ‘8 quarters’).
.model	arima	metadata	Model identifier (e.g., snaive, naive, ets, arima).
sigma2	0.014	fit_stats	Estimated innovation/residual variance (model noise level).
log_lik	65.343	fit_stats	Log-likelihood of the fitted model at estimated parameters.
AIC	-114.686	fit_stats	Akaike Information Criterion (smaller is better).
AICc	-112.863	fit_stats	Small-sample corrected AIC (smaller is better).
BIC	-94.867	fit_stats	Bayesian Information Criterion (smaller is better).
lb_stat	2.945	diagnostics	Ljung-Box Q statistic testing residual autocorrelation up to selected lag K.
lb_pvalue	0.708	diagnostics	p-value for Ljung-Box test (small values suggest residual autocorrelation remains).
.type	Test	metadata	Accuracy table type indicator (typically ‘Test’ for holdout evaluation).
ME	0.073	accuracy	Mean Error on holdout: average(y – forecast); signed bias.
RMSE	0.102	accuracy	Root Mean Squared Error on holdout.
MAE.y	0.087	accuracy	Mean Absolute Error on holdout.
MPE	1.625	accuracy	Mean Percentage Error on holdout (signed).
MAPE	1.928	accuracy	Mean Absolute Percentage Error on holdout.
ACF1	0.123	diagnostics	Lag-1 autocorrelation of residuals (quick diagnostic for remaining serial correlation).
estimate__ar1	0.807	parameters	Estimated parameter value for term ‘ar1’.
std.error__ar1	0.115	parameters	Standard error of the estimate for term ‘ar1’.
statistic__ar1	7.043	parameters	Test statistic for H0: term ‘ar1’ equals 0 (typically estimate/SE).
p.value__ar1	0	parameters	p-value associated with the test statistic for term ‘ar1’.
estimate__ma1	-0.01	parameters	Estimated parameter value for term ‘ma1’.
std.error__ma1	0.133	parameters	Standard error of the estimate for term ‘ma1’.
statistic__ma1	-0.073	parameters	Test statistic for H0: term ‘ma1’ equals 0 (typically estimate/SE).
p.value__ma1	0.942	parameters	p-value associated with the test statistic for term ‘ma1’.
estimate__ma2	0.138	parameters	Estimated parameter value for term ‘ma2’.
std.error__ma2	0.136	parameters	Standard error of the estimate for term ‘ma2’.
statistic__ma2	1.02	parameters	Test statistic for H0: term ‘ma2’ equals 0 (typically estimate/SE).
p.value__ma2	0.31	parameters	p-value associated with the test statistic for term ‘ma2’.
estimate__ma3	-0.396	parameters	Estimated parameter value for term ‘ma3’.
std.error__ma3	0.142	parameters	Standard error of the estimate for term ‘ma3’.
statistic__ma3	-2.798	parameters	Test statistic for H0: term ‘ma3’ equals 0 (typically estimate/SE).
p.value__ma3	0.006	parameters	p-value associated with the test statistic for term ‘ma3’.
estimate__ma4	0.404	parameters	Estimated parameter value for term ‘ma4’.
std.error__ma4	0.106	parameters	Standard error of the estimate for term ‘ma4’.
statistic__ma4	3.816	parameters	Test statistic for H0: term ‘ma4’ equals 0 (typically estimate/SE).
p.value__ma4	0	parameters	p-value associated with the test statistic for term ‘ma4’.
estimate__sma1	-0.815	parameters	Estimated parameter value for term ‘sma1’.
std.error__sma1	0.078	parameters	Standard error of the estimate for term ‘sma1’.
statistic__sma1	-10.427	parameters	Test statistic for H0: term ‘sma1’ equals 0 (typically estimate/SE).
p.value__sma1	0	parameters	p-value associated with the test statistic for term ‘sma1’.
estimate__constant	0.009	parameters	Estimated parameter value for term ‘constant’.
std.error__constant	0.003	parameters	Standard error of the estimate for term ‘constant’.
statistic__constant	2.99	parameters	Test statistic for H0: term ‘constant’ equals 0 (typically estimate/SE).
p.value__constant	0.004	parameters	p-value associated with the test statistic for term ‘constant’.

































		parameters	p-value associated with the test statistic for term ‘sma1’.

Forecast accuracy measures are also reported (ME, RMSE, MAE, MPE, MAPE). In context, a MAPE around 1.928 (as reported) indicates that, over the holdout, typical percentage errors were small in magnitude. Equally important, the residual ACF1 is low (0.123), consistent with the residual plots in Figure 12 looking broadly well-behaved.

Parameter-by-parameter, the AR(1) term is strongly significant (estimate 0.807 with p-value 0.000), which fits the persistence story. Several moving-average terms are not significant (MA1 and MA2), while others are (MA3 and MA4), and the seasonal MA(1) is strongly significant (estimate −0.815, p-value 0.000), which is exactly what you would expect in a quarterly series with strong seasonal dependence. The constant term is also significant at conventional levels. The right way to read this is not “every coefficient is sacred,” but rather: the model captures persistence and seasonal structure in a statistically coherent way, and the residual checks do not raise red flags.

*Figure 12*. Construction – ARIMA model – Residuals analysis

Figure 12 (residual diagnostics) supports that interpretation visually: residuals fluctuate around zero with a few larger episodes; the residual ACF bars mostly sit within bounds; and the histogram looks broadly unimodal rather than pathological. Put together, the evidence suggests the ARIMA is a defensible short-horizon forecasting device for this series, provided users remember what ARIMA is and is not: it extrapolates patterns embedded in the past (including seasonality and persistence), but it cannot “know” future policy shifts, project pipelines, or financing regime changes unless those shifts are already visible in the data.

4. Construction and GDP: Do their cycles move together?

Figure 13 compares HP-filtered cycle components of construction and GDP. Two features stand out. First, construction’s cycle is visibly more volatile than GDP’s cycle. That is economically intuitive: construction is a smaller, more project-driven sector, so its deviations from trend can be larger even when the broader economy moves modestly. Second, despite that volatility, there are stretches where the two cycles share the same sign and rough timing, suggesting that construction is not purely idiosyncratic; it does participate in the broader cycle, even if it amplifies it.

*Figure 13*. Business cycles synchronisation: Construction vs GDP (HP filter)

Table 4 quantifies synchronisation using a concordance index. The concordance index is 0.61 with a p-value of 0.04, indicating that the two series are in the same “phase” more often than not, and that this alignment is unlikely to be pure coincidence under the test used. A 0.61 concordance is not “lockstep,” but it is also not trivial: it fits the visual impression that co-movement exists, but it is partial and occasionally interrupted.

Table 4. Concordance index

Series1	Series2	N	N11	N00	Nc	C_index	p_value
Construction	GDP	100	27	34	61	0.61	0.04

*Figure 14*. Cross-correlation function: Construction vs GDP

Figure 14 uses cross-correlation to probe lead–lag structure. The strongest positive correlation is at lag 0, and nearby lags (notably 1 and 2) are also positive and visibly above the significance threshold marked in the plot. The macro reading is that the construction and GDP cycles are primarily contemporaneous, with some evidence that the relationship extends a couple of quarters in one direction. Because cross-correlation lead–lag interpretation depends on the plotting convention (which series is “x” and which is “y”), the safe statement, supported directly by the figure, is: the link is strongest within a window of about 0–2 quarters, and it fades beyond that. In practical terms, this means construction can be treated as a near-coincident cyclical indicator for GDP, not a long-horizon leading indicator.

5. Economic outlook

Forecasts should not be mistaken for prophecies, but the combination of evidence here does justify a disciplined “near-term narrative.” The trend in construction is high and has remained elevated into the end of the sample, and seasonality remains strong and predictable. The ARIMA holdout performance suggests that, absent a structural break, the next couple of years’ quarterly path is likely to continue exhibiting strong seasonal swings around a relatively high baseline.

The more economically interesting uncertainty is not statistical but structural. When a sector’s level shifts upward, the key question becomes whether the shift is backed by durable drivers, housing demand, sustained public infrastructure programs, improved financing depth, or whether it is vulnerable to a change in policy stance, external financing conditions, or project timing effects. The cycle evidence implies construction is connected to GDP but not perfectly synchronized, which is both a risk and a buffer: construction can weaken even when GDP is stable (project pauses), but it can also remain resilient when GDP softens (public works smoothing). A balanced near-term view is therefore: construction in Serbia looks structurally strong in level and trend, with predictable seasonality and short-horizon forecastability, but its cyclical volatility means it may amplify macro surprises rather than merely reflect them.

6. Methodological appendix (Part 2): Dependence features and ARIMA forecasting

Table 1’s features are best understood as a compact “diagnostic dashboard.” Trend_strength and seasonal_strength quantify how much of the series is explainable by smooth long-run movement and stable within-year repetition, respectively. Autocorrelation (ACF) and partial autocorrelation (PACF) features measure persistence and help motivate why ARIMA models, built explicitly around lagged dependence, are a natural baseline. Stationarity tests (KPSS, PP) and differencing recommendations (ndiffs, nsdiffs) indicate whether modelling should occur in levels, differences, or seasonally differenced form to avoid spurious dynamics. Structural break features and shift indices warn that “one model for the entire sample” may be an approximation if the economy moved between regimes.

The ARIMA forecasting strategy used here explicitly separates estimation from evaluation by holding out the last 8 quarters. The model is estimated on the training sample, then forecasts are generated for the holdout horizon, producing not just point forecasts but prediction intervals. Forecast accuracy is assessed using standard error measures (ME, RMSE, MAE, MAPE) alongside residual checks. Figure 12 complements this with visual diagnostics: residual time plots (looking for drift or clustering), residual ACF (checking for leftover dependence), and residual distribution shape (checking for extreme non-normality). Together with the Ljung–Box p-value reported in Table 3, these checks aim to establish that the ARIMA has captured the main time dependence well enough to be useful for short-horizon projection, while acknowledging that any future regime change would still break purely statistical extrapolation.

Concordance index (cycle synchronization): Definition, counts, and formula

To quantify how often Construction and GDP are in the same phase of the cycle, we use the concordance index introduced in the business-cycle context by Harding and Pagan (2002).

The starting point is to convert each cyclical series into a binary phase indicator for each quarter . A common operational choice (consistent with your HP-filter cycle plots) is:

$S_{x,t}=1$ if series $x$ is in expansion at $t$ (cycle > 0); $S_{x,t} = 0$ if in contraction (cycle < 0).
Similarly define $S_{y,t}$ for series $y$ .

We then count how many quarters the two series share the same phase:

$N_{11}$ : number of quarters where both are in expansion

$N_{11} = \sum_{t=1}^{N_c} 1 \{S_{x,t} = 1 \quad \text{and} \quad S_{y,t} = 1 \}$

$N_{00}$ : number of quarters where both are in contraction

$N_{00} = \sum_{t=1}^{N_c} 1 \{S_{x,t} = 0 \quad \text{and} \quad S_{y,t} = 0 \}$

$N_{c}$ : the total number of comparable quarters used in the calculation (typically the full sample after aligning the two series and excluding any missing endpoints).

With these counts, the concordance index is simply the share of time the two series are in the same phase: $CI = \frac{N_{11}+N_{00}}{N_c}$ .

Equivalently, using the binary indicators directly: $CI = \frac{1}{N_{c}} \sum_{t=1}^{N_c} [S_{x,t} S_{y,t} + (1-S_{x,t})(1-S_{y,t})]$ .

Interpretation is intuitive: $CI=1$ means the cycles are always in the same phase; $CI=0$ means they are never in the same phase; $CI=0.5$ corresponds to “no systematic phase alignment” in a rough practical sense (though formal inference uses a dedicated test).

Reference

Harding D., & Pagan, A.R. (2002). Dissecting the cycle: A methodological investigation. Journal of Monetary Economics, 49(2), 365-381.

LEAVE A RESPONSE Cancel reply

You must be logged in to post a comment.

Zlatko Kovacic

Director of Wellington based My Statistical Consultant Ltd company. Retired Associate Professor in Statistics. Has a PhD in Statistics and over 45 years experience as a university professor, international researcher and government consultant.

View all posts

Industrial production

Stylized Facts

Stylized Facts

Construction in Serbia, 2000Q1–2024Q4: Part 2 — Dependence, forecasting, and synchronization with GDP

LEAVE A RESPONSE Cancel reply

Zlatko Kovacic

Slovenia: Seasonality analysis of monthly industrial production time series

Okun’s Law: The rule that works, until it doesn’t

Croatia: Seasonality analysis of monthly industrial production time series

Slovenia: tourism time series plots

Recent Posts

Archives

Categories

Construction in Serbia, 2000Q1–2024Q4: Part 2 — Dependence, forecasting, and synchronization with GDP

LEAVE A RESPONSE Cancel reply

Zlatko Kovacic

You Might Also Like

Slovenia: Seasonality analysis of monthly industrial production time series

Okun’s Law: The rule that works, until it doesn’t

Croatia: Seasonality analysis of monthly industrial production time series

Slovenia: tourism time series plots

Recent Posts

Archives

Categories