Chapter overview and Task

Chapter 4

Code

# Loading R packages
if (!require("pacman")) install.packages("pacman")
pacman::p_load(tidyverse,
               tsibble, fable,
               feasts, tsibbledata,
               fable.prophet,
               patchwork,
               lubridate,
               rio,
               ggplot2, knitr,
               kableExtra
               )

This is the overview of the chapter 4. I will add info from the index file here. I will filter from there to what will go here. My goal is to first have the website have a map of the concepts in time series. I want to first avoid doing it by chapters since it seperates topics and limit my learning to that lesson. Bigger picture first.

Adding this from homework_4_0 chapter notes.qmd file in old repo. This are notes intended to connect time series concepts

Code

# Load necessary library

# Define the checklist table
checklist_data <- data.frame(
  Topic = c("Trend", "Seasonality", "Stationarity", "White Noise", "Serial Correlation", "Drift"),
  Description = c(
    "Presence of an upward or downward movement over time.",
    "Repeating patterns at regular intervals.",
    "Mean, variance, and autocorrelation are constant over time.",
    "Residuals are independent, identically distributed with mean zero.",
    "Past values are correlated with current values.",
    "Consistent positive or negative change over time."
  ),
  White_Noise = c("", "", "✔", "✔", "", ""),
  Random_Walk = c("", "", "", "", "✔", ""),
  RW_with_Drift = c("", "", "", "", "✔", "✔"),
  AR_p = c("✔", "", "✔", "", "✔", ""),
  ARIMA = c("", "", "✔", "", "✔", ""),
  Holt_Winters = c("✔", "✔", "✔", "", "✔", "")
)

# Render the table
kable(checklist_data, format = "html", escape = F) %>%
  kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover"))

Topic	Description	White_Noise	Random_Walk	RW_with_Drift	AR_p	ARIMA	Holt_Winters
Trend	Presence of an upward or downward movement over time.				✔		✔
Seasonality	Repeating patterns at regular intervals.						✔
Stationarity	Mean, variance, and autocorrelation are constant over time.	✔			✔	✔	✔
White Noise	Residuals are independent, identically distributed with mean zero.	✔
Serial Correlation	Past values are correlated with current values.		✔	✔	✔	✔	✔
Drift	Consistent positive or negative change over time.			✔

Time Series Model Checklist

Topic	Description	White Noise	Random Walk	RW with Drift	AR(p)	ARIMA	Holt-Winters
Trend	Presence of an upward or downward movement over time.				✔		✔
Seasonality	Repeating patterns at regular intervals.						✔
Stationarity	Mean, variance, and autocorrelation are constant over time.	✔			✔	✔	✔
White Noise	Residuals are independent, identically distributed with mean zero.	✔
Serial Correlation	Past values are correlated with current values.		✔	✔	✔	✔	✔
Drift	Consistent positive or negative change over time.			✔

spacer

Code

temps_ts <- rio::import("https://byuistats.github.io/timeseries/data/global_temparature.csv") |>
  as_tsibble(index = year)

temps_ts |> autoplot(.vars = change) +
    labs(
      x = "Year",
      y = "Temperature Change (Celsius)",
      title = paste0("Change in Mean Annual Global Temperature (", min(temps_ts$year), "-", max(temps_ts$year), ")")
    ) +
    theme_minimal() +
    theme(
      plot.title = element_text(hjust = 0.5)
    )

Code

# second r chunk
pacf(temps_ts$change)

Code

# 3rd r chunk
global_ar <- temps_ts |>
    model(AR(change ~ order(1:9)))
tidy(global_ar)

# A tibble: 7 × 6
  .model                  term     estimate std.error statistic  p.value
  <chr>                   <chr>       <dbl>     <dbl>     <dbl>    <dbl>
1 AR(change ~ order(1:9)) constant   0.0190   0.00881     2.15  3.30e- 2
2 AR(change ~ order(1:9)) ar1        0.656    0.0841      7.80  1.40e-12
3 AR(change ~ order(1:9)) ar2       -0.0662   0.100      -0.659 5.11e- 1
4 AR(change ~ order(1:9)) ar3        0.140    0.0988      1.42  1.58e- 1
5 AR(change ~ order(1:9)) ar4        0.265    0.0995      2.67  8.58e- 3
6 AR(change ~ order(1:9)) ar5       -0.163    0.102      -1.60  1.11e- 1
7 AR(change ~ order(1:9)) ar6        0.206    0.0863      2.38  1.85e- 2

Code

# 4th r chunk
alphas <- global_ar |> coefficients() |> tail(-1) |> dplyr::select(estimate) |> pull()
cat(
  "0 = 1", 
        "- (", alphas[1], ") * x",
        "- (", alphas[2], ") * x^2",
        "- (", alphas[3], ") * x^3",
        "\n     ",
        "- (", alphas[4], ") * x^4",
        "- (", alphas[5], ") * x^5",
        "- (", alphas[6], ") * x^6"
)

0 = 1 - ( 0.6559292 ) * x - ( -0.06617426 ) * x^2 - ( 0.140204 ) * x^3 
      - ( 0.2653744 ) * x^4 - ( -0.1627911 ) * x^5 - ( 0.2056924 ) * x^6

Code

alphas

[1]  0.65592918 -0.06617426  0.14020403  0.26537444 -0.16279106  0.20569242

Code

# 5th r chunk

temps_forecast <- global_ar |> forecast(h = "50 years")
temps_forecast |>
  autoplot(temps_ts, level = 95) +
  geom_line(aes(y = .fitted, color = "Fitted"),
    data = augment(global_ar)) +
  scale_color_discrete(name = "") +
  labs(
    x = "Year",
    y = "Temperature Change (Celsius)",
    title = paste0("Change in Mean Annual Global Temperature (", min(temps_ts$year), "-", max(temps_ts$year), ")"),
    subtitle = paste0("50-Year Forecast Based on our AR(", tidy(global_ar) |> as_tibble() |> dplyr::select(term) |> tail(1) |> stringr::str_sub(1), ") Model")
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(hjust = 0.5),
    plot.subtitle = element_text(hjust = 0.5)
  )

Code

# Import and prepare the data
temps_ts <- rio::import("https://byuistats.github.io/timeseries/data/global_temparature.csv") |>
  as_tsibble(index = year)

# Fit an AR model to the 'change' variable
ar_fit <- ar(temps_ts$change, method = "mle", na.action = na.omit)

# Extract and display the order and coefficients
order <- ar_fit$order
coefficients <- ar_fit$ar

cat("Order of the fitted AR model: ", order, "\n")

Order of the fitted AR model:  4

Code

cat("Coefficients of the AR model: ", coefficients, "\n")

Coefficients of the AR model:  0.6770191 -0.03582926 0.1560533 0.1929323

Code

# Visualization of the original data
temps_ts |> autoplot(.vars = change) +
    labs(
      x = "Year",
      y = "Temperature Change (Celsius)",
      title = paste0("Change in Mean Annual Global Temperature (", min(temps_ts$year), "-", max(temps_ts$year), ")")
    ) +
    theme_minimal() +
    theme(
      plot.title = element_text(hjust = 0.5)
    )

Book section

4.5 Autoregressive models

4.5.1 defition

The series $\{x_t\}$ is an autoregressive process of order $p$, abbreviated to AR($p$), if

\[ x_t = \alpha_1 x_{t-1} + \alpha_2 x_{t-2} + \dots + \alpha_p x_{t-p} + w_t \tag{e4.15} \]

where $\{w_t\}$ is white noise and the $\alpha_i$ are the model parameters with $\alpha_p \neq 0$ for an order $p$ process.Equation (4.15) can be expressed as a polynomial of order $p$ in terms of the backward shift operator (e4.16):

\[ \theta_p(B)x_t = (1 - \alpha_1 B - \alpha_2 B^2 - \dots - \alpha_p B^p)x_t = w_t \tag{e4.16} \] The following points should be noted:

The random walk is the special case AR(1) with $\alpha_1 = 1$ (see Equation (4.4)).
The exponential smoothing model is the special case $\alpha_i = \alpha(1 - \alpha)^i$ for $i = 1, 2, \dots$ and $p \to \infty$.
The model is a regression of $x_t$ on past terms from the same series; hence the use of the term “autoregressive.”
A prediction at time $t$ is given by (e4.17)

\[ \hat{x}_t = \alpha_1 x_{t-1} + \alpha_2 x_{t-2} + \dots + \alpha_p x_{t-p} \tag{e4.17} \]

The model parameters can be estimated by minimizing the sum of squared errors.

4.5.2 Stationary and non-stationary AR processes

The equation $\theta_p(B) = 0$, where $B$ is formally treated as a number (real or complex), is called the characteristic equation. The roots of the characteristic equation (i.e., the polynomial $\theta_p(B)$ from Equation (4.16)) must all exceed unity in absolute value for the process to be stationary. Notice that the random walk has $\theta = 1 - B$ with root $B = 1$ and is non-stationary. The following four examples illustrate the procedure for determining whether an AR process is stationary or non-stationary:

The AR(1) model $x_t = \frac{1}{2}x_{t-1} + w_t$ is stationary because the root of $1 - \frac{1}{2}B = 0$ is $B = 2$, which is greater than 1.
The AR(2) model $x_t = x_{t-1} - \frac{1}{4}x_{t-2} + w_t$ is stationary. The proof of this result is obtained by first expressing the model in terms of the backward shift operator: $\frac{1}{4}(B^2 - 4B + 4)x_t = w_t$ i.e., $\frac{1}{4}(B - 2)^2x_t = w_t$. The roots of the polynomial are given by solving $\theta(B) = \frac{1}{4}(B - 2)^2 = 0$ and are therefore obtained as $B = 2$. As the roots are greater than unity, this AR(2) model is stationary.
The model $x_t = \frac{1}{2}x_{t-1} + \frac{1}{2}x_{t-2} + w_t$ is non-stationary because one of the roots is unity. To prove this, first express the model in terms of the backward shift operator: $-\frac{1}{2}(B^2 + B - 2)x_t = w_t$; i.e., $-\frac{1}{2}(B - 1)(B + 2)x_t = w_t.$ The polynomial $\theta(B) = -\frac{1}{2}(B - 1)(B + 2)$ has roots $B = 1, -2$. As there is a unit root ($B = 1$), the model is non-stationary. Note that the other root ($B = -2$) exceeds unity in absolute value, so only the presence of the unit root makes this process non-stationary.
The AR(2) model $x_t = -\frac{1}{4}x_{t-2} + w_t$ is stationary because the roots of $1 + \frac{1}{4}B^2 = 0$ are $B = \pm 2i$, which are complex numbers with $i = \sqrt{-1}$, each having an absolute value of 2, exceeding unity.

The R function polyroot finds zeros of polynomials and can be used to find the roots of the characteristic equation to check for stationarity.

4.5.3 Second-order properties of an AR(1) model

From Equation (4.15), the AR(1) process is given by

\[ x_t = \alpha x_{t-1} + w_t \tag{e4.18} \]

where $\{w_t\}$ is a white noise series with mean zero and variance $\sigma^2$. It can be shown (§4.5.4) that the second-order properties follow as:

\[ \left.\begin{matrix} \mu_x = 0 \\ \gamma_k = \frac{\alpha^k \sigma^2}{1 - \alpha^2} \\ \end{matrix}\right\} \tag{e4.19} \]

4.5.4 Derivation of second-order properties for an AR(1) process*

Using $B$, a stable AR(1) process ($|\alpha| < 1$) can be written as

\[ \left.\begin{matrix} (1 - \alpha B)x_t = w_t \\ \Rightarrow x_t = (1 - \alpha B)^{-1}w_t \\ \Rightarrow x_t = w_t + \alpha w_{t-1} + \alpha^2 w_{t-2} + \dots = \sum_{i=0}^\infty \alpha^i w_{t-i} \\ \end{matrix}\right\} \tag{e4.20} \]

Hence, the mean is given by

\[ E(x_t) = E\left(\sum_{i=0}^\infty \alpha^i w_{t-i}\right) = \sum_{i=0}^\infty \alpha^i E(w_{t-i}) = 0 \]

and the autocovariance follows as

\[ \gamma_k = \text{Cov}(x_t, x_{t+k}) = \text{Cov} \left(\sum_{i=0}^\infty \alpha^i w_{t-i}, \sum_{j=0}^\infty \alpha^j w_{t+k-j}\right) \]

\[ = \sum\nolimits_{j=k+i} \alpha^i \alpha^j \text{Cov}(w_{t-i}, w_{t+k-j}) \]

\[ = \alpha^k \sigma^2 \sum\nolimits_{i=0}^\infty \alpha^{2i} = \frac{\alpha^k \sigma^2}{1 - \alpha^2} \]

using Equation (e2.15) and (e4.2)

4.5.5 Correlogram of an AR(1) process

From Equation (4.19), the autocorrelation function follows as

\[ \rho_k = \alpha^k \quad (k \geq 0) \tag{4.21} \]

where $|\alpha| < 1$. Thus, the correlogram decays to zero more rapidly for small $\alpha$. The following example gives two correlograms for positive and negative values of $\alpha$, respectively (Fig. 4.11):

Fig. 4.11. Example correlograms for two autoregressive models:

$x_t = 0.7x_{t-1} + w_t$
$x_t = -0.7x_{t-1} + w_t$

Code

# book code
# rho <- function(k, alpha) alpha^k
# layout(1:2)
# plot(0:10, rho(0:10, 0.7), type = "b")
# plot(0:10, rho(0:10, -0.7), type = "b")

rho <- function(k, alpha) alpha^k
dat <- tibble(x = 0:10, a7 = rho(x, .7), am7 = rho(x, -.7)) |>
    pivot_longer(a7:am7, names_to = "param", values_to = "value") 

ggplot(dat, aes(x = x, y = value)) + geom_line() + geom_point() + facet_wrap(~param, ncol = 1, labeller = as_labeller(c(a7 = "Fig. 4.11a Alpha = 0.7", am7 = "Fig. 4.11b Alpha = -0.7"))) +
    labs(x = "lag k", y = "p_k")

Try experimenting using other values for $\alpha$. For example, use a small value of $\alpha$ to observe a more rapid decay to zero in the correlogram.

4.5.6 Partial Autocorrelation

From Equation (4.21), the autocorrelations are non-zero for all lags even though in the underlying model $x_t$ only depends on the previous value $x_{t-1}$ (Equation (4.18)). The partial autocorrelation at lag $k$ is the correlation that results after removing the effect of any correlations due to the terms at shorter lags. For example, the partial autocorrelation of an AR(1) process will be zero for all lags greater than 1. In general, the partial autocorrelation at lag $k$ is the $k$th coefficient of a fitted AR($k$) model; if the underlying process is AR($p$), then the coefficients $\alpha_k$ will be zero for all $k > p$. Thus, an AR($p$) process has a correlogram of partial autocorrelations that is zero after lag $p$. Hence, a plot of the estimated partial autocorrelations can be useful when determining the order of a suitable AR process for a time series. In R, the function pacf can be used to calculate the partial autocorrelations of a time series and produce a plot of the partial autocorrelations against lag (the “partial correlogram”).

4.5.7 Simulation

An AR(1) process can be simulated in R as follows:

 > set.seed(1)
 > x <- w <- rnorm(100)
 > for (t in 2:100) x[t] <- 0.7 * x[t- 1] + w[t]
 > plot(x, type = "l")
 > acf(x)
 > pacf(x)

Fig. 4.12. A simulated AR(1) process, $x_t = 0.7x_{t-1} + w_t$. Note that in the partial correlogram (c) only the first lag is significant, which is usually the case when the underlying process is AR(1).

Code

set.seed(1)
x <- w <- rnorm(100)
for (t in 2:100) x[t] <- 0.7 * x[t - 1] + w[t]

# Custom titles
plot_title <- "fig. 12a Time Series Plot"
x_axis_title <- "Time"

# Plot with custom titles
plot(x, type = "l", main = plot_title, xlab = x_axis_title)

Code

acf(x, main = "fig 12b Correlogram: Sample correlation against lag")

Code

pacf(x, main = "fig 12c Partial correlogram: Sample partial correlation against lag")

The resulting plots of the simulated data are shown in Figure 4.12 and give one possible realisation of the model. The partial correlogram has no significant correlations except the value at lag 1, as expected (Fig. 4.12c – note that the pacf starts at lag 1, whilst the acf starts at lag 0).The difference between the correlogram of the underlying model (Fig. 4.11a) and the sample correlogram of the simulated series (Fig. 4.12b) shows discrepancies that have arisen due to sampling variation. Try repeating the commands above several times to obtain a range of possible sample correlograms for an AR(1) process with underlying parameter $\alpha = 0.7$.

You are asked to investigate an AR(2) process in Exercise 4.

4.6 Fitted models

4.6.1 Model fitted to simulated series

An AR(p) model can be fitted to data in R using the ar function. In the code below, the autoregressive model x.ar is fitted to the simulated series of the last section and an approximate 95% confidence interval for the underlying parameter is given, where the (asymptotic) variance of the parameter estimate is extracted using x.ar$asy.var:

 > x.ar <- ar(x, method = "mle")
 > x.ar$order

Code

set.seed(1)
x <- w <- rnorm(100)
for (t in 2:100) x[t] <- 0.7 * x[t - 1] + w[t]
x.ar <- ar(x, method = "mle")
x.ar$order

[1] 1

 > x.ar$ar

Code

x.ar$ar

[1] 0.6009459

 > x.ar$ar + c(-2, 2) * sqrt(x.ar$asy.var)

Code

x.ar$ar + c(-2, 2) * sqrt(x.ar$asy.var)[1] # slight change

[1] 0.4404031 0.7614886

The method “mle” used in the fitting procedure above is based on maximizing the likelihood function (the probability of obtaining the data given the model) with respect to the unknown parameters. The order $p$ of the process is chosen using the Akaike Information Criterion (AIC; Akaike, 1974), which penalizes models with too many parameters:

\[ \text{AIC} = -2 \times \text{log-likelihood} + 2 \times \text{number of parameters} \tag{e4.22} \]

In the function ar, the model with the smallest AIC is selected as the best-fitting AR model. Note that, in the code above, the correct order ($p = 1$) of the underlying process is recovered.

The parameter estimate for the fitted AR(1) model is $\hat{\alpha} = 0.60$. Whilst this is smaller than the underlying model value of $\alpha = 0.7$, the approximate 95% confidence interval does contain the value of the model parameter as expected, giving us no reason to doubt the implementation of the model.

4.6.2 Exchange rate series: Fitted AR model

An AR(1) model is fitted to the exchange rate series, and the upper bound of the confidence interval for the parameter includes 1. This indicates that there would not be sufficient evidence to reject the hypothesis $\alpha = 1$, which is consistent with the earlier conclusion that a random walk provides a good approximation for this series.

However, simulated data from models with values of $\alpha > 1$, formally included in the confidence interval below, exhibit exponentially unstable behavior and are not credible models for the New Zealand exchange rate.

 > Z.ar <- ar(Z.ts)
 > mean(Z.ts)
 [1] 2.823
 > Z.ar$order
 [1] 1
 > Z.ar$ar
 [1] 0.8903
 > Z.ar$ar + c(-2, 2) * sqrt(Z.ar$asy.var)
 [1] 0.7405 1.0400
 > acf(Z.ar$res[-1])

Figure 4.13 Fig. 4.13. The correlogram of residual series for the AR(1) model fitted to the exchange rate data.

Code

Z <- read.table("../data/pounds_nz.dat", header = T)
Z.ts <- ts(Z, st = 1991, fr = 4)
Z.ar <- ar(Z.ts)
mean(Z.ts)

[1] 2.823251

Code

Z.ar$order

[1] 1

Code

Z.ar$ar

[1] 0.890261

Code

Z.ar$ar + c(-2, 2) * sqrt(Z.ar$asy.var)[1]

[1] 0.7405097 1.0400123

Code

acf(Z.ar$res[-1])

In the code above, a “−1” is used in the vector of residuals to remove the first item from the residual series (Fig. 4.13). For a fitted AR(1) model, the first item has no predicted value because there is no observation at $t = 0$; in general, the first $p$ values will be “not available” (NA) in the residual series of a fitted AR($p$) model.

By default, the mean is subtracted before the parameters are estimated, so a predicted value $\hat{z}_t$ at time $t$ based on the output above is given by

\[ \hat{z}_t = 2.8 + 0.89(z_{t-1} - 2.8) \tag{e4.23} \]

4.6.3 Global temperature series: Fitted AR model

The global temperature series was introduced in §1.4.5, where it was apparent that the data exhibited an increasing trend after 1970, which may be due to the “greenhouse effect.” Sceptics may claim that the apparent increasing trend can be dismissed as a transient stochastic phenomenon. For their claim to be consistent with the time series data, it should be possible to model the trend without the use of deterministic functions.

Consider the following AR model fitted to the mean annual temperature series (figure 4.14):

Code

# resource code 4.6.3
Global = scan("../data/global.dat")
Global.ts = ts(Global, st = c(1856, 1), end = c(2005, 12),fr = 12)
Global.ar <- ar(aggregate(Global.ts, FUN = mean), method = "mle")
mean(aggregate(Global.ts, FUN = mean))

[1] -0.1382628

Code

Global.ar$order

[1] 4

Code

Global.ar$ar

[1] 0.58762026 0.01260254 0.11116731 0.26763656

Code

acf(Global.ar$res[-(1:Global.ar$order)], lag = 50)

figure Fig. 4.14. The correlogram of the residual series for the AR(4) model fitted to the annual global temperature series. The correlogram is approximately white noise so that, in the absence of further information, a simple stochastic model can ‘explain’ the correlation and trends in the series.

Based on the output above, a predicted mean annual temperature $\hat{x}_t$ at time $t$ is given by

\[ \hat{x}_t = -0.14 + 0.59(x_{t-1} + 0.14) + 0.013(x_{t-2} + 0.14) \\ + 0.11(x_{t-3} + 0.14) + 0.27(x_{t-4} + 0.14) \tag{e4.24} \]

The correlogram of the residuals has only one (marginally) significant value at lag 27, so the underlying residual series could be white noise (Fig. 4.14). Thus, the fitted AR(4) model (Equation (e4.24)) provides a good fit to the data.

As the AR model has no deterministic trend component, the trends in the data can be explained by serial correlation and random variation, implying that it is possible that these trends are stochastic (or could arise from a purely stochastic process).

Again, we emphasize that this does not imply that there is no underlying reason for the trends. If a valid scientific explanation is known, such as a link with the increased use of fossil fuels, then this information would clearly need to be included in any future forecasts of the series.