Fitted AR Models

Chapter 4: Lesson 4

Code

# Loading R packages
if (!require("pacman")) install.packages("pacman")
pacman::p_load(tidyverse,
               tsibble, fable,
               feasts, tsibbledata,
               fable.prophet,
               patchwork,
               lubridate,
               rio,
               ggplot2,
               kableExtra
               )

Lesson 4.4 Fitted AR Models

Simulate an AR(1) Time Series

In this simulation, we first simulate data from the $AR(1)$ model \[ x_t = 0.75 ~ x_{t-1} + w_t \] where $w_t$ is a white noise process with variance 1.

Show the code

set.seed(123)
n_rep <- 1000
alpha1 <- 0.75

dat_ts <- tibble(w = rnorm(n_rep)) |>
  mutate(
    index = 1:n(),
    x = purrr::accumulate2(
      lag(w), w, 
      \(acc, nxt, w) alpha1 * acc + w,
      .init = 0)[-1]) |>
  tsibble::as_tsibble(index = index)

dat_ts |> 
  autoplot(.vars = x) +
    labs(
      x = "Time",
      y = "Simulated Time Series",
      title = "Simulated Values from an AR(1) Process"
    ) +
    theme_minimal() +
    theme(
      plot.title = element_text(hjust = 0.5)
    )

The R command mean(dat_ts$x) gives the mean of the $x_t$ values as 0.067.

Fit an $AR(1)$ Model with Zero Mean

Show the code

# Fit the AR(1) model
fit_ar <- dat_ts |>
  model(AR(x ~ order(1)))
tidy(fit_ar)

# A tibble: 1 × 6
  .model           term  estimate std.error statistic   p.value
  <chr>            <chr>    <dbl>     <dbl>     <dbl>     <dbl>
1 AR(x ~ order(1)) ar1      0.720    0.0220      32.8 2.07e-160

The estimate of the parameter $\alpha_1$ (i.e. the fitted value of the parameter $\alpha_1$) is $\hat \alpha_1 = 0.72$.

When R fits an AR model, the mean of the time series is subtracted from the data before the parameter values are estimated. If R detects that the mean of the time series is not significantly different from zero, it is omitted from the output.

Because the mean is subtracted from the time series before the parameter values are estimated, R is using the model \[ z_t = \alpha_1 ~ z_{t-1} + w_t \] where $z_t = x_t - \mu$ and $\mu$ is the mean of the time series.

Things to do

Check Your Understanding

Answer the following questions with your partner.

Use the expression for $z_t$ above to solve for $x_t$ in terms of $x_{t-1}$, $\mu$, $\alpha_1$, and $w_t$.
What does your model reduce to when $\mu = 0$?
Explain to your partner why this correctly models a time series with mean $\mu$.

WHat do I know

For what type of Time Series will a fitted AR model be use?
Is the sample above a stationary or non-stationary time series?
Stochastic vs deterministic time series?
What does this model tell us about the time series?

We replace the parameter $\mu$ with its estimator $\hat \mu = \bar x$. We also replace $\alpha_1$ with the fitted value from the output $\hat \alpha_1$. This gives us the fitted model: \[ \hat x_t = \bar x + \hat \alpha_1 ~ (x_{t-1} - \bar x) \]

The fitted model can be expressed as:

\[\begin{align*} \hat x_t &= 0.067 + 0.72 \left( x_{t-1} - 0.067 \right) \\ &= 0.067 - 0.72 ~ (0.067) + 0.72 ~ \left( x_{t-1} \right) \\ &= 0.019 + 0.72 ~ x_{t-1} \end{align*}\]

Even though R does not report the parameter for the mean of the process, $\hat \mu = 0.019$, it is not significantly different from zero. One could argue that we should not use a model that contains the mean and instead focus on a simple fitted model that has only one parameter:

\[ \hat x_t = 0.72 ~ x_{t-1} \]

Confidence Interval for the Model Parameter

The P-value given above tests the hypothesis that $\alpha_1=0$. This is not helpful in this context. We are interested in the plausible values for $\alpha_1$, not whether or not it is different from zero. For this reason, we consider a confidence interval and disregard the P-value.

We can compute an approximate 95% confidence interval for $\alpha_1$ as: \[ \left( \hat \alpha_1 - 2 \cdot SE_{\hat \alpha_1} , ~ \hat \alpha_1 + 2 \cdot SE_{\hat \alpha_1} \right) \] where $\hat \alpha_1$ is our parameter estimate and $SE_{\hat \alpha_1}$ is the standard error of the estimate. Both of these values are given in the R output.

Show the code

ci_summary <- tidy(fit_ar) |>
    mutate(
        lower = estimate - 2 * std.error,
        upper = estimate + 2 * std.error
    )

So, our 95% confidence interval for $\alpha_1$ is: \[ \left( 0.72 - 2 \cdot 0.022 , ~ 0.72 + 2 \cdot 0.022 \right) \] or \[ \left( 0.676 , ~ 0.764 \right) \] Note that the confidence interval contains $\alpha_1 = 0.75$, the value of the parameter we used in our simulation. The process of estimating the parameter worked well. In practice, we will not know the value of $\alpha_1$, but the confidence interval gives us a reasonable estimate of the value.

Residuals

For an $AR(1)$ model where the mean of the time series is not statistically significantly different from 0, the residuals are computed as \[\begin{align*} r_t &= x_t - \hat x_t \\ &= x_t - \left[ 0.72 ~ x_{t-1} \right] \end{align*}\]

Code

# had include false, and eval false.


# Computing the residuals manually
dat_ts |>
  # Zero mean model
  mutate(resid0 = x - ( (tidy(fit_ar) |> select(estimate) |> pull()) * lag(x) ) ) |>
  # Non-zero mean model
  mutate(resid1 = x - (mean(x) + (tidy(fit_ar) |> select(estimate) |> pull()) * (lag(x) - mean(x)) ) )

# A tsibble: 1,000 x 5 [1]
         w index      x resid0  resid1
     <dbl> <int>  <dbl>  <dbl>   <dbl>
 1 -0.560      1 -0.560 NA     NA     
 2 -0.230      2 -0.651 -0.247 -0.266 
 3  1.56       3  1.07   1.54   1.52  
 4  0.0705     4  0.874  0.103  0.0842
 5  0.129      5  0.784  0.156  0.137 
 6  1.72       6  2.30   1.74   1.72  
 7  0.461      7  2.19   0.531  0.512 
 8 -1.27       8  0.376 -1.20  -1.22  
 9 -0.687      9 -0.405 -0.675 -0.694 
10 -0.446     10 -0.749 -0.458 -0.477 
# ℹ 990 more rows

We can easily obtain these residual values in R:

Show the code

fit_ar |> residuals()

# A tsibble: 1,000 x 3 [1]
# Key:       .model [1]
   .model           index .resid
   <chr>            <int>  <dbl>
 1 AR(x ~ order(1))     1 NA    
 2 AR(x ~ order(1))     2 -0.247
 3 AR(x ~ order(1))     3  1.54 
 4 AR(x ~ order(1))     4  0.103
 5 AR(x ~ order(1))     5  0.156
 6 AR(x ~ order(1))     6  1.74 
 7 AR(x ~ order(1))     7  0.531
 8 AR(x ~ order(1))     8 -1.20 
 9 AR(x ~ order(1))     9 -0.675
10 AR(x ~ order(1))    10 -0.458
# ℹ 990 more rows

The variance of the residuals is $0.982$. This is very close to the actual value used in the simulation: $\sigma^2 = 1$.

Fitting a Simulated AR(1) Part 1

Fitting a Simulated AR(1) Model with Non-Zero Mean

Fit an AR(1) Model with Non-Zero Mean

Same as above? I guess in this one we use R to fit an AR(1) model to the time series data.

Confidence Interval for the Model Parameters

We can compute approximate 95% confidence intervals for $\alpha_0$ and $\alpha_1$:
- I need to review this
Review the two columns

Residuals

Looks like this sections is one full that can be summarize in one for fitting a simulated AR(1) model with non-zero mean???

Repeat Part 1: Class Activity

This section is a class activity but we just repeat part 1 (above section)

Fitting a Simulated AR(2) Model

Seems to be same as part 1 above, definetly need to understand what this code is doing before I tried solving it.

Fit an AR(2) Model

Confidence Interval for the Model Parameters

Residuals

We can compute the residuals in the same manner as we did for the other models.

Activity: Global Warming

Using the PACF to Choose p for an AR(p) Process

In the previous lesson, we noted that the partial correlogram can be used to assess the number of parameters in an AR model. Here is a partial correlogram for the change in the mean annual global temperature.

Fitting Models (Dynamic Number of Parameters)

Stationary of the AR(p) Model

Forecasting with an AR(p) Model

Clas Activity: Forecasting with an AR(p) Model

Comparison to results in Section 4.6.3 of the Book

Class Activity: Comparison to the Results in Section 4.6.3 of the Book (5 min)

search for words for lesson 4.4

conceive - white noise process - variance - estimate for the parameter for the constant -

Lesson 4.4 Fitted AR Models

Simulate an AR(1) Time Series

Fit an \(AR(1)\) Model with Zero Mean

Confidence Interval for the Model Parameter

Residuals

Fitting a Simulated AR(1) Part 1

Fit an AR(1) Model with Non-Zero Mean

Confidence Interval for the Model Parameters

Residuals

Repeat Part 1: Class Activity

Fitting a Simulated AR(2) Model

Fit an AR(2) Model

Confidence Interval for the Model Parameters

Residuals

Activity: Global Warming

Using the PACF to Choose p for an AR(p) Process

Fitting Models (Dynamic Number of Parameters)

Stationary of the AR(p) Model

Forecasting with an AR(p) Model

Comparison to results in Section 4.6.3 of the Book

search for words for lesson 4.4