Applied Time Series Analysis Notebook

MATH 494R

just testing panel tabset

Visual Render

\[ x_t = m_t + s_t + z_t \]

\[ x_t = m_t * s_t + z_t \]

\[ log(x_t) = m_t + s_t + z_t \]

Source Code

$$
x_t = m_t + s_t + z_t 
$$

$$
x_t = m_t * s_t + z_t
$$

$$
log(x_t) = m_t + s_t + z_t
$$

\[ x_t = m_t + s_t + z_t \]

\[ x_t = m_t* s_t + z_t \]

\[ log(x_t) = m_t + s_t + z_t \]

I’m first going to work out this index file, to have the whole picture then I will branch out base on information here. As I branch out information here will be compress and have links to the lessons and samples.

AR models

AR(1) model lesson 4.3

$AR(1)$ model is expressed as \[ x_t = \alpha x_{t-1} + w_t \] where $\{w_t\}$ is a white noise series with mean zero and variance $\sigma^2$.

Sationary vs Non-Stationary TS

look at 4.3.6 The difference operator from the book to further add to this concept. 4.5.2 Stationary and non-stationary AR processes (book)

non-stationary convert to stationary time series 1. In Chapter 4 Lesson 2, we established that taking the difference of a non-stationary time series with a stochastic trend can convert it to a stationary time series.

Stationary Time Series

Sationarity of Linear Models Lesson 5.1
- Linear models for time series are non-stationary when they include functions of time.
Differencing can often transform a non-stationary series with a de terministic trend to a stationary series (4.1).
In many cases, differencing sequential terms of a non-stationary process can lead to a stationary process of differences (4.2).

Non-Stationary Time Series

Non-Stationary Time Series Lesson 5.2
- A time series with a stochastic trend is non-stationary.
- A time series with a deterministic trend is non-stationary.
- A time series with a seasonal component is non-stationary.
- A time series with a unit root is non-stationary.

Stochastic vs Deterministic Trend

White Noise

White noise consists of independent, identically distributed variables with mean 0, finite constant variance, and no correlation, representing purely random fluctuations over time.

Lessons

Lesson 4.1 White Noise and Random Walks

Stochastic Process 4.1

Stochastic processes or random processes evolve over time. They are characterized by the fact that the future values of the process cannot be predicted based on past values. The random walk is a classic example of a stochastic process.

Discrete White Noise DWN 4.1

4.2: Computing the difference between successive terms of a random walk leads to a discrete white noise series.

A time series $\{w_t: t = 1, 2, \ldots, n\}$ is a discrete white noise (DWN) if the variables $w_1, w_2, \ldots, w_n$ are independent and identically distributed with mean 0.

Second-Order Properties of DWN

When we refer to the second-order properties of a time series, we are talking about its variance and covariance. The variance of a DWN is constant, and the covariance between any two observations is zero
The mean is a first-order property, the covariance is a second-order property.

Discrete White Noise Process

A DWN process will have the following properties:
- There is a discrete observations.
- The mean of the observations is zero.
- The variance of the observations is finite.
- Successive observations are uncorrelated.

Random Walk

Random Walks Lesson 4.1
- A random walk is a stochastic process in which the difference between each observation is a white noise process, a non-stationary time series. (see def)
- wt is a dwn and often model as gwn, however wt could be as simple as a coin toss (random walk).

Properties of Random Walk or walks

or First-Order Properties of A Random Walk - The mean of a random walk series is 0.

Look at shinny code for this

Second-Order Properties of a Random Walk

Covariance: $cov(x_t,x_{t+k})$:
The covariance between two values of the series depends on ( t ):
\[ \text{cov}(x_t, x_{t+k}) = t \sigma^2 \]
Correlation Function $\rho_k$:

The correlation for lag k is:
\[ \rho_k = \frac{1}{\sqrt{1 + \frac{k}{t}}} \]
Non-Stationarity:
The variance of the series increases with ( t ), making the random walk non-stationary.
Correlogram Characteristics:
The correlogram of a random walk typically shows:
- Positive autocorrelations starting near 1.
- A slow decrease as ( k ) increases.

Gaussian White Noise GWN 4.1

If the variables are normally distributed, i.e. $w_i \sim N(0,\sigma^2)$, the DWN is called a Gaussian white noise process. The normal distribution is also known as the Gaussian distribution, after Carl Friedrich Gauss.

White Noise Time Series

White Noise Time Series Lesson 4.1
- A white noise time series is a sequence of random variables that are uncorrelated and have a mean of zero.
- A white noise time series has a constant variance.
- A white noise time series has a constant mean.
- A white noise time series has a constant autocorrelation of zero for all lags except when the lag is zero.

Correlogram 4.1

Correlogram Lesson 4.1
- A correlogram is a plot of the autocorrelation function (ACF) of a time series.
- Each correlogram lag tests for correlation significance, increasing the chance of Type I error, resulting in potentially misleading conclusions about significant relationships.

Fitting the White Noise Model

Backward Shift Operator

We define the backward shift operator or the lag operator, $\mathbf{B}$, as: \[ \mathbf{B} x_t = x_{t-1} \] where $\{x_t\}$ is any time series.

We can apply this operator repeatedly. We will use exponential notation to indicate this.

\[ \mathbf{B}^2 x_t = \mathbf{B} \mathbf{B} x_t = \mathbf{B} ( \mathbf{B} x_t ) = \mathbf{B} x_{t-1} = x_{t-2} \]

Properties of the Backshift Operator

The backwards shift operator is a linear operator. So, if $a$, $b$, $c$, and $d$ are constants, then \[ (a \mathbf{B} + b)x_t = a \mathbf{B} x_t + b x_t \] The distributive property also holds. \[\begin{align*} (a \mathbf{B} + b)(c \mathbf{B} + d) x_t &= c (a \mathbf{B} + b) \mathbf{B} x_t + d(a \mathbf{B} + b) x_t \\ &= a \mathbf{B} (c \mathbf{B} + d) x_t + b (c \mathbf{B} + d) x_t \\ &= \left( ac \mathbf{B}^2 + (ad+bc) \mathbf{B} + bd \right) x_t \\ &= ac \mathbf{B}^2 x_t + (ad+bc) \mathbf{B} x_t + (bd) x_t \end{align*}\]

search words for lesson 4.1

Gaussian white noise - GWN - discrete white noise - dwn - variance - covariance - correlation - correlogram - Type I error - histogram - backward shift operator - backshift operator

Lesson 4.2 White Noise and Random Walks - Part 2

Differecing a Time Series

Why do we difference a time series? Differencing a time series can help us to remove the trend and make the series stationary.

Computing the difference between successive terms of a random walk leads to a discrete white noise series.

\[\begin{align*} x_t &= x_{t-1} + w_t \\ x_t - x_{t-1} &= w_t \end{align*}\]

In many cases, differencing sequential terms of a non-stationary process can lead to a stationary process of differences.

Correlograms & Histogram

When do we use a correlogram and what do we look for?

Correlogram

A correlogram is a plot of the autocorrelation function (ACF) of a time series. The ACF is a measure of the correlation between the time series and a lagged version of itself.
Notice that the values in the correlogram of the stock prices start at 1 and slowly decay as k increases. There are no significant autocorrelations in the differenced values. This is exactly what we would expect from a random walk.

Histogram

Figure 5 is a histogram of the differences. This is a simple measure of volatility of the stock, or in other words, how much the price changes in a day.

Difference Operator

Differencing nonstationary time series often leads to a stationary series, so we will define a formal operator to express this process.

Definition of the Difference Operator

The difference operator, $\nabla$, is defined as:

\[\nabla x_t = x_t - x_{t-1} = (1-\mathbf{B}) x_t\]

Higher-order differencing can be denoted

\[\nabla^n x_t = (1-\mathbf{B})^n x_t\]

Things to do - Do excel workout and link to this. So in the website for this. This lesson will be nother tab. Maybe add the option to download the excel sheet.

Computing Differences Small group acitivity The difference operator can be helpful in identifying the functional underpinnings of a trend. If a function is linear, then the first differences of equally-spaced values will be constant. If a function is quadratic, then the second differences of equally-spaced values will be constant. If a function is cubic, then the third differences of equally-spaced values will be constant, and so on.

differencing Stock Prices do this group activity

Integrated Autoregressive Model do this group activity

Class Activyt: Random Walk Drift do this class activity

Function is Linear vs Quadratic vs Cubic

This goes at the top, not in lessons

Linear Function
- If a function is linear, then the first differences of equally-spaced values will be constant. (4.2)
Quadratic Function
- If a function is quadratic, then the second differences of equally-spaced values will be constant. (4.2)
Cubic Function
- If a function is cubic, then the third differences of equally-spaced values will be constant, and so on. (4.2)

Search for words for lesson 4.2

correlogram - successive stock prices -

Lesson 4.3 Autoregressive (AR) Models

Definition of an Autoregressive (AR) Model

The time series $\{x_t\}$ is an autoregressive process of order $p$, denoted as $AR(p)$, if \[ x_t = \alpha_1 x_{t-1} + \alpha_2 x_{t-2} + \alpha_3 x_{t-3} + \cdots + \alpha_{p-1} x_{t-(p-1)} + \alpha_p x_{t-p} + w_t ~~~~~~~~~~~~~~~~~~~~~~~ (4.15) \]

where $\{w_t\}$ is white noise and the $\alpha_i$ are the model parameters with $\alpha_p \ne 0$.

Properties of an AR(p) Stochastic Process

Autoregressive Properties of an AR model

The mean of an AR model is a constant.
The variance of an AR model is finite.
The covariance of an AR model is a function of the lag.
The autocorrelation of an AR model is a function of the lag.

Exploring AR(1) Models

Definitino Recall that an $AR(p)$ model is of the form \[ x_t = \alpha_1 x_{t-1} + \alpha_2 x_{t-2} + \alpha_3 x_{t-3} + \cdots + \alpha_{p-1} x_{t-(p-1)} + \alpha_p x_{t-p} + w_t \] So, an $AR(1)$ model is expressed as \[ x_t = \alpha x_{t-1} + w_t \] where $\{w_t\}$ is a white noise series with mean zero and variance $\sigma^2$.

Second-Order Properties of an AR(1) Model

Second-Order Properties of an $AR(1)$ Model

If $\{x_t\}_{t=1}^n$ is an $AR(1)$ prcess, then its the first- and second-order properties are summarized below.

\[ \begin{align*} \mu_x &= 0 \\ \gamma_k = cov(x_t, x_{t+k}) &= \frac{\alpha^k \sigma^2}{1-\alpha^2} \end{align*} \]

Correlogram of an AR(1) Model

The autocorrelation function of an $AR(1)$ model is a function of the lag.

Correlogram of an AR(1) Process

The autocorrelation function for an AR(1) process is

\[ \rho_k = \alpha^k ~~~~~~ (k \ge 0) \] where $|\alpha| < 1$.

Things to do

DO group activity: Simulation of an AR(1) process

Partial Autocorrelation

Definition: Partial Autocorrleation

The partial autocorrelation at lag $k$ is defined as the portion of the correlation that is not explained by shorter lags.

For example, the partial correlation for lag 4 is the correlation not explained by lags 1, 2, or 3.

Check Your Understanding

What is the value of the partial autocorrelation function for an $AR(2)$ process for all lags greater than 2? answer: 0

Partial Autocorrelation Plots of Various AR(p) Processes

Look at lesson shinny code

Sationary and Non-Stationary AR Processes

Definition of the Characteristic Equation

Treating the symbol $\mathbf{B}$ formally as a number (either real or complex), the polynomial

\[ \theta_p(\mathbf{B}) x_t = \left( 1 - \alpha_1 \mathbf{B} - \alpha_2 \mathbf{B}^2 - \cdots - \alpha_p \mathbf{B}^p \right) x_t \]

is called the characteristic polynomial of an AR process.

If we set the characteristic polynomial to zero, we get the characteristic equation:

\[ \theta_p(\mathbf{B}) = \left( 1 - \alpha_1 \mathbf{B} - \alpha_2 \mathbf{B}^2 - \cdots - \alpha_p \mathbf{B}^p \right) = 0 \]

Identifying Stationary Processes

An AR process will be stationary if the absolute value of the solutions of the characteristic equation are all strictly greater than 1. - to Prove that ( y_t ) is stationary. (see hw 4.3 and lesson 4.3) (see hw4.3 Q2.C)

Absolute Value in the Complex Plane

Definition of the Absolute Value in the Complex Plane

Let $z = a+bi$ be any complex number. It can be represented by the point $(a,b)$ in the complex plane. We define the absolute value of $z$ as the distance from the origin to the point:

\[ |z| = \sqrt{a^2 + b^2} \]

This sections check for this - We will now practice assessing whether an AR process is stationary using the characteristic equation.

co-pilot notes

Stationary and Non-Stationary AR Processes Lesson 4.3
- An AR process will be stationary if the absolute value of the solutions of the characteristic equation are all strictly greater than 1.
- The characteristic equation of an AR process is the polynomial $\theta_p(\mathbf{B}) = 1 - \alpha_1 \mathbf{B} - \alpha_2 \mathbf{B}^2 - \cdots - \alpha_p \mathbf{B}^p$.
- The roots of the characteristic polynomial are the solutions of the characteristic equation.
- The absolute value of the roots of the characteristic polynomial must be greater than 1 for the AR process to be stationary.

co-pilot notes end

Questions

What is an exponential smoothing model?

Search for words for lesson 4.3

exponential smoothing model - polyroot function -

Lesson 5.1 unassigned sections

Generalized Least Squares (GLS)

The autocorrelation in the data make ordinary least squares estimation inappropriate. What caped superhero comes to our rescue? None other than Captain GLS – the indominable Generalized Least Squares algorithm!
Generalized Least Squares (GLS) Lesson 5.1
- Generalized Least Squares (GLS) is a method for estimating the unknown parameters in a linear regression model.
- GLS is used when the errors in a regression model are correlated.
- GLS is used when the errors in a regression model are heteroskedastic.
- GLS is used when the errors in a regression model are autocorrelated.
- GLS is used when the errors in a regression model are non-normal.

Additive Seasonal Indicator Variables

additive model with seasonal indicator variables can be perceived similarly to other additive models with a seasonal component:

\[ x_t = m_t + s_t + z_t \]

where \[ s_t = \begin{cases} \beta_1, & t ~\text{falls in season}~ 1 \\ \beta_2, & t ~\text{falls in season}~ 2 \\ ⋮~~~~ & ~~~~~~~~~~~~⋮ \\ \beta_s, & t ~\text{falls in season}~ s \end{cases} \] and $s$ is the number of seasons in one cycle/period, and $n$ is the number of observations, so $t = 1, 2, \ldots, n$ and $i = 1, 2, \ldots, s$, and $z_t$ is the residual error series, which can be autocorrelated.

It is important to note that $m_t$ does not need to be a constant. It can be a linear trend:

Seasonal indicator variable

lesson5.1
- We will create a linear model that includes a constant term for each month. This constant monthly term is called a seasonal indicator variable.
- This name is derived from the fact that each variable indicates (either as 1 or 0) whether a given month is represented.
- Indicator variables are also called dummy varaibles.

Definitions

Stochastic processes: are random processes that evolve over time. They are characterized by the fact that the future values of the process cannot be predicted based on past values. The random walk is a classic example of a stochastic process.

Discrete white noise DWN: is a sequence of uncorrelated random variables with a mean of zero and a constant variance. The autocorrelation function of white noise is zero for all lags except when the lag is zero.

Type I Errors: 4.1 Suppose we will conduct a hypothesis test with a level of significance equal to $\alpha = 0.05$. If the null hypothesis is true, there is a probability of 0.05 that we will reject the null hypothesis. Due to sampling variation, we will reject a true null hypothesis 5% of the time. We refer to this as making a Type I Error.

Random Walk: is a stochastic process in which the difference between each observation is a white noise process. A random walk is a non-stationary time series. A random walk has a stochastic trend, a unit root, a constant variance, a constant mean, and a constant autocorrelation of one for all lags except when the lag is zero.

Let $\{x_t\}$ be a time series. Then, $\{x_t\}$ is a random walk if it can be expressed as \[ x_{t} = x_{t-1} + w_{t} \] where $\{w_t\}$ is a random process.

The value $x_t$ can be considered as the cumulative summation of the first $t$ values of the $w_t$ series. In many cases, $w_t$ is a discrete white noise series, and it is often modeled as a Gaussian white noise series. However, $w_t$ could be as simple as a coin toss

Absolute Value in the COmplex Plane

Definition of the Absolute Value in the Complex Plane

Let $z = a+bi$ be any complex number. It can be represented by the point $(a,b)$ in the complex plane. We define the absolute value of $z$ as the distance from the origin to the point:

\[ |z| = \sqrt{a^2 + b^2} \]

BOOK 1.4.3

interesting view for Time Series Analysis

However, it is important to realise that correlation does not imply causation. In this case, it is not plausible that higher numbers of air passengers in the United States cause, or are caused by, higher electricity production in Australia. A reasonable explanation for the correlation is that the increasing prosperity and technological development in both countries over this period accounts for the increasing trends. The two time series also happen to have similar seasonal variations. For these reasons, it is usually appropriate to remove trends and seasonal effects before comparing multiple series. This is often achieved by working with the residuals of a regression model that has deterministic terms to represent the trend and seasonal effects (Chapter 5).

book 1.5.2

additive when some multiplicative is present

additive vs multiplicative models

Additive model if the random variation is modelled by a multiplicative factor and the variable is positive, an additive decomposition model for log($x_t$) can be used.

If the seasonal effect tends to increase as the trend increases, a multiplicative model may be more appropriate: $x_t = m_t * s_t + z_t$ If the random variation is modelled by a multiplicative factor and the variable is positive, an additive decomposition model for $log(x_t)$ can be used: $log(x_t) = m_t + s_t + z_t$

1.5.2 Models book

Models dominated by trend and seasonal effects.

As the first two examples showed, many series are dominated by a trend and/or seasonal effects, so the models in this section are based on these components. A simple additive decomposition model is given by $x_t = m_t + s_t + z_t$ ($z_t = r_t$)

2.2.4 book Variance Function

If we assume the model is stationary in the variance, this constant population variance, σ2, can be estimated from the sample variance

--- title: "Applied Time Series Analysis Notebook" subtitle: "MATH 494R" format: html: page-layout: full error: false message: false warning: false embed-resources: true toc: true code-fold: true math: katex # katex, mathjax, none code-tools: # this is to show source code source: true toggle: false caption: none --- just testing panel tabset ::: {.columns} :::: {.column width="50%"} ### Visual Render ::: {.panel-tabset} #### Add $$ x_t = m_t + s_t + z_t $$ #### MULT $$ x_t = m_t * s_t + z_t $$ #### log add $$ log(x_t) = m_t + s_t + z_t $$ ::: :::: :::: {.column width="50%"} ### Source Code ::: {.panel-tabset} ## Add ``` $$ x_t = m_t + s_t + z_t $$ ``` ## MULT ``` $$ x_t = m_t * s_t + z_t $$ ``` ## log add ``` $$ log(x_t) = m_t + s_t + z_t $$ ``` ::: :::: ::: ::: {.panel-tabset} ## Add $$ x_t = m_t + s_t + z_t $$ ## MULT $$ x_t = m_t* s_t + z_t $$ ## log add $$ log(x_t) = m_t + s_t + z_t $$ ::: I'm first going to work out this index file, to have the whole picture then I will branch out base on information here. As I branch out information here will be compress and have links to the lessons and samples. # AR models ## AR(1) model lesson 4.3 $AR(1)$ model is expressed as $$ x_t = \alpha x_{t-1} + w_t $$ where $\{w_t\}$ is a white noise series with mean zero and variance $\sigma^2$. # Sationary vs Non-Stationary TS look at 4.3.6 The difference operator from the book to further add to this concept. 4.5.2 Stationary and non-stationary AR processes (book) **non-stationary convert to stationary time series** 1. In Chapter 4 Lesson 2, we established that taking the difference of a non-stationary time series with a stochastic trend can convert it to a stationary time series. ## Stationary Time Series - **Sationarity of Linear Models Lesson 5.1** - Linear models for time series are non-stationary when they include functions of time. - Differencing can often transform a non-stationary series with a de terministic trend to a stationary series (4.1). - In many cases, differencing sequential terms of a non-stationary process can lead to a stationary process of differences (4.2). ## Non-Stationary Time Series - **Non-Stationary Time Series Lesson 5.2** - A time series with a stochastic trend is non-stationary. - A time series with a deterministic trend is non-stationary. - A time series with a seasonal component is non-stationary. - A time series with a unit root is non-stationary. # Stochastic vs Deterministic Trend # White Noise - White noise consists of independent, identically distributed variables with mean 0, finite constant variance, and no correlation, representing purely random fluctuations over time.  # Lessons ## Lesson 4.1 White Noise and Random Walks ### Stochastic Process 4.1 Stochastic processes or random processes evolve over time. They are characterized by the fact that the future values of the process cannot be predicted based on past values. The random walk is a classic example of a stochastic process. ### Discrete White Noise DWN 4.1 - 4.2: Computing the difference between successive terms of a random walk leads to a discrete white noise series. A time series $\{w_t: t = 1, 2, \ldots, n\}$ is a **discrete white noise (DWN)** if the variables $w_1, w_2, \ldots, w_n$ are independent and identically distributed with mean 0. #### Second-Order Properties of DWN - When we refer to the second-order properties of a time series, we are talking about its variance and covariance. The variance of a DWN is constant, and the covariance between any two observations is zero - The mean is a first-order property, the covariance is a second-order property. #### Discrete White Noise Process - A DWN process will have the following properties: - There is a discrete observations. - The mean of the observations is zero. - The variance of the observations is finite. - Successive observations are uncorrelated. ### Random Walk - **Random Walks Lesson 4.1** - A random walk is a stochastic process in which the difference between each observation is a white noise process, a non-stationary time series. (see def) - wt is a dwn and often model as gwn, however wt could be as simple as a coin toss (**random walk**). #### Properties of Random Walk or walks or First-Order Properties of A Random Walk - The mean of a random walk series is 0. Look at shinny code for this #### Second-Order Properties of a Random Walk - **Covariance:** $cov(x_t,x_{t+k})$:\ The covariance between two values of the series depends on ( t ):\ $$ \text{cov}(x_t, x_{t+k}) = t \sigma^2 $$ - **Correlation Function** $\rho_k$: The correlation for lag k is:\ $$ \rho_k = \frac{1}{\sqrt{1 + \frac{k}{t}}} $$ - **Non-Stationarity**:\ The variance of the series increases with ( t ), making the random walk non-stationary. - **Correlogram Characteristics**:\ The correlogram of a random walk typically shows: - Positive autocorrelations starting near 1. - A slow decrease as ( k ) increases. ### Gaussian White Noise GWN 4.1 - If the variables are normally distributed, i.e. $w_i \sim N(0,\sigma^2)$, the DWN is called a **Gaussian white noise** process. The normal distribution is also known as the Gaussian distribution, after Carl Friedrich Gauss. ### White Noise Time Series - **White Noise Time Series Lesson 4.1** - A white noise time series is a sequence of random variables that are uncorrelated and have a mean of zero. - A white noise time series has a constant variance. - A white noise time series has a constant mean. - A white noise time series has a constant autocorrelation of zero for all lags except when the lag is zero. ### Correlogram 4.1 - **Correlogram Lesson 4.1** - A correlogram is a plot of the autocorrelation function (ACF) of a time series. - Each correlogram lag tests for correlation significance, increasing the chance of Type I error, resulting in potentially misleading conclusions about significant relationships. ### Fitting the White Noise Model ### Backward Shift Operator We define the **backward shift operator** or the **lag operator**, $\mathbf{B}$, as: $$ \mathbf{B} x_t = x_{t-1} $$ where $\{x_t\}$ is any time series. We can apply this operator repeatedly. We will use exponential notation to indicate this. $$ \mathbf{B}^2 x_t = \mathbf{B} \mathbf{B} x_t = \mathbf{B} ( \mathbf{B} x_t ) = \mathbf{B} x_{t-1} = x_{t-2} $$ #### Properties of the Backshift Operator The backwards shift operator is a linear operator. So, if $a$, $b$, $c$, and $d$ are constants, then $$ (a \mathbf{B} + b)x_t = a \mathbf{B} x_t + b x_t $$ The distributive property also holds. \begin{align*} (a \mathbf{B} + b)(c \mathbf{B} + d) x_t &= c (a \mathbf{B} + b) \mathbf{B} x_t + d(a \mathbf{B} + b) x_t \\ &= a \mathbf{B} (c \mathbf{B} + d) x_t + b (c \mathbf{B} + d) x_t \\ &= \left( ac \mathbf{B}^2 + (ad+bc) \mathbf{B} + bd \right) x_t \\ &= ac \mathbf{B}^2 x_t + (ad+bc) \mathbf{B} x_t + (bd) x_t \end{align*} . ### search words for lesson 4.1 Gaussian white noise - GWN - discrete white noise - dwn - variance - covariance - correlation - correlogram - Type I error - histogram - backward shift operator - backshift operator  ## Lesson 4.2 White Noise and Random Walks - Part 2 ### Differecing a Time Series Why do we difference a time series? Differencing a time series can help us to remove the trend and make the series stationary. - Computing the difference between successive terms of a random walk leads to a discrete white noise series. \begin{align*} x_t &= x_{t-1} + w_t \\ x_t - x_{t-1} &= w_t \end{align*} In many cases, differencing sequential terms of a non-stationary process can lead to a stationary process of differences. ### Correlograms & Histogram When do we use a correlogram and what do we look for? **Correlogram** - A correlogram is a plot of the autocorrelation function (ACF) of a time series. The ACF is a measure of the correlation between the time series and a lagged version of itself. - Notice that the values in the correlogram of the stock prices start at 1 and slowly decay as k increases. There are no significant autocorrelations in the differenced values. This is exactly what we would expect from a random walk. **Histogram** - Figure 5 is a histogram of the differences. This is a simple measure of volatility of the stock, or in other words, how much the price changes in a day. ### Difference Operator - Differencing nonstationary time series often leads to a stationary series, so we will define a formal operator to express this process. ::: {.callout-note icon="false" title="Definition of the Difference Operator"} The **difference operator**, $\nabla$, is defined as: $$\nabla x_t = x_t - x_{t-1} = (1-\mathbf{B}) x_t$$ Higher-order differencing can be denoted $$\nabla^n x_t = (1-\mathbf{B})^n x_t$$ ::: **Things to do** - Do excel workout and link to this. So in the website for this. This lesson will be nother tab. Maybe add the option to download the excel sheet. **Computing Differences** Small group acitivity The difference operator can be helpful in identifying the functional underpinnings of a trend. If a function **is linear**, then the first differences of equally-spaced values will be constant. If a function **is quadratic**, then the second differences of equally-spaced values will be constant. If a function **is cubic**, then the third differences of equally-spaced values will be constant, and so on. **differencing Stock Prices** do this group activity **Integrated Autoregressive Model** do this group activity **Class Activyt: Random Walk Drift** do this class activity # Function is Linear vs Quadratic vs Cubic This goes at the top, not in lessons - **Linear Function** - If a function is linear, then the first differences of equally-spaced values will be constant. (4.2) - **Quadratic Function** - If a function is quadratic, then the second differences of equally-spaced values will be constant. (4.2) - **Cubic Function** - If a function is cubic, then the third differences of equally-spaced values will be constant, and so on. (4.2) ### Search for words for lesson 4.2 correlogram - successive stock prices -  ## Lesson 4.3 Autoregressive (AR) Models ::: {.callout-note icon="false" title="Definition of an Autoregressive (AR) Model"} The time series $\{x_t\}$ is an **autoregressive process of order** $p$, denoted as $AR(p)$, if $$ x_t = \alpha_1 x_{t-1} + \alpha_2 x_{t-2} + \alpha_3 x_{t-3} + \cdots + \alpha_{p-1} x_{t-(p-1)} + \alpha_p x_{t-p} + w_t ~~~~~~~~~~~~~~~~~~~~~~~ (4.15) $$ where $\{w_t\}$ is white noise and the $\alpha_i$ are the model parameters with $\alpha_p \ne 0$. ::: ### Properties of an AR(p) Stochastic Process **Autoregressive Properties of an AR model** - The mean of an AR model is a constant. - The variance of an AR model is finite. - The covariance of an AR model is a function of the lag. - The autocorrelation of an AR model is a function of the lag. ### Exploring AR(1) Models **Definitino** Recall that an $AR(p)$ model is of the form $$ x_t = \alpha_1 x_{t-1} + \alpha_2 x_{t-2} + \alpha_3 x_{t-3} + \cdots + \alpha_{p-1} x_{t-(p-1)} + \alpha_p x_{t-p} + w_t $$ So, an $AR(1)$ model is expressed as $$ x_t = \alpha x_{t-1} + w_t $$ where $\{w_t\}$ is a white noise series with mean zero and variance $\sigma^2$. ### Second-Order Properties of an AR(1) Model ::: {.callout-note icon="false" title="Second-Order Properties of an $AR(1)$ Model"} If $\{x_t\}_{t=1}^n$ is an $AR(1)$ prcess, then its the first- and second-order properties are summarized below. $$ \begin{align*} \mu_x &= 0 \\ \gamma_k = cov(x_t, x_{t+k}) &= \frac{\alpha^k \sigma^2}{1-\alpha^2} \end{align*} $$ ::: ### Correlogram of an AR(1) Model - The autocorrelation function of an $AR(1)$ model is a function of the lag. ::: {.callout-note icon="false" title="Correlogram of an AR(1) Process"} The autocorrelation function for an AR(1) process is $$ \rho_k = \alpha^k ~~~~~~ (k \ge 0) $$ where $|\alpha| < 1$. ::: **Things to do** - DO group activity: Simulation of an AR(1) process ### Partial Autocorrelation ::: {.callout-note icon="false" title="Definition: Partial Autocorrleation"} The **partial autocorrelation** at lag $k$ is defined as the portion of the correlation that is not explained by shorter lags. ::: For example, the partial correlation for lag 4 is the correlation not explained by lags 1, 2, or 3. ::: {.callout-tip icon="false" title="Check Your Understanding"} - What is the value of the partial autocorrelation function for an $AR(2)$ process for all lags greater than 2? answer: 0 ::: ### Partial Autocorrelation Plots of Various AR(p) Processes **Look at lesson shinny code** ### Sationary and Non-Stationary AR Processes ::: {.callout-note icon="false" title="Definition of the Characteristic Equation"} Treating the symbol $\mathbf{B}$ formally as a number (either real or complex), the polynomial $$ \theta_p(\mathbf{B}) x_t = \left( 1 - \alpha_1 \mathbf{B} - \alpha_2 \mathbf{B}^2 - \cdots - \alpha_p \mathbf{B}^p \right) x_t $$ is called the **characteristic polynomial** of an AR process. If we set the characteristic polynomial to zero, we get the **characteristic equation**: $$ \theta_p(\mathbf{B}) = \left( 1 - \alpha_1 \mathbf{B} - \alpha_2 \mathbf{B}^2 - \cdots - \alpha_p \mathbf{B}^p \right) = 0 $$ ::: ::: {.callout-note icon="false" title="Identifying Stationary Processes"} An AR process will be **stationary** if the absolute value of the solutions of the characteristic equation are all strictly greater than 1. - to Prove that $ y_t $ is stationary. (see hw 4.3 and lesson 4.3) (see hw4.3 Q2.C) ::: ### Absolute Value in the Complex Plane ::: {.callout-note icon="false" title="Definition of the Absolute Value in the Complex Plane"} Let $z = a+bi$ be any complex number. It can be represented by the point $(a,b)$ in the complex plane. We define the absolute value of $z$ as the distance from the origin to the point: $$ |z| = \sqrt{a^2 + b^2} $$ ::: **This sections check for this** - We will now practice assessing whether an AR process is stationary using the characteristic equation. co-pilot notes - **Stationary and Non-Stationary AR Processes Lesson 4.3** - An AR process will be stationary if the absolute value of the solutions of the characteristic equation are all strictly greater than 1. - The characteristic equation of an AR process is the polynomial $\theta_p(\mathbf{B}) = 1 - \alpha_1 \mathbf{B} - \alpha_2 \mathbf{B}^2 - \cdots - \alpha_p \mathbf{B}^p$. - The roots of the characteristic polynomial are the solutions of the characteristic equation. - The absolute value of the roots of the characteristic polynomial must be greater than 1 for the AR process to be stationary. co-pilot notes end ### Questions * What is an exponential smoothing model? ### Search for words for lesson 4.3 exponential smoothing model - polyroot function -  ## Lesson 5.1 unassigned sections ### Generalized Least Squares (GLS) - The autocorrelation in the data make ordinary least squares estimation inappropriate. What caped superhero comes to our rescue? None other than Captain GLS – the indominable Generalized Least Squares algorithm! - **Generalized Least Squares (GLS) Lesson 5.1** - Generalized Least Squares (GLS) is a method for estimating the unknown parameters in a linear regression model. - GLS is used when the errors in a regression model are correlated. - GLS is used when the errors in a regression model are heteroskedastic. - GLS is used when the errors in a regression model are autocorrelated. - GLS is used when the errors in a regression model are non-normal. ### Additive Seasonal Indicator Variables - additive model with seasonal indicator variables can be perceived similarly to other additive models with a seasonal component: $$ x_t = m_t + s_t + z_t $$ where $$ s_t = \begin{cases} \beta_1, & t ~\text{falls in season}~ 1 \\ \beta_2, & t ~\text{falls in season}~ 2 \\ ⋮~~~~ & ~~~~~~~~~~~~⋮ \\ \beta_s, & t ~\text{falls in season}~ s \end{cases} $$ and $s$ is the number of seasons in one cycle/period, and $n$ is the number of observations, so $t = 1, 2, \ldots, n$ and $i = 1, 2, \ldots, s$, and $z_t$ is the residual error series, which can be autocorrelated. It is important to note that $m_t$ does not need to be a constant. It can be a linear trend: #### Seasonal indicator variable - lesson5.1 - We will create a linear model that includes a constant term for each month. This constant monthly term is called a **seasonal indicator variable.** - This name is derived from the fact that each variable indicates (either as 1 or 0) whether a given month is represented. - Indicator variables are also called dummy varaibles. # Definitions Stochastic processes: are random processes that evolve over time. They are characterized by the fact that the future values of the process cannot be predicted based on past values. The random walk is a classic example of a stochastic process. Discrete white noise DWN: is a sequence of uncorrelated random variables with a mean of zero and a constant variance. The autocorrelation function of white noise is zero for all lags except when the lag is zero. Type I Errors: 4.1 Suppose we will conduct a hypothesis test with a level of significance equal to $\alpha = 0.05$. If the null hypothesis is true, there is a probability of 0.05 that we will reject the null hypothesis. Due to sampling variation, we will reject a true null hypothesis 5% of the time. We refer to this as making a **Type I Error**. Random Walk: is a stochastic process in which the difference between each observation is a white noise process. A random walk is a non-stationary time series. A random walk has a stochastic trend, a unit root, a constant variance, a constant mean, and a constant autocorrelation of one for all lags except when the lag is zero. Let $\{x_t\}$ be a time series. Then, $\{x_t\}$ is a **random walk** if it can be expressed as $$ x_{t} = x_{t-1} + w_{t} $$ where $\{w_t\}$ is a random process. The value $x_t$ can be considered as the cumulative summation of the first $t$ values of the $w_t$ series. In many cases, $w_t$ is a discrete white noise series, and it is often modeled as a Gaussian white noise series. However, $w_t$ could be as simple as a coin toss Absolute Value in the COmplex Plane ::: {.callout-note icon="false" title="Definition of the Absolute Value in the Complex Plane"} Let $z = a+bi$ be any complex number. It can be represented by the point $(a,b)$ in the complex plane. We define the absolute value of $z$ as the distance from the origin to the point: $$ |z| = \sqrt{a^2 + b^2} $$ ::: **BOOK 1.4.3** interesting view for Time Series Analysis However, it is important to realise that *correlation* does not imply *causation*. In this case, it is not plausible that higher numbers of air passengers in the United States cause, or are caused by, higher electricity production in Australia. A reasonable explanation for the correlation is that the increasing prosperity and technological development in both countries over this period accounts for the increasing trends. The two time series also happen to have similar seasonal variations. For these reasons, it is usually appropriate to *remove trends and seasonal effects* before comparing multiple series. This is often achieved by working with the residuals of a regression model that has deterministic terms to represent the trend and seasonal effects (Chapter 5). ***book 1.5.2*** additive when some multiplicative is present additive vs multiplicative models Additive model if the random variation is modelled by a multiplicative factor and the variable is positive, an additive decomposition model for log($x_t$) can be used. If the seasonal effect tends to increase as the trend increases, a multiplicative model may be more appropriate: $x_t = m_t * s_t + z_t$ If the random variation is modelled by a multiplicative factor and the variable is positive, an additive decomposition model for $log(x_t)$ can be used: $log(x_t) = m_t + s_t + z_t$ **1.5.2 Models** book Models dominated by trend and seasonal effects. As the first two examples showed, many series are dominated by a trend and/or seasonal effects, so the models in this section are based on these components. A simple additive decomposition model is given by $x_t = m_t + s_t + z_t$ ($z_t = r_t$) **2.2.4 book Variance Function** If we assume the model is stationary in the variance, this constant population variance, σ2, can be estimated from the sample variance