Applied Time Series Analysis Notebook
MATH 494R
just testing panel tabset
Visual Render
\[ x_t = m_t + s_t + z_t \]
\[ x_t = m_t * s_t + z_t \]
\[ log(x_t) = m_t + s_t + z_t \]
Source Code
$$
x_t = m_t + s_t + z_t
$$
$$
x_t = m_t * s_t + z_t
$$
$$
log(x_t) = m_t + s_t + z_t
$$
\[ x_t = m_t + s_t + z_t \]
\[ x_t = m_t* s_t + z_t \]
\[ log(x_t) = m_t + s_t + z_t \]
I’m first going to work out this index file, to have the whole picture then I will branch out base on information here. As I branch out information here will be compress and have links to the lessons and samples.
AR models
AR(1) model lesson 4.3
\(AR(1)\) model is expressed as \[ x_t = \alpha x_{t-1} + w_t \] where \(\{w_t\}\) is a white noise series with mean zero and variance \(\sigma^2\).
Sationary vs Non-Stationary TS
look at 4.3.6 The difference operator from the book to further add to this concept. 4.5.2 Stationary and non-stationary AR processes (book)
non-stationary convert to stationary time series 1. In Chapter 4 Lesson 2, we established that taking the difference of a non-stationary time series with a stochastic trend can convert it to a stationary time series.
Stationary Time Series
- Sationarity of Linear Models Lesson 5.1
- Linear models for time series are non-stationary when they include functions of time.
- Differencing can often transform a non-stationary series with a de terministic trend to a stationary series (4.1).
- In many cases, differencing sequential terms of a non-stationary process can lead to a stationary process of differences (4.2).
Non-Stationary Time Series
- Non-Stationary Time Series Lesson 5.2
- A time series with a stochastic trend is non-stationary.
- A time series with a deterministic trend is non-stationary.
- A time series with a seasonal component is non-stationary.
- A time series with a unit root is non-stationary.
Stochastic vs Deterministic Trend
White Noise
- White noise consists of independent, identically distributed variables with mean 0, finite constant variance, and no correlation, representing purely random fluctuations over time.
Lessons
Lesson 4.1 White Noise and Random Walks
Stochastic Process 4.1
Stochastic processes or random processes evolve over time. They are characterized by the fact that the future values of the process cannot be predicted based on past values. The random walk is a classic example of a stochastic process.
Discrete White Noise DWN 4.1
- 4.2: Computing the difference between successive terms of a random walk leads to a discrete white noise series.
A time series \(\{w_t: t = 1, 2, \ldots, n\}\) is a discrete white noise (DWN) if the variables \(w_1, w_2, \ldots, w_n\) are independent and identically distributed with mean 0.
Second-Order Properties of DWN
When we refer to the second-order properties of a time series, we are talking about its variance and covariance. The variance of a DWN is constant, and the covariance between any two observations is zero
The mean is a first-order property, the covariance is a second-order property.
Discrete White Noise Process
A DWN process will have the following properties:
- There is a discrete observations.
- The mean of the observations is zero.
- The variance of the observations is finite.
- Successive observations are uncorrelated.
Random Walk
- Random Walks Lesson 4.1
- A random walk is a stochastic process in which the difference between each observation is a white noise process, a non-stationary time series. (see def)
- wt is a dwn and often model as gwn, however wt could be as simple as a coin toss (random walk).
Properties of Random Walk or walks
or First-Order Properties of A Random Walk - The mean of a random walk series is 0.
Look at shinny code for this
Second-Order Properties of a Random Walk
Covariance: \(cov(x_t,x_{t+k})\):
The covariance between two values of the series depends on ( t ):
\[ \text{cov}(x_t, x_{t+k}) = t \sigma^2 \]Correlation Function \(\rho_k\):
The correlation for lag k is:
\[ \rho_k = \frac{1}{\sqrt{1 + \frac{k}{t}}} \]Non-Stationarity:
The variance of the series increases with ( t ), making the random walk non-stationary.Correlogram Characteristics:
The correlogram of a random walk typically shows:- Positive autocorrelations starting near 1.
- A slow decrease as ( k ) increases.
Gaussian White Noise GWN 4.1
- If the variables are normally distributed, i.e. \(w_i \sim N(0,\sigma^2)\), the DWN is called a Gaussian white noise process. The normal distribution is also known as the Gaussian distribution, after Carl Friedrich Gauss.
White Noise Time Series
- White Noise Time Series Lesson 4.1
- A white noise time series is a sequence of random variables that are uncorrelated and have a mean of zero.
- A white noise time series has a constant variance.
- A white noise time series has a constant mean.
- A white noise time series has a constant autocorrelation of zero for all lags except when the lag is zero.
Correlogram 4.1
- Correlogram Lesson 4.1
- A correlogram is a plot of the autocorrelation function (ACF) of a time series.
- Each correlogram lag tests for correlation significance, increasing the chance of Type I error, resulting in potentially misleading conclusions about significant relationships.
Fitting the White Noise Model
Backward Shift Operator
We define the backward shift operator or the lag operator, \(\mathbf{B}\), as: \[ \mathbf{B} x_t = x_{t-1} \] where \(\{x_t\}\) is any time series.
We can apply this operator repeatedly. We will use exponential notation to indicate this.
\[ \mathbf{B}^2 x_t = \mathbf{B} \mathbf{B} x_t = \mathbf{B} ( \mathbf{B} x_t ) = \mathbf{B} x_{t-1} = x_{t-2} \]
Properties of the Backshift Operator
The backwards shift operator is a linear operator. So, if \(a\), \(b\), \(c\), and \(d\) are constants, then \[ (a \mathbf{B} + b)x_t = a \mathbf{B} x_t + b x_t \] The distributive property also holds. \[\begin{align*} (a \mathbf{B} + b)(c \mathbf{B} + d) x_t &= c (a \mathbf{B} + b) \mathbf{B} x_t + d(a \mathbf{B} + b) x_t \\ &= a \mathbf{B} (c \mathbf{B} + d) x_t + b (c \mathbf{B} + d) x_t \\ &= \left( ac \mathbf{B}^2 + (ad+bc) \mathbf{B} + bd \right) x_t \\ &= ac \mathbf{B}^2 x_t + (ad+bc) \mathbf{B} x_t + (bd) x_t \end{align*}\]
.
search words for lesson 4.1
Gaussian white noise - GWN - discrete white noise - dwn - variance - covariance - correlation - correlogram - Type I error - histogram - backward shift operator - backshift operator
Lesson 4.2 White Noise and Random Walks - Part 2
Differecing a Time Series
Why do we difference a time series? Differencing a time series can help us to remove the trend and make the series stationary.
- Computing the difference between successive terms of a random walk leads to a discrete white noise series.
\[\begin{align*} x_t &= x_{t-1} + w_t \\ x_t - x_{t-1} &= w_t \end{align*}\]
In many cases, differencing sequential terms of a non-stationary process can lead to a stationary process of differences.
Correlograms & Histogram
When do we use a correlogram and what do we look for?
Correlogram
- A correlogram is a plot of the autocorrelation function (ACF) of a time series. The ACF is a measure of the correlation between the time series and a lagged version of itself.
- Notice that the values in the correlogram of the stock prices start at 1 and slowly decay as k increases. There are no significant autocorrelations in the differenced values. This is exactly what we would expect from a random walk.
Histogram
- Figure 5 is a histogram of the differences. This is a simple measure of volatility of the stock, or in other words, how much the price changes in a day.
Difference Operator
- Differencing nonstationary time series often leads to a stationary series, so we will define a formal operator to express this process.
Things to do - Do excel workout and link to this. So in the website for this. This lesson will be nother tab. Maybe add the option to download the excel sheet.
Computing Differences Small group acitivity The difference operator can be helpful in identifying the functional underpinnings of a trend. If a function is linear, then the first differences of equally-spaced values will be constant. If a function is quadratic, then the second differences of equally-spaced values will be constant. If a function is cubic, then the third differences of equally-spaced values will be constant, and so on.
differencing Stock Prices do this group activity
Integrated Autoregressive Model do this group activity
Class Activyt: Random Walk Drift do this class activity
Function is Linear vs Quadratic vs Cubic
This goes at the top, not in lessons
- Linear Function
- If a function is linear, then the first differences of equally-spaced values will be constant. (4.2)
- Quadratic Function
- If a function is quadratic, then the second differences of equally-spaced values will be constant. (4.2)
- Cubic Function
- If a function is cubic, then the third differences of equally-spaced values will be constant, and so on. (4.2)
Search for words for lesson 4.2
correlogram - successive stock prices -
Lesson 4.3 Autoregressive (AR) Models
Properties of an AR(p) Stochastic Process
Autoregressive Properties of an AR model
- The mean of an AR model is a constant.
- The variance of an AR model is finite.
- The covariance of an AR model is a function of the lag.
- The autocorrelation of an AR model is a function of the lag.
Exploring AR(1) Models
Definitino Recall that an \(AR(p)\) model is of the form \[ x_t = \alpha_1 x_{t-1} + \alpha_2 x_{t-2} + \alpha_3 x_{t-3} + \cdots + \alpha_{p-1} x_{t-(p-1)} + \alpha_p x_{t-p} + w_t \] So, an \(AR(1)\) model is expressed as \[ x_t = \alpha x_{t-1} + w_t \] where \(\{w_t\}\) is a white noise series with mean zero and variance \(\sigma^2\).
Second-Order Properties of an AR(1) Model
Correlogram of an AR(1) Model
- The autocorrelation function of an \(AR(1)\) model is a function of the lag.
Things to do
- DO group activity: Simulation of an AR(1) process
Partial Autocorrelation
For example, the partial correlation for lag 4 is the correlation not explained by lags 1, 2, or 3.
Partial Autocorrelation Plots of Various AR(p) Processes
Look at lesson shinny code
Sationary and Non-Stationary AR Processes
Absolute Value in the Complex Plane
This sections check for this - We will now practice assessing whether an AR process is stationary using the characteristic equation.
co-pilot notes
- Stationary and Non-Stationary AR Processes Lesson 4.3
- An AR process will be stationary if the absolute value of the solutions of the characteristic equation are all strictly greater than 1.
- The characteristic equation of an AR process is the polynomial \(\theta_p(\mathbf{B}) = 1 - \alpha_1 \mathbf{B} - \alpha_2 \mathbf{B}^2 - \cdots - \alpha_p \mathbf{B}^p\).
- The roots of the characteristic polynomial are the solutions of the characteristic equation.
- The absolute value of the roots of the characteristic polynomial must be greater than 1 for the AR process to be stationary.
co-pilot notes end
Questions
- What is an exponential smoothing model?
Search for words for lesson 4.3
exponential smoothing model - polyroot function -
Lesson 5.1 unassigned sections
Generalized Least Squares (GLS)
The autocorrelation in the data make ordinary least squares estimation inappropriate. What caped superhero comes to our rescue? None other than Captain GLS – the indominable Generalized Least Squares algorithm!
Generalized Least Squares (GLS) Lesson 5.1
- Generalized Least Squares (GLS) is a method for estimating the unknown parameters in a linear regression model.
- GLS is used when the errors in a regression model are correlated.
- GLS is used when the errors in a regression model are heteroskedastic.
- GLS is used when the errors in a regression model are autocorrelated.
- GLS is used when the errors in a regression model are non-normal.
Additive Seasonal Indicator Variables
- additive model with seasonal indicator variables can be perceived similarly to other additive models with a seasonal component:
\[ x_t = m_t + s_t + z_t \]
where \[ s_t = \begin{cases} \beta_1, & t ~\text{falls in season}~ 1 \\ \beta_2, & t ~\text{falls in season}~ 2 \\ ⋮~~~~ & ~~~~~~~~~~~~⋮ \\ \beta_s, & t ~\text{falls in season}~ s \end{cases} \] and \(s\) is the number of seasons in one cycle/period, and \(n\) is the number of observations, so \(t = 1, 2, \ldots, n\) and \(i = 1, 2, \ldots, s\), and \(z_t\) is the residual error series, which can be autocorrelated.
It is important to note that \(m_t\) does not need to be a constant. It can be a linear trend:
Seasonal indicator variable
- lesson5.1
- We will create a linear model that includes a constant term for each month. This constant monthly term is called a seasonal indicator variable.
- This name is derived from the fact that each variable indicates (either as 1 or 0) whether a given month is represented.
- Indicator variables are also called dummy varaibles.
Definitions
Stochastic processes: are random processes that evolve over time. They are characterized by the fact that the future values of the process cannot be predicted based on past values. The random walk is a classic example of a stochastic process.
Discrete white noise DWN: is a sequence of uncorrelated random variables with a mean of zero and a constant variance. The autocorrelation function of white noise is zero for all lags except when the lag is zero.
Type I Errors: 4.1 Suppose we will conduct a hypothesis test with a level of significance equal to \(\alpha = 0.05\). If the null hypothesis is true, there is a probability of 0.05 that we will reject the null hypothesis. Due to sampling variation, we will reject a true null hypothesis 5% of the time. We refer to this as making a Type I Error.
Random Walk: is a stochastic process in which the difference between each observation is a white noise process. A random walk is a non-stationary time series. A random walk has a stochastic trend, a unit root, a constant variance, a constant mean, and a constant autocorrelation of one for all lags except when the lag is zero.
Let \(\{x_t\}\) be a time series. Then, \(\{x_t\}\) is a random walk if it can be expressed as \[ x_{t} = x_{t-1} + w_{t} \] where \(\{w_t\}\) is a random process.
The value \(x_t\) can be considered as the cumulative summation of the first \(t\) values of the \(w_t\) series. In many cases, \(w_t\) is a discrete white noise series, and it is often modeled as a Gaussian white noise series. However, \(w_t\) could be as simple as a coin toss
Absolute Value in the COmplex Plane
BOOK 1.4.3
interesting view for Time Series Analysis
However, it is important to realise that correlation does not imply causation. In this case, it is not plausible that higher numbers of air passengers in the United States cause, or are caused by, higher electricity production in Australia. A reasonable explanation for the correlation is that the increasing prosperity and technological development in both countries over this period accounts for the increasing trends. The two time series also happen to have similar seasonal variations. For these reasons, it is usually appropriate to remove trends and seasonal effects before comparing multiple series. This is often achieved by working with the residuals of a regression model that has deterministic terms to represent the trend and seasonal effects (Chapter 5).
book 1.5.2
additive when some multiplicative is present
additive vs multiplicative models
Additive model if the random variation is modelled by a multiplicative factor and the variable is positive, an additive decomposition model for log(\(x_t\)) can be used.
If the seasonal effect tends to increase as the trend increases, a multiplicative model may be more appropriate: \(x_t = m_t * s_t + z_t\) If the random variation is modelled by a multiplicative factor and the variable is positive, an additive decomposition model for \(log(x_t)\) can be used: \(log(x_t) = m_t + s_t + z_t\)
1.5.2 Models book
Models dominated by trend and seasonal effects.
As the first two examples showed, many series are dominated by a trend and/or seasonal effects, so the models in this section are based on these components. A simple additive decomposition model is given by \(x_t = m_t + s_t + z_t\) (\(z_t = r_t\))
2.2.4 book Variance Function
If we assume the model is stationary in the variance, this constant population variance, σ2, can be estimated from the sample variance