## 5.3 Dickey-Fuller and Augmented Dickey-Fuller tests

### 5.3.1 Dickey-Fuller test

The Dickey-Fuller test is testing if \(\phi=0\) in this model of the data:\[y_t = \alpha + \beta t + \phi y_{t-1} + e_t\]which is written as\[\Delta y_t = y_t-y_{t-1}= \alpha + \beta t + \gamma y_{t-1} + e_t\]where \(y_t\) is your data. It is written this way so we can do a linear regression of \(\Delta y_t\) against \(t\) and \(y_{t-1}\) and test if \(\gamma\) is different from 0. If \(\gamma=0\), then we have a random walk process. If not and \(-1<1+\gamma<1\), then we have a stationary process.

### 5.3.2 Augmented Dickey-Fuller test

The Augmented Dickey-Fuller test allows for higher-order autoregressive processes by including \(\Delta y_{t-p}\) in the model. But our test is still if \(\gamma = 0\).\[\Delta y_t = \alpha + \beta t + \gamma y_{t-1} + \delta_1 \Delta y_{t-1} + \delta_2 \Delta y_{t-2} + \dots\]

The null hypothesis for both tests is that the data are non-stationary. We want to REJECT the null hypothesis for this test, so we want a p-value of less that 0.05 (or smaller).

### 5.3.3 ADF test using `adf.test()`

The `adf.test()`

from the **tseries** package will do a Augmented Dickey-Fuller test (Dickey-Fuller if we set lags equal to 0) with a trend and an intercept. Use `?adf.test`

to read about this function. The function is

`adf.test(x, alternative = c("stationary", "explosive"), k = trunc((length(x)-1)^(1/3)))`

`x`

are your data. `alternative="stationary"`

means that \(-2<\gamma<0\) (\(-1<\phi<1\)) and `alternative="explosive"`

means that is outside these bounds. `k`

is the number of \(\delta\) lags. For a Dickey-Fuller test, so only up to AR(1) time dependency in our stationary process, we set `k=0`

so we have no \(\delta\)’s in our test. Being able to control the lags in our test, allows us to avoid a stationarity test that is too complex to be supported by our data.

#### 5.3.3.1 Test on white noise

Let’s start by doing the test on data that we know are stationary, white noise. We will use an Augmented Dickey-Fuller test where we use the default number of lags (amount of time-dependency) in our test. For a time-series of 100, this is 4.

`TT <- 100wn <- rnorm(TT) # white noisetseries::adf.test(wn)`

`Warning in tseries::adf.test(wn): p-value smaller than printed p-value`

` Augmented Dickey-Fuller Testdata: wnDickey-Fuller = -4.8309, Lag order = 4, p-value = 0.01alternative hypothesis: stationary`

The null hypothesis is rejected.

Try a Dickey-Fuller test. This is testing with a null hypothesis of AR(1) stationarity versus a null hypothesis with AR(4) stationarity when we used the default `k`

.

`tseries::adf.test(wn, k = 0)`

`Warning in tseries::adf.test(wn, k = 0): p-value smaller than printed p-value`

` Augmented Dickey-Fuller Testdata: wnDickey-Fuller = -10.122, Lag order = 0, p-value = 0.01alternative hypothesis: stationary`

Notice that the test-statistic is smaller. This is a more restrictive test and we can reject the null with a higher significance level.

#### 5.3.3.2 Test on white noise with trend

Try the test on white noise with a trend and intercept.

`intercept <- 1wnt <- wn + 1:TT + intercepttseries::adf.test(wnt)`

`Warning in tseries::adf.test(wnt): p-value smaller than printed p-value`

` Augmented Dickey-Fuller Testdata: wntDickey-Fuller = -4.8309, Lag order = 4, p-value = 0.01alternative hypothesis: stationary`

The null hypothesis is still rejected. `adf.test()`

uses a model that allows an intercept and trend.

#### 5.3.3.3 Test on random walk

Let’s try the test on a random walk (nonstationary).

`rw <- c*msum(rnorm(TT))tseries::adf.test(rw)`

` Augmented Dickey-Fuller Testdata: rwDickey-Fuller = -2.3038, Lag order = 4, p-value = 0.4508alternative hypothesis: stationary`

The null hypothesis is NOT rejected as the p-value is greater than 0.05.

Try a Dickey-Fuller test.

`tseries::adf.test(rw, k = 0)`

` Augmented Dickey-Fuller Testdata: rwDickey-Fuller = -1.7921, Lag order = 0, p-value = 0.6627alternative hypothesis: stationary`

Notice that the test-statistic is larger.

#### 5.3.3.4 Test the anchovy data

`tseries::adf.test(anchovyts)`

` Augmented Dickey-Fuller Testdata: anchovytsDickey-Fuller = -1.6851, Lag order = 2, p-value = 0.6923alternative hypothesis: stationary`

The p-value is greater than 0.05. We cannot reject the null hypothesis. The null hypothesis is that the data are non-stationary.

### 5.3.4 ADF test using `ur.df()`

The `ur.df()`

Augmented Dickey-Fuller test in the **urca** package gives us a bit more information on and control over the test.

`ur.df(y, type = c("none", "drift", "trend"), lags = 1, selectlags = c("Fixed", "AIC", "BIC")) `

The `ur.df()`

function allows us to specify whether to test stationarity around a zero-mean with no trend, around a non-zero mean with no trend, or around a trend with an intercept. This can be useful when we know that our data have no trend, for example if you have removed the trend already. `ur.df()`

allows us to specify the lags or select them using model selection.

#### 5.3.4.1 Test on white noise

Let’s first do the test on data we know is stationary, white noise. We have to choose the `type`

and `lags`

. If you have no particular reason to not include an intercept and trend, then use `type="trend"`

. This allows both intercept and trend. When you might you have a particular reason not to use `"trend"`

? When you have removed the trend and/or intercept.

Next you need to chose the `lags`

. We will use `lags=0`

to do the Dickey-Fuller test. Note the number of lags you can test will depend on the amount of data that you have. `adf.test()`

used a default of `trunc((length(x)-1)^(1/3))`

for the lags, but `ur.df()`

requires that you pass in a value or use a fixed default of 1.

`lags=0`

is fitting the following model to the data:

`z.diff = gamma * z.lag.1 + intercept + trend * tt`

`z.diff`

means \(\Delta y_t\) and `z.lag.1`

is \(y_{t-1}\). You are testing if the effect for `z.lag.1`

is 0.

When you use `summary()`

for the output from `ur.df()`

, you will see the estimated values for \(\gamma\) (denoted `z.lag.1`

), intercept and trend. If you see `***`

or `**`

on the coefficients list for `z.lag.1`

, it suggest that the effect of `z.lag.1`

is significantly different than 0 and this supports the assumption of stationarity. However, the test level shown is for independent data not time series data. The correct test levels (critical values) are shown at the bottom of the summary output.

`wn <- rnorm(TT)test <- urca::ur.df(wn, type = "trend", lags = 0)urca::summary(test)`

`############################################### # Augmented Dickey-Fuller Test Unit Root Test # ############################################### Test regression trend Call:lm(formula = z.diff ~ z.lag.1 + 1 + tt)Residuals: Min 1Q Median 3Q Max -2.2170 -0.6654 -0.1210 0.5311 2.6277 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.0776865 0.2037709 0.381 0.704 z.lag.1 -1.0797598 0.1014244 -10.646 <2e-16 ***tt 0.0004891 0.0035321 0.138 0.890 ---Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1Residual standard error: 1.004 on 96 degrees of freedomMultiple R-squared: 0.5416, Adjusted R-squared: 0.532 F-statistic: 56.71 on 2 and 96 DF, p-value: < 2.2e-16Value of test-statistic is: -10.646 37.806 56.7083 Critical values for test statistics: 1pct 5pct 10pcttau3 -4.04 -3.45 -3.15phi2 6.50 4.88 4.16phi3 8.73 6.49 5.47`

Note `urca::`

in front of `summary()`

is needed if you have not loaded the urca package with `library(urca)`

.

We need to look at information at the bottom of the summary output for the test statistics and critical values. The part that looks like this

`Value of test-statistic is: #1 #2 #3Critical values for test statistics: 1pct 5pct 10pcttau3 xxx xxx xxx...`

The first test statistic number is for \(\gamma=0\) and will be labeled `tau`

, `tau2`

or `tau3`

.

In our example with white noise, notice that the test statistic is LESS than the critical value for `tau3`

at 5 percent. This means the null hypothesis is rejected at \(\alpha=0.05\), a standard level for significance testing.

#### 5.3.4.2 When you might want to use `ur.df()`

If you remove the trend (and/or level) from your data, the `ur.df()`

test allows you to increase the power of the test by removing the trend and/or level from the model.