# A Complete Introduction To Time Series Analysis (with R):: Durbin-Levinson Algorithm

In the last article, we saw how we could find the form of the best linear predictor of X_{n+h} using all previous observations up to n has the form

where the coefficients satisfy

We also saw that in practice it is very cumbersome to invert the Gamma matrix on the left. Therefore, we would like to have a way to find the coefficients without actually having to inverse the matrix by hand.

Dubin-Levinson Algorithm

Let {X_{t}} be a stationary time series and zero-mean , then for n=0,1,.. , the Dubin-Levinson algorithm computes coefficients

Let

Then, the Dubinson algorithm is given by

for n=2,… Although it might seem a bit complex, it turns out that by executing this algorithm, we recursively obtain the coefficients of the BLP at each step, for as many steps as we need. These coefficients are produced along the process! Further, notice that althugh we can only predict one step ahead at the time,

we can actually use these predictions to compute further observations, recursively. Pretty cool eh! Since we don’t need to invert any matrix, it turns out that this algorithm runs in O(n²) time, one polynomial level below the naive matrix inversion! Further, we also obtain the MSE of the prediction at each time step, as

Proof of correctness

This part is actually quite tedious, and not for the weak of the heart. I leave it here for the mathematically inclined, but else, feel free to skip to the “Partial autocorrelation function” section. First, let’s define

That is, the first one is just the Gamma matrix scaled by the variance (or zero-covariance). We also define to the right two vectors: one that is a collection of all correlations from 1 to n, and the other one which is the same vector, but backwards. Now note that

The first equation holds from what we have previously derived in the last article, that is

. Here, we simply give another name to the coefficients. Let’s now examine why this algorithm works, by using mathematical induction.

Indeed, this is the correct coefficient, as we had previously shown that

minimizes the MSE when using only the last observation.

Note that we essentially want to show that this linear combination correctly produces the series of partial autocorrelations , which are the coefficients of the best linear predictor!

Let’s see what’s going on here:

1. First, we simply put the definition of the matrix, and realize that its elements are precisely given that way (remember that this is a symmetric matrix!), and also given to the recursive step. The rightmost vector is given by the definitions of the algorithm.
2. We the proceed to perrom matrix multiplication like in the second step (you can verify this).
3. In the third step, we distribute the terms, and realize that the first term in the equation can be replaced by definition , so that we can re-arange terms and factorize in both upper and lower entries.
4. In the last part, we see that the second term of the first entry is zero, by pluggin in the definition by induction.

Although this might seem very confusing at first, I highly recommend you go though it by yourself to fully understand it. Now, after the last step, we know that

Now, one thing that we have to show is that

Which follows as we have previously shown the MSE of the BLP has the vectorial form above in the last article. In the last equation, we simply factorized the ACV(0). We can use these facts for the second entry of the last vector in the equation above to obtain

Thus, using these results we have shown that

Finally, we need to show that

We can do this as

For the first two lines, we are just using the facts that we found before, and then we are simply applying the recursive definitions of the algorithm. After some algebraic manipulations, we arrive at the desired answer.

Partial Autocorrelation Function

More explicitly,

Indeed, what this is telling us is, the autocorrelation given by the last timsetmp and the first one, ignoring the timestamps in between. The interesting thing is that thanks to the Duvin-Levinson algorithm, we already obtain these coefficients, which also happen to be the PACF! Remember that very early in this series, we actually presented the PACF briefly but didn’t talk much about how it came to be. Now you know how :). Make sure to take a look back at the Classical Decomposition and Differencing articles to see how to interpret the PACF!

Note: In practice, we do not know the actual ACV values of interest. However, we can simply use the sample ACV from the data as estimates!

## How to R

We will present a very brief example of using the Durbin-Levinson algorithm. For this, we wil need the R package `FitAR` . First we construct an AR(4) model, that is, a model that has dependence on the 4 last observations, i.e.

Above, we first set a random seat, then set a vector of the true values of the AR(4) model. We then convert this into a list, and use the `arima.sim` function to simulate some data accordingly. Then, we define the sample autocorrelation function for the whole series using the built-in `acv` function finally the `FitAR::PacfDL` function to calculate the partial autocorrelation as follows:

We specify a lag of 4 because we know that our series comes from the AR(4) process. If we didn’t know, we would have to first fit a bunch of models to determine this (we will see how later). We can also compare this to the built-in function for autocorrelations:

Finally, we can visualize it with `ggPacf` :

Notice something interesting? It is not a coincidence that exactly four lags fall out of bounds, and they are proportional to the magnitude of the AR(4) model coefficients. We will see why this is the case in a later article.

Note: Example adapted from the FitAR package documentation.

## Next time

And that’s it for today! As you could see, inverting the Gamma matrix can become rather bothersome. In the next article, we will see how we can instead use a clever algorithm that will save us from doing this, and still obtaining the desired coefficients. Stay tuned, and see you next time!

## Last time

Best Linear Predictor (Part I)

## Main page

Data Scientist & Data Engineer at Cisco, Canada. McGill University CS, Stats & Linguistics graduate. Polyglot.

## More from Hair Parra

Data Scientist & Data Engineer at Cisco, Canada. McGill University CS, Stats & Linguistics graduate. Polyglot.

## Why Is 1 + 2 + 4 + 8 + … = -1?

Get the Medium app