Least Square Method in Time Series

18/04/2020 1 By indiafreenotes

During Time Series analysis we come across with variables, many of them are dependent upon others. It is often required to find a relationship between two or more variables.  Least Square is the method for finding the best fit of a set of data points. It minimizes the sum of the residuals of points from the plotted curve. It gives the trend line of best fit to a time series data. This method is most widely used in time series analysis.

Method of Least Squares 

Each point on the fitted curve represents the relationship between a known independent variable and an unknown dependent variable.

In general, the least squares method uses a straight line in order to fit through the given points which are known as the method of linear or ordinary least squares. This line is termed as the line of best fit from which the sum of squares of the distances from the points is minimized.

Equations with certain parameters usually represent the results in this method. The method of least squares actually defines the solution for the minimization of the sum of squares of deviations or the errors in the result of each equation.

The least squares method is used mostly for data fitting. The best fit result minimizes the sum of squared errors or residuals which are said to be the differences between the observed or experimental value and corresponding fitted value given in the model. There are two basic kinds of the least squares methods – ordinary or linear least squares and nonlinear least squares.

Mathematical Representation

It is a mathematical method and with it gives a fitted trend line for the set of data in such a manner that the following two conditions are satisfied.

  1. The sum of the deviations of the actual values of Y and the computed values of Y is zero.
  2. The sum of the squares of the deviations of the actual values and the computed values is least.

This method gives the line which is the line of best fit. This method is applicable to give results either to fit a straight line trend or a parabolic trend.

The method of least squares as studied in time series analysis is used to find the trend line of best fit to a time series data.

Secular Trend Line

The secular trend line (Y) is defined by the following equation:

Y = a + b X

Where, Y = predicted value of the dependent variable

a = Y-axis intercept i.e. the height of the line above origin (when X = 0, Y = a)

b = slope of the line (the rate of change in Y for a given change in X)

When b is positive the slope is upwards, when b is negative, the slope is downwards

X = independent variable (in this case it is time)

To estimate the constants a and b, the following two equations have to be solved simultaneously:

ΣY = na + b ΣX

ΣXY = aΣX + bΣX2

 To simplify the calculations, if the midpoint of the time series is taken as origin, then the negative values in the first half of the series balance out the positive values in the second half so that ΣX = 0. In this case, the above two normal equations will be as follows:

ΣY = na

ΣXY = bΣX2

In such a case the values of a and b can be calculated as under:

Since ΣY = na

a = ∑Yn

Since, ΣXY = bΣX2

Example

Fit a straight line trend on the following data using the Least Squares Method.

Period (year) 1996 1997 1998 1999 2000 2001 2002 2003 2004
Y 4 7 7 8 9 11 13 14 17

Solution:

Total of 9 observations are there. So, the origin is taken at the Year 2000 for which X is assumed to be 0.

PERIOD (YEAR) Y X XY X2 REMARK
1996 4 -4 -16 16 NEGATIVE REGION
1997 7 -3 -21 9
1998 7 -2 -14 4
1999 8 -1 -8 1
2000 9 0 0 0 ORIGIN
2001 11 1 11 1 POSITIVE REGION
2002 13 2 16 4
2003 14 3 42 9
2004 17 4 68 16
Total (Σ) ΣY = 90 ΣX = 0 ΣXY = 88 SΣX2 =60

From the table we find that value of n is 9, value of   ΣY is 90, value of ΣX is  0, value of ΣXY is  88   and value of  ΣX2  is 60 .

Substituting these values in the two given equations,

a = 909 or a = 10
b =  8860 or b = 1.47
Trend equation is :    Y = 10 + 1.47 X