Time Series Analysis: Utility of Time Series9th February 2020
A time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Examples of time series are heights of ocean tides, counts of sunspots, and the daily closing value of the Dow Jones Industrial Average.
Time series are very frequently plotted via line charts. Time series are used in statistics, signal processing, pattern recognition, econometrics, mathematical finance, weather forecasting, earthquake prediction, electroencephalography, control engineering, astronomy, communications engineering, and largely in any domain of applied science and engineering which involves temporal measurements.
Time series analysis comprises methods for analyzing time series data in order to extract meaningful statistics and other characteristics of the data. Time series forecasting is the use of a model to predict future values based on previously observed values. While regression analysis is often employed in such a way as to test theories that the current values of one or more independent time series affect the current value of another time series, this type of analysis of time series is not called “time series analysis”, which focuses on comparing values of a single time series or multiple dependent time series at different points in time. Interrupted time series analysis is the analysis of interventions on a single time series.
Time series data have a natural temporal ordering. This makes time series analysis distinct from cross-sectional studies, in which there is no natural ordering of the observations (e.g. explaining people’s wages by reference to their respective education levels, where the individuals’ data could be entered in any order). Time series analysis is also distinct from spatial data analysis where the observations typically relate to geographical locations (e.g. accounting for house prices by the location as well as the intrinsic characteristics of the houses). A stochastic model for a time series will generally reflect the fact that observations close together in time will be more closely related than observations further apart. In addition, time series models will often make use of the natural one-way ordering of time so that values for a given period will be expressed as deriving in some way from past values, rather than from future values.
Analysis of time series has a lot of utilities for the various fields of human interest viz: business, economics, sociology, politics, administration etc. It is, also, found very useful in the fields of physical, and natural sciences. Some such points of its utilities are briefly here as under:
(i) It helps in studying the behaviours of a variable.
In a time series, the past data relating to a variable over a period of time are arranged in an orderly manner. By simple observation of such a series, one can understand the nature of change that takes place with the variable in course of time. Further, by the technique of isolation applied to the series, one can tendency of the variable, seasonal change, cyclical change, and irregular or accidental change with the variable.
(ii) It helps in forecasting
The analysis of a time series reveals the mode of changes in the value of a variable in course of the times. This helps us in forecasting the future value of a variable after a certain period. Thus, with the help of such a series we can make our future plan relating to certain matters like production, sales, profits, etc. This is how in a planned economy all plans for the future development are the analysis of time series of the relevant data.
(iii) It helps in evaluating the performances.
Evaluation of the actual performances with reference to the predetermined targets is highly necessary to judge the efficiency, or otherwise in the progress of a certain work. For example, the achievements or out five-Year Plans are evaluated by determining the annual rate of growth in the gross national product. Similarly, our policy of controlling the inflation, and price rises is evaluated with the help of various price indices. All these are facilitated by analysis of the time series relating to the relevant variables.
(iv) It helps in making comparative study
Comparative study of data relating to two, of more periods, regions, or industries reveals a lot of valuable information which guide a management in taking the proper course of action for the future. A time series, per se, provides a scientific basis for making the comparision between the two, or more related set of data as in such series, the data are chronologically, and the effects of its various components are gradually isolated and unraveled.
The central logical problem in forecasting is that the “cases” (that is, the time periods) which you use to make predictions never form a random sample from the same population as the time periods about which you make the predictions. This point is vividly illustrated by the 509-point plunge in the Dow-Jones Industrial Average on October 19, 1987. Even in percentage terms, no one-day drop in the previous 40 years (comprising some 8000 trading days) had ever been more than a fraction of that size. Thus this drop (which occurred in the absence of any dramatic news developments) could never have been forecast just from a time-series study of stock prices over the previous 40 years, no matter how detailed. It is now widely believed that a major cause of the drop was the newly-introduced widespread use of personal computers programmed to sell stock options whenever stock prices dropped slightly; this created a snowball effect. Thus the stock market’s history of the previous 40 or even 100 years was irrelevant to predicting its volatility in October 1987, because nearly all that history had occurred in the absence of this new factor.
The same general point arises in nearly all forecasting work. If you have records of monthly sales in a department store for the last 10 years, and are asked to project those sales into the future, those statistics will not reflect the fact that as you work, a new discount store is opening a few blocks away, or the city has just changed the street in front of your store to a one-way street, making it harder for customers to reach your store.
A second problem that arises in time-series forecasting is that you rarely know the true shape of the distribution with which you are working. Workers in other areas often assume normal distributions while knowing that assumption is not fully accurate. However, such a simplification may produce more serious errors in time-series work than in other areas. In much statistical work the problem of non-normal distributions is greatly ameliorated by the fact that you are really concerned with sample means, and the Central Limit Theorem asserts that means are often approximately normally distributed even when the underlying scores are not. However, in time-series forecasting you are often concerned with just one time period–the period for which you want to forecast. Thus the Central Limit Theorem has no chance to operate, and the assumption of normality may lead to seriously wrong conclusions. Even if your forecast is just one of a series of forecasts which you update after each new time period, the forecasts are made one at a time, so that a single seriously wrong forecast may bankrupt your company or lead to your dismissal, and nobody will ever learn that your next 50 forecasts would have been within the range predicted by a normal distribution. Some stock market speculators, who had previously been quite successful, were bankrupted or driven into retirement by the stock market plunge of 1987. That’s very different from the situation in which a company hires, all at once, 50 workers identified by a competence test. If one out of the 50 is a spectacular failure, the company (and you the forecaster) will survive because at the very same time the other 49 were turning out well.
Analyzing the Impact of Single Events
When you try to assess the impact of a single event, the major problem is that there are always many events occurring at any one time. Suppose you are trying to assess the effect of a new toll on bridge A on traffic across bridge B, but a new store opened near bridge B the same day the toll was introduced, permanently increasing traffic on bridge B. When critics remind us that “correlation does not imply causation”, they are mostly talking about the possible effects of overlooked variables. But in these time-series examples we are talking about the possible effects of overlooked events. It’s difficult to say which type of problem is more intractable, but they do seem to be two different types of problem.
Analyzing Causal Patterns
When scientists use time series to study the effects of one variable on another, they usually have at least two time series–one for the independent variable and one for the dependent–as in our earlier example on the relation between unemployment and crime. The problems in analyzing causal patterns are difficult but not impossible.
One problem with such research is that because the observations within each series are not independent of each other, the probability of finding a high correlation between the two series may be higher than is suggested by standard formulas. Later we describe a solution to this problem.
A second problem is that it is rarely reasonable to assume that the time sequence of the causal patterns matches the time periods in the study. Thus if increased unemployment typically produced an increase in crime exactly six months later but not five months later, then it would be fairly easy to discover that relationship by correlating monthly changes in unemployment with monthly changes in crime six months later. However, it is much more plausible to assume that increased unemployment in January produces a slight rise in crime during February, a further slight rise during March, and so on for several months. Such effects can be much more difficult to detect, though later we do suggest a solution to this problem.
A third problem in analyzing causal patterns is the familiar problem that correlation does not imply causation. As in ordinary regression problems, it helps to be able to control statistically for covariates. Later we describe one way to do this in time-series problems.