Introduction:
Time Series : A time series is a data series consisting of several values over a time interval. e.g. daily BSE Sensex closing point, weekly sales and monthly profit of a company etc.
Typically, in a time series it is assumed that value at any given point of time is a result of its historical values. This assumption is the basis of performing a time series analysis. ARIMA technique exploits the auto-correlation (Correlation of observation with its lags) for forecasting.
So talking mathematically,
Vt = p(Vt-n) + e
It means Value (V) at time "t" is a function of value at time "n" instance ago with an error (e). Value at time "t" can depend on one or various lags of various order.
Example :
Components of a Time Series :
1. Trend
Series could be constantly increasing or decreasingor first decreasing for a considerable time period and then decreasing. This trend is identified and then removed from the time series in ARIMA forecasting process.
2. Seasonality
Repeating pattern with fixed period.
Time Series : A time series is a data series consisting of several values over a time interval. e.g. daily BSE Sensex closing point, weekly sales and monthly profit of a company etc.
Typically, in a time series it is assumed that value at any given point of time is a result of its historical values. This assumption is the basis of performing a time series analysis. ARIMA technique exploits the auto-correlation (Correlation of observation with its lags) for forecasting.
So talking mathematically,
Vt = p(Vt-n) + e
It means Value (V) at time "t" is a function of value at time "n" instance ago with an error (e). Value at time "t" can depend on one or various lags of various order.
Example :
Suppose Mr. X starts his job in year 2010 and his starting salary was $5,000 per month. Every years he is appraised and salary reached to a level of $20,000 per month in year 2014. His annual salary can be considered a time series and it is clear that every year's salary is function of previous year's salary (here function is appraisal rating).
Components of a Time Series :
1. Trend
Series could be constantly increasing or decreasingor first decreasing for a considerable time period and then decreasing. This trend is identified and then removed from the time series in ARIMA forecasting process.
2. Seasonality
Repeating pattern with fixed period.
Example - Sales in festive seasons. Sales of Candies and sales of Chocolates peaks in every October Month and December month respectively every year in US. It is because of Halloween and Christmas falling in those months. The time-series should be de-seasonalized in ARIMA forecasting process.
3. Random Variation (Irregular Component / Residual)
This is the unexplained variation in the time-series which is totally random. Erratic movements that are not predictable because they do not follow a pattern. It is also known as residual.
Example - Earthquake
Terminologies related to Time Series
1. Stationary Series
A stationary series is one whose mean and variance of the series is constant over time.
Autocorrelation refers to the correlation of a time series with its own past and future values. Autocorrelation is also sometimes called “lagged correlation” or “serial correlation”.
4. Random Walk
A random walk is defined as a process where the current value of a variable is composed of the past value plus an error term defined as a white noise (a normal variable with zero mean and variance one). Algebraically a random walk is represented as follows: yt = yt−1 + e
ARIMA (Box-Jenkins Approach)
ARIMA stands for Auto-Regressive Integrated Moving Average. It is also known as Box-Jenkins approach. It is one of the most popular techniques used for time series analysis and forecasting purpose.
We would cover ARIMA in a series of blogs starting from introduction, theory and finally the process of performing ARIMA on SAS.
Well, coming back to ARIMA, as its full form indicates that it involves two components :
1. Auto-regressive Component
It implies relationship of a value of a series at a point of time with its own previous values. Such relationship can exist with any order of lag.
Lag -
Lag is basically value at a previous point of time. It can have various orders as shown in the table below. It hints toward a pointed relationship.
3. Random Variation (Irregular Component / Residual)
This is the unexplained variation in the time-series which is totally random. Erratic movements that are not predictable because they do not follow a pattern. It is also known as residual.
Example - Earthquake
Terminologies related to Time Series
1. Stationary Series
The series has to be stationary before building a time series with ARIMA. Most of the time series are non-stationary. If series is non-stationary, we need to make it stationary with detrending, differencing etc.
Why Stationary?
To calculate the expected value, we generally take a mean across time intervals. The mean across many time intervals makes sense only when the expected value is the same across those time periods. If the mean and population variance can vary, there is no point estimating by taking an average across time.
2. White Noise
A white noise process is one with a constant mean and variation and no correlation between its values at different times. White noise series exhibit a very erratic, jumpy, unpredictable behavior. Since values are uncorrelated, previous values do not help us to forecast future values.
White noise series themselves are quite uninteresting from a forecasting standpoint (they are no linearly forecastable).
3. Autocorrelation
Autocorrelation refers to the correlation of a time series with its own past and future values. Autocorrelation is also sometimes called “lagged correlation” or “serial correlation”.
4. Random Walk
A random walk is defined as a process where the current value of a variable is composed of the past value plus an error term defined as a white noise (a normal variable with zero mean and variance one). Algebraically a random walk is represented as follows: yt = yt−1 + e
The implication of a process of this type is that the best prediction of y for next period is the current value or in other words the process does not allow to predict the change (yt − yt−1). That is, the change of y is absolutely random. It can be shown that the mean of a random walk process is constant but its variance is not. Therefore a random walk process is non-stationary, and its variance increases with t.
![]() |
Time Series : Random Walk |
ARIMA stands for Auto-Regressive Integrated Moving Average. It is also known as Box-Jenkins approach. It is one of the most popular techniques used for time series analysis and forecasting purpose.
We would cover ARIMA in a series of blogs starting from introduction, theory and finally the process of performing ARIMA on SAS.
Well, coming back to ARIMA, as its full form indicates that it involves two components :
- Auto-regressive component
- Moving average component
1. Auto-regressive Component
It implies relationship of a value of a series at a point of time with its own previous values. Such relationship can exist with any order of lag.
Lag -
Lag is basically value at a previous point of time. It can have various orders as shown in the table below. It hints toward a pointed relationship.
![]() |
Time Series : Lag |
2. Moving average components
It implies the current deviation from mean depends on previous deviations. Such relationship can exist with any number of lags which decides the order of moving average.
Moving Average -
Moving Average is average of consecutive values at various time periods. It can have various orders as shown in the table below. It hints toward a distributed relationship as moving itself is derivative of various lags.
It implies the current deviation from mean depends on previous deviations. Such relationship can exist with any number of lags which decides the order of moving average.
Moving Average -
Moving Average is average of consecutive values at various time periods. It can have various orders as shown in the table below. It hints toward a distributed relationship as moving itself is derivative of various lags.
![]() |
Moving Average Explanation |
Moving average is itself considered as one of the most rudimentary methods of forecasting. So if you drag the average formula in excel further (beyond Dec-15), it would give you forecast for next month.
Both Auto-regressive (lag based) and moving average components in conjunction are used by ARIMA technique for forecasting a time series.
Now we would directly jump to ARIMA process in SAS.
Part 2 : Time Series Forecasting : ARIMA
About the Author -
Both Auto-regressive (lag based) and moving average components in conjunction are used by ARIMA technique for forecasting a time series.
Now we would directly jump to ARIMA process in SAS.
Part 2 : Time Series Forecasting : ARIMA
About the Author -
This article was originally written by Rajat Agarwal, later Deepanshu gave final touch to the post. Rajat is an analytics professional with more than 8 years of work experience in diverse business domains. He has gained expert knowledge in Excel and SAS. He loves to create innovative and imaginative dashboards with Excel. He is founder and lead author cum editor at Ask Analytics.