Time Series Forecasting - ARIMA [Part 3]

Here comes the climax of the Time Series Forecasting - ARIMA series. Hope you have gone through and enjoyed learning previous two articles in the series, if not then please do it.

1. Time Series Forecasting - ARIMA [Part 1]

2. Time Series Forecasting - ARIMA [Part 2]

We have checked the Volatility and stationarity in the series and have made the series non-volatile and stationary. We have also divided dataset into two parts : training and validaton. Now we are ready to perform ARIMA modeling on training Dataset.

Next Step : Model Identification

The order of an ARIMA (autoregressive integrated moving-average) model is usually denoted by the notation ARIMA(p,d,q ) or it can be read as AR(p) , I(d), MA(q)

p = Order of Autoregression
d = Order of differencing (No. of times data to be differenced to become stationary)
q = Order of Moving Average

Many of the simple time series models are special cases of ARIMA Model

Simple Exponential Smoothing ARIMA(0,1,1)
Holt's Exponential Smoothing ARIMA(0,2,2)
White noise ARIMA(0,0,0)
Random walk ARIMA(0,1,0) with no constant
Random walk with drift ARIMA(0,1,0) with a constant
Autoregression ARIMA(p,0,0)
Moving average ARIMA(0,0,q)

We can do the model identification in two ways :

1 . Using ACF and PACF Functions

2. Using Minimum Information Criteria Matrix (Recommended)

Autocorrelation Function (ACF)

Autocorrelation is a correlation coefficient. However, instead of correlation between two different variables, the correlation is between two values of the same variable at times Xt and Xt-h. Correlation between two or more lags.

Partial Autocorrelation Function (PACF)

For a time series, the partial autocorrelation between xt and xt-h is defined as the conditional correlation between xt and xt-h, conditional on xt-h+1, ... , xt-1, the set of observations that come between the time points t and t−h.

ARIMA Procedure

identify var=VariableY(PeriodsOfDifferencing);
estimate p=OrderOfAutoregression q=OrderOfMovingAverage;

where VariableY is modeled as ARIMA(p,d,q) with p = OrderOfAutoregression, d = the order of differencing (determined from PeriodsOfDifferencing), and q = OrderOfMovingAverage.

Using these identified p and q values, we run ARIMA model.

PROC ARIMA DATA= Training ;
IDENTIFY VAR = Log_Air(1,12) ;
ESTIMATE P =1 Q =1 OUTSTAT= stats ;
Forecast lead=12 interval = month id = date
out = result;
RUN;
Quit;

We strongly suggest to follow Minimum Information Criteria Matrix approach though.

Minimum Information Criteria Matrix approach

A MINIC table is then constructed using BIC(m,j) where m=pmin,.......pmax and j=qmin....qmax.

ARIMA Orders

We run following code first to get MINIC:

PROC ARIMA DATA= Training;
IDENTIFY VAR = Log_Air(1,12) MINIC;
RUN;Quit;

It would give you the matrix given below. Find the minimum value (large negative) point in the matrix.

Now we consider the maximum of P(3) and Q(0) suggested by MINIC which is max(3,0) = 3 in this case. And then we iterate ARIMA model for P = 0 to 3 to Q = 0 to 3 (Except 0,0).

%Macro top_models;

%do p = 0 %to 3 ;
%do q = 0 % to 3 ;

PROC ARIMA DATA= test ;
IDENTIFY VAR = Log_Air(1,12) ;
ESTIMATE P = &p. Q =&q. OUTSTAT= stats_&p._&q. ;
Forecast lead=12 interval = month id = date
out = result_&p._&q.;
RUN;
Quit;

data stats_&p._&q.;
set stats_&p._&q.;
p = &p.;
q = &q.;
Run;

data result_&p._&q.;
set result_&p._&q.;
p = &p.;
q = &q.;
Run;

%end;
%end;

Data final_stats ;
set %do p = 0 %to 3 ;
%do q = 0 % to 3 ;
stats_&p._&q.
%end;
%end;;
Run;
Data final_results ;
set %do p = 0 %to 3 ;
%do q = 0 % to 3 ;
result_&p._&q.
%end;
%end;;
Run;

%Mend;
%top_models

/* Then to calculate the mean of AIC and SBC*/

proc sql;
create table final_stats_1 as select p,q, sum(_VALUE_)/2 as mean_aic_sbc from final_stats
where _STAT_ in ('AIC','SBC')
group by p,q
order by mean_aic_sbc;

quit;

Save AIC and SBC values of all the iterations and choose top 5-7 models with minimum AIC and SBC average values.

Now for all these selected models selected using AIC and SBC average, we calculate MAPE on validation data. We run the ARIMA on validation data with all selected P and Q.

Calculate Mean Squared Percentage Error (MAPE) for each model :

MAPE = Abs(Actual – Predicted) / Actual *100

Use the following code :

Proc SQL;
create table final_results_1 as select a.p, a.q, a.date,a.forecast, b.log_air
from final_results as a join validation as b
on a.date = b.date;
quit;

Data Mape;
set final_results_1 ;
Ind_Mape = abs(log_air - forecast)/ log_air;
Run;

Proc Sql;
create table mape as select p, q, mean(ind_mape) as mape from mape
group by p, q
order by mape ;
quit;

Results:

Model with least MAPE is finally your climax model which is p= 0, q=3;

Time Series Forecasting - ARIMA [Part 3]

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112