Forecasters use all kinds of software to do their job; software that comes integrated with enterprise software, specialised statistical packages, or even Excel. Here is a small instruction how to forecast with the open source statistical environment R.
What is R?
R is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R.
R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques, and is highly extensible.
Install the R software
To install R dowload the latest precompiled binary base from a location near you via:
http://cran.r-project.org/mirrors.html
After download completes, run the setup program for a full install. After this you should be able to start R from the Start Menu in Windows, or it’s corresponding location if you use another operating system.
Install the forecast package
The next step is to install the forecast library. To do so you should select a mirror from the GUI menu at:
Packages > Set CRAN mirror
Select a location nearby to speed up downloads. Next, select the forecast package from the menu “Packages > Install package(s)”. The list is alphabetical so you should scroll down a bit. Select forecast, press OK and the package will install itself and all packages it needs.
Before using the forecast package, you should load it into the current R workspace. To do so, go to the menu “Packages > Load package” and select forecast again. Press OK and it will load the forecast and all required packages. An other way is to type in the console:
library(forecast)
Import data into R
The next step is to import data and forecast. R has many import functions for files, other statistical programs and databases. We use a simple comma seperated file here, but if you have another source of data, take a look at the manual at:
http://cran.r-project.org/doc/manuals/R-data.pdf
To import data from a csv file, you can use the read.csv() function. If the file is named “mydata.csv”, has headings and is seperated with commas, the command to type in the console is:
Data1 <- read.csv(file="mydata.csv", head="TRUE", sep=",")
Next step is to define a time series, for example based on a column with name Col1:
Series1 <- ts(Data1$Col1, frequency=12, start=2009)
Fit a model and forecast
The forecast package has a automatic exponential smoothing algorithm that delivers great performance. Although there is a lot of computation involved, it can be handled remarkably quickly on modern computers.
To fit a model to the time series:
fit <- ets(Series1)
Display a summary of the fitted model:
summary(fit)
Plot a graph of the forecast 4 periods out:
plot(forecast(fit,h=4))
More information
To read more on the forecast package for R:
http://www.jstatsoft.org/v27/i03/paper
The author of the R forecast package:
http://robjhyndman.com/