5
The main idea in Time Series Analysis is that of serial correlation. Briefly, in terms of daily
trading prices, serial correlation describes how much of today’s asset prices are correlated to
previous days’ prices. Understanding the structure of this correlation helps us to build sophisti-
cated models that can help us interpret the data and predict future values. The concept of asset
momentum–and trading strategies derived from it–is based on positive serial correlation of asset
returns.
Time Series Analysis can be thought of as a more rigourous approach to understanding the
behaviour of financial asset prices than is provided via "technical analysis".
While technical analysis has basic "indicators" for trends, mean reverting behaviour and
volatility determination, Time Series Analysis brings with it the full power of statistical inference.
This includes hypothesis testing, goodness-of-fit tests and model selection, all of which serve
to help rigourously determine asset behaviour and thus eventually increase profitability of sys-
tematic strategies. Trends, seasonality, long-memory effects and volatility clustering can all be
understood in much more detail.
To carry out Time Series Analysis in this book the R statistical programming environment,
along with its many external libraries, will be utilised.
1.2.3 Machine Learning
Machine Learning is another subset of statistical learning that applies modern statistical models
to vast data sets, whether they have a temporal component or not. Machine Learning is part
of the broader "data science" and quant ecosystem. In essence it is a fusion of computational
methods–mainly optimisation techniques–within a rigourous probabilistic framework. It provides
the ability to "learn a model from data".
Machine Learning is generally subdivided into three separate categories: Supervised Learning,
Unsupervised Learning and Reinforcement Learning.
Supervised Learning makes use of "training data" to train, or supervise, an algorithm to detect
patterns in data. Unsupervised Learning differs in that there is no concept of training (hence
the "unsupervised"). Unsupervised algorithms act solely on the data without being penalised or
rewarded for correct answers. This makes it a far harder problem. Both of these techniques will
be studied at length in this book and applied to quant trading strategies.
Reinforcement Learning has gained significant popularity over the last few years due to
the famous results of firms such as Google DeepMind[3], including their work on Atari 2600
videogames[70] and the AlphaGo contest[4]. Unfortunately Reinforcement Learning is a vast
area of academic research and as such is outside the scope of the book.
In this book Machine Learning techniques such as Support Vector Machines and Random
Forests will be used to find more complicated relationships between differing sets of financial
data. If these patterns can be successfully validated then they can be used to infer structure in
the data and thus make predictions about future data points. Such tools are highly useful in
alpha generation and risk management.
To carry out Machine Learning in this book the Python Scikit-Learn and Pandas libraries
will be utilised.