使用Python与R进行时间序列分析、机器学习的算法交易实战

5星 · 超过95%的资源需积分: 9 125 浏览量更新于2024-07-20 5 收藏 13.83MB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

"Advanced Algorithmic Trading 2016" 是一本深入探讨算法交易的书籍，专注于使用时间序列分析、机器学习以及贝叶斯统计学，通过开源的Python和R编程语言来实施高级交易策略，以提升策略的盈利能力。本书分为多个章节，详细介绍了算法交易的各个方面。首先，作者在引言部分阐述了高级算法交易的核心概念，包括追求超额收益（Alpha）的重要性，为什么选择时间序列分析、贝叶斯统计和机器学习作为工具，以及这些工具在交易策略中的应用。书中还提到了读者需要具备的基础数学和编程知识，并对比了本书与其它算法交易书籍的区别，指导读者如何安装必要的软件环境，如Python和R。在第二部分，书中深入介绍了贝叶斯统计。这部分内容从基本的贝叶斯统计概念出发，通过对比频率主义和贝叶斯方法的实例来解释其核心差异。接着，详细讲解了如何运用贝叶斯规则进行推断，并通过一枚硬币翻转的例子直观地展示贝叶斯方法。此外，还讨论了如何对二项比例进行贝叶斯推断，详细解析了该方法的假设、贝叶斯规则的应用以及似然函数的计算，特别是涉及伯努利分布的部分。接下来的部分可能会继续深入到机器学习和时间序列分析的细节，包括但不限于各种预测模型、回归分析、神经网络、支持向量机、决策树等在交易策略中的应用，以及如何利用Python和R的库（如scikit-learn、pandas、tidyverse等）来实现这些技术。此外，还会介绍如何进行回测和模拟交易，如书中提到的QSTrader软件，以及如何评估和优化交易策略的性能指标，如夏普比率、信息比率和最大回撤等。本书的目标读者是那些已经有一定编程基础，特别是Python和R经验的金融从业者，希望通过数据驱动的方法提升交易决策的精准度和效率。对于想要深入了解并实践算法交易的读者来说，"Advanced Algorithmic Trading 2016" 提供了一个全面而深入的学习框架，涵盖了理论与实践的结合，是一本宝贵的参考资料。

资源详情

资源推荐

Chapter 1

Introduction To Advanced

Algorithmic Trading

1.1 The Hunt for Alpha

The goal of the quantitative trading researcher is to seek out what is termed alpha–new streams

of uncorrelated risk-adjusted returns–and then exploit these returns via a systematic trading

model and execution infrastructure.

Alpha is diﬃcult to ﬁnd, as by deﬁnition once it is well-known it decays and seeks to be

an uncorrelated source of returns. Instead it gradually becomes a risk factor and thus loses its

risk-adjusted proﬁtability.

This book concentrates on three major areas of mathematical modelling–Bayesian Statistics,

Time Series Analysis and Machine Learning–that will augment your quantitative trading research

process in order to help you discover sources of alpha.

Many of these techniques are in use at some of the largest global asset managers and quan-

titative hedge funds. In the following chapters these techniques will be described and applied to

ﬁnancial data in order to develop testable systematic trading strategies.

1.2 Why Time Series Analysis, Bayesian Statistics and Ma-

chine Learning?

In the last few years there has been a signiﬁcant increase in the availability of software for

carrying out statistical analysis at large scales–the so called "big data" era.

Much of this software is completely free, open source, extremely well-tested and straightfor-

ward to use. The prevalence of free software coupled to the availability of ﬁnancial data, as

provided by services such as Yahoo Finance, Google Finance, Quandl and DTN IQ Feed, has

lead to a sharp increase in individuals deciding to become quant traders.

Unfortunately many of these individuals never get past learning basic "technical analysis".

They avoid important topics such as risk management, portfolio construction and algorithmic

execution–topics given signiﬁcant attention in institutional environments. In addition "retail"

traders often neglect more eﬀective means of generating alpha, such as can be provided via

detailed statistical analysis.

The aim of this book is to provide the "next step" for those who have already begun their

quantitative trading career or are looking to try more advanced methods. In particular the book

will discuss techniques that are currently in deployment at some of the large quantitative hedge

funds and asset management ﬁrms.

Our main area of study will be that of rigourous statistical analysis. This may sound

like a dry topic, but rest assured that it is not only extremely interesting when applied to real

world data, but also provides a solid "mental framework" for how to think about future trading

methods and approaches.

Statistical analysis is a huge ﬁeld of academic interest and only a fraction of the ﬁeld can be

considered within this book. Trying to distill the topics important for quantitative trading is

diﬃcult. However three main areas have been chosen for discussion:

• Bayesian Statistics

• Time Series Analysis

• Machine Learning

Each of these three areas is extremely useful for quantitative trading research.

1.2.1 Bayesian Statistics

Bayesian Statistics is an alternative way of thinking about probability. The more

traditional "frequentist" approach considers probabilities as the end result of many trials, for

instance, the fairness of a coin being ﬂipped many times. Bayesian Statistics takes a diﬀerent

approach and instead considers probability as a measure of belief. That is, opinions are used to

create probability distributions from which the fairness of the coin might be based on.

While this may sound highly subjective it is often an extremely eﬀective method in practice.

As new data arrives beliefs can be updated in a rational manner using the famous Bayes’ Rule.

Bayesian Statistics has found uses in many ﬁelds, including engineering reliability, searching for

lost nuclear submarines and controlling spacecraft orientation. However, it is also extremely

applicable to quantitative trading problems.

Bayesian Inference is the application of Bayesian Statistics to making inference and predic-

tions about data. Within this book the main goal will be to study ﬁnancial asset prices in order

to predict future values or understand why they change. The Bayesian framework provides a

modern, sophisticated mathematical toolkit with which to carry this out.

Time Series Analysis and Machine Learning make heavy use of Bayesian Inference for the

design of some of their algorithms. Hence it is essential that the basics of how Bayesian Statistics

is carried out are discussed ﬁrst.

To carry out Bayesian Inference in this book a "probabilistic programming" tool written in

Python will be used, called PyMC3.

1.2.2 Time Series Analysis

Time Series Analysis provides a set of "workhorse" techniques for analysing ﬁnancial time series.

Most professional quants will begin their analysis of ﬁnancial data using basic time series meth-

ods. By applying the tools in time series analysis it is possible to make elementary assessments

of ﬁnancial asset behaviour.

The main idea in Time Series Analysis is that of serial correlation. Brieﬂy, in terms of daily

trading prices, serial correlation describes how much of today’s asset prices are correlated to

previous days’ prices. Understanding the structure of this correlation helps us to build sophisti-

cated models that can help us interpret the data and predict future values. The concept of asset

momentum–and trading strategies derived from it–is based on positive serial correlation of asset

returns.

Time Series Analysis can be thought of as a more rigourous approach to understanding the

behaviour of ﬁnancial asset prices than is provided via "technical analysis".

While technical analysis has basic "indicators" for trends, mean reverting behaviour and

volatility determination, Time Series Analysis brings with it the full power of statistical inference.

This includes hypothesis testing, goodness-of-ﬁt tests and model selection, all of which serve

to help rigourously determine asset behaviour and thus eventually increase proﬁtability of sys-

tematic strategies. Trends, seasonality, long-memory eﬀects and volatility clustering can all be

understood in much more detail.

To carry out Time Series Analysis in this book the R statistical programming environment,

along with its many external libraries, will be utilised.

1.2.3 Machine Learning

Machine Learning is another subset of statistical learning that applies modern statistical models

to vast data sets, whether they have a temporal component or not. Machine Learning is part

of the broader "data science" and quant ecosystem. In essence it is a fusion of computational

methods–mainly optimisation techniques–within a rigourous probabilistic framework. It provides

the ability to "learn a model from data".

Machine Learning is generally subdivided into three separate categories: Supervised Learning,

Unsupervised Learning and Reinforcement Learning.

Supervised Learning makes use of "training data" to train, or supervise, an algorithm to detect

patterns in data. Unsupervised Learning diﬀers in that there is no concept of training (hence

the "unsupervised"). Unsupervised algorithms act solely on the data without being penalised or

rewarded for correct answers. This makes it a far harder problem. Both of these techniques will

be studied at length in this book and applied to quant trading strategies.

Reinforcement Learning has gained signiﬁcant popularity over the last few years due to

the famous results of ﬁrms such as Google DeepMind[3], including their work on Atari 2600

videogames[70] and the AlphaGo contest[4]. Unfortunately Reinforcement Learning is a vast

area of academic research and as such is outside the scope of the book.

In this book Machine Learning techniques such as Support Vector Machines and Random

Forests will be used to ﬁnd more complicated relationships between diﬀering sets of ﬁnancial

data. If these patterns can be successfully validated then they can be used to infer structure in

the data and thus make predictions about future data points. Such tools are highly useful in

alpha generation and risk management.

To carry out Machine Learning in this book the Python Scikit-Learn and Pandas libraries

will be utilised.

1.3 How Is The Book Laid Out?

The book is broadly laid out in four sections. The ﬁrst three are theoretical in nature and

teach the basics of Bayesian Statistics, Time Series Analysis and Machine Learning, with many

references presented for further research. The fourth section applies all of the previous theory

to the backtesting of quantitative trading strategies using the QSTrader open-source backtesting

engine.

The book begins with a discussion on the Bayesian philosophy of statistics. The binomial

model is presented as a simple example with which to apply Bayesian concepts such as conjugate

priors and posterior sampling via Markov Chain Monte Carlo.

It then explores Bayesian statistics as related to quantitative ﬁnance, discussing a Bayesian

approach to stochastic volatility. Such a model is eligible for use within a regime detection

mechanism in a risk management setting.

In Time Series Analysis the discussion begins with the concept of serial correlation, applying

it to simple models such as White Noise and the Random Walk. From these two models more

sophisticated linear approaches can be built up to explain serial correlation, culminating in the

Autoregressive Integrated Moving Average (ARIMA) family of models.

The book then considers volatility clustering, or conditional heteroskedasticity, motivating the

famous Generalised Autoregressive Conditional Heteroskedastic (GARCH) family of models.

Subsequent to ARIMA and GARCH the book introduces the concept of cointegration (used

heavily in pairs trading) and introduces state space models including Hidden Markov Models

and Kalman Filters.

These time series methods are all applied to current ﬁnancial data as they are introduced.

Their inferential and predictive performance is also assessed.

In the Machine Learning section a rigourous deﬁnition of supervised and unsupervised learn-

ing is presented utilising the notation and methodology of statistical machine learning. The

humble linear regression will be presented in a probabilistic fashion, which allows introduction

of machine learning ideas in a familiar setting.

The book then introduces the more advanced non-linear methods such as Decision Trees,

Support Vector Machines and Random Forests. It then discusses unsupervised techniques such

as K-Means Clustering.

Many of the above mentioned techniques are applied to asset price prediction, natural lan-

guage processing and sentiment analysis. Subsequently full code is provided for systematic

strategy backtesting implementations within QSTrader.

The book provides plenty of references on where to head next. There are many potential

academic topics of interest to pursue subsequent to this book, including Non-Linear Time Series

Methods, Bayesian Nonparametrics and Deep Learning using Neural Networks. Unfortunately,

these exciting methods will need to wait for an additional book to be given the proper treatment

they deserve!

1.4 Required Technical Background

Advanced Algorithmic Trading is a deﬁnite step up in complexity from the previous QuantStart

book Successful Algorithmic Trading. Unfortunately it is diﬃcult to carry out any statistical

inference without utilising some mathematics and programming.

1.4.1 Mathematics

To get the most out of this book it will be necessary to have taken introductory undergrad-

uate classes in Mathematical Foundations, Calculus, Linear Algebra and Probability,

which are often taught in university degrees of Mathematics, Physics, Engineering, Economics,

Computer Science or similar.

Thankfully it is unnecessary to have completed a university education in order to make good

use of this book. There are plenty of fantastic resources for learning these topics on the internet.

Some useful suggestions include:

• Khan Academy - https://www.khanacademy.org

• MIT Open Courseware - http://ocw.mit.edu/index.htm

• Coursera - https://www.coursera.org

• Udemy - https://www.udemy.com

However, it should be well noted that Bayesian Statistics, Time Series Analysis and Machine

Learning are quantitative subjects. There is no avoiding the fact that some intermediate level

mathematics will be needed to quantify our ideas.

The following courses are extremely useful for getting up to speed with the required mathe-

matics:

• Linear Algebra by Gilbert Strang - http://ocw.mit.edu/courses/mathematics/18-06sc-

linear-algebra-fall-2011/index.htm

• Single Variable Calculus by David Jerison - http://ocw.mit.edu/courses/mathematics/18-

01-single-variable-calculus-fall-2006

• Multivariable Calculus by Denis Auroux - http://ocw.mit.edu/courses/mathematics/18-

02-multivariable-calculus-fall-2007

• Probability by Santosh Venkatesh - https://www.coursera.org/course/probability

1.4.2 Programming

Since this book is fundamentally about programming quantitative trading strategies, it will be

necessary to have some exposure to programming languages.

While it is not necessary to be an expert programmer or software developer, it is helpful to

have used a language similar to C++, C#, Java, Python, R or MatLab.

Many will have likely have programmed in VB Script or VB.NET through Excel. However,

taking an introductory Python or R programming course is strongly recommended. There are

many such courses available online:

• Programming for Everybody - https://www.coursera.org/learn/python

• R Programming - https://www.coursera.org/course/rprog

剩余516页未读，继续阅读

cqiao0

粉丝: 5
资源: 24

使用Python与R进行时间序列分析、机器学习的算法交易实战

Advanced Algorithmic Trading

pyautotrade_tdx:股票自动化交易

Advanced+Algorithmic+Trading

Advanced Algorithmic Trading-2017.pdf

Advanced_Algorithmic_Trading

Algorithmic Trading Methods

advanced-algorithmic-trading.pdf

advanced-algorithmic-trading-with-full-source-code.zip

advanced-algorithimic-trading

Advanced Data Analytics Using Python

PyAutoTrading

Machine Learning for Text 无水印原版pdf

Learn-Algorithmic-Trading:学习算法交易，由Packt发布

考虑P2G和碳捕集设备的热电联供综合能源系统优化调度模型（Matlab代码实现）.rar

可提高超声成像系统在时间延迟多普勒参数方面的分辨率 matlab代码.rar

宠物服务平台 SSM毕业设计 源码+数据库+论文（JAVA+SpringBoot+Vue.JS）.zip

四轮转向汽车模型预测控制(MPC)路径跟踪 simulink-simscape仿真，无需carsim mpc基于车辆动力学模型

基于改进U-Net算法(融入注意力机制)的甲状腺结节分割系统python源码+h5模型.zip

最新资源

宠物服务平台 SSM毕业设计源码+数据库+论文（JAVA+SpringBoot+Vue.JS）.zip