深度学习网络在股市分析与高频预测中的应用比较

需积分: 10 105 浏览量更新于2024-07-17 收藏 774KB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

资源详情

资源推荐

AE is a neural network model characterized by a structure in which the model parameters are

calibrated by minimizing the reconstruction error. Let h

= δ

l−1

+ b

) be the network

function of the lth layer with input h

l−1

and output h

. Although δ

can differ across layers, a

sigmoid function, δ(z) = 1/(1 + exp (−z)), is typically used for all layers, which we also adopt

in our research.

Regarding h

as a function of the input, the representation of x can be written as

u = φ(x) = h

◦ ··· ◦ h

(x) for an L-layer AE (h

= x). Then the reconstruction of the data

can be similarly deﬁned: x

rec

= ψ(u) = h

◦ ··· ◦ h

L+1

(u), and the model can be calibrated

by minimizing the reconstruction error over a training dataset, {x

}

n=1

. We adopt the following

learning criterion:

min

n=1

||x

− ψ ◦ φ(x

)||

, (8)

where θ = {W

, b

}, i = 1, ..., 2L. W

L+i

is often set as the transpose of W

L+1−i

, in which case

only W

, i = 1, ··· , L, need to be estimated. In this paper, we consider a single-layer AE and

estimate both W

and W

(iii) Restricted Boltzmann Machine, RBM

RBM (Hinton, 2002) has the same network structure as a single-layer autoencoder, but it uses

a different learning method. RBM treats the input and output variables, x and u, random, and

deﬁnes an energy function, E(x, u), from which the joint probability density function of x and u

is determined from the formula

p(x, u) =

exp(−E(x, u))

, (9)

where Z =

x,u

exp(−E(x, u)) is the partition function. In most cases, u is assumed to be a d-

dimensional binary variable, i.e., u ∈ {0, 1}

, and x is assumed to be either binary or real-valued.

When x is a real-valued variable, the energy function has the following form (Cho, Ilin, & Raiko,

2011):

E(x, u) =

(x − b)

−1

(x − b) − c

u − u

W Σ

−1/2

x , (10)

where Σ, W , b, c are model parameters. We set Σ to be the identity matrix; this makes learning

simpler with little performance sacriﬁce (Taylor, Hinton, & Roweis, 2006). From Equations (9)

and (10), the conditional distributions are obtained as follows:

p(u

= 1|x) = δ(c

+ W

(j,:)

x), j = 1, ··· , d, (11)

p(x

|u) = N(b

+ u

(:,i)

, 1), i = 1, ··· , D, (12)

where δ(·) is the sigmoid function, and W

(j,:)

and W

(:,i)

are the jth row and the ith column of W ,

respectively. This type of RBM is denoted the Gaussian-Bernoulli RBM. The input data is then

represented and reconstructed in a probabilistic way using the conditional distributions. Given an

input dataset {x

}

n=1

, maximum log-likelihood learning is formulated as the following optimiza-

tion:

max

L =

n=1

log p(x

; θ)

, (13)

exp (z) is applied to each element of z.

where θ = {W, b, c} are the model parameters, and u is marginalized out (i.e., integrated out

via expectation). This problem can be solved via standard gradient descent. However, due to

the computationally intractable partition function Z, an analytic formula for the gradient is usu-

ally unavailable. The model parameters are instead estimated using a learning method called the

contrastive divergence (CD) method (Carreira-Perpinan & Hinton, 2005); we refer the reader to

Hinton (2002) for details on learning with RBM.

3 Data Speciﬁcation

We construct a deep neural network using stock returns from the KOSPI market, the major stock market

in South Korea. We ﬁrst choose the ﬁfty largest stocks in terms of market capitalization at the beginning

of the sample period, and keep only the stocks which have a price record over the entire sample period.

This leaves 38 stocks in the sample, which are listed in Table II. The stock prices are collected every ﬁve

minutes during the trading hours of the sample period (04-Jan-2010 to 30-Dec-2014), and ﬁve-minute

logarithmic returns are calculated using the formula r

= ln(S

t−∆t

), where S

is the stock price at

time t, and ∆t is ﬁve minutes. We only consider intraday prediction, i.e., the ﬁrst ten ﬁve-minute returns

(i.e., lagged returns with g = 10) each day are used only to construct the raw level input R

, and not

included in the target data. The sample contains a total of 1,239 trading days and 73,041 ﬁve-minute

returns (excluding the ﬁrst ten returns each day) for each stock.

The training set consists of the ﬁrst 80% of the sample (from 04-Jan-2010 to 24-Dec-2013) which

contains 58,421 (N

) stock returns, while the remaining 20% (from 26-Dec-2013 to 30-Dec-2014) with

14,620 (N

) returns is used as the test set:

Training set: {R

, r

i,t+1

}

n=1

, Test set: {R

, r

i,t+1

}

n=1

, i = 1, ··· , M.

To avoid over-ﬁtting during training, the last 20% of the training set is further separated as a validation

set.

All stock returns are normalized using the training set mean and standard deviation, i.e., for the

mean µ

and the standard deviation σ

of r

i,t

over the training set, the normalized return is deﬁned as

i,t

− µ

)/σ

. Henceforth, for notational convenience we will use r

i,t

to denote the normalized return.

At each time t, we use ten lagged returns of the stocks in the sample to construct the raw level input:

= [r

1,t

, ··· , r

1,t−9

, ··· , r

38,t

, ··· , r

38,t−9

]

3.1 Evidence of Predictability in the Korean Stock Market

As a motivating example, we carry out a simple experiment to see whether past returns have predictable

power for future returns. We ﬁrst divide the returns of each stock into two groups according to the

mean or variance of ten lagged returns: If the mean of the lagged returns, M(10), is greater than some

threshold η, the return is assigned to one group; otherwise, it is assigned to the other group. Similarly,

by comparing the variance of the lagged returns, V (10), with a threshold , the returns are divided into

two groups.

剩余35页未读，继续阅读

hoare311

粉丝: 1
资源: 6

深度学习网络在股市分析与高频预测中的应用比较

Machine Learning Techniques for Stock Prediction

Stock_Market_Prediction_And_Suggestion

Deep learning networks for stock market analysis and prediction_ Methodology

: Experts Lead You from Zero to Mastery in Analysis and Prediction

【Essentials of Deep Learning for Time Series Forecasting】: Tips and Advanced Applications of RNN

Time Series Forecasting with Sliding Window Technique: Dynamic Prediction and Case Analysis

【Multilayer Perceptron (MLP) Deep Learning Guide】: From Basics to Advanced Applications, ...

Time Series Anomaly Detection: Case Analysis and Practical Techniques

Applications of MATLAB Optimization Algorithms in Machine Learning: Case Studies and Practical Guide

Complete Analysis of Cases and Strategies

Time Series Causal Relationship Analysis: An Expert Guide to Identification and Modeling

A Comprehensive Analysis of Decomposition Methods and Applications

Evaluation of Time Series Forecasting Models: In-depth Analysis of Key Metrics and Testing Methods

Time Series Chaos Theory: Expert Insights and Applications for Predicting Complex Dynamics

[Advanced] Application of Convolutional Neural Networks (CNN) in MATLAB

Time Series Forecasting with Ensemble Learning: Expert Guide to Enhancing Accuracy

Integration Learning Methods: Master These 6 Strategies to Build an Unbeatable Model

【Machine Learning Time Series Forecasting: From Beginner to Expert】: Mastering Core Applications

MATLAB Versions and Commercial Applications: Project Suitability and Unveiled Advantages

最新资源