Deep Learning for Event-Driven Stock Prediction
Xiao Ding
†∗
, Yue Zhang
‡
, Ting Liu
†
, Junwen Duan
†
†
Research Center for Social Computing and Information Retrieval
Harbin Institute of Technology, China
{xding, tliu, jwduan}@ir.hit.edu.cn
‡
Singapore University of Technology and Design
yue zhang@sutd.edu.sg
Abstract
We propose a deep learning method for event-
driven stock market prediction. First, events are
extracted from news text, and represented as dense
vectors, trained using a novel neural tensor net-
work. Second, a deep convolutional neural network
is used to model both short-term and long-term in-
fluences of events on stock price movements. Ex-
perimental results show that our model can achieve
nearly 6% improvements on S&P 500 index predic-
tion and individual stock prediction, respectively,
compared to state-of-the-art baseline methods. In
addition, market simulation results show that our
system is more capable of making profits than pre-
viously reported systems trained on S&P 500 stock
historical data.
1 Introduction
It has been shown that the financial market is “information-
ally efficient”
[
Fama, 1965
]
— stock prices reflect all known
information, and the price movement is in response to news or
events. As web information grows, recent work has applied
Natural Language Processing (NLP) techniques to explore fi-
nancial news for predicting market volatility.
Pioneering work mainly uses simple features from news
documents, such as bags-of-words, noun phrases, and named
entities
[
Kogan et al., 2009; Schumaker and Chen, 2009
]
. Al-
though useful, these features do not capture structured rela-
tions, which limits their potentials. For example, representing
the event “Microsoft sues Barnes & Noble.” using term-level
features {“Microsoft”, “sues”, “Barnes”, “Noble”} alone, it
can be difficult to accurately predict the price movements
of Microsoft Inc. and Barnes & Noble Inc., respectively, as
the unstructured terms cannot differentiate the accuser (“Mi-
crosoft”) and defendant (“Barnes & Noble”).
Recent advances in computing power and NLP technology
enables more accurate models of events with structures. Us-
ing open information extraction (Open IE) to obtain struc-
tured events representations, we find that the actor and object
∗
This work was done while the first author was visiting Singa-
pore University of Technology and Design
Figure 1: Example news influence of Google Inc.
of events can be better captured
[
Ding et al., 2014
]
. For ex-
ample, a structured representation of the event above can be
(Actor = Microsoft, Action = sues, Object = Barnes & Noble).
They report improvements on stock market prediction using
their structured representation instead of words as features.
One disadvantage of structured representations of events
is that they lead to increased sparsity, which potentially lim-
its the predictive power. We propose to address this issue by
representing structured events using event embeddings, which
are dense vectors. Embeddings are trained such that similar
events, such as (Actor = Nvidia fourth quarter results, Action
= miss, Object = views) and (Actor = Delta profit, Action =
didn’t reach, Object = estimates), have similar vectors, even if
they do not share common words. In theory, embeddings are
appropriate for achieving good results with a density estima-
tor (e.g. convolutional neural network), which can misbehave
in high dimensions
[
Bengio et al., 2005
]
. We train event em-
beddings using a novel neural tensor network (NTN), which
can learn the semantic compositionality over event arguments
by combining them multiplicatively instead of only implic-
itly, as with standard neural networks.
For the predictive model, we propose to use deep learning
[
Bengio, 2009
]
to capture the influence of news events over
a history that is longer than a day. Research shows dimin-
ishing effects of reported events on stock market volatility.
For example, Xie et al.
[
2013
]
, Tetlock et al.
[
2008
]
and
Ding et al.
[
2014
]
show that the performance of daily predic-
tion is better than weekly and monthly prediction. As shown
in Figure 1, the influences of three actual events for Google
Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015)