and particle swarm optimization (PSO) together with ELM
for gene selection and classification. Huang et al. [32]
employ ELM and apply it to traffic sign detection task.
There are many works combining deep learning and ELM
to achieve better performance in their data sets. Using auto-
encoder and ELM, Kasan et al. [33] give results that out-
perform many other state-of-the-art deep learning methods
in MNIST OCR data set. In [34], Tang et al. use deep
neural networks and ELM in ship detection on spaceborne
images, where deep neural networks are used for higher-
level feature representation and ELM is used for decision-
making. In [35], Li et al. applied ELM to both market news
and prices to make predictions of price movements.
2.3 Market impact analysis
Market news and stock prices are two of the most impor-
tant sources of market information that are used for market
impact analysis. In [36], Seo et al. follow the approaches of
text mining and build a multi-agent system for intelligent
portfolio management which assesses companies’ risk
levels by analyzing textual news features. In [1, 37],
Schumaker and Chen propose the AZFinText system,
which is based on terms’ statistical model of news, to make
predictions of future market price movements. In [38],
multi-document summarization algorithm is first applied to
news and more accurate predictions are generated based on
the summaries instead of the original full text. Besides the
works on news, there have been great efforts to analyze the
market impact based on market prices. In [39], Gestel et al.
apply support vector regression to price and make predic-
tions on volatility. In [40–44], Tay and Cao improve their
previous works where the objective function of support
vector machine is modified to adapt non-stationary price
time series. In [45], Huang et al. propose a support vector
machine-based system to predict the price movements of
NIKKEI 225 index.
In [46], a market making trading strategy is proposed,
which places orders in the market order book based on the
signals generated from market quote ticks. In [47], Li et al.
enhance the market impact prediction accuracy by inte-
grating news and price information sources.
In this paper, we take the advantages of the deep learned
representations as reviewed in Sect. 2.1 and apply it to
market news and stock tick prices in order to have a better
feature representation than the human-engineered one. On
the other hand, we consider the good classification per-
formance of extreme learning machine as stated in
Sect. 2.2, and set up a market impact analysis system
which has the extreme learning machine on top of the deep
learned representations. We design several different system
configurations to compare the performances of the pro-
posed system with many benchmarks. The empirical results
indicate that the proposed system produces convincing
outputs and outperforms the benchmarks in most of the
experimental cases.
3 Deep learned architecture
The architecture of the market impact analysis system is
shown in Fig. 1. In this section, we explain the processing
pipeline of the system step by step. The whole system
consists of two parts: (1) unsupervised deep learned rep-
resentation and (2) supervised classification. The first part
uses multiple layers of auto-encoders to do abstraction on
the input data, and the second part utilizes several different
machine learning models to make market impact predic-
tions based on the features generated in the first part.
3.1 Preprocessing of prices and news
Some of the news articles are to be filtered out because of
two constraints: (1) market trading hours and (2) prediction
horizon overlaps. Take Hong Kong Stock Exchange for
example, the trading hour starts from 9:30 to 12:00 in the
morning and 1:00–4:00 in the afternoon. Since news
impact overnight is believed to be absorbed in the morning
auction hour, the system in this paper only keeps the news
articles that have time stamps within the trading hours. The
second issue is the overlap of prediction horizons. As
illustrated in Fig. 2, assume two news articles d
1
and d
2
on
the same company are released within time window D,itis
hard to determine whether the market impact at time t
þD
is
generated by d
1
,ord
2
, or both. To avoid this situation, d
1
is
purposely eliminated in the preprocessing. Following the
approach of news preprocessing in text mining [48], news
texts are firstly preprocessed Chinese segmentation and
stop word filtering. In the second step, each word is con-
sidered as a shallow feature, and TF.IDF (term frequency
and inverse document frequency) [49] is calculated as the
weight of features.
To the best of our knowledge, it is hard to theoretically
determine a prediction horizon for each news piece. In
order to cover more cases, post-news 5, 10, 15, 20, 25 and
30 min are used to mimic different prediction horizons.
The snapshot prices at the time points are extracted and
converted into simple returns which are further discretized
and used as labels of the news articles. To formulate,
assume one piece of news is at t
0
, and the snapshot price at
t
0
is p
0
. According to the determination method of the
prediction horizon, future 5-, 10-, 15-, 20-, 25- and 30-min
prices are extracted, denoted as p
þ5
, p
þ10
, p
þ15
, p
þ20
, p
þ25
,
and p
þ30
; respectively, and simple return is calculated by
Eq. (1),
Neural Computing and Applications (2019) 31:5989–6000 5991
123