B. B. Cao
10.4236/jdaip.2019.74011 175 Journal of Data Analysis and Information Processing
sion of Internet information makes it more and more difficult for people to ex-
tract effective information. If investors are unable to obtain timely and accurate
information about events that lead to financial market volatility, then the losses
caused are incalculable. Therefore, how to quickly find valuable topics and
events from a large number of Internet data is particularly important.
With time goes by, numerous research methods for event discovery have been
put forward [1]-[11]. However, most of these methods are based on text data
[1]-[11] or time series data [12]-[19] for event discovery separately. There are
few scholars, to the best of our knowledge, study the characteristics of financial
time series data and text data to carry out research [20] [21] [22]. As a realistic
behavior of financial markets, time-series data such as stock trading data and
market data are often affected by events and can better reflect changes before
and after events. Therefore, this paper studies the discovery of financial
events by combining network text information and financial time series data, so
as to help investors to quickly obtain hot events and correctly grasp market
dynamics.
2. Post’s Influence of Network Public Opinion Space
2.1. Definition and Quantification of Post’s Activity
In web forums, netizens can express their concern for specific information by
posting, reading and replying. And this degree of attention is an important ex-
ternal feature of the emotional tendency of network public opinion. In this pa-
per, we call it post’s activity. In order to quantify the user’s attention to topic in-
formation intuitively, we calculate it by the amount of readings and the amount
of comments of the posts. Among them, the readings amount of posts reflects
the degree of dissemination of the information contained in the posts and it is
the instinct concern of users. The comments amount of posts reflects the atten-
tion paid to the information contained in the posts. And it is the manifestation
of the user’s emphasis on topic interaction, and its emotional intensity is strong-
er. So in this paper, we choose the amount of readings and the amount of com-
ments as indicators of post’s activity. The specific definitions are as follows:
Definition 2-1 Post’s activity: Assuming that within a period of time
t
, a total
of
N
posts are posted in the online public opinion space, which are
. The readings amount of the
i
-th post is
, and the
comments amount is
. Then the total readings amount of
N
posts in the
time period
t
is
, and the average readings amount of each post is
. We define the propagation coefficient
of
as the ratio of
the readings amount of
in time period
t
to the average reading amount of
each post in the same time period. The formula of
is
. Si-
milarly, the attention coefficient
of
is defined as the ratio of the
comment amount of
in time period
t
to the average comment amount of