cial media such as twitters or weibo[6,7]. However, these sources are not that reliable
because they sometimes contain rumors. 2) Most previous works didn’t pay enough at-
tention on date unit. Without a precise date unit, the extracted events are sometimes con-
fusing. Because the described events may happen for several times and we don’t know
exactly which event it referred to. We cannot organize these events by time and display
them on a timeline, either. 3) Extracted events are sometimes ordered in a random fash-
ion, making it difficult for users to either obtain the events related to entities nor make
a retrospective exploration of historical events. Consequently, it is hard to shed light on
potential connections between events and entities.
In this paper, we propose a Chinese event extraction system that first extracts event
triples from Chinese online encyclopedias. Compared with the other text sources, Chi-
nese online encyclopedias are authoritative and timely update. Meanwhile, Chinese on-
line encyclopedias allow us to extract ample events with complete temporal expressions
and organize events by date or by corresponding entities. We also solve the three major
challenges in our event validation process to harvest high-quality event triples. Aiming
at the complexity of Chinese temporal expressions, we design an event fusion compo-
nent to solve the challenge C1 and a date unit correction and normalization component
to solve challenge C2 and C3.
In summary, we have extracted a total of 7,250,838 events from the articles of
2,624,974 entities; covering over 6000 years of history from 5,000 BC to 2018. In addi-
tion, each event description retains the hyperlinks forward to the pages of other related
entities, generating the largest Chinese structured event base.
Another main contribution of this paper is that our system facilitates the event ex-
ploration process for users who are interested in the historical facts of entities. Event
timeline is an effective way to present stories of entities and provide context to users. Our
Timeline system thus implements an event timeline web portal that not only provides a
succinct overview of chronological events for each entity, but also provides a historical
event timeline which enables users to make a retrospective exploration of important his-
torical events. Our system also provides APIs, giving users the option to obtain complete
structured event triples.
To the best of our knowledge, the Timeline system is the first of its kinds, providing
knowledge of an entity in the form of event triples and encoding events along a timeline.
These structured event triples are a supplement of Chinese event bases and the timeline
visualization makes the exploration of events more enjoyable and effective.
2. Related Work
In this section, we discuss relevant previous work including event extraction from un-
structured text and visualization methods for events.
2.1. Event Extraction from Unstructured Text
Nowadays, researchers dedicate in leveraging natural language processing methods to
harvest events from all kinds of text sources. Although a lot of work has been done on
English documents (e.g., [10,11]), considerably less event extraction work has been fo-
cused on Chinese documents [12]. Most previous works focus on extracting events from
C. Li et al. / Timeline: A Chinese Event Extraction and Exploration System 789