Drain: An Online Log Parsing Approach with Fixed
Depth Tree
Pinjia He
∗
, Jieming Zhu
∗
, Zibin Zheng
†
, and Michael R. Lyu
∗
∗
Computer Science and Engineering Department, The Chinese University of Hong Kong, China
{pjhe, jmzhu, lyu}@cse.cuhk.edu.hk
†
Key Laboratory of Machine Intelligence and Advanced Computing (Sun Yat-sen University), Ministry of Education
School of Data and Computer Science, Sun Yat-sen University, China
zhzibin@mail.sysu.edu.cn
Abstract—Logs, which record valuable system runtime infor-
mation, have been widely employed in Web service management
by service providers and users. A typical log analysis based Web
service management procedure is to first parse raw log messages
because of their unstructured format; and then apply data mining
models to extract critical system behavior information, which can
assist Web service management. Most of the existing log parsing
methods focus on offline, batch processing of logs. However, as
the volume of logs increases rapidly, model training of offline
log parsing methods, which employs all existing logs after log
collection, becomes time consuming. To address this problem,
we propose an online log parsing method, namely Drain, that
can parse logs in a streaming and timely manner. To accelerate
the parsing process, Drain uses a fixed depth parse tree, which
encodes specially designed rules for parsing. We evaluate Drain
on five real-world log data sets with more than 10 million raw
log messages. The experimental results show that Drain has the
highest accuracy on four data sets, and comparable accuracy
on the remaining one. Besides, Drain obtains 51.85%∼81.47%
improvement in running time compared with the state-of-the-
art online parser. We also conduct a case study on an anomaly
detection task using Drain in the parsing step, which determines
the effectiveness of Drain in log analysis.
Index Terms—Log parsing; Online algorithm; Log analysis;
Web service management;
I. INTRODUCTION
The prevalence of cloud computing, which enables on-
demand service delivery, has made Service-oriented Architec-
ture (SOA) a dominant architectural style. Nowadays, more
and more developers leverage existing Web services to build
their own systems because of their rich functionality and
“plug-and-play” property. Although developing Web service
based system is convenient and lightweight, Web service man-
agement is a significant challenge for both service providers
and users. Specifically, service providers (e.g., Amazon EC2
[1]) are expected to provide services with no failures or SLA
(service-level agreement) violations to a large number of users.
Similarly, service users need to effectively and efficiently
manage the adopted services, which have been discussed in
many recent works (e.g., Web service monitoring [2]). In this
context, log analysis based service management techniques,
which employ service logs to achieve automatic or semi-
automatic service management, have been widely studied.
Logs are usually the only data resource available that
records service runtime information. In general, a log message
is a line of text printed by logging statements (e.g., printf(),
logging.info()) written by developers. Thus, log analysis tech-
niques, which apply data mining models to get insights of sys-
tem behaviors, are in widespread use for service management.
For service providers, there are studies in anomaly detection
[3], [4], fault diagnosis [5], [6] and performance improvement
[7]. For service users, typical examples include business model
mining [8], [9] and user behavior analysis [10], [11].
Most of the data mining models used in these log analysis
techniques require structured input (e.g., an event list or a
matrix). However, raw log messages are usually unstructured,
because developers are allowed to write free-text log messages
in source code. Thus, the first step of log analysis is log
parsing, where unstructured log messages are transformed into
structured events. An unstructured log message, as in the
following example, usually contains various forms of system
runtime information: timestamp (records the occurring time
of an event), verbosity level (indicate the severity level of
an event, e.g., INFO), and raw message content (free-text
description of a service operation).
081109 204655 556 INFO dfs.DataNode$PacketResponder
: Received block blk_3587508140051953248 of size 67
108864 from /10.251.42.84
Traditionally, log parsing relies heavily on regular expres-
sions [12], which are designed and maintained manually by
developers. However, this manual method is not suitable for
logs generated by modern services for the following three
reasons. First, the volume of logs is increasing rapidly, which
makes the manual method prohibitive. For example, a large-
scale service system can generate 50 GB logs (120∼200
million lines) per hour [13]. Second, as open-source platforms
(e.g., Github) and Web service become popular, a system often
consists of components written by hundreds of developers
globally [3]. Thus, people in charge of the regular expressions
may not know the original logging purpose, which makes
manual management even harder. Third, logging statements
in modern systems updates frequently (e.g., hundreds of new
logging statements every month [14]). In order to maintain
a correct regular expression set, developers need to check all
logging statements regularly, which is tedious and error-prone.
Log parsing is widely studied to parse the raw log messages
automatically. Most of existing log parsers focus on offline,
batch processing. For example, Xu et al. [3] design a method
2017 IEEE 24th International Conference on Web Services
978-1-5386-0752-7/17 $31.00 © 2017 IEEE
DOI 10.1109/ICWS.2017.13
33