Graph Convolutional Networks for Text Classification
Liang Yao, Chengsheng Mao, Yuan Luo
∗
Northwestern University
Chicago IL 60611
{liang.yao, chengsheng.mao, yuan.luo}@northwestern.edu
Abstract
Text classification is an important and classical problem in
natural language processing. There have been a number of
studies that applied convolutional neural networks (convolu-
tion on regular grid, e.g., sequence) to classification. How-
ever, only a limited number of studies have explored the more
flexible graph convolutional neural networks (convolution on
non-grid, e.g., arbitrary graph) for the task. In this work, we
propose to use graph convolutional networks for text classi-
fication. We build a single text graph for a corpus based on
word co-occurrence and document word relations, then learn
a Text Graph Convolutional Network (Text GCN) for the cor-
pus. Our Text GCN is initialized with one-hot representation
for word and document, it then jointly learns the embeddings
for both words and documents, as supervised by the known
class labels for documents. Our experimental results on mul-
tiple benchmark datasets demonstrate that a vanilla Text GCN
without any external word embeddings or knowledge outper-
forms state-of-the-art methods for text classification. On the
other hand, Text GCN also learns predictive word and docu-
ment embeddings. In addition, experimental results show that
the improvement of Text GCN over state-of-the-art compar-
ison methods become more prominent as we lower the per-
centage of training data, suggesting the robustness of Text
GCN to less training data in text classification.
Introduction
Text classification is a fundamental problem in natural lan-
guage processing (NLP). There are numerous applications
of text classification such as document organization, news
filtering, spam detection, opinion mining, and computa-
tional phenotyping (Aggarwal and Zhai 2012; Zeng et al.
2018). An essential intermediate step for text classification
is text representation. Traditional methods represent text
with hand-crafted features, such as sparse lexical features
(e.g., bag-of-words and n-grams). Recently, deep learning
models have been widely used to learn text representa-
tions, including convolutional neural networks (CNN) (Kim
2014) and recurrent neural networks (RNN) such as long
short-term memory (LSTM) (Hochreiter and Schmidhuber
1997). As CNN and RNN prioritize locality and sequential-
ity (Battaglia et al. 2018), these deep learning models can
∗
Corresponding Author
Copyright
c
2019, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
capture semantic and syntactic information in local consec-
utive word sequences well, but may ignore global word co-
occurrence in a corpus which carries non-consecutive and
long-distance semantics (Peng et al. 2018).
Recently, a new research direction called graph neural
networks or graph embeddings has attracted wide atten-
tion (Battaglia et al. 2018; Cai, Zheng, and Chang 2018).
Graph neural networks have been effective at tasks thought
to have rich relational structure and can preserve global
structure information of a graph in graph embeddings.
In this work, we propose a new graph neural network-
based method for text classification. We construct a single
large graph from an entire corpus, which contains words and
documents as nodes. We model the graph with a Graph Con-
volutional Network (GCN) (Kipf and Welling 2017), a sim-
ple and effective graph neural network that captures high
order neighborhoods information. The edge between two
word nodes is built by word co-occurrence information and
the edge between a word node and document node is built
using word frequency and word’s document frequency. We
then turn text classification problem into a node classifica-
tion problem. The method can achieve strong classification
performances with a small proportion of labeled documents
and learn interpretable word and document node embed-
dings. Our source code is available at https://github.
com/yao8839836/text_gcn. To summarize, our con-
tributions are as follows:
• We propose a novel graph neural network method for text
classification. To the best of our knowledge, this is the
first study to model a whole corpus as a heterogeneous
graph and learn word and document embeddings with
graph neural networks jointly.
• Results on several benchmark datasets demonstrate that
our method outperforms state-of-the-art text classifica-
tion methods, without using pre-trained word embeddings
or external knowledge. Our method also learn predictive
word and document embeddings automatically.
Related Work
Traditional Text Classification
Traditional text classification studies mainly focus on fea-
ture engineering and classification algorithms. For feature
arXiv:1809.05679v3 [cs.CL] 13 Nov 2018