Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 1201–1207
August 1–6, 2021. ©2021 Association for Computational Linguistics
1201
Fusing Context Into Knowledge Graph for Commonsense Question
Answering
Yichong Xu
∗
, Chenguang Zhu
∗
, Ruochen Xu, Yang Liu, Michael Zeng, Xuedong Huang
Microsoft Cognitive Services Research Group
{yicxu,chezhu,ruox,yaliu10,nzeng,xdh}@microsoft.com
Abstract
Commonsense question answering (QA) re-
quires a model to grasp commonsense and
factual knowledge to answer questions about
world events. Many prior methods couple lan-
guage modeling with knowledge graphs (KG).
However, although a KG contains rich struc-
tural information, it lacks the context to pro-
vide a more precise understanding of the con-
cepts. This creates a gap when fusing knowl-
edge graphs into language modeling, espe-
cially when there is insufficient labeled data.
Thus, we propose to employ external entity
descriptions to provide contextual information
for knowledge understanding. We retrieve de-
scriptions of related concepts from Wiktionary
and feed them as additional input to pre-
trained language models. The resulting model
achieves state-of-the-art result in the Common-
senseQA dataset and the best result among
non-generative models in OpenBookQA.
1 Introduction
One critical aspect of human intelligence is the abil-
ity to reason over everyday matters based on obser-
vation and knowledge. This capability is usually
shared by most people as a foundation for commu-
nication and interaction with the world. Therefore,
commonsense reasoning has emerged as an impor-
tant task in natural language understanding, with
various datasets and models proposed in this area
(Ma et al., 2019; Talmor et al., 2018; Wang et al.,
2020; Lv et al., 2020).
While massive pre-trained models (Devlin et al.,
2018; Liu et al., 2019) are effective in language
understanding, they lack modules to explicitly han-
dle knowledge and commonsense. Also, structured
data like knowledge graph is much more efficient
in representing commonsense compared with un-
structured text. Therefore, there have been multiple
∗
Equal contribution
methods coupling language models with various
forms of knowledge graphs (KG) for commonsense
reasoning, including knowledge bases (Sap et al.,
2019; Yu et al., 2020b), relational paths (Lin et al.,
2019), graph relation network (Feng et al., 2020)
and heterogeneous graph (Lv et al., 2020). These
methods combine the merits of language modeling
and structural knowledge information and improve
the performance of commonsense reasoning and
question answering.
However, there is still a non-negligible gap be-
tween the performance of these models and hu-
mans. One reason is that, although a KG can en-
code topological information between the concepts,
it lacks rich context information. For instance, for
a graph node for the entity “Mona Lisa”, the graph
depicts its relations to multiple other entities. But
given this neighborhood information, it is still hard
to infer that it is a painting. On the other hand, we
can retrieve the precise definition of “Mona Lisa”
from external sources, e.g. the definition of Mona
Lisa in Wiktionary is “A painting by Leonardo da
Vinci, widely considered as the most famous paint-
ing in history”. To represent structured data that
can be seamlessly integrated into language models,
we need to provide a panoramic view of each con-
cept in the knowledge graph, including its neigh-
boring concepts, relations to them, and a definitive
description of it.
Thus, we propose the DEKCOR model, i.e. DE-
scriptive Knowledge for COmmonsense question
answeRing, to tackle multiple choice common-
sense questions. Given a question and a choice,
we first extract the contained concepts. Then, we
extract the edge between the question concept and
the choice concept in ConceptNet (Speer et al.,
2017). If such an edge does not exist, we compute
a relevance score for each knowledge triple (sub-
ject, relation, object) containing the choice concept,
and select the one with the highest score. Next, we