An Iterative Polishing Framework based on Quality Aware Masked Language
Model for Chinese Poetry Generation
Liming Deng,
1
Jie Wang,
1
Hangming Liang,
1
Hui Chen,
1
Zhiqiang Xie,
3∗
Bojin Zhuang,
1
Shaojun Wang,
1
Jing Xiao
2
1
Ping An Technology
2
Ping An Insurance (Group) Company of China
3
University of Science and Technology of China
dengliming777@pingan.com.cn, photonicsjay@163.com
Abstract
Owing to its unique literal and aesthetical characteristics, au-
tomatic generation of Chinese poetry is still challenging in
Artificial Intelligence, which can hardly be straightforwardly
realized by end-to-end methods. In this paper, we propose a
novel iterative polishing framework for highly qualified Chi-
nese poetry generation. In the first stage, an encoder-decoder
structure is utilized to generate a poem draft. Afterwards,
our proposed Quality-Aware Masked Language Model (QA-
MLM) is employed to polish the draft towards higher quality
in terms of linguistics and literalness. Based on a multi-task
learning scheme, QA-MLM is able to determine whether pol-
ishing is needed based on the poem draft. Furthermore, QA-
MLM is able to localize improper characters of the poem
draft and substitute with newly predicted ones accordingly.
Benefited from the masked language model structure, QA-
MLM incorporates global context information into the pol-
ishing process, which can obtain more appropriate polishing
results than the unidirectional sequential decoding. Moreover,
the iterative polishing process will be terminated automati-
cally when QA-MLM regards the processed poem as a qual-
ified one. Both human and automatic evaluation have been
conducted, and the results demonstrate that our approach
is effective to improve the performance of encoder-decoder
structure.
Introduction
Chinese Poetry, originated from people’s production and
life, has a long history. The poetry is developed from few
characters, vague rules to some fixed characters and lines
with stable rules and forms. The rules like tonal pattern,
rhyme scheme lead to poems easy to be read and remem-
bered. The great poems, which touch millions of people at
heart across the space and time, should unify the concise
form, refined language and rich content together to guaran-
tee the long-term prosperity. Writing great poems are not
easy, which require strong desire for poets to express their
feelings, views or thoughts and then to choose characters
and build sentence carefully.
Poets are always regarded as genius with great talents and
well trained in writing poems. It is hard to write a poem
∗
This work was done when Zhiqiang Xie was at Ping An Tech-
nology
for ordinary people, let alone to computers. Although many
works (Gerv
´
as 2001; Ghazvininejad et al. 2016; Yi et al.
2018; Li et al. 2018) have been conducted for automatic po-
etry generation and poetic rules and forms can be learned
partially, the large gaps remain in the meaningfulness and
coherence of generated poems.
In this paper, we focus on the automatic Chinese poetry
generation and aim to fill these gaps. We notice that poets
would first write a poem draft and then polish the draft many
times to a perfect one. There is a popular story about pol-
ishing poem by Dao Jia, a famous poet in Tang Dynasty,
who influences many later poets in polishing their poems
intensively. Motivated by the writing poem process of po-
ets, we aim to imitate this process and improve the coher-
ence and meaningfulness of primitive poems. However, it is
challenging for computer algorithms to automatically polish
the poem draft to an excellent one. The computer algorithms
are unable to choose the characters and sentences like poets
with intuition and comprehensive understanding of the char-
acters, which are only good at calculating the probability of
characters and picking up ones with maximum probability
from vocabulary. There are three key issues to be addressed
for the polishing framework.
• Whether the text need to be polished, and when should we
stop the iterative polishing process?
• Which characters in the text are improper and need to be
replaced with better ones?
• How to obtain the better ones?
To address these key issues and further improve the qual-
ity of generated poem, we propose a Quality-Aware Masked
Language Model (QA-MLM) to implement an iterative pol-
ishing process. To the best of our knowledge, this is the first
work to solve the three key issues in polishing framework
with one elegant model.
Our idea originates from the BERT (Devlin et al. 2018)
with two-task learning schema, and we modify the tasks to
aware of text quality and further obtain appropriate charac-
ters to replace the low quality characters in the text. With
these two tasks, we can polish the generated poem draft it-
eratively, and the polishing process will be terminated auto-
matically. The main contributions of this paper are summa-
arXiv:1911.13182v1 [cs.CL] 29 Nov 2019