ICLR 2017：神经符号程序合成解决编程归纳难题

需积分: 9 65 浏览量更新于2024-09-07 收藏 803KB PDF 举报

在2017年的国际计算机视觉与机器学习会议上(Conference on Learning Representations, ICLR 2017)，论文"MSR2017-Neuro Symbolic Program Synthesis"提出了一个重要的研究方向——神经符号程序合成(Neuro-Symbolic Program Synthesis)。近年来，随着深度学习技术在程序归纳问题上的广泛应用，即通过输入输出示例学习通用映射到新的测试数据上，神经网络架构已经展现出了令人印象深刻的性能。然而，这些方法也存在一些关键局限性： 1. **计算成本和训练复杂性**：传统的神经程序归纳模型往往需要大量的计算资源和时间来训练，这限制了其在实际应用中的效率。 2. **任务特定性**：每个任务（即不同的程序）都需要单独训练一个模型，这意味着无法复用已有的模型，增加了开发和部署的复杂性。 3. **可解释性和验证性**：由于学习的映射是基于神经网络的，这些模型往往难以理解和验证，对于需要确保正确性的应用场景，如安全关键系统，这是一大挑战。 "Neuro-Symbolic Program Synthesis"这一新方法旨在解决这些问题。它提出了一种融合神经网络和符号逻辑的创新策略。这种方法在训练完成后，能够在特定领域语言中自动构造出一致且可解释的计算机程序。这种融合允许模型利用神经网络的强大泛化能力，同时保留符号逻辑的结构清晰性和可验证性，从而提高效率、降低任务独立性，并增强模型的透明度和可靠性。该论文的主要贡献可能包括： - **模型架构设计**：一种能够结合神经网络和符号逻辑的新型框架，可能是通过集成神经网络用于表示和学习部分功能，然后由符号推理模块负责程序的结构和逻辑部分。 - **训练与优化**：提出有效的训练算法，既能保证模型的性能，又能控制计算成本和训练时间，例如通过强化学习或联合优化的方式。 - **可解释性与验证**：研究如何将神经网络的决策过程与符号规则相结合，使得生成的程序更容易被理解和审计，比如通过可视化或生成可读的程序逻辑描述。 - **评估与基准**：论文可能会提供实验证据，展示新方法在不同任务上的性能，以及相比于传统神经程序归纳方法的优势。总结来说，"MSR2017-Neuro Symbolic Program Synthesis"论文代表了一种潜在的游戏改变者，它在程序生成领域寻求了更好的平衡，旨在通过结合神经网络和符号方法克服当前挑战，提升程序学习的实用性和可解释性。

Under review as a conference paper at ICLR 2017

Input v Output

1 William Henry Charles Charles, W.

2 Michael Johnson Johnson, M.

3 Barack Rogers Rogers, B.

4 Martha D. Saunders Saunders, M.

5 Peter T Gates Gates, P.

String e := Concat(f

, · · · , f

)

Substring f := ConstStr(s)

| SubStr(v, p

, p

)

Position p := (r, k, Dir)

| ConstPos(k)

Direction Dir := Start | End

Regex r := s | T

· · · | T

(a) (b)

Figure 1: An example FlashFill task for transforming names to lastname with initials of ﬁrst name,

and (b) The DSL for regular expression based string transformations.

The syntax and semantics of the DSL for string transformations is shown in Figure 1(b) and Figure 7

respectively. The DSL corresponds to a large subset of FlashFill DSL (except conditionals), and

allows for a richer class of substring operations than FlashFill. A DSL program takes as input a

string v and returns an output string o. The top-level string expression e is a concatenation of a

ﬁnite list of substring expressions f

, · · · , f

. A substring expression f can either be a constant

string s or a substring expression, which is deﬁned using two position logics p

(left) and p

(right).

A position logic corresponds to a symbolic expression that evaluates to an index in the string. A

position logic p can either be a constant position k or a token match expression (r, k, Dir), which

denotes the Start or End of the k

match of token r in input string v. A regex token can either be a

constant string s or one of 8 regular expression tokens: p (ProperCase), C (CAPS), l (lowercase), d

(Digits), α (Alphabets), αn (Alphanumeric),

∧

(StartOfString), and $ (EndOfString). The semantics

of the DSL programs is described in the appendix.

A DSL program for the name transformation task shown in Figure 1(a) that is con-

sistent with the examples is: Concat(f

, ConstStr(“, ”), f

, ConstStr(“.”)), where f

≡

SubStr(v, (“ ”, −1, End), ConstPos(−1)) and f

≡ SubStr(v, ConstPos(0), ConstPos(1)). The

program concatenates the following 4 strings: i) substring between the end of last whitespace and

end of string, ii) constant string “, ”, iii) ﬁrst character of input string, and iv) constant string “.”.

3 OVERVIEW OF OUR APPROACH

We now present an overview of our approach. Given a DSL L, we learn a generative model of pro-

grams in the DSL L that is conditioned on input-output examples to efﬁciently search for consistent

programs. The workﬂow of our system is shown in Figure 2, which is trained end-to-end using a

large training set of programs in the DSL together with their corresponding input-output examples.

To generate a large training set, we uniformly sample programs from the DSL and then use a rule-

based strategy to compute well-formed input strings that satisfy the pre-conditions of the programs.

The corresponding output strings are obtained by running the programs on the input strings.

A DSL can be considered a context-free grammar with a start symbol S and a set of non-terminals

with corresponding expansion rules. The (partial) grammar derivations or trees correspond to (par-

tial) programs. A na

ıve way to perform a search over the programs in a DSL is to start from the start

symbol S and then randomly choose non-terminals to expand with randomly chosen expansion rules

until reaching a derivation with only terminals. We, instead, learn a generative model over partial

derivations in the DSL that assigns probabilities to different non-terminals in a partial derivation and

corresponding expansions to guide the search for complete derivations.

Our generative model uses a Recursive-Reverse-Recursive Neural Network (R3NN) to encode par-

tial trees (derivations) in L, where each node in the partial tree encodes global information about

every other node in the tree. The model assigns a vector representation for every symbol and every

expansion rule in the grammar. Given a partial tree, the model ﬁrst assigns a vector representation

to each leaf node, and then performs a recursive pass going up in the tree to assign a global tree

representation to the root. It then performs a reverse-recursive pass starting from the root to assign

a global tree representation to each node in the tree.

剩余13页未读，继续阅读

万事屋主

粉丝: 0
资源: 2

ICLR 2017：神经符号程序合成解决编程归纳难题

MSR2017-RobustFill-Neural Program Learning under Noisy IO

MSR Tools-开源

MSR2600-CMW520-R2516P29

MSR2600-CMW520-R2516P26

MSR009-V4.zip

msr2013-bug_dataset

msr26-cmw710-r6728p25

msr36-20配置dhcp

MSR56-CMW710-R6728P25.ipe

最新资源