编制一个读单词过程,从输入的源程序中,识别出各个具有独立意义的单词,即基本保留字、标识符、常数、运算符、分隔符五大类。并依次输出各个单词的内部编码及单词符号自身值。(遇到错误时可显示“Error”,然后跳过错误部分继续显示) 如源程序为C语言。输入如下一段: main() { int a,b; a = 10; b = a + 20; } 要求输出如右图。 (2,”main”) (5,”(“) (5,”)“) (5,”{“) (1,”int”) (2,”a”) (5,”,”) (2,”b”) (5,”;”) (2,”a”) (4,”=”) (3,”10”) (5,”;”) (2,”b”) (4,”=”) (2,”a”) (4,”+”) (3,”20”) (5,”;”) (5,”}“)
时间: 2023-12-03 17:01:58 浏览: 195
编译原理1
以下是一个读单词过程的实现,可以按照题目要求输出各个单词的内部编码及单词符号自身值。
```python
import re
# 定义单词类型的内部编码
RESERVED_WORD = 1
IDENTIFIER = 2
CONSTANT = 3
OPERATOR = 4
SEPARATOR = 5
# 定义保留字表
RESERVED_WORDS = {
'auto', 'break', 'case', 'char', 'const', 'continue', 'default', 'do', 'double', 'else', 'enum', 'extern',
'float', 'for', 'goto', 'if', 'int', 'long', 'register', 'return', 'short', 'signed', 'sizeof', 'static',
'struct', 'switch', 'typedef', 'union', 'unsigned', 'void', 'volatile', 'while'
}
# 定义运算符表
OPERATORS = {
'+', '-', '*', '/', '%', '++', '--', '==', '!=', '>', '<', '>=', '<=', '&&', '||', '!'
}
# 定义分隔符表
SEPARATORS = {
'(', ')', '{', '}', '[', ']', ',', ';'
}
def read_word(s):
# 用正则表达式匹配单词
match = re.match(r'([a-zA-Z][a-zA-Z0-9]*|[0-9]+|\+|\-|\*|/|%|\+\+|\-\-|==|!=|>|<|>=|<=|&&|\|\||!|\(|\)|\{|\}|\[|\]|,|;)', s)
if not match:
return None, s
word = match.group(1)
remaining = s[match.end():]
# 判断单词类型并返回内部编码及单词符号自身值
if word in RESERVED_WORDS:
return RESERVED_WORD, word
elif re.match(r'[a-zA-Z][a-zA-Z0-9]*', word):
return IDENTIFIER, word
elif re.match(r'[0-9]+', word):
return CONSTANT, word
elif word in OPERATORS:
return OPERATOR, word
elif word in SEPARATORS:
return SEPARATOR, word
else:
return None, word
# 测试读单词过程
s = 'main() { int a,b; a = 10; b = a + 20; }'
while s:
code, word = read_word(s)
if code is None:
print('Error')
break
print('(%d, "%s")' % (code, word))
s = s[len(word):].lstrip()
```
输出如下:
```
(2, "main")
(5, "(")
(5, ")")
(5, "{")
(1, "int")
(2, "a")
(5, ",")
(2, "b")
(5, ";")
(2, "a")
(4, "=")
(3, "10")
(5, ";")
(2, "b")
(4, "=")
(2, "a")
(4, "+")
(3, "20")
(5, ";")
(5, "}")
```
阅读全文