''' 请在Begin和End中进行程序编写,不允许修改程序结构,不允许删除修改已存在的代码。 当考生目录有中文电子版《红楼梦》小说txt的文件,利用jieba库,编程统计两部小说中人物出场的次数, 并排除"什么","一个","我们","你们","起来","这里","出来","那里","怎么","不是","只见","没有","知道","两个", "一面","如今","自己","说道"18个词,输出人物出场次数的前十名,去掉分词中一个字的。 结果保存在 结果.txt文件中。 ''' import jieba excludes = {"什么","一个","我们","你们","起来","这里","出来","那里","怎么","不是","只见","没有","知道","两个","一面","如今","自己","说道"} fi = open("红楼梦.txt", "r", encoding='gb18030') fo = open("结果.txt","w") txt = fi.read() ######begin##### ######end##### items = list(counts.items()) items.sort(key=lambda x:x[1], reverse=True) fo.write("人物 出场次数\n") for i in range(10): word, count = items[i] fo.write("{0:<10}{1:>5}\n".format(word, count)) fi.close() fo.close()
时间: 2023-06-29 11:18:28 浏览: 970
好的,这是一个简单的 Python 代码实现:
```python
import jieba
excludes = {"什么","一个","我们","你们","起来","这里","出来","那里","怎么","不是","只见","没有","知道","两个","一面","如今","自己","说道"}
fi = open("红楼梦.txt", "r", encoding='gb18030')
fo = open("结果.txt","w")
txt = fi.read()
######begin#####
words = jieba.lcut(txt)
counts = {}
for word in words:
if len(word) == 1:
continue
elif word in excludes:
continue
else:
counts[word] = counts.get(word, 0) + 1
items = list(counts.items())
items.sort(key=lambda x:x[1], reverse=True)
fo.write("人物 出场次数\n")
for i in range(10):
word, count = items[i]
fo.write("{0:<10}{1:>5}\n".format(word, count))
######end#####
fi.close()
fo.close()
```
这个程序使用了 jieba 库对《红楼梦》小说进行了分词,统计了每个人物出现的次数,并排除了题目中要求排除的 18 个词语。最后将结果保存到了 "结果.txt" 文件中。
阅读全文