word_counts = {} for words in positivewords: if words not in word_counts: word_counts[words] = 0 else: word_counts[words] += 1这段代码的作用是
时间: 2024-02-12 10:03:14 浏览: 71
统计一个列表或集合 `positivewords` 中每个单词出现的次数,并将结果存储在一个字典 `word_counts` 中。如果 `positivewords` 中的一个单词出现多次,那么对应的计数器 `word_counts[words]` 会相应地增加。如果 `positivewords` 中的一个单词第一次出现,那么它会被添加到 `word_counts` 字典中,初始计数器为 0。
相关问题
import jieba with open('123.txt', 'r', encoding='utf-8') as f: text = f.read() words = jieba.lcut(text) word_counts = {} for word in words: if len(word) < 2: continue if word in word_counts: word_counts[word] += 1 else: word_counts[word] = 1 sorted_words = sorted(word_counts.items(), key=lambda x: x[1], reverse=True) nouns = [] for word, count in sorted_words: pos = jieba.lcut(word)[0].split('/')[1] if pos == 'n': nouns.append(word) if len(nouns) == 10: break print(nouns)
这段代码可以计算指定文本文件中出现频率排名前10的名词。你需要将文本文件命名为“123.txt”,并将其放在与代码文件相同的目录下,然后运行这段代码即可。请注意,代码中使用的是jieba分词库,所以需要确保该库已经安装。如果没有安装,可以使用以下命令进行安装:
```
pip install jieba
```
另外,在运行代码之前,你需要将文本文件中的内容替换为你想要分析的实际文本。
import jieba import wordcloud def takeSecond(elem): return elem[1] def createWordCloud(text): w=wordcloud.WordCloud (font_path="msyh.ttf",width=1000,height=500,background_color="white").generate(text) w.to_file("西游记词云图.jpg") def main(): path = "西游记.txt" file = open(path,"r",encoding="utf-8") text=file.read() file.close() words = jieba.lcut(text) counts = {} for word in words: if len(word) == 1: continue elif word == "大圣" or word=="老孙" or word=="行者" or word=="孙大圣" or word=="孙行者" or word=="猴王" or word=="悟空" or word=="齐天大圣" or word=="猴子": rword = "孙悟空" elif word == "师父" or word == "三藏" or word=="圣僧": rword = "唐僧" elif word == "呆子" or word=="八戒" or word=="老猪": rword = "猪八戒" elif word=="沙和尚": rword="沙僧" elif word == "妖精" or word=="妖魔" or word=="妖道": rword = "妖怪" elif word=="佛祖": rword="如来" elif word=="三太子": rword="白马" else: rword = word counts[rword] = counts.get(rword,0) + 1 file = open("excludes.txt","r",encoding="utf-8") excludes =file.read().split(",") file.close for delWord in excludes: try: del counts[delWord] except: continue items = list(counts.items()) items.sort(key = takeSecond,reverse=True) for i in range(20): item=items[i] keyWord =item[0] count=item[1] print("{0:<10}{1:>5}".format(keyWord,count)) createWordCloud(str(items[0:20])) main()
在代码中有以下几个错误:
1. 第一行代码应该在同一行,不能出现换。
2. 第二行代码需要加上空格。
3. 第四行代码需要加上空格。
4. 第18行代码需要加上空格。
5. 第27行代码`file.close`后需要加上括号。
6. 第37行代码`file.close`后需要加上括号。
7. 第51行代码的`range(20)`应该改为`range(len(items))`。
8. 第54行代码需要加上括号。
这是修改后的代码:
```
import jieba
import wordcloud
def takeSecond(elem):
return elem[1]
def createWordCloud(text):
w=wordcloud.WordCloud(font_path="msyh.ttf", width=1000, height=500, background_color="white").generate(text)
w.to_file("西游记词云图.jpg")
def main():
path = "西游记.txt"
file = open(path, "r", encoding="utf-8")
text = file.read()
file.close()
words = jieba.lcut(text)
counts = {}
for word in words:
if len(word) == 1:
continue
elif word == "大圣" or word == "老孙" or word == "行者" or word == "孙大圣" or word == "孙行者" or word == "猴王" or word == "悟空" or word == "齐天大圣" or word == "猴子":
rword = "孙悟空"
elif word == "师父" or word == "三藏" or word == "圣僧":
rword = "唐僧"
elif word == "呆子" or word == "八戒" or word == "老猪":
rword = "猪八戒"
elif word == "沙和尚":
rword = "沙僧"
elif word == "妖精" or word == "妖魔" or word == "妖道":
rword = "妖怪"
elif word == "佛祖":
rword = "如来"
elif word == "三太子":
rword = "白马"
else:
rword = word
counts[rword] = counts.get(rword, 0) + 1
file = open("excludes.txt", "r", encoding="utf-8")
excludes = file.read().split(",")
file.close()
for delWord in excludes:
try:
del counts[delWord]
except:
continue
items = list(counts.items())
items.sort(key=takeSecond, reverse=True)
for i in range(len(items)):
item = items[i]
keyWord = item[0]
count = item[1]
print("{0:<10}{1:>5}".format(keyWord, count))
createWordCloud(str(items[0:20]))
main()
```
阅读全文
相关推荐
![rar](https://img-home.csdnimg.cn/images/20241231044955.png)
![rar](https://img-home.csdnimg.cn/images/20241231044955.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![-](https://img-home.csdnimg.cn/images/20241226111658.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)