def countWords(self, df): # 读取停用词表 stopwords_file = open('stopwords.txt', 'r', encoding='utf-8') stopwords = set(stopwords_file.read().splitlines()) stopwords_file.close() # 对评论内容进行中文分词 df = df['评论内容'].str.replace(r'\[.*?\]', '').apply(jieba.lcut) lst = [x for y in df.tolist() for x in y if len(x) >= 2 and x not in stopwords] # 统计词频 counts = Counter(lst) for i, word in enumerate(counts.most_common(30)): print('排名：{}，词汇：{}，频数：{}' . format(i + 1, word[0], word[1])) # 绘制词云 wc = WordCloud(width=1000, height=700, font_path="simhei.ttf", max_words=30,background_color="white") wc.generate_from_frequencies(counts) plt.axis('off') plt.imshow(wc) plt.savefig('词云图.png', dpi=300) plt.show()

时间: 2024-03-30 12:39:58 浏览: 73

这段代码定义了一个 `countWords` 方法，用于对评论内容进行分词，并统计词频并绘制词云图。该方法接受一个参数 `df`，表示需要处理的数据。该方法首先读取停用词表，并对评论内容进行中文分词，去除停用词后统计词频。然后，使用 `most_common` 函数获取出现频率最高的前 30 个词汇，并将结果打印出来。接着，使用 `WordCloud` 类生成词云图，并将其保存为图片文件。最后，使用 `imshow` 函数显示词云图。

Exception in thread "main" java.io.FileNotFoundException: java (系统找不到指定的文件。) at java.base/java.io.FileInputStream.open0(Native Method) at java.base/java.io.FileInputStream.open(FileInputStream.java:216) at java.base/java.io.FileInputStream.<init>(FileInputStream.java:157) at java.base/java.io.FileInputStream.<init>(FileInputStream.java:111) at java.base/java.io.FileReader.<init>(FileReader.java:60) at CountWords.CountWords.main(CountWords.java:8)

这个错误表明程序无法找到指定的文件，因此无法打开它。在你的代码中，文件名可能没有指定正确，或者文件确实不存在。你可以检查以下几个方面： 1. 确保文件名是正确的。如果文件名包含路径，则需要提供完整的路径，或者确保程序的工作目录是正确的。 2. 确保文件确实存在。你可以在文件系统中查找文件，或者使用文件浏览器来检查文件是否存在。 3. 确保程序具有访问文件的权限。在某些操作系统中，程序可能需要特殊权限才能访问某些文件或目录。你可以尝试使用管理员权限运行程序，或者将文件移动到程序可以访问的目录中。如果你仍然无法解决问题，请提供更多信息，我会尽力帮助你。

7. 编写bash脚本程序，完成要求的功能。生成测试数据： mkdir tstdir; for i in {1..3}; do echo file$i>./tstdir/file$i.txt;echo “line 2 hello ” >> ./tstdir/file$i.txt; done 编写脚本，完成以下下功能： 1）、编写脚本countfile统计 ./tstdir 下的文件数量 sh countfile 2）、编写脚本countwords统计文件./tstdir/file1.txt中每行字数 sh countwords 3)、编写脚本changecontent将./tstdir目录下所有文件内容中的小写hello改为大写HELLO cat ./tstdir/file1.txt sh changecontent cat ./tstdir/file1.txt 4)、编写脚本changename将./tstdir目录下所有文件的扩展名改为大写 ls ./tstdir -l sh changename ls ./tstdir –l 5）、编写脚本changemode将./tstdir目录中所有文件的权限设置为所有者只读，其他人无权限。 ls ./tstdir -l sh changemode ls ./tstdir –l

以下是编写的bash脚本程序： 1. countfile.sh： ``` #!/bin/bash cd tstdir count=$(ls -1 | wc -l) echo "Number of files in tstdir is: $count" ``` 2. countwords.sh： ``` #!/bin/bash cd tstdir file=./file1.txt if [ ! -f $file ] then echo "$file not found" else while read line do count=$(echo $line | wc -w) echo "Number of words in '$line' is: $count" done < $file fi ``` 3. changecontent.sh： ``` #!/bin/bash cd tstdir for file in * do if [ -f "$file" ] then sed -i 's/hello/HELLO/g' $file fi done echo "Content of all files in tstdir updated" ``` 4. changename.sh： ``` #!/bin/bash cd tstdir for file in * do if [ -f "$file" ] then mv "$file" "${file%.*}".TXT fi done echo "Extensions of all files in tstdir changed to uppercase" ``` 5. changemode.sh： ``` #!/bin/bash cd tstdir for file in * do if [ -f "$file" ] then chmod 400 "$file" fi done echo "Permissions of all files in tstdir changed to read-only for owner" ``` 注意：以上脚本程序均需在tstdir的父目录下运行。

阅读全文

相关推荐

countwords_nim:用尼姆数词

count-the-number-of-the-words.zip_The Count

CountWords-crx插件

matlab开发-countwords维特斯的频率

精彩编程与编程技巧-MS Office 开发工具--VBA...

CountWords.java

CountWords:本·霍伊特（Ben Hoyt）数词基准

countwords:简单文本单词计数工具使用解析

Scala使用接收给定的一行字符串（例：hello-hello,hi,job-hi,...，依次切割英文逗号和“-”，可以得到每个单词）； 统计各单词出现的次数。

【中国房地产业协会-2024研报】2024年第三季度房地产开发企业信用状况报告.pdf

【中国银行-2024研报】美国大选结果对我国芯片产业发展的影响和应对建议.pdf

RM1135开卡工具B17A

最新推荐

【中国房地产业协会-2024研报】2024年第三季度房地产开发企业信用状况报告.pdf

JHU荣誉单变量微积分课程教案介绍

管理建模和仿真的文件

【实战篇：自定义损失函数】：构建独特损失函数解决特定问题，优化模型性能

如何在ZYNQMP平台上配置TUSB1210 USB接口芯片以实现Host模式，并确保与Linux内核的兼容性？

Naruto爱好者必备CLI测试应用

"互动学习：行动中的多样性与论文攻读经历"

【强化学习损失函数探索】：奖励函数与损失函数的深入联系及优化策略

如何在Springboot后端项目中实现前端的多人视频会议功能，并使用Vue.js与ElementUI进行界面开发？

Android应用显示Ignaz-Taschner-Gymnasium取消课程概览

Scala使用接收给定的一行字符串（例：hello-hello,hi,job-hi,...，依次切割英文逗号和“-”，可以得到每个单词）；统计各单词出现的次数。