为什么得到了一个空的文件 import csv import spacy_pkuseg as pkuseg import re # 创建分词对象 seg = pkuseg.pkuseg(model_name="mixed") # 读取csv文件 with open('/Users/rachel_lrq/Desktop/浙江分词.csv', 'r', encoding='utf-8') as file: csv_reader = csv.reader(file) data = [] for row in csv_reader: data.extend(row) # 进行分词 seg_list = seg.cut(' '.join(data)) #设置停用词 content = open('/Users/rachel_lrq/Desktop/实习/哈工大停用词表.txt',encoding="gbk") stop_words = [] for c in content: c = re.sub('\n|\r','',c) stop_words.append(c) with open('/Users/rachel_lrq/Desktop/data.csv', 'w', newline='', encoding='utf-8') as file: csv_writer = csv.writer
时间: 2024-04-16 19:26:51 浏览: 144
pkuseg python版工具包
在你的代码中,你忘记导入csv模块了。你需要在开头添加`import csv`语句来导入csv模块。修改后的代码如下所示:
```python
import csv
import spacy_pkuseg as pkuseg
import re
# 创建分词对象
seg = pkuseg.pkuseg(model_name="mixed")
# 读取csv文件
with open('/Users/rachel_lrq/Desktop/浙江分词.csv', 'r', encoding='utf-8') as file:
csv_reader = csv.reader(file)
data = []
for row in csv_reader:
data.extend(row)
# 进行分词
seg_list = seg.cut(' '.join(data))
# 设置停用词
content = open('/Users/rachel_lrq/Desktop/实习/哈工大停用词表.txt', encoding="gbk")
stop_words = []
for c in content:
c = re.sub('\n|\r', '', c)
stop_words.append(c)
with open('/Users/rachel_lrq/Desktop/data.csv', 'w', newline='', encoding='utf-8') as file:
csv_writer = csv.writer(file)
for word in seg_list:
if word not in stop_words:
csv_writer.writerow([word])
```
这段代码会使用csv模块读取CSV文件,并使用spacy_pkuseg进行分词。然后,它会读取停用词表,并将分词结果保存到一个新的CSV文件中。请确保替换代码中的文件路径为你自己的实际路径。
阅读全文