分割题目和内容并保存到csv文件中 parts[1]报错 list index out of range
时间: 2023-07-04 17:28:12 浏览: 117
python读取csv文件并把文件放入一个list中的实例讲解
5星 · 资源好评率100%
抱歉,我的代码有误。这里应该使用冒号`:`来分割题目和内容,而不是中文冒号`:`。修改后的代码如下:
```python
import re
import csv
import jieba
# 读取文件并筛选出七言诗
with open('poetry.txt', 'r', encoding='utf-8') as f:
lines = f.readlines()
septenary_poems = []
for line in lines:
if re.match('^[\u4e00-\u9fa5]{7}[\,\。\?]', line):
septenary_poems.append(line.strip())
# 清洗数据,只保留文字和 ,。?
cleaned_poems = []
for poem in septenary_poems:
cleaned_poem = re.sub('[^\u4e00-\u9fa5\,\。\?]', '', poem)
cleaned_poems.append(cleaned_poem)
# 分割题目和内容并保存到csv文件中
titles = []
contents = []
for poem in cleaned_poems:
parts = poem.split(':')
title = parts[0]
content = parts[1]
titles.append(title)
contents.append(content)
with open('poetry.csv', 'w', encoding='gbk', newline='') as f:
writer = csv.writer(f)
writer.writerows(zip(titles, contents))
# 分词题目并保存到列表中
title_words_list = []
for title in titles:
title_words = list(jieba.cut(title))
title_words_list.append(title_words)
# 分词内容并在每个字前加特殊标记<STA>,保存到列表中
content_words_list = []
for content in contents:
content_words = ['<STA>' + word for word in content]
content_words_list.append(content_words)
# 将题目和内容分词后的结果保存到csv文件中
with open('poetry_words.csv', 'w', encoding='gbk', newline='') as f:
writer = csv.writer(f)
writer.writerow(['title_words', 'content_words'])
for i in range(len(title_words_list)):
writer.writerow([','.join(title_words_list[i]), ','.join(content_words_list[i])])
```
感谢您的反馈,希望代码能够正常运行。
阅读全文