# 示例数据 reviews = [ {"reviewerID": "AO1Z63O9CTT95", "asin": "B000H29TXU", "reviewerName": "James Westervelt", "helpful": [0, 0], "reviewText": "he's OK. His humor consists mainly of varying between a Vietnamese accent and a smooth articulate presentation of ironic situations.", "overall": 2.0, "summary": "same routine he did on last comic standing", "unixReviewTime": 1394064000, "reviewTime": "03 6, 2014"}, # 添加更多评论数据... ]把这个代码换成读取excel数据呢
时间: 2023-08-09 13:10:54 浏览: 176
读取excel数据
若您的评论数据存储在 Excel 文件中,您可以使用 pandas 库来读取 Excel 文件并提取评论文本。以下是一个修改后的示例代码:
```python
import re
import pandas as pd
from collections import Counter
def count_word_frequency(reviews):
word_freq = Counter()
for review in reviews:
review_text = review["reviewText"]
# 清洗文本数据
review_text = re.sub(r"[^a-zA-Z\s]", "", review_text)
review_text = review_text.lower()
# 分割文本为单词
words = review_text.split()
# 统计词频
word_freq.update(words)
return word_freq
# 读取 Excel 文件
df = pd.read_excel("your_file_path.xlsx")
# 将数据转换为字典列表
reviews = df.to_dict(orient="records")
# 统计词频
word_frequency = count_word_frequency(reviews)
# 打印前10个常见词
most_common_words = word_frequency.most_common(10)
for word, freq in most_common_words:
print(f"{word}: {freq}")
```
请确保将 "your_file_path.xlsx" 替换为您实际的文件路径。此示例假设您的 Excel 文件包含与示例数据相同的列和字段名称,如果有所不同,请相应地进行调整。
此外,确保您已安装所需的库,可以使用 `pip install pandas` 和 `pip install xlrd` 来安装 pandas 和 xlrd 库。
阅读全文