首页python利用jieba库实现Hamet英文词频统计和《哈利波特》人物出场统计。请你统计你想要的字的频率。

python利用jieba库实现Hamet英文词频统计和《哈利波特》人物出场统计。请你统计你想要的字的频率。

时间: 2024-10-16 20:05:31 浏览: 34

在Python中，我们可以使用jieba库来进行中文分词，并结合collections模块来统计词频。对于英文单词的统计，虽然jieba主要是处理中文，但我们也可以先将英文文本转换为小写，然后使用split()函数切分为单词列表，再进行频率统计。首先，我们导入必要的库： ```python import jieba from collections import Counter ``` 接着，假设我们有一个英文文本，例如《哈利波特》的部分内容： ```python harry_potter_text = """ Harry Potter is a series of fantasy novels written by British author J.K. Rowling... """ ``` 我们将文本转换为小写并分词： ```python # 对英文文本进行处理 text_list = harry_potter_text.lower().split() ``` 然后使用Counter计算每个单词的频率： ```python word_freq = Counter(text_list) ``` 为了统计特定英文单词的频率，比如"potter"： ```python potter_freq = word_freq['potter'] print(f"The frequency of 'potter' is: {potter_freq}") ``` 而对于中文，jieba可以帮助我们统计出现次数： ```python chinese_words = jieba.lcut(harry_potter_text) # 分词 chinese_word_freq = Counter(chinese_words) for word in chinese_word_freq.items(): print(f"'{word[0]}' appears {word[1]} times.") ```

阅读全文