首页python统计csv文件某一列出现最多的词,且每个单元格数据为不定长的列表

python统计csv文件某一列出现最多的词,且每个单元格数据为不定长的列表

时间: 2023-06-18 08:02:08 浏览: 81

你可以使用Python的pandas库来读取CSV文件，并使用Counter来计数。具体步骤如下： 1. 导入pandas和Counter库： ```python import pandas as pd from collections import Counter ``` 2. 使用pandas读取CSV文件，并将指定列转化为列表： ```python df = pd.read_csv('your_csv_file.csv') col_list = df['your_column_name'].tolist() ``` 3. 将列表中的每个单元格数据合并为一个大列表： ```python merged_list = [item for sublist in col_list for item in sublist] ``` 4. 使用Counter计数并获取出现频率最高的词： ```python word_counts = Counter(merged_list) most_common_word = word_counts.most_common(1)[0][0] ``` 其中，most_common(1)表示获取出现频率最高的一个词及其出现次数，[0][0]表示获取该词。完整代码如下： ```python import pandas as pd from collections import Counter df = pd.read_csv('your_csv_file.csv') col_list = df['your_column_name'].tolist() merged_list = [item for sublist in col_list for item in sublist] word_counts = Counter(merged_list) most_common_word = word_counts.most_common(1)[0][0] print(most_common_word) ```

阅读全文