mapreduce词频python
时间: 2023-08-15 09:13:34 浏览: 71
在Python中使用MapReduce模式计算词频可以很方便地实现。下面是一个简单的示例代码:
```python
from collections import Counter
from multiprocessing import Pool
def mapper(text):
words = text.split()
return Counter(words)
def reducer(counters):
return sum(counters, Counter())
def map_reduce(data, num_processes):
pool = Pool(num_processes)
mapped_data = pool.map(mapper, data)
reduced_data = reducer(mapped_data)
pool.close()
pool.join()
return reduced_data
if __name__ == "__main__":
# 假设有一些文本数据
data = [
"Hello world",
"Hello CSDN",
"CSDN is great",
"Hello python"
]
# 使用MapReduce计算词频
word_count = map_reduce(data, 2)
# 输出结果
for word, count in word_count.items():
print(f"{word}: {count}")
```
运行以上代码,你将会得到每个单词出现的次数。这是一个简单的MapReduce实现,你可以根据你的需求进行修改和扩展。