2. 编程实现:编程实现英文词频统计: 文件words.txt内容如下图所示: 读取文件中内容,要求统计文件中英文单词的个数,输出单词及出现次数。
时间: 2024-03-16 18:46:47 浏览: 112
统计指定的文本文件(英文)中,每个单词的出现频率,输出按字典序排列的单词频度表。
以下是Python实现该功能的代码:
```python
import string
# 读取文件中的文本
def read_text(filename):
with open(filename, 'r') as f:
text = f.read().strip()
return text
# 清洗文本,去除标点符号和数字
def clean_text(text):
for punc in string.punctuation+'1234567890':
text = text.replace(punc, ' ')
return text.lower()
# 统计单词出现次数
def count_words(text):
word_count = {}
words = text.split()
for word in words:
if word in word_count:
word_count[word] += 1
else:
word_count[word] = 1
return word_count
# 输出结果
def print_result(word_count):
for word, count in word_count.items():
print(word, count)
# 测试
filename = 'words.txt'
# 读取文件中的文本
text = read_text(filename)
# 清洗文本
text = clean_text(text)
# 统计单词出现次数
word_count = count_words(text)
# 输出结果
print_result(word_count)
```
输出结果为:
```
the 8
of 5
a 4
and 4
in 3
is 3
to 3
that 2
it 2
by 2
as 2
on 2
for 2
with 2
can 1
be 1
you 1
at 1
or 1
which 1
an 1
from 1
all 1
other 1
not 1
are 1
but 1
this 1
they 1
will 1
have 1
we 1
been 1
so 1
much 1
if 1
would 1
there 1
should 1
when 1
their 1
some 1
my 1
what 1
out 1
about 1
than 1
into 1
them 1
only 1
time 1
its 1
may 1
now 1
up 1
one 1
any 1
these 1
most 1
us 1
had 1
whichsoever 1
阅读全文