python构建2-gram和3-gram语言模型。
时间: 2023-05-31 18:03:29 浏览: 105
2-gram语言模型:
```python
text = "I love natural language processing"
tokens = text.split()
# 构建2-gram
n = 2
n_grams = []
for i in range(len(tokens)-n+1):
n_grams.append(tuple(tokens[i:i+n]))
# 统计2-gram出现频率
freq = {}
for gram in n_grams:
if gram in freq:
freq[gram] += 1
else:
freq[gram] = 1
# 计算2-gram概率
prob = {}
for gram in freq:
prob[gram] = freq[gram] / len(n_grams)
print(prob)
```
输出:
```
{('I', 'love'): 0.25, ('love', 'natural'): 0.25, ('natural', 'language'): 0.25, ('language', 'processing'): 0.25}
```
3-gram语言模型:
```python
text = "I love natural language processing"
tokens = text.split()
# 构建3-gram
n = 3
n_grams = []
for i in range(len(tokens)-n+1):
n_grams.append(tuple(tokens[i:i+n]))
# 统计3-gram出现频率
freq = {}
for gram in n_grams:
if gram in freq:
freq[gram] += 1
else:
freq[gram] = 1
# 计算3-gram概率
prob = {}
for gram in freq:
prob[gram] = freq[gram] / len(n_grams)
print(prob)
```
输出:
```
{('I', 'love', 'natural'): 0.3333333333333333, ('love', 'natural', 'language'): 0.3333333333333333, ('natural', 'language', 'processing'): 0.3333333333333333}
```
相关推荐
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![pdf](https://img-home.csdnimg.cn/images/20210720083512.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![-](https://csdnimg.cn/download_wenku/file_type_column_c1.png)
![-](https://csdnimg.cn/download_wenku/file_type_column_c1.png)
![-](https://csdnimg.cn/download_wenku/file_type_column_c1.png)
![-](https://csdnimg.cn/download_wenku/file_type_column_c1.png)
![-](https://csdnimg.cn/download_wenku/file_type_column_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)