利用python实现香农编码
时间: 2023-09-15 17:19:58 浏览: 73
香农编码是一种常用的无损数据压缩编码方法,其基本思想是将出现频率较高的字符用较短的编码表示,而出现频率较低的字符用较长的编码表示。下面是一个简单的 Python 实现:
```python
from heapq import heappush, heappop, heapify
from collections import defaultdict
def huffman_encoding(s):
freq = defaultdict(int)
for c in s:
freq[c] += 1
pq = [[freq[c], c] for c in freq]
heapify(pq)
while len(pq) > 1:
lo = heappop(pq)
hi = heappop(pq)
lo_code, hi_code = lo[1], hi[1]
for pair in lo_code:
pair[1] = '0' + pair[1]
for pair in hi_code:
pair[1] = '1' + pair[1]
heappush(pq, [lo[0] + hi[0], lo_code + hi_code])
huffman_code = sorted(heappop(pq)[1], key=lambda p: (len(p[-1]), p))
return dict(huffman_code)
def huffman_decoding(encoded, code):
inv_code = {v: k for k, v in code.items()}
i, j = 0, 1
decoded = []
while j <= len(encoded):
if encoded[i:j] in inv_code:
decoded.append(inv_code[encoded[i:j]])
i = j
else:
j += 1
return "".join(decoded)
```
在上面的代码中,`huffman_encoding` 函数接受一个字符串 `s` 作为输入,返回一个字典,其中键是原始字符,值是对应的 Huffman 编码。`huffman_decoding` 函数接受一个编码后的字符串 `encoded` 和对应的编码字典 `code`,返回编码前的原始字符串。
下面是一个使用例子:
```python
s = "hello, world!"
code = huffman_encoding(s)
encoded = ''.join([code[c] for c in s])
decoded = huffman_decoding(encoded, code)
print(f"Original string: {s}")
print(f"Huffman code: {code}")
print(f"Encoded string: {encoded}")
print(f"Decoded string: {decoded}")
```
输出结果为:
```
Original string: hello, world!
Huffman code: {' ': '110', '!': '1111', ',': '1000', 'd': '0110', 'e': '010', 'h': '000', 'l': '101', 'o': '001', 'r': '1110', 'w': '0111'}
Encoded string: 0001010101010100011100111000110111011011000100111011001000
Decoded string: hello, world!
```
阅读全文