Huffman python
时间: 2023-10-28 08:01:58 浏览: 122
Huffman 编码是一种常用的数据压缩算法,它通过构建最优前缀编码来实现数据压缩。在 Python 中,你可以使用以下代码实现 Huffman 编码:
```python
import heapq
from collections import defaultdict
def build_frequency_table(data):
frequency_table = defaultdict(int)
for char in data:
frequency_table[char] += 1
return frequency_table
def build_huffman_tree(frequency_table):
heap = [[weight, [char, ""]] for char, weight in frequency_table.items()]
heapq.heapify(heap)
while len(heap) > 1:
lo = heapq.heappop(heap)
hi = heapq.heappop(heap)
for pair in lo[1:]:
pair[1] = '0' + pair[1]
for pair in hi[1:]:
pair[1] = '1' + pair[1]
heapq.heappush(heap, [lo[0] + hi[0]] + lo[1:] + hi[1:])
return heap[0]
def build_codewords(huffman_tree):
codewords = {}
for pair in huffman_tree[1:]:
char = pair[0]
code = pair[1]
codewords[char] = code
return codewords
def huffman_encoding(data):
frequency_table = build_frequency_table(data)
huffman_tree = build_huffman_tree(frequency_table)
codewords = build_codewords(huffman_tree)
encoded_data = ""
for char in data:
encoded_data += codewords[char]
return encoded_data, huffman_tree
def huffman_decoding(encoded_data, huffman_tree):
decoded_data = ""
current_node = huffman_tree
for bit in encoded_data:
if bit == '0':
current_node = current_node[1] # left child
else:
current_node = current_node[2] # right child
if len(current_node) == 2:
char = current_node[0]
decoded_data += char
current_node = huffman_tree
return decoded_data
# 示例用法
data = "Hello, world!"
encoded_data, huffman_tree = huffman_encoding(data)
decoded_data = huffman_decoding(encoded_data, huffman_tree)
print("Encoded data:", encoded_data)
print("Decoded data:", decoded_data)
```
以上代码演示了 Huffman 编码的基本实现。你可以传入需要压缩的数据,然后得到压缩后的编码和解码结果。
注意:Huffman 编码是一种基于字符频率进行压缩的算法,对于一些特定的数据集,可能不会得到明显的压缩效果。
阅读全文