Huffman树与Huffman编码
时间: 2024-06-13 08:06:15 浏览: 114
Huffman树与HUFFMAN编码
哈夫曼树是一种带权路径长度最短的树,也称为最优二叉树。它的构建过程是通过贪心算法,将权值较小的节点放在左子树,权值较大的节点放在右子树,最终构建出一棵树。哈夫曼编码是一种可变长度编码,它是通过哈夫曼树来实现的。在哈夫曼编码中,每个字符都对应着一条从根节点到叶子节点的路径,路径上的0和1分别表示左子树和右子树。对于同一棵哈夫曼树,每个字符的编码都是唯一的,且满足任意一个字符的编码都不是另一个字符编码的前缀,这种编码方式称为前缀编码。
下面是一个构建哈夫曼树和编码的Python示例代码:
```python
import heapq
from collections import defaultdict
def huffman_encoding(data):
# 统计字符出现的频率
freq = defaultdict(int)
for c in data:
freq[c] += 1
# 构建哈夫曼树
heap = [[weight, [char, ""]] for char, weight in freq.items()]
heapq.heapify(heap)
while len(heap) > 1:
lo = heapq.heappop(heap)
hi = heapq.heappop(heap)
for pair in lo[1:]:
pair[1] = '0' + pair[1]
for pair in hi[1:]:
pair[1] = '1' + pair[1]
heapq.heappush(heap, [lo[0] + hi[0]] + lo[1:] + hi[1:])
# 生成编码
encoding = {}
for pair in heap[0][1:]:
encoding[pair[0]] = pair[1]
# 对数据进行编码
encoded_data = ''.join([encoding[c] for c in data])
return encoded_data, encoding
def huffman_decoding(encoded_data, encoding):
# 反转编码字典
decoding = {v: k for k, v in encoding.items()}
# 解码数据
decoded_data = ""
code = ""
for bit in encoded_data:
code += bit
if code in decoding:
decoded_data += decoding[code]
code = ""
return decoded_data
# 示例
data = "hello world"
encoded_data, encoding = huffman_encoding(data)
decoded_data = huffman_decoding(encoded_data, encoding)
print("Encoded data:", encoded_data)
print("Decoded data:", decoded_data)
print("Encoding table:", encoding)
```
阅读全文