用python写一个赫夫曼编码
时间: 2023-06-19 21:10:17 浏览: 52
赫夫曼编码是一种基于字符频率构建可变长度编码的压缩算法。以下是用 Python 实现赫夫曼编码的示例代码:
```python
import heapq
from collections import defaultdict
def huffman_encoding(data):
# Step 1: Determine character frequencies
freq = defaultdict(int)
for c in data:
freq[c] += 1
# Step 2: Build Huffman tree
heap = [[wt, [sym, ""]] for sym, wt in freq.items()]
heapq.heapify(heap)
while len(heap) > 1:
lo = heapq.heappop(heap)
hi = heapq.heappop(heap)
for pair in lo[1:]:
pair[1] = '0' + pair[1]
for pair in hi[1:]:
pair[1] = '1' + pair[1]
heapq.heappush(heap, [lo[0] + hi[0]] + lo[1:] + hi[1:])
# Step 3: Generate Huffman codes
huffman_dict = dict(heapq.heappop(heap)[1:])
encoded_data = ''.join([huffman_dict[c] for c in data])
return encoded_data, huffman_dict
def huffman_decoding(encoded_data, huffman_dict):
# Step 1: Invert Huffman dictionary
inv_dict = {v: k for k, v in huffman_dict.items()}
# Step 2: Decode encoded data
decoded_data = ""
i = 0
while i < len(encoded_data):
j = i + 1
while encoded_data[i:j] not in inv_dict:
j += 1
decoded_data += inv_dict[encoded_data[i:j]]
i = j
return decoded_data
# Example usage
data = "hello world"
encoded_data, huffman_dict = huffman_encoding(data)
print("Encoded data:", encoded_data)
decoded_data = huffman_decoding(encoded_data, huffman_dict)
print("Decoded data:", decoded_data)
```
这段代码首先通过统计每个字符的出现频率来确定 Huffman 树。然后,它从 Huffman 树中构建了一个编码字典,其中每个字符都有一个唯一的二进制编码。接下来,它使用该字典将输入数据编码为二进制字符串。最后,它使用该字典将编码数据解码为原始数据。