'utf-8' codec can't decode byte 0x8a in position 0: invalid start byte
时间: 2023-10-16 22:07:38 浏览: 190
Zenmap 报’utf8′ codec can’t decode byte 0xc0 in position 0: invalid start byte错误部分解决方案
当使用`str(爬取的html字节码,encoding=‘utf-8’)`进行编码转换时,报错`UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0x8a in position 0: invalid start byte`。这个错误通常是因为字节码中包含无法被UTF-8编码解析的字节。根据引用中的错误信息,我们可以看到报错的位置是在字节码的起始字节处。这可能是由于字节码被gzip压缩了,导致解码时出现问题。所以我们需要对接收的字节码进行解压操作。
解决方法是使用`gzip`库对字节码进行解压,然后再进行编码转换。具体操作如下:
```
from urllib import request
from io import BytesIO
import gzip
class Spider():
url = 'https://www.douyu.com/'
def __fetch_content(self):
r = request.urlopen(Spider.url)
htmls = r.read()
buff = BytesIO(htmls)
f = gzip.GzipFile(fileobj=buff)
htmls = f.read().decode('utf-8')
print(htmls)
def go(self):
self.__fetch_content()
spider = Spider()
spider.go()
```
以上代码首先使用`gzip`库对字节码进行解压,然后再将解压后的字节码进行UTF-8编码解析。这样就可以避免报错`UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0x8a in position 0: invalid start byte`。<span class="em">1</span><span class="em">2</span><span class="em">3</span>
#### 引用[.reference_title]
- *1* [UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd5 in position 2: invalid continuation byte-...](https://download.csdn.net/download/weixin_42204453/15589184)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 33.333333333333336%"]
- *2* [ftp在window和Linux之间传文件UnicodeDecodeError: ‘utf-8‘ codec can‘t decode byte 0x8a in position ...](https://blog.csdn.net/qq_38163931/article/details/126351263)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 33.333333333333336%"]
- *3* [UnicodeDecodeError: ‘utf-8‘ codec can‘t decode byte 0x8b in position 1: invalid start byte](https://blog.csdn.net/lijuncheng963375877/article/details/122232914)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 33.333333333333336%"]
[ .reference_list ]
阅读全文