上述代码如何加入异常情况处理代码
时间: 2024-04-14 20:04:07 浏览: 18
你可以使用try-except块来处理异常情况。以下是在代码中添加异常处理的示例:
```python
import re
import requests
headers = {'User-Agent':
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36"}
def baidu(company, page):
num = (page-1) * 10
url = 'http://www.baidu.com/s?rtt=4&bsst=1&cl=2&tn=news&rsv_dl=ns_pc&word=' + company + '&pn=' + str(num)
res = requests.get(url, headers=headers).text
p_href = re.compile('<h3 class="news-title_1YtI1 "><a href="(.*?)"', re.S)
p_title = re.compile('class="news-title-font_1xS-F" aria-label="(.*?)"', re.S)
p_date = re.compile('class="c-color-gray2 c-font-normal c-gap-right-xsmall" aria-label="(.*?)"', re.S)
p_source = re.compile('<span class="c-color-gray" aria-label="(.*?)"', re.S)
try:
href = re.findall(p_href, res)
title = re.findall(p_title, res)
date = re.findall(p_date, res)
source = re.findall(p_source, res)
except Exception as e:
print("爬取过程中出现异常:", str(e))
return [], [], [], []
return href, title, date, source
companys = ['海康威视', '京东', '顺丰', '阿里巴巴', '海通证券', '恒大集团']
for company in companys:
for b in range(3):
href, title, date, source = baidu(company, b+1)
print(company + '第' + str(b+1) + '页爬取成功')
for i in range(len(href)):
print("Href:", href[i])
print("Title:", title[i])
print("Date:", date[i])
print("Source:", source[i])
print()
```
在上述代码中,我们使用try-except块来捕获异常。如果在爬取过程中出现异常,我们会打印出异常信息并返回空列表。这样可以确保即使出现异常,代码也能继续执行而不中断。