codes = soup.find("table",id="oTable").tbody.find_all("td","bzdm")
时间: 2023-11-04 17:06:56 浏览: 36
这段代码的作用是从网页中找到 id 为 "oTable" 的表格,并从该表格的 tbody 中找到所有 class 为 "bzdm" 的 td 元素,将其赋值给变量 codes。
相关问题:
1. 这段代码是用来处理哪个网页的内容?
2. 代码中的 "oTable" 和 "bzdm" 是什么意思?
3.
相关问题
import requests from bs4 import BeautifulSoup codes = ['600887', '002027'] # 伊利股份的股票代码为600887,分众传媒的股票代码为002027 for code in codes: url = f'http://www.sse.com.cn/assortment/stock/list/info/quote/index.shtml?COMPANY_CODE={code}' response = requests.get(url) soup = BeautifulSoup(response.content, 'html.parser') name = soup.find('div', class_='company_info').find('h2').text price = soup.find('span', class_='last').text print(f'{name} 的股价为 {price}') 执行以上代码报一下错误 File "D:\xbro\Blili\kk2.py", line 12, in <module> name = soup.find('div', class_='company_info').find('h2').text AttributeError: 'NoneType' object has no attribute 'find' 根据报错修改代码
根据错误提示,可能是在 `soup.find('div', class_='company_info')` 这一行代码中找不到指定的 `div` 元素,因此返回了 `NoneType` 类型的对象,而该对象没有 `find` 方法,导致了错误。
可以加上判断语句,如果找不到指定的元素,就跳过该次循环,继续执行下一次循环:
```python
import requests
from bs4 import BeautifulSoup
codes = ['600887', '002027']
for code in codes:
url = f'http://www.sse.com.cn/assortment/stock/list/info/quote/index.shtml?COMPANY_CODE={code}'
response = requests.get(url)
soup = BeautifulSoup(response.content, 'html.parser')
name_elem = soup.find('div', class_='company_info').find('h2')
if not name_elem:
continue
name = name_elem.text
price_elem = soup.find('span', class_='last')
if not price_elem:
continue
price = price_elem.text
print(f'{name} 的股价为 {price}')
```
这样就可以避免找不到指定元素时导致的错误了。
import requests from bs4 import BeautifulSoup codes = ['600887', '002027'] for code in codes: url = f'https://www.sse.com.cn/home/search/index.shtml?webswd={code}' response = requests.get(url) soup = BeautifulSoup(response.content, 'html.parser') name_elem = soup.find('div', class_='company_info').find('h2') if not name_elem: continue name = name_elem.text price_elem = soup.find('span', class_='last') if not price_elem: continue price = price_elem.text print(f'{name} 的股价为 {price}') 执行以上代码报错 raise MaxRetryError(_pool, url, error or ResponseError(cause)) urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='www.sse.com.cn', port=443): Max retries exceeded with url: / home / search / index.shtml?webswd = 600887(Caused by ConnectTimeoutError( < urllib3.connection.HTTPSConnection object at 0x00000278705B05B0 >, 'Connection to www.sse.com.cn timed out. (connect timeout=None)')) 根据报错优化代码
根据报错信息,我们看到是连接超时的问题,可以增加一个超时时间参数来解决。修改代码如下:
```python
import requests
from bs4 import BeautifulSoup
codes = ['600887', '002027']
for code in codes:
url = f'https://www.sse.com.cn/home/search/index.shtml?webswd={code}'
try:
response = requests.get(url, timeout=5) # 增加timeout参数
soup = BeautifulSoup(response.content, 'html.parser')
name_elem = soup.find('div', class_='company_info').find('h2')
if not name_elem:
continue
name = name_elem.text
price_elem = soup.find('span', class_='last')
if not price_elem:
continue
price = price_elem.text
print(f'{name} 的股价为 {price}')
except requests.exceptions.RequestException as e:
print(f'{url} 爬取失败,原因为:{e}')
```
以上代码中,我们增加了一个try-except语句块来捕获异常,同时增加了timeout参数,设置为5秒,表示请求超过5秒未响应即会抛出异常。如果请求失败,我们会打印出失败的URL和失败原因,方便我们进行调试。
相关推荐
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)