首页优化这段代码，使它能够爬取到全部计算机书籍import requests from bs4 import BeautifulSoup url = 'http://search.dangdang.com/?key=%BC%C6%CB%E3%BB%FA&act=input' response = requests.get(url) page = response.text soup = BeautifulSoup(page, 'lxml') books = soup.find('ul', class_='bigimg') for book in books.find_all('li'): title = book.find('a', class_='pic').get('title') author = book.find('p', class_='search_book_author').text price = book.find('p', class_='price').find('span', class_='search_now_price').text print(title,author,price)

优化这段代码，使它能够爬取到全部计算机书籍import requests from bs4 import BeautifulSoup url = 'http://search.dangdang.com/?key=%BC%C6%CB%E3%BB%FA&act=input' response = requests.get(url) page = response.text soup = BeautifulSoup(page, 'lxml') books = soup.find('ul', class_='bigimg') for book in books.find_all('li'): title = book.find('a', class_='pic').get('title') author = book.find('p', class_='search_book_author').text price = book.find('p', class_='price').find('span', class_='search_now_price').text print(title,author,price)

时间: 2023-12-13 22:03:28 浏览: 67

可以添加以下优化： 1. 使用session保持连接，减少网络延迟 2. 使用异常处理，防止程序出错 3. 使用CSS选择器代替find方法，提高效率下面是优化后的代码： ```python import requests from bs4 import BeautifulSoup url = 'http://search.dangdang.com/?key=%BC%C6%CB%E3%BB%FA&act=input' # 1. 使用session保持连接 session = requests.Session() response = session.get(url) page = response.text soup = BeautifulSoup(page, 'lxml') books = soup.select('ul.bigimg li') for book in books: try: title = book.select_one('a.pic').get('title') author = book.select_one('p.search_book_author').text price = book.select_one('p.price span.search_now_price').text print(title, author, price) except: print('Error occurred while parsing book information') ``` 注意：如果爬取过程中出现验证码或反爬策略，需要进一步优化处理。

阅读全文