首页url1 = 'https://jn.lianjia.com/zufang/pg1/#contentList' response = requests.get(url1, headers=header) html = response.text match = re.search(r'data-totalpage="(\d+)"', html) if match: total_page = int(match.group(1)) else: print('未匹配到结果')for num in range(1, total_page + 1):中没找到total_page

url1 = 'https://jn.lianjia.com/zufang/pg1/#contentList' response = requests.get(url1, headers=header) html = response.text match = re.search(r'data-totalpage="(\d+)"', html) if match: total_page = int(match.group(1)) else: print('未匹配到结果')for num in range(1, total_page + 1):中没找到total_page

时间: 2024-04-01 07:35:01 浏览: 54

你需要将获取总页数的代码放在循环之前，确保能够成功获取总页数，然后再进行循环。以下是一个示例代码： ```python import requests import re url = 'https://jn.lianjia.com/zufang/pg1/#contentList' headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36 Edge/16.16299' } # 获取总页数 response = requests.get(url, headers=headers) html = response.text match = re.search(r'data-totalpage="(\d+)"', html) if match: total_page = int(match.group(1)) print(total_page) else: print('未匹配到结果') # 循环获取每一页数据 for num in range(1, total_page + 1): url = f'https://jn.lianjia.com/zufang/pg{num}/#contentList' response = requests.get(url, headers=headers) html = response.text # 提取数据的代码 # ... ``` 这个示例代码先获取总页数，然后再进行循环，获取每一页的数据。你需要根据自己的需求，修改提取数据的代码。

阅读全文