首页现在我想用request方法爬取https://www.maersk.com.cn/schedules/pointToPoint?from=2IW9P6J7XAW72&to=0XOP5ISJZK0HR&containerIsoCode=42G1&fromServiceMode=CY&toServiceMode=CY&numberOfWeeks=4&dateType=D&date=2023-06-13&vesselFlag=，我想爬取这个网页上的启运港、启运时间、目的港、到达时间、船名

现在我想用request方法爬取https://www.maersk.com.cn/schedules/pointToPoint?from=2IW9P6J7XAW72&to=0XOP5ISJZK0HR&containerIsoCode=42G1&fromServiceMode=CY&toServiceMode=CY&numberOfWeeks=4&dateType=D&date=2023-06-13&vesselFlag=，我想爬取这个网页上的启运港、启运时间、目的港、到达时间、船名

时间: 2023-07-23 20:45:21 浏览: 175

好的，您可以使用Python的requests库来发起请求，然后使用BeautifulSoup库对HTML进行解析，提取所需的数据。下面是一个简单的示例代码： ```python import requests from bs4 import BeautifulSoup url = "https://www.maersk.com.cn/schedules/pointToPoint" params = { "from": "2IW9P6J7XAW72", "to": "0XOP5ISJZK0HR", "containerIsoCode": "42G1", "fromServiceMode": "CY", "toServiceMode": "CY", "numberOfWeeks": "4", "dateType": "D", "date": "2023-06-13", "vesselFlag": "", } response = requests.get(url, params=params) soup = BeautifulSoup(response.text, "html.parser") for row in soup.select(".table-row"): departure_port = row.select_one(".departure-port").text.strip() departure_time = row.select_one(".departure-time").text.strip() arrival_port = row.select_one(".arrival-port").text.strip() arrival_time = row.select_one(".arrival-time").text.strip() vessel_name = row.select_one(".vessel-name").text.strip() print(f"船名：{vessel_name}，启运港：{departure_port}，启运时间：{departure_time}，目的港：{arrival_port}，到达时间：{arrival_time}") ``` 在上面的代码中，我们首先定义了目标URL和请求参数，然后使用requests库发起GET请求。接着，我们使用BeautifulSoup库对响应内容进行解析，通过CSS选择器来选取每一行数据，并提取所需的信息。最后，我们将数据打印出来，或者保存到文件中，根据实际需求进行处理。需要注意的是，该网站可能有反爬机制，为了避免被网站封禁IP，我们需要设置合适的请求头和代理IP等信息。另外，如果需要爬取多个搜索结果的数据，可以参考前面的回答，使用循环遍历多个搜索条件。

阅读全文