首页打开http://www.gov.cn/zhengce/xxgk/index.htm，爬取前30条政策标题、发文字号、成文日期、发布日期及政策网址，并输出。

打开http://www.gov.cn/zhengce/xxgk/index.htm，爬取前30条政策标题、发文字号、成文日期、发布日期及政策网址，并输出。

时间: 2024-01-11 11:03:13 浏览: 10

好的，这里提供一个Python代码实现： ```python import requests from bs4 import BeautifulSoup url = 'http://www.gov.cn/zhengce/xxgk/index.htm' r = requests.get(url) r.encoding = 'utf-8' soup = BeautifulSoup(r.text, 'html.parser') policy_list = soup.find_all('li') for policy in policy_list[:30]: title = policy.a.get_text().strip() # 获取政策标题 fwzh = policy.span.get_text().strip() # 获取发文字号 cwdate = policy.em.get_text().strip() # 获取成文日期 fbdate = policy.i.get_text().strip() # 获取发布日期 link = policy.a['href'] # 获取政策链接 print(title, fwzh, cwdate, fbdate, link) ``` 代码中使用了requests库和BeautifulSoup库来爬取网页内容，并获取政策标题、发文字号、成文日期、发布日期及政策网址。其中，我们只取前30条政策信息，并通过print函数输出结果。注：如果您在运行代码时出现乱码问题，可以尝试将r.encoding设置为'utf-8'。