运用re库的中的操作方法和技能提取自己感兴趣网页的超链接,并对过程和输出的结果进行展示。
时间: 2024-09-22 18:05:05 浏览: 35
在Python中,可以使用`re`库(Regular Expression,正则表达式)配合`requests`库来抓取并提取网页中的超链接。以下是一个简单的步骤:
首先,需要安装`requests`库,如果尚未安装,可以使用`pip install requests`命令。
```python
import re
import requests
# 获取网页内容
def get_html(url):
try:
response = requests.get(url)
response.raise_for_status() # 检查请求是否成功
return response.text
except requests.exceptions.RequestException as e:
print(f"Error occurred while fetching the page: {e}")
return None
# 使用正则表达式提取超链接
def extract_links(html):
pattern = r'<a\s+(?:[^>]*?\s+)?href="([^"]*)"[^>]*>'
links = re.findall(pattern, html)
return links
# 示例:获取并显示Python官网首页的超链接
url = "https://www.python.org/"
html_content = get_html(url)
if html_content:
extracted_links = extract_links(html_content)
print("提取到的Python官网超链接:")
for link in extracted_links:
print(link)
else:
print("无法获取页面内容")
阅读全文
相关推荐
data:image/s3,"s3://crabby-images/48ecf/48ecfff65b0229a65d66a94d53c67b4ec0248998" alt="docx"
data:image/s3,"s3://crabby-images/5402c/5402c08311ac4060fea3813aa755d24bfad9113e" alt=""
data:image/s3,"s3://crabby-images/67779/677799e3f0cb300878598cdf44af630e5aa7bdbb" alt="pdf"
data:image/s3,"s3://crabby-images/67779/677799e3f0cb300878598cdf44af630e5aa7bdbb" alt="application/pdf"
data:image/s3,"s3://crabby-images/c7f95/c7f957a578cbb465f17670ca5ec5de6d8fbcb44e" alt="zip"
data:image/s3,"s3://crabby-images/67779/677799e3f0cb300878598cdf44af630e5aa7bdbb" alt="pdf"
data:image/s3,"s3://crabby-images/76d5d/76d5dcefc5ad32aa65e7d5f6e5b202b09b84830d" alt="rar"
data:image/s3,"s3://crabby-images/e09fa/e09fa4d37aa4b8ac63bbefa75d17fdf661f74dab" alt="doc"
data:image/s3,"s3://crabby-images/76d5d/76d5dcefc5ad32aa65e7d5f6e5b202b09b84830d" alt="rar"
data:image/s3,"s3://crabby-images/67779/677799e3f0cb300878598cdf44af630e5aa7bdbb" alt="pdf"
data:image/s3,"s3://crabby-images/67779/677799e3f0cb300878598cdf44af630e5aa7bdbb" alt="pdf"
data:image/s3,"s3://crabby-images/e09fa/e09fa4d37aa4b8ac63bbefa75d17fdf661f74dab" alt="doc"
data:image/s3,"s3://crabby-images/c7f95/c7f957a578cbb465f17670ca5ec5de6d8fbcb44e" alt="application/x-zip"
data:image/s3,"s3://crabby-images/76d5d/76d5dcefc5ad32aa65e7d5f6e5b202b09b84830d" alt="rar"
data:image/s3,"s3://crabby-images/10214/10214c21be157c7ee09c5af8793db42d5f2aee45" alt="txt"