首页编写一个函数 extract-title(html)，该函数使用BeautifulSoup库解析给定的HTML源码，并返回该页面的标题

编写一个函数 extract-title(html)，该函数使用BeautifulSoup库解析给定的HTML源码，并返回该页面的标题

时间: 2024-12-14 07:15:39 浏览: 14

要编写一个名为 `extract_title` 的函数，首先需要安装 `beautifulsoup4` 和 `lxml` 库（如果你还没有安装的话）。然后，可以按照以下代码来实现： ```python from bs4 import BeautifulSoup import requests def extract_title(html): # 使用requests获取HTML内容 response = requests.get('data:text/html;charset=utf-8,' + html) # 检查请求是否成功 if response.status_code == 200: soup = BeautifulSoup(response.text, 'lxml') # 使用lxml解析HTML # 找到标题标签，通常是<h1>或<head>中的<title> title_tag = soup.find('title', recursive=False) or soup.find('head/title') # 如果找到标题，返回其文本；否则返回None return title_tag.text.strip() if title_tag else None else: print(f"Failed to fetch HTML with status code {response.status_code}") return None # 示例用法 html_source = "<html><head><title>Example Page Title</title></head><body>...</body></html>" title = extract_title(html_source) if title: print("Title:", title) ``` 这个函数会尝试从HTML中提取 `<title>` 标签的内容，如果找不到或请求失败，它将返回 `None`。

阅读全文