采用xpath解析网页https://peps.python.org/pep-0020/#the-zen-of-python中The Zen of Python标题下的内容,并存储为文本文件zen.txt
时间: 2024-10-18 19:07:39 浏览: 25
Python解析网页-XPath和requests-html.xmind
要使用XPath解析Python官方PEP 20(The Zen of Python)页面并保存到`zen.txt`文件中,你可以使用Python的`requests`库获取网页内容,然后结合`lxml`库解析HTML,最后写入文本文件。以下是一个简单的示例:
```python
import requests
from lxml import etree
# 请求网页
url = 'https://peps.python.org/pep-0020/#the-zen-of-python'
response = requests.get(url)
# 检查请求是否成功
if response.status_code == 200:
# 解析HTML
html_content = response.text
parser = etree.HTMLParser()
tree = etree.fromstring(html_content, parser)
# 使用XPath选取标题下内容
zen_area = tree.xpath('//div[@id="content"]//pre[@class="zen"]')[0]
# 提取文字并保存到文件
zen_text = zen_area.text_content().strip() # 去除多余的空白
with open('zen.txt', 'w', encoding='utf-8') as f:
f.write(zen_text)
else:
print(f"Failed to fetch the page. Status code: {response.status_code}")
阅读全文