首页python中解析文档

python中解析文档

时间: 2023-11-20 15:27:09 浏览: 151

Python中解析文档的方式取决于文档的类型。常见的文档类型包括： 1. XML：使用`xml`模块或第三方库如`lxml`进行解析。 2. JSON：使用`json`模块或第三方库如`ujson`进行解析。 3. CSV：使用`csv`模块进行解析。 4. Excel：使用第三方库如`openpyxl`、`xlrd`、`pandas`进行解析。 5. HTML：使用第三方库如`BeautifulSoup`、`lxml`进行解析。以下是一个使用`BeautifulSoup`库解析HTML文档的例子： ```python from bs4 import BeautifulSoup import requests # 发送请求并获取HTML文档 response = requests.get('https://www.example.com') html = response.text # 使用BeautifulSoup解析HTML文档 soup = BeautifulSoup(html, 'html.parser') # 获取页面标题 title = soup.title.text # 获取页面所有链接 links = [link.get('href') for link in soup.find_all('a')] ``` 以上代码中，我们使用`requests`库发送请求并获取HTML文档，然后使用`BeautifulSoup`库解析文档并获取页面标题和链接。

阅读全文