html = """ <!DOCTYPE html> <html> <head> <title>xpath test</title> </head> <body> <div price="99.8"> <div> <ul> <li>时间</li> <li>地点</li> <li>任务</li> </ul> </div> <div id='testid' data-h="first"> <h2>这里是个小标题</h2> <ol> <li data="one">1</li> <li data="two">2</li> <li data="three">3</li> </ol> <ul> <li code="84">84</li> <li code="104">104</li> <li code="223">223</li> </ul> </div> <div> <h3>这里是H3的内容 <a href="http://www.baidu.com">百度一下</a> <ul> <li>test1</li> <li>test2</li> </ul> </h3> </div> <div id="go"> <ul> <li>1</li> <li>2</li> <li>3</li> <li>4</li> <li>5</li> <li>6</li> <li>7</li> <li>8</li> <li>9</li> <li>10</li> </ul> <h3> 这里是H3 -2的内容 </h3> </div> </div> </body> </html> """ 三、实验步骤 分别使用css与xpath语法解析获得yi下内容 (1)获得title标签的内容 (2)获取所有的li标签的内容 (3)获取id="go"的后代节点中的第一个li标签的内容 (4)获取id="go"的后代节点中的最后一个li标签的内容 (5)获取id="go"的后代节点中的第五个li标签的内容 (6)获取head标签的所有后代节点 (7)获取head标签的所有兄弟节点 (8)获取存在id属性的所有标签 (9)获取存在id='testid'的所有标签 (10)获取属性值中存在‘t’的所有标签
时间: 2023-06-01 15:03:08 浏览: 152
index.html
(1) CSS语法:
title_content = soup.select_one('title').text
XPath语法:
title_content = tree.xpath('//title/text()')[0]
(2) CSS语法:
li_contents = [li.text for li in soup.select('li')]
XPath语法:
li_contents = tree.xpath('//li/text()')
(3) CSS语法:
first_li_content = soup.select_one('#go li:first-child').text
XPath语法:
first_li_content = tree.xpath('//*[@id="go"]/ul/li[1]/text()')[0]
(4) CSS语法:
last_li_content = soup.select_one('#go li:last-child').text
XPath语法:
last_li_content = tree.xpath('//*[@id="go"]/ul/li[last()]/text()')[0]
(5) CSS语法:
fifth_li_content = soup.select_one('#go li:nth-child(5)').text
XPath语法:
fifth_li_content = tree.xpath('//*[@id="go"]/ul/li[5]/text()')[0]
(6) CSS语法:
head_descendants = [descendant.name for descendant in soup.select('head *')]
XPath语法:
head_descendants = tree.xpath('//head//*')
(7) CSS语法:
head_siblings = [sibling.name for sibling in soup.select('head ~ *')]
XPath语法:
head_siblings = tree.xpath('//head/following-sibling::*')
(8) CSS语法:
id_tags = [tag.name for tag in soup.select('[id]')]
XPath语法:
id_tags = tree.xpath('//*[@id]')
(9) CSS语法:
testid_tags = [tag.name for tag in soup.select('#testid')]
XPath语法:
testid_tags = tree.xpath('//*[@id="testid"]')
(10) CSS语法:
t_tags = [tag.name for tag in soup.select('[*="t"]')]
XPath语法:
t_tags = tree.xpath('//*[contains(@*,"t")]')
阅读全文