首页selenium爬虫获取html中ul标签下所有a标签的href属性

selenium爬虫获取html中ul标签下所有a标签的href属性

时间: 2024-03-25 10:41:32 浏览: 75

要获取HTML中`ul`标签下所有`a`标签的`href`属性，可以使用Selenium的`find_elements_by_xpath()`方法和`get_attribute()`方法。以下是一个例子： ```python from selenium import webdriver driver = webdriver.Chrome() driver.get("http://example.com") # 找到ul标签 ul = driver.find_element_by_xpath("//ul") # 找到所有a标签 links = ul.find_elements_by_xpath(".//a") # 循环遍历所有a标签，获取href属性 for link in links: href = link.get_attribute("href") print(href) # 关闭浏览器 driver.quit() ``` 在上面的代码中，我们首先使用`find_element_by_xpath()`方法找到`ul`标签，然后使用`.//a`语法查找所有子孙节点中的`a`标签。接着，使用`find_elements_by_xpath()`方法获取所有`a`标签，然后循环遍历每个标签，使用`get_attribute()`方法获取`href`属性的值，并打印输出。最后使用`quit()`方法关闭浏览器。需要注意的是，如果要查找的`ul`标签和`a`标签具有特定的类名或ID，可以在XPath表达式中添加`@class`或`@id`属性，以缩小查找范围。例如，`//ul[@class='menu']//a`表示查找`class`属性为`menu`的`ul`标签下的所有`a`标签。

阅读全文