怎么从 <ul data-target="area"> <li data-id="0" data-type="bizcircle" class="filter__item--level3 filter__item--aside strong"> <a href="/zufang/yizhuangkaifaqu/rt200600000001/">不限</a> </li> <li class="filter__item--title">d</li> <li data-id="611100395" data-type="bizcircle" class="filter__item--level3 "> <a href="/zufang/daxingqita11/rt200600000001/">大兴其它</a> </li> <li class="filter__item--title">m</li> <li data-id="18335778" data-type="bizcircle" class="filter__item--level3 "> <a href="/zufang/majuqiao1/rt200600000001/">马驹桥</a> </li> <li class="filter__item--title">t</li> <li data-id="611100405" data-type="bizcircle" class="filter__item--level3 "> <a href="/zufang/tongzhouqita11/rt200600000001/">通州其它</a> </li> <li class="filter__item--title">y</li> <li data-id="18335750" data-type="bizcircle" class="filter__item--level3 "> <a href="/zufang/yizhuang1/rt200600000001/">亦庄</a> </li> <li data-id="611100595" data-type="bizcircle" class="filter__item--level3 "> <a href="/zufang/yizhuangkaifaquqita1/rt200600000001/">亦庄开发区其它</a> </li> </ul> 这段代码中把href属性值全部提取出来
时间: 2023-10-05 08:05:24 浏览: 157
可以使用BeautifulSoup库来解析HTML代码,并使用find_all方法选取所有的a标签,再从中提取href属性值。
例如,假设HTML代码如下:
```html
<ul data-target="area">
<li data-id="0" data-type="bizcircle" class="filter__item--level3 filter__item--aside strong">
<a href="/zufang/yizhuangkaifaqu/rt200600000001/">不限</a>
</li>
<li class="filter__item--title">d</li>
<li data-id="611100395" data-type="bizcircle" class="filter__item--level3 ">
<a href="/zufang/daxingqita11/rt200600000001/">大兴其它</a>
</li>
<li class="filter__item--title">m</li>
<li data-id="18335778" data-type="bizcircle" class="filter__item--level3 ">
<a href="/zufang/majuqiao1/rt200600000001/">马驹桥</a>
</li>
<li class="filter__item--title">t</li>
<li data-id="611100405" data-type="bizcircle" class="filter__item--level3 ">
<a href="/zufang/tongzhouqita11/rt200600000001/">通州其它</a>
</li>
<li class="filter__item--title">y</li>
<li data-id="18335750" data-type="bizcircle" class="filter__item--level3 ">
<a href="/zufang/yizhuang1/rt200600000001/">亦庄</a>
</li>
<li data-id="611100595" data-type="bizcircle" class="filter__item--level3 ">
<a href="/zufang/yizhuangkaifaquqita1/rt200600000001/">亦庄开发区其它</a>
</li>
</ul>
```
要获取所有a标签的href属性值,可以使用以下代码:
```python
from bs4 import BeautifulSoup
html = '''
<ul data-target="area">
<li data-id="0" data-type="bizcircle" class="filter__item--level3 filter__item--aside strong">
<a href="/zufang/yizhuangkaifaqu/rt200600000001/">不限</a>
</li>
<li class="filter__item--title">d</li>
<li data-id="611100395" data-type="bizcircle" class="filter__item--level3 ">
<a href="/zufang/daxingqita11/rt200600000001/">大兴其它</a>
</li>
<li class="filter__item--title">m</li>
<li data-id="18335778" data-type="bizcircle" class="filter__item--level3 ">
<a href="/zufang/majuqiao1/rt200600000001/">马驹桥</a>
</li>
<li class="filter__item--title">t</li>
<li data-id="611100405" data-type="bizcircle" class="filter__item--level3 ">
<a href="/zufang/tongzhouqita11/rt200600000001/">通州其它</a>
</li>
<li class="filter__item--title">y</li>
<li data-id="18335750" data-type="bizcircle" class="filter__item--level3 ">
<a href="/zufang/yizhuang1/rt200600000001/">亦庄</a>
</li>
<li data-id="611100595" data-type="bizcircle" class="filter__item--level3 ">
<a href="/zufang/yizhuangkaifaquqita1/rt200600000001/">亦庄开发区其它</a>
</li>
</ul>
'''
soup = BeautifulSoup(html, 'html.parser')
as_ = soup.find_all('a')
for a in as_:
href = a['href']
print(href)
```
输出结果为:
```
/zufang/yizhuangkaifaqu/rt200600000001/
/zufang/daxingqita11/rt200600000001/
/zufang/majuqiao1/rt200600000001/
/zufang/tongzhouqita11/rt200600000001/
/zufang/yizhuang1/rt200600000001/
/zufang/yizhuangkaifaquqita1/rt200600000001/
```
其中,find_all方法用来选取所有匹配的元素,参数为标签名称字符串。在这里,我们选取所有的a标签,并遍历每个a标签。然后,从每个a标签中提取href属性值,并将所有的href属性值打印出来。
阅读全文