连接该代码:text_array = [] boxes = soup.find_all('div', class_=lambda x: x and 'fmJAkJ' in x.split()) for i, box in enumerate(boxes): if i >= 2: break text_array.append(box.get_text()) print(text_array)
时间: 2024-04-28 12:19:52 浏览: 122
html_convert.zip_html conve_python 网页_python 网页_read text pytho
好的,这里是完整的代码:
```
from bs4 import BeautifulSoup
def run(playwright: Playwright) -> List[str]:
browser = playwright.chromium.launch(headless=False)
context = browser.new_context()
page = context.new_page()
page.goto("https://www.baidu.com")
html = page.content()
soup = BeautifulSoup(html, 'html.parser')
text_array = []
boxes = soup.find_all('div', class_=lambda x: x and 'fmJAkJ' in x.split())
for i, box in enumerate(boxes):
if i >= 2:
break
text_array.append(box.get_text())
return text_array
print(run(playwright))
```
这段代码使用了BeautifulSoup库解析了百度首页的HTML内容,并查找了所有`class`属性包含`fmJAkJ`的`div`元素,并将其文本内容添加到`text_array`列表中。最后,该函数返回了`text_array`列表。同时,我们在代码最后调用了该函数并打印了返回结果,以检查是否成功获取了文本内容。
阅读全文