请优化以下代码 import requests from bs4 import BeautifulSoup from lxml import html url = "https://www.555dy1.com/voddetail/58397.html" # 发送请求并获取网页内容 response = requests.get(url) # 解析HTML内容,并提取播放地址 soup = BeautifulSoup(response.text, 'html.parser') links = soup.find_all('a', class_='module-play-list-link') # 输出每个链接的播放地址 for link in links: videolist = 'https://www.555dy1.com/'+ link['href'] print(videolist) 只输出包含-4-内容的数据
时间: 2024-02-26 09:51:42 浏览: 107
可以将最后一个 for 循环改为如下代码:
```
for link in links:
videolist = 'https://www.555dy1.com/'+ link['href']
if '-4-' in videolist:
print(videolist)
```
这样就只会输出包含-4-的链接了。另外,可以考虑对 requests.get() 方法添加异常处理,以防止请求失败导致程序崩溃。
相关问题
import requests from lxml import etree url="https://www.80142.com/txt/82776.html" html=requests.get(url) print(html)出现not get
您的代码有语法错误。正确的代码应该是:
```python
import requests
from lxml import etree
url = "https://www.80142.com/txt/82776.html"
html = requests.get(url)
print(html)
```
如果您仍然遇到 `not get` 错误,请检查您的网络连接或者网站是否可以正常访问。您可以尝试在浏览器中打开该网站,看看是否可以访问。
import requests from bs4 import BeautifulSoup url = "https://www.555dy1.com/voddetail/58397.html" # 发送请求并获取网页内容 response = requests.get(url) # 解析HTML内容,并提取播放地址 soup = BeautifulSoup(response.text, 'html.parser') links = soup.find_all('a', class_='module-play-list-link') # print(links) # 输出每个链接的播放地址 for link in links: print(link['href']) videolist = 'https://www.555dy1.com/'+ link['href'] print (videolist)
您可以考虑使用集合(set)来去除重复的播放地址,并使用字符串的 `startswith()` 方法判断链接是否以特定的字符串开头,以筛选出不同的播放器。下面是示例代码:
```
import requests
from bs4 import BeautifulSoup
url = "https://www.555dy1.com/voddetail/58397.html"
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
links = soup.find_all('a', class_='module-play-list-link')
playlists = set()
for link in links:
href = link['href']
if href.startswith('/play/'):
playlist = 'https://www.555dy1.com' + href
playlists.add(playlist)
for playlist in playlists:
print(playlist)
```
在上述代码中,我们首先创建了一个空的集合 `playlists`,用于存储不同播放器的播放地址。遍历所有链接时,我们使用 `startswith()` 方法判断链接是否以 `/play/` 开头,如果是则将其转换为完整的播放地址,并加入集合中。最后,我们遍历集合中的所有地址,并输出它们。由于集合中不会存在重复元素,因此重复的地址会被自动去除。
阅读全文