请优化以下代码 import requests from bs4 import BeautifulSoup from lxml import html url = "https://www.555dy1.com/voddetail/58397.html" # 发送请求并获取网页内容 response = requests.get(url) # 解析HTML内容，并提取播放地址 soup = BeautifulSoup(response.text, 'html.parser') links = soup.find_all('a', class_='module-play-list-link') # 输出每个链接的播放地址 for link in links: videolist = 'https://www.555dy1.com/'+ link['href'] print(videolist) 只输出包含-4-内容的数据

时间: 2024-02-26 15:51:42 浏览: 125

可以将最后一个 for 循环改为如下代码： ``` for link in links: videolist = 'https://www.555dy1.com/'+ link['href'] if '-4-' in videolist: print(videolist) ``` 这样就只会输出包含-4-的链接了。另外，可以考虑对 requests.get() 方法添加异常处理，以防止请求失败导致程序崩溃。

import requests from lxml import etree url="https://www.80142.com/txt/82776.html" html=requests.get(url) print(html)出现not get

您的代码有语法错误。正确的代码应该是： ```python import requests from lxml import etree url = "https://www.80142.com/txt/82776.html" html = requests.get(url) print(html) ``` 如果您仍然遇到 `not get` 错误，请检查您的网络连接或者网站是否可以正常访问。您可以尝试在浏览器中打开该网站，看看是否可以访问。

import requests from bs4 import BeautifulSoup url = "https://www.555dy1.com/voddetail/58397.html" # 发送请求并获取网页内容 response = requests.get(url) # 解析HTML内容，并提取播放地址 soup = BeautifulSoup(response.text, 'html.parser') links = soup.find_all('a', class_='module-play-list-link') # print(links) # 输出每个链接的播放地址 for link in links: print(link['href']) videolist = 'https://www.555dy1.com/'+ link['href'] print (videolist)

您可以考虑使用集合（set）来去除重复的播放地址，并使用字符串的 `startswith()` 方法判断链接是否以特定的字符串开头，以筛选出不同的播放器。下面是示例代码： ``` import requests from bs4 import BeautifulSoup url = "https://www.555dy1.com/voddetail/58397.html" response = requests.get(url) soup = BeautifulSoup(response.text, 'html.parser') links = soup.find_all('a', class_='module-play-list-link') playlists = set() for link in links: href = link['href'] if href.startswith('/play/'): playlist = 'https://www.555dy1.com' + href playlists.add(playlist) for playlist in playlists: print(playlist) ``` 在上述代码中，我们首先创建了一个空的集合 `playlists`，用于存储不同播放器的播放地址。遍历所有链接时，我们使用 `startswith()` 方法判断链接是否以 `/play/` 开头，如果是则将其转换为完整的播放地址，并加入集合中。最后，我们遍历集合中的所有地址，并输出它们。由于集合中不会存在重复元素，因此重复的地址会被自动去除。

阅读全文

import requests from lxml import etree url="https://www.80142.com/txt/82776.html" html=requests.get(url) print(html)出现not get

相关推荐

Python爬虫实战：抓取http://www.win4000.com/美桌图片

解决Python调试import requests报错及环境变量配置

Python编程：实现HTTP/HTTPS服务器请求的高效代码

import requests from bs4 import BeautifulSoup url = "https://movie.douban.com/top250" response = re

以下代码爬取的内容是乱码，什么原因？from bs4 import BeautifulSoup import requests if name == 'main': url = 'https://www.pincai.com/article/2320333.htm' response = requests.get(url).text soup = BeautifulSoup(response, 'lxml')。帮我修改好代码

以下代码爬取的内容是乱码，什么原因？from bs4 import BeautifulSoup import requests if __name__ == '__main__': url = 'https://www.pincai.com/article/2320333.htm' response = requests.get(url).text soup = BeautifulSoup(response, 'lxml')

import requests url = "https://www.cdairport.com/dynamic3.aspx?t=8&inout=D&date=0&etime=23:59&" html = requests.get(url) print(html)怎么修改

大家在看

silvaco中文学习资料

AES128（CBC或者ECB）源码

EMC VNX 5300使用安装

华为MA5671光猫使用 华为MA5671补全shell 101版本可以补全shell，安装后自动补全，亲测好用，需要的可以下载

视频转换芯片 TP9950 iic 驱动代码

最新推荐

智慧园区3D可视化解决方案PPT(24页).pptx

labelme标注的json转mask掩码图，用于分割数据集 批量转化，生成cityscapes格式的数据集

掌握Android RecyclerView拖拽与滑动删除功能

【IBM HttpServer入门全攻略】：一步到位的安装与基础配置教程

[root@localhost~]#mount-tcifs-0username=administrator,password=hrb.123456//192.168.100.1/ygptData/home/win mount：/home/win：挂载点不存在

惠普8594E与IT8500系列电子负载使用教程

MATLAB与Python在SAR点目标仿真中的对决：哪种工具更胜一筹？

前端代理配置config.js配置proxyTable多个代理不生效

最小二乘法程序深入解析与应用案例

SAR点目标仿真应用指南：案例研究与系统设计实战

以下代码爬取的内容是乱码，什么原因？from bs4 import BeautifulSoup import requests if name == 'main': url = 'https://www.pincai.com/article/2320333.htm' response = requests.get(url).text soup = BeautifulSoup(response, 'lxml')

华为MA5671光猫使用华为MA5671补全shell 101版本可以补全shell，安装后自动补全，亲测好用，需要的可以下载

labelme标注的json转mask掩码图，用于分割数据集批量转化，生成cityscapes格式的数据集