使用正则提取“https://v.cdnlz2.com/20231210/27345_07b4cdb4/index.m3u8”中的“https://v.cdnlz2.com/20231210/27345_07b4cdb4”
时间: 2024-01-23 09:03:18 浏览: 269
可以使用正则表达式来提取字符串中的特定部分。对于提取“https://v.cdnlz2.com/20231210/27345_07b4cdb4/index.m3u8”中的“https://v.cdnlz2.com/20231210/27345_07b4cdb4”,可以使用以下正则表达式:
```python
import re
url = "https://v.cdnlz2.com/20231210/27345_07b4cdb4/index.m3u8"
pattern = r"(https://v.cdnlz2.com/20231210/27345_07b4cdb4)"
result = re.search(pattern, url)
if result:
extracted_url = result.group(1)
print(extracted_url) # 输出:https://v.cdnlz2.com/20231210/27345_07b4cdb4
```
正则表达式`r"(https://v.cdnlz2.com/20231210/27345_07b4cdb4)"`表示匹配字符串中的"https://v.cdnlz2.com/20231210/27345_07b4cdb4"部分。使用`re.search()`函数进行匹配,如果找到匹配的结果,则使用`result.group(1)`提取匹配的部分。
相关问题
for page in range(0, 10):#根据网页规律,用for循环构建 page=page*10 time.sleep(10) #获取正确url网络地址 url = f'https://www.maoyan.com/board/4?timeStamp=20&offset={page}' headers = { 'Cookie':'__mta=250911417.1684852683551.1684857857025.1684857868530.6; ci=59%2C%E6%88%90%E9%83%BD; ci.sig=6-eKn999I8699yCUqmUVkEAA3RA; featrues=[object Object]; featrues.sig=KbQquuOrr42L3kMHbtKc319ems8; _lxsdk_cuid=188490b5844c8-0dc2de72154915-3e604809-144000-188490b5845c8; Hm_lvt_703e94591e87be68cc8da0da7cbd0be2=1684852660; uuid_n_v=v1; uuid=6A080B90F97711EDB1C0EF0226805F3B350AFA999F724C6B9013F8578B9E816D; _csrf=c33def8c4b7063982594d90e32c67030753d41029d577dd8c0b7300fd842a5fd; _lx_utm=utm_source%3DBaidu%26utm_medium%3Dorganic; _lxsdk=6A080B90F97711EDB1C0EF0226805F3B350AFA999F724C6B9013F8578B9E816D; __mta=250911417.1684852683551.1684852690037.1684856768008.4; Hm_lpvt_703e94591e87be68cc8da0da7cbd0be2=1684857868; _lxsdk_s=188494a0542-07b-504-ec2%7C%7C8', 'Host':'www.maoyan.com', 'Referer':'https://www.maoyan.com/board?timeStamp=1684853042241&channelId=40011&index=1&signKey=683c07ec69bec48fc2589b25eec30cf8&sVersion=1&webdriver=false', #防盗链,表示从哪个页面过来的 'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36', }
这段代码是构建请求头信息,其中包括Cookie、Host、Referer和User-Agent等信息,这些信息可以帮助我们模拟浏览器发送请求,避免被网站屏蔽。其中的url变量是根据页面规律构建的请求地址,通过for循环遍历页面,每次获取10个电影信息。time.sleep(10)是为了避免频繁访问网站被封禁IP。