selenium伪装

时间: 2024-05-01 15:15:43 浏览: 153

selenium设置proxy、headers的方法(phantomjs、Chrome、Firefox)

### Selenium 设置 Proxy 和 Headers 方法 (PhantomJS、Chrome、Firefox) 在进行自动化测试或爬虫开发时，Selenium 是一个非常强大的工具。它允许我们控制浏览器执行一系列操作，包括但不限于打开网页、填写表单以及点击按钮等。但在某些场景下，我们需要通过代理服务器访问网页或者更改请求头来模拟真实用户的行为。下面详细介绍如何使用Selenium设置代理和请求头，适用于PhantomJS、Chrome和Firefox三种浏览器。 #### PhantomJS 设置 Proxy 和 Headers **设置 Proxy** 在PhantomJS中设置代理有两种主要方式： 1. **使用 `service_args` 参数**: ```python service_args = [ '--proxy=%s' % ip_html, # 代理IP:端口（例如：192.168.0.28:808） '--proxy-type=http', # 代理类型：http/https '--load-images=no', # 关闭图片加载（可选） '--disk-cache=yes', # 开启缓存（可选） '--ignore-ssl-errors=true' # 忽略HTTPS错误（可选） ] driver = webdriver.PhantomJS(service_args=service_args) ``` 2. **使用 `webdriver.Proxy()` 类**: ```python browser = webdriver.PhantomJS(PATH_PHANTOMJS) proxy = webdriver.Proxy() proxy.proxy_type = ProxyType.MANUAL proxy.http_proxy = '1.9.171.51:800' proxy.add_to_capabilities(webdriver.DesiredCapabilities.PHANTOMJS) browser.start_session(webdriver.DesiredCapabilities.PHANTOMJS) browser.get('http://1212.ip138.com/ic.asp') ``` 若要恢复为系统默认代理设置，可以使用以下代码: ```python proxy.proxy_type = ProxyType.DIRECT proxy.add_to_capabilities(webdriver.DesiredCapabilities.PHANTOMJS) browser.start_session(webdriver.DesiredCapabilities.PHANTOMJS) browser.get('http://1212.ip138.com/ic.asp') ``` **设置 Headers** 对于PhantomJS来说，可以通过修改`DesiredCapabilities`来设置请求头。示例代码如下： ```python import random, requests, json from selenium import webdriver from selenium.webdriver.common.desired_capabilities import DesiredCapabilities def proxies(): r = requests.get("http://120.26.166.214:9840/JProxy/update/proxy/scoreproxy") rr = json.loads(r.text) hh = rr['ip'] + ":" + "8907" print(hh) return hh ips = proxies() dcap = dict(DesiredCapabilities.PHANTOMJS) dcap["phantomjs.page.settings.userAgent"] = ( "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.98 Safari/537.36 LBBROWSER" ) service_args = [ '--proxy=%s' % ips, # 代理IP:端口（例如：192.168.0.28:808） '--ssl-protocol=any', # 忽略ssl协议 '--load-images=no', # 关闭图片加载（可选） '--disk-cache=yes', # 开启缓存（可选） '--ignore-ssl-errors=true' # 忽略HTTPS错误(可选) ] driver = webdriver.PhantomJS(desired_capabilities=dcap, service_args=service_args) ``` #### Chrome 设置 Proxy 和 Headers **设置 Proxy** 对于Chrome，可以通过创建`ChromeOptions`对象并添加`--proxy-server`参数来设置代理。 ```python from selenium import webdriver from selenium.webdriver.chrome.options import Options chrome_options = Options() chrome_options.add_argument('--proxy-server=%s' % ip_html) # 代理IP:端口 driver = webdriver.Chrome(chrome_options=chrome_options) ``` **设置 Headers** 同样地，通过`ChromeOptions`对象来设置请求头。 ```python chrome_options = Options() chrome_options.add_argument("user-agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.50 Safari/537.36") ``` #### Firefox 设置 Proxy 和 Headers **设置 Proxy** Firefox中设置代理也比较简单，只需使用`FirefoxProfile`来配置代理服务器。 ```python from selenium import webdriver from selenium.webdriver.firefox.firefox_profile import FirefoxProfile profile = FirefoxProfile() profile.set_preference('network.proxy.type', 1) profile.set_preference('network.proxy.http', '1.9.171.51') profile.set_preference('network.proxy.http_port', 800) driver = webdriver.Firefox(firefox_profile=profile) ``` **设置 Headers** 对于Firefox，同样可以使用`FirefoxProfile`来设置请求头。 ```python profile = FirefoxProfile() profile.set_preference("general.useragent.override", "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.50 Safari/537.36") driver = webdriver.Firefox(firefox_profile=profile) ``` ### 总结以上介绍了如何使用Selenium为PhantomJS、Chrome和Firefox三种浏览器设置代理和请求头的方法。这些技术可以帮助开发者更好地模拟用户行为，实现更复杂的自动化测试和爬虫需求。在实际应用中，可以根据具体需求选择适合自己的配置方式。

Selenium 是一款自动化测试工具，它可以模拟人工在浏览器中操作，从而实现自动化测试。在实现自动化测试的过程中，经常需要伪装浏览器，模拟人工操作，以便达到更好的测试效果。伪装浏览器的方式有很多种，以下是一些常用的方式： 1. 修改 User-Agent：通过修改浏览器的 User-Agent 字段来伪装浏览器，从而达到欺骗网站的目的。 2. 使用代理：使用代理服务器来请求目标网站，从而隐藏真实 IP 地址，达到匿名访问网站的效果。 3. 设置随机等待时间：在请求页面时设置一个随机的等待时间，从而模拟人工访问网站时的等待时间，避免被网站检测出是机器访问。

阅读全文

selenium伪装

相关推荐

selenium需要的包 模拟真实浏览器。

Python3 伪装浏览器的方法示例

Selenium伪装

selenium伪装实践

python selenium 伪装 stealth

python selenium 伪装headers

python怎么用 selenium伪装有哪一些方法

selenium如何伪装

selenium4 伪装 headers

怎么伪装selenium

对selenium做出一定伪装

使用selenium模拟登录伪装

Selenium Introuction

Selenium环境配置与大型网站伪装技术揭秘

repuest和selenium

selenium爬取亚马逊

selenium反反爬虫

猫眼 selenium获取票房

selenium_stealth用法

最新推荐

2000-2021年中国科技统计年鉴（分省年度）面板数据集-最新更新.zip

PPT保护工具PDFeditor专业版-精心整理.zip

高清艺术文字图标资源，PNG和ICO格式免费下载

管理建模和仿真的文件

DMA技术：绕过CPU实现高效数据传输

SGM8701电压比较器如何在低功耗电池供电系统中实现高效率运作？

mui框架HTML5应用界面组件使用示例教程

"互动学习：行动中的多样性与论文攻读经历"

【数据传输高速公路】：总线系统的深度解析

如何结合PID算法调整PWM信号来优化电机速度控制？请提供实现这一过程的步骤和代码示例。

selenium需要的包模拟真实浏览器。