Apache HttpClient获取代理列表
时间: 2024-02-25 07:45:13 浏览: 180
Apache HttpClient是一个非常流行的Java HTTP客户端库,它可以用来发送HTTP请求并处理响应。如果你需要使用代理服务器来发送HTTP请求,那么你可以使用HttpClient来获取代理列表。下面是一个使用HttpClient获取代理列表的示例代码:
```java
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.URI;
import java.util.ArrayList;
import java.util.List;
import org.apache.http.HttpHost;
import org.apache.http.HttpResponse;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.utils.URIBuilder;
import org.apache.http.impl.client.HttpClientBuilder;
import org.apache.http.util.EntityUtils;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.select.Elements;
public class ProxyUtil {
public static List<HttpHost> getProxyList() throws Exception {
List<HttpHost> proxyList = new ArrayList<>();
URI uri = new URIBuilder().setScheme("http").setHost("www.xicidaili.com").setPath("/nn/").setParameter("wt", "1").build();
HttpGet request = new HttpGet(uri);
HttpClient client = HttpClientBuilder.create().build();
HttpResponse response = client.execute(request);
String html = EntityUtils.toString(response.getEntity());
Document doc = Jsoup.parse(html);
Elements elements = doc.select("table#ip_list tr");
for (int i = 1; i < elements.size(); i++) {
String ip = elements.get(i).select("td:eq(1)").text();
String port = elements.get(i).select("td:eq(2)").text();
HttpHost proxy = new HttpHost(ip, Integer.parseInt(port));
proxyList.add(proxy);
}
return proxyList;
}
public static void main(String[] args) {
try {
List<HttpHost> proxyList = getProxyList();
for (HttpHost proxy : proxyList) {
System.out.println(proxy);
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
```
这个示例代码使用了一个免费的代理IP网站(www.xicidaili.com)来获取代理列表,然后解析HTML文档提取出代理IP和端口号,最后将它们封装成HttpHost对象并添加到代理列表中。你可以修改这个示例代码来适应其他的代理IP网站。
阅读全文