用java语言写一个网页爬虫 用于获取图片
时间: 2023-05-12 16:06:42 浏览: 73
好的,以下是一个简单的Java网页爬虫示例,可以用于获取网页中的图片:
```java
import java.io.*;
import java.net.*;
import java.util.regex.*;
public class WebCrawler {
public static void main(String[] args) {
String url = "https://www.example.com";
String html = getHtml(url);
String pattern = "<img.*?src=\"(.*?)\".*?>";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(html);
while (m.find()) {
String imgUrl = m.group(1);
downloadImage(imgUrl);
}
}
private static String getHtml(String url) {
StringBuilder html = new StringBuilder();
try {
URLConnection conn = new URL(url).openConnection();
BufferedReader reader = new BufferedReader(new InputStreamReader(conn.getInputStream()));
String line;
while ((line = reader.readLine()) != null) {
html.append(line);
}
reader.close();
} catch (IOException e) {
e.printStackTrace();
}
return html.toString();
}
private static void downloadImage(String imgUrl) {
try {
URL url = new URL(imgUrl);
InputStream is = url.openStream();
String fileName = imgUrl.substring(imgUrl.lastIndexOf("/") + 1);
FileOutputStream fos = new FileOutputStream(fileName);
byte[] buffer = new byte[1024];
int length;
while ((length = is.read(buffer)) != -1) {
fos.write(buffer, 0, length);
}
is.close();
fos.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
```
这个程序会获取指定网址的HTML代码,然后使用正则表达式匹配其中的图片链接,并下载图片到本地。你可以根据需要修改程序来适应不同的网站和图片格式。