springboot 解析html
时间: 2023-11-17 09:04:14 浏览: 150
可以使用Jsoup库来解析HTML。以下是一个使用Spring Boot和Jsoup解析HTML的例子:
```java
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import org.springframework.stereotype.Controller;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.ResponseBody;
@Controller
public class HtmlParserController {
@GetMapping("/parseHtml")
@ResponseBody
public String parseHtml() throws Exception {
String html = "<html><head><title>Jsoup Example</title></head>"
+ "<body><p>Jsoup is a Java library for working with real-world HTML. It provides a very convenient API for extracting and manipulating data, using the best of DOM, CSS, and jquery-like methods.</p>"
+ "<p><a href=\"https://jsoup.org/\">jsoup</a> is a Java library for working with real-world HTML. It provides a very convenient API for extracting and manipulating data, using the best of DOM, CSS, and jquery-like methods.</p>"
+ "</body></html>";
Document doc = Jsoup.parse(html);
Elements paragraphs = doc.select("p");
StringBuilder sb = new StringBuilder();
for (Element paragraph : paragraphs) {
sb.append(paragraph.text()).append("\n");
}
return sb.toString();
}
}
```
在上面的例子中,我们使用`Jsoup.parse()`方法将HTML字符串解析为`Document`对象。然后,我们使用`doc.select()`方法选择所有的`<p>`元素,并使用`Element.text()`方法获取元素的文本内容。最后,我们将所有的文本内容连接起来并返回。
阅读全文