从字符串获取

String html = "<html><head><title>Java爬虫</title></head>"
				+ "<body>内容部分</body></html>";
Document doc = Jsoup.parse(html);

从网址简单获取

Document doc = Jsoup.connect("http://example.com/").get();

从网址获取

Document doc = Jsoup.connect("http://example.com";)
				  .data("query", "Java")
				  .userAgent("Mozilla")
				  .cookie("auth", "token")
				  .timeout(3000)
				  .post();

注：需要哪些参数自行加上即可，上面参数都不是必需的，只是为了说明可以这样设置浏览参数，具体参数结合具体情况使用。post()可以改为get()，深入知识在此不再介绍。

从文件获取

		File file = new File("F:\\test.html");
		Document doc = Jsoup.parse(file, "UTF-8");

原文链接：https://blog.csdn.net/weixin_45792450/article/details/104102868