首页 > 编程知识 正文

java word转html,java pdf转html

时间:2023-05-05 02:26:35 阅读:222902 作者:3171

第一步:引入Jsoup和lang和lang3的依赖:

Jsoup是HTML解析器
lang和lang3这两个包里有转换所需的工具类

<dependency><groupId>org.jsoup</groupId><artifactId>jsoup</artifactId><version>1.11.3</version></dependency><dependency><groupId>commons-lang</groupId><artifactId>commons-lang</artifactId><version>2.6</version></dependency><dependency><groupId>org.apache.commons</groupId><artifactId>commons-lang3</artifactId><version>3.4</version></dependency> 第二步:直接使用即可: import org.apache.commons.lang.StringEscapeUtils;import org.apache.commons.lang3.StringUtils;import org.jsoup.Jsoup;import org.jsoup.nodes.Document;import org.jsoup.safety.Whitelist;/** * @author 碧蓝的大白/public class Html2PlainText { public static String convert(String html) { if (StringUtils.isEmpty(html)) { return ""; } Document document = Jsoup.parse(html); Document.OutputSettings outputSettings = new Document.OutputSettings().prettyPrint(false); document.outputSettings(outputSettings); document.select("br").append("\n"); document.select("p").prepend("\n"); document.select("p").append("\n"); String newHtml = document.html().replaceAll("\\n", "n"); String plainText = Jsoup.clean(newHtml, "", Whitelist.none(), outputSettings); String result = StringEscapeUtils.unescapeHtml(plainText.trim()); return result; }} 使用测试:

版权声明:该文观点仅代表作者本人。处理文章:请发送邮件至 三1五14八八95#扣扣.com 举报,一经查实,本站将立刻删除。