最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

c# - Extract all content using MUPDF.net - Stack Overflow

programmeradmin3浏览0评论

Is there a way to extract all content from mupdf? For example the following code using the GetText() method will extract all text in html format:

using MuPDF.NET

var document = new Document("path-to-doc.pdf")
for (int i = 0; i < document.PageCount; i++) {
           var htmlContent = page.GetText("html");
           
}

this will not necessairly include form fields, vector graphics e.t.c. How would i get all of these and their relative positions within the PDF?

发布评论

评论列表(0)

  1. 暂无评论