最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

java - Converting Docx file to PDF and sending to ServletResponse outputStream - Stack Overflow

programmeradmin0浏览0评论
try {
            response.setHeader("Content-Disposition", "inline; filename=\"" + URLEncoder.encode(file.getFileName(), StandardCharsets.UTF_8) + ".pdf\"");
        } catch (Exception e) {
            log.error("Happened error when setting headers Content-Disposition: ", e);
        }
        response.setCharacterEncoding("UTF-8");
        response.setContentType("application/pdf");
        try (InputStream inputStream = minioClient.getObject(GetObjectArgs.builder()
                .bucket(bucketName)
                .object(file.getS3Id())
                .build())) {

            if (file.getFileExtension() == FileExtension.PDF) {
                IOUtils.copy(inputStream, response.getOutputStream());
                response.getOutputStream().flush();
            } else {
                XWPFDocument doc = new XWPFDocument(inputStream);
                PdfOptions options = PdfOptions.create();
                options.fontEncoding("UTF-8");
                options.fontProvider((familyName, encoding, size, style, color) -> {
                    try {
                        BaseFont baseFont = BaseFont.createFont(
                                "classpath:fonts/Times_New_Roman.ttf", BaseFont.IDENTITY_H, BaseFont.EMBEDDED
                        );
                        Font font = new Font(baseFont, size, style, color);
                        if (familyName != null)
                            font.setFamily( familyName );
                        return font;
                    }
                    catch (Exception e) {
                        e.printStackTrace();
                        return null;
                    }
                });
                PdfConverter.getInstance().convert(doc, response.getOutputStream(), options);
                response.getOutputStream().flush();
                doc.close();
            }
        }

I am getting OOME:

java.lang.OutOfMemoryError: Java heap space
    at java.base/java.util.Arrays.copyOf(Arrays.java:3537) ~[na:na]
    at java.base/java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:100) ~[na:na]
    at java.base/java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:130) ~[na:na]
    at java.base/java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:81) ~[na:na]
    at java.base/java.io.BufferedOutputStream.write(BufferedOutputStream.java:127) ~[na:na]
    at com.lowagie.text.pdf.OutputStreamCounter.write(Unknown Source) ~[itext-2.1.7.jar!/:na]
    at java.base/java.io.ByteArrayOutputStream.writeTo(ByteArrayOutputStream.java:161) ~[na:na]
    at com.lowagie.text.pdf.PdfStream.toPdf(Unknown Source) ~[itext-2.1.7.jar!/:na]
    at com.lowagie.text.pdf.PdfIndirectObject.writeTo(Unknown Source) ~[itext-2.1.7.jar!/:na]
    at com.lowagie.text.pdf.PdfWriter$PdfBody.add(Unknown Source) ~[itext-2.1.7.jar!/:na]
    at com.lowagie.text.pdf.PdfWriter$PdfBody.add(Unknown Source) ~[itext-2.1.7.jar!/:na]
    at com.lowagie.text.pdf.PdfWriter$PdfBody.add(Unknown Source) ~[itext-2.1.7.jar!/:na]
    at com.lowagie.text.pdf.PdfWriter.addToBody(Unknown Source) ~[itext-2.1.7.jar!/:na]
    at com.lowagie.text.pdf.PdfWriter.add(Unknown Source) ~[itext-2.1.7.jar!/:na]
    at com.lowagie.text.pdf.PdfDocument.newPage(Unknown Source) ~[itext-2.1.7.jar!/:na]
    at com.lowagie.text.pdf.PdfDocument.addPTable(Unknown Source) ~[itext-2.1.7.jar!/:na]
    at com.lowagie.text.pdf.PdfDocument.add(Unknown Source) ~[itext-2.1.7.jar!/:na]
    at com.lowagie.text.Document.add(Unknown Source) ~[itext-2.1.7.jar!/:na]
    at fr.opensagres.xdocreport.itext.extension.ExtendedDocument.add(ExtendedDocument.java:114) ~[fr.opensagres.xdocreport.itext.extension-2.1.0.jar!/:2.1.0]
    at fr.opensagres.poi.xwpf.converter.pdf.internal.elements.StylableDocument.flushTable(StylableDocument.java:374) ~[fr.opensagres.poi.xwpf.converter.pdf-2.1.0.jar!/:2.1.0]
    at fr.opensagres.poi.xwpf.converter.pdf.internal.elements.StylableDocument.pageBreak(StylableDocument.java:141) ~[fr.opensagres.poi.xwpf.converter.pdf-2.1.0.jar!/:2.1.0]
    at fr.opensagres.poi.xwpf.converter.pdf.internal.elements.StylableDocument.columnBreak(StylableDocument.java:120) ~[fr.opensagres.poi.xwpf.converter.pdf-2.1.0.jar!/:2.1.0]
    at fr.opensagres.poi.xwpf.converter.pdf.internal.elements.StylableDocument.addElement(StylableDocument.java:101) ~[fr.opensagres.poi.xwpf.converter.pdf-2.1.0.jar!/:2.1.0]
    at fr.opensagres.poi.xwpf.converter.pdf.internal.PdfMapper.endVisitParagraph(PdfMapper.java:458) ~[fr.opensagres.poi.xwpf.converter.pdf-2.1.0.jar!/:2.1.0]
    at fr.opensagres.poi.xwpf.converter.pdf.internal.PdfMapper.endVisitParagraph(PdfMapper.java:122) ~[fr.opensagres.poi.xwpf.converter.pdf-2.1.0.jar!/:2.1.0]
    at fr.opensagres.poi.xwpf.converter.core.XWPFDocumentVisitor.visitParagraph(XWPFDocumentVisitor.java:412) ~[fr.opensagres.poi.xwpf.converter.core-2.1.0.jar!/:2.1.0]
    at fr.opensagres.poi.xwpf.converter.core.XWPFDocumentVisitor.visitBodyElements(XWPFDocumentVisitor.java:264) ~[fr.opensagres.poi.xwpf.converter.core-2.1.0.jar!/:2.1.0]
    at fr.opensagres.poi.xwpf.converter.core.XWPFDocumentVisitor.start(XWPFDocumentVisitor.java:216) ~[fr.opensagres.poi.xwpf.converter.core-2.1.0.jar!/:2.1.0]
    at fr.opensagres.poi.xwpf.converter.pdf.PdfConverter.doConvert(PdfConverter.java:57) ~[fr.opensagres.poi.xwpf.converter.pdf-2.1.0.jar!/:2.1.0]
    at fr.opensagres.poi.xwpf.converter.pdf.PdfConverter.doConvert(PdfConverter.java:39) ~[fr.opensagres.poi.xwpf.converter.pdf-2.1.0.jar!/:2.1.0]
    at fr.opensagres.poi.xwpf.converter.core.AbstractXWPFConverter.convert(AbstractXWPFConverter.java:42) ~[fr.opensagres.poi.xwpf.converter.core-2.1.0.jar!/:2.1.0]

There are some docx files with complex structs and sizes are at most 500KB. How to fix this? OR what libraries can I use? Tried docx4j, but there are dependencies that uses old versions of javax bind and cannot resolve conflicts. Currently XMX set to 1g, but tried with 2g, did not help

try {
            response.setHeader("Content-Disposition", "inline; filename=\"" + URLEncoder.encode(file.getFileName(), StandardCharsets.UTF_8) + ".pdf\"");
        } catch (Exception e) {
            log.error("Happened error when setting headers Content-Disposition: ", e);
        }
        response.setCharacterEncoding("UTF-8");
        response.setContentType("application/pdf");
        try (InputStream inputStream = minioClient.getObject(GetObjectArgs.builder()
                .bucket(bucketName)
                .object(file.getS3Id())
                .build())) {

            if (file.getFileExtension() == FileExtension.PDF) {
                IOUtils.copy(inputStream, response.getOutputStream());
                response.getOutputStream().flush();
            } else {
                XWPFDocument doc = new XWPFDocument(inputStream);
                PdfOptions options = PdfOptions.create();
                options.fontEncoding("UTF-8");
                options.fontProvider((familyName, encoding, size, style, color) -> {
                    try {
                        BaseFont baseFont = BaseFont.createFont(
                                "classpath:fonts/Times_New_Roman.ttf", BaseFont.IDENTITY_H, BaseFont.EMBEDDED
                        );
                        Font font = new Font(baseFont, size, style, color);
                        if (familyName != null)
                            font.setFamily( familyName );
                        return font;
                    }
                    catch (Exception e) {
                        e.printStackTrace();
                        return null;
                    }
                });
                PdfConverter.getInstance().convert(doc, response.getOutputStream(), options);
                response.getOutputStream().flush();
                doc.close();
            }
        }

I am getting OOME:

java.lang.OutOfMemoryError: Java heap space
    at java.base/java.util.Arrays.copyOf(Arrays.java:3537) ~[na:na]
    at java.base/java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:100) ~[na:na]
    at java.base/java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:130) ~[na:na]
    at java.base/java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:81) ~[na:na]
    at java.base/java.io.BufferedOutputStream.write(BufferedOutputStream.java:127) ~[na:na]
    at com.lowagie.text.pdf.OutputStreamCounter.write(Unknown Source) ~[itext-2.1.7.jar!/:na]
    at java.base/java.io.ByteArrayOutputStream.writeTo(ByteArrayOutputStream.java:161) ~[na:na]
    at com.lowagie.text.pdf.PdfStream.toPdf(Unknown Source) ~[itext-2.1.7.jar!/:na]
    at com.lowagie.text.pdf.PdfIndirectObject.writeTo(Unknown Source) ~[itext-2.1.7.jar!/:na]
    at com.lowagie.text.pdf.PdfWriter$PdfBody.add(Unknown Source) ~[itext-2.1.7.jar!/:na]
    at com.lowagie.text.pdf.PdfWriter$PdfBody.add(Unknown Source) ~[itext-2.1.7.jar!/:na]
    at com.lowagie.text.pdf.PdfWriter$PdfBody.add(Unknown Source) ~[itext-2.1.7.jar!/:na]
    at com.lowagie.text.pdf.PdfWriter.addToBody(Unknown Source) ~[itext-2.1.7.jar!/:na]
    at com.lowagie.text.pdf.PdfWriter.add(Unknown Source) ~[itext-2.1.7.jar!/:na]
    at com.lowagie.text.pdf.PdfDocument.newPage(Unknown Source) ~[itext-2.1.7.jar!/:na]
    at com.lowagie.text.pdf.PdfDocument.addPTable(Unknown Source) ~[itext-2.1.7.jar!/:na]
    at com.lowagie.text.pdf.PdfDocument.add(Unknown Source) ~[itext-2.1.7.jar!/:na]
    at com.lowagie.text.Document.add(Unknown Source) ~[itext-2.1.7.jar!/:na]
    at fr.opensagres.xdocreport.itext.extension.ExtendedDocument.add(ExtendedDocument.java:114) ~[fr.opensagres.xdocreport.itext.extension-2.1.0.jar!/:2.1.0]
    at fr.opensagres.poi.xwpf.converter.pdf.internal.elements.StylableDocument.flushTable(StylableDocument.java:374) ~[fr.opensagres.poi.xwpf.converter.pdf-2.1.0.jar!/:2.1.0]
    at fr.opensagres.poi.xwpf.converter.pdf.internal.elements.StylableDocument.pageBreak(StylableDocument.java:141) ~[fr.opensagres.poi.xwpf.converter.pdf-2.1.0.jar!/:2.1.0]
    at fr.opensagres.poi.xwpf.converter.pdf.internal.elements.StylableDocument.columnBreak(StylableDocument.java:120) ~[fr.opensagres.poi.xwpf.converter.pdf-2.1.0.jar!/:2.1.0]
    at fr.opensagres.poi.xwpf.converter.pdf.internal.elements.StylableDocument.addElement(StylableDocument.java:101) ~[fr.opensagres.poi.xwpf.converter.pdf-2.1.0.jar!/:2.1.0]
    at fr.opensagres.poi.xwpf.converter.pdf.internal.PdfMapper.endVisitParagraph(PdfMapper.java:458) ~[fr.opensagres.poi.xwpf.converter.pdf-2.1.0.jar!/:2.1.0]
    at fr.opensagres.poi.xwpf.converter.pdf.internal.PdfMapper.endVisitParagraph(PdfMapper.java:122) ~[fr.opensagres.poi.xwpf.converter.pdf-2.1.0.jar!/:2.1.0]
    at fr.opensagres.poi.xwpf.converter.core.XWPFDocumentVisitor.visitParagraph(XWPFDocumentVisitor.java:412) ~[fr.opensagres.poi.xwpf.converter.core-2.1.0.jar!/:2.1.0]
    at fr.opensagres.poi.xwpf.converter.core.XWPFDocumentVisitor.visitBodyElements(XWPFDocumentVisitor.java:264) ~[fr.opensagres.poi.xwpf.converter.core-2.1.0.jar!/:2.1.0]
    at fr.opensagres.poi.xwpf.converter.core.XWPFDocumentVisitor.start(XWPFDocumentVisitor.java:216) ~[fr.opensagres.poi.xwpf.converter.core-2.1.0.jar!/:2.1.0]
    at fr.opensagres.poi.xwpf.converter.pdf.PdfConverter.doConvert(PdfConverter.java:57) ~[fr.opensagres.poi.xwpf.converter.pdf-2.1.0.jar!/:2.1.0]
    at fr.opensagres.poi.xwpf.converter.pdf.PdfConverter.doConvert(PdfConverter.java:39) ~[fr.opensagres.poi.xwpf.converter.pdf-2.1.0.jar!/:2.1.0]
    at fr.opensagres.poi.xwpf.converter.core.AbstractXWPFConverter.convert(AbstractXWPFConverter.java:42) ~[fr.opensagres.poi.xwpf.converter.core-2.1.0.jar!/:2.1.0]

There are some docx files with complex structs and sizes are at most 500KB. How to fix this? OR what libraries can I use? Tried docx4j, but there are dependencies that uses old versions of javax bind and cannot resolve conflicts. Currently XMX set to 1g, but tried with 2g, did not help

Share Improve this question asked Feb 5 at 14:40 MeetraMeetra 1 1
  • Do Java arrays have a maximum size? Does your code comply? – trashgod Commented Feb 6 at 20:47
Add a comment  | 

1 Answer 1

Reset to default 0

1.Increase Java Heap Size Further: You've tried 2GB, but complex DOCX files with heavy tables, images, or embedded objects might require more memory. Try setting:

sh -Xms512m -Xmx4g

  1. Stream Instead of Buffering the Entire Document Writing in chunks instead of fully loading the DOCX into memory. Reducing unnecessary object retention don't keep both DOCX and PDF representations in memory simultaneously: code:
try (InputStream inputStream = minioClient.getObject(GetObjectArgs.builder()
        .bucket(bucketName)
        .object(file.getS3Id())
        .build());
     OutputStream outStream = response.getOutputStream()) {

    XWPFDocument doc = new XWPFDocument(inputStream);
    PdfOptions options = PdfOptions.create();

    // Enable streaming-based font provider to reduce memory usage
    options.fontProvider((familyName, encoding, size, style, color) -> {
        try {
            BaseFont baseFont = BaseFont.createFont(
                    "classpath:fonts/Times_New_Roman.ttf", BaseFont.IDENTITY_H, BaseFont.EMBEDDED
            );
            return new Font(baseFont, size, style, color);
        } catch (Exception e) {
            e.printStackTrace();
            return null;
        }
    });

    PdfConverter.getInstance().convert(doc, outStream, options);
    outStream.flush();
    doc.close();
}
发布评论

评论列表(0)

  1. 暂无评论