I have followed a quick tutorial on how to implement Mozilla's PDF viewer with React. I have made a codesanbox here. I would like to know if this is possible to implement with importing node module of pdfjs. So, instead of downloading the package in to public folder to use it with import:
export default class PDFJs {
init = (source, element) => {
const iframe = document.createElement("iframe");
iframe.src = `/pdfjs-2.5.207-dist/web/viewer.html?file=${source}`;
iframe.width = "100%";
iframe.height = "100%";
element.appendChild(iframe);
};
}
Also, this kind of setup doesn't work when PDF's source is an URL. If I do that I get an error:
PDF.js v2.5.207 (build: 0974d6052) Message: file origin does not match viewer's
I have mented out the part of the code where it checks the file's origin in pdfjs-2.5.207-dist/web/viewer.js:
//if (origin !== viewerOrigin && protocol !== "blob:") {
// throw new Error("file origin does not match viewer's");
//}
But, then I got an error:
PDF.js v2.5.207 (build: 0974d6052) Message: Failed to fetch
How can I fix this? Is it possible to import this package like a module into react ponent and how can I use it for PDF's from external resources with URL?
I have followed a quick tutorial on how to implement Mozilla's PDF viewer with React. I have made a codesanbox here. I would like to know if this is possible to implement with importing node module of pdfjs. So, instead of downloading the package in to public folder to use it with import:
export default class PDFJs {
init = (source, element) => {
const iframe = document.createElement("iframe");
iframe.src = `/pdfjs-2.5.207-dist/web/viewer.html?file=${source}`;
iframe.width = "100%";
iframe.height = "100%";
element.appendChild(iframe);
};
}
Also, this kind of setup doesn't work when PDF's source is an URL. If I do that I get an error:
PDF.js v2.5.207 (build: 0974d6052) Message: file origin does not match viewer's
I have mented out the part of the code where it checks the file's origin in pdfjs-2.5.207-dist/web/viewer.js:
//if (origin !== viewerOrigin && protocol !== "blob:") {
// throw new Error("file origin does not match viewer's");
//}
But, then I got an error:
PDF.js v2.5.207 (build: 0974d6052) Message: Failed to fetch
How can I fix this? Is it possible to import this package like a module into react ponent and how can I use it for PDF's from external resources with URL?
Share Improve this question edited Jan 21, 2021 at 8:09 Ludwig asked Jan 13, 2017 at 9:03 LudwigLudwig 1,83115 gold badges71 silver badges142 bronze badges 4- This seems to be an issue with the iframe or the browser. I am aware of such iframe issues where Firefox blocks such iframes . Why not try with a simple div and present it . I really don't see a need for a view port plication. Did you try this actually? pdftron./blog/react/how-to-build-a-react-pdf-viewer – Gary Commented Jan 21, 2021 at 8:15
- Yes, I followed along all the way to Implementing with Webviewer, because I just want the mozilla's pdf viewer and there it is also iframe that is being used. – Ludwig Commented Jan 21, 2021 at 9:03
- I am a little apprehensive on the iframe. I know Mozilla Firefox has started blocking insecure includes/html – Gary Commented Jan 21, 2021 at 9:06
- is the PDF hosted on your server ? – Mohamed Ramrami Commented Jan 23, 2021 at 1:48
4 Answers
Reset to default 2Here is a working codesandbox with Mozilla's viewer and your pdf.
Things to note :
- Your pdf must be served over HTTPS, otherwise you get this error :
Mixed Content: The page at 'https://codesandbox.io/' was loaded over HTTPS, but requested an insecure resource 'http://www.africau.edu/images/default/sample.pdf'. This request has been blocked; the content must be served over HTTPS.
- The server hosting the pdf should allow your app domain using
Access-Control-Allow-Origin
, or be in the same origin, otherwise you get this error :
Access to fetch at 'https://www.adobe./support/products/enterprise/knowledgecenter/media/c4611_sample_explain.pdf' from origin 'https://lchyv.csb.app' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource. If an opaque response serves your needs, set the request's mode to 'no-cors' to fetch the resource with CORS disabled.
- For the demo purpose, I used
https://cors-anywhere.herokuapp./<URL_TO_PDF>
, which setsAccess-Control-Allow-Origin: *
for you, but should not be used in production!
So in conclusion, your pdf didn't load because of the browser's restrictions. Importing pdfjs
directly in your app, and building a viewer from scratch (which is a lot of work), won't solve those problems.
Referrer Policy: strict-origin-when-cross-origin / Usage with external sources
The pdf should be located on the same host (including same protocol). Hosting the pdf on the same url as your app/website, should solve this problem.
Allowing a pdf to be loaded in other pages can lead to various security risks.
If you want to show an up-to-date version of an external pdf on your own homepage, there are basically two options.
Hosting PDF on your server
Running a server script (cron) which downloads the pdf and hosts it on your own server.
Allow cross-origin
If you have access to the server hosting the pdf you can send headers to allow cross-origin.
Access-Control-Allow-Origin: *
How to use pdfjs with yarn/npm
Documentation on this is really bad, but they have a repository pdfjs-dist
and some related docs.
Installation
npm install pdfjs-dist
Usage (from DOC)
import * as pdfjsLib from 'pdfjs-dist';
var url = 'https://raw.githubusercontent./mozilla/pdf.js/ba2edeae/examples/learning/helloworld.pdf';
// The workerSrc property shall be specified.
pdfjsLib.GlobalWorkerOptions.workerSrc = '//mozilla.github.io/pdf.js/build/pdf.worker.js';
// Asynchronous download of PDF
var loadingTask = pdfjsLib.getDocument(url);
loadingTask.promise.then(function(pdf) {
console.log('PDF loaded');
// Fetch the first page
var pageNumber = 1;
pdf.getPage(pageNumber).then(function(page) {
console.log('Page loaded');
var scale = 1.5;
var viewport = page.getViewport({scale: scale});
// Prepare canvas using PDF page dimensions
var canvas = document.getElementById('the-canvas');
var context = canvas.getContext('2d');
canvas.height = viewport.height;
canvas.width = viewport.width;
// Render PDF page into canvas context
var renderContext = {
canvasContext: context,
viewport: viewport
};
var renderTask = page.render(renderContext);
renderTask.promise.then(function () {
console.log('Page rendered');
});
});
}, function (reason) {
// PDF loading error
console.error(reason);
});
Service Worker
You do need the service worker - pdfjs does not work without it, so neither does reactpdf.
If you use CRA, and do not want to use CDN, you can perform following steps:
1) Copy worker to public folder
cp ./node_modules/pdfjs-dist/build/pdf.worker.js public/scripts
2) Register Service Worker
pdfjsLib.GlobalWorkerOptions.workerSrc = `${process.env.PUBLIC_URL}/scripts/pdf.worker.js`
I make changes to your example so it will accept an URL
My code bellow
import pdfjsWorker from "pdfjs-dist/build/pdf.worker.entry";
const pdfjsLib = import("pdfjs-dist/build/pdf");
export default class PDFJs {
init = (source, element) => {
pdfjsLib.then((pdfjs) => {
pdfjs.GlobalWorkerOptions.workerSrc = pdfjsWorker;
var loadingTask = pdfjs.getDocument(`${source}`);
loadingTask.promise.then((pdf) => {
pdf.getPage(1).then((page) => {
var scale = 1.5;
var viewport = page.getViewport({ scale: scale });
var canvas = document.createElement("canvas");
var context = canvas.getContext("2d");
canvas.height = viewport.height;
canvas.width = viewport.width;
element.appendChild(canvas);
var renderContext = {
canvasContext: context,
viewport: viewport
};
page.render(renderContext);
});
});
});
};
}
You can see the result here
Note: As others have already said, using just react (or any client side library), it is not possible to fetch an external resource (PDF in your case) without solving the CORS issue. You will need some kind of server-side tech to resolve it. (unless you own / have access to the external resource server)
Looking at the sandbox code you have provided, it seems you are already using node js, but the solution is applicable for all. Basically, you would request your server to fetch the file for you, and then return the file as a response payload. e.g. a node server listening to requests on
fetchPdf
and returns the file itself as response
app.post('/fetchPdf', asyncMiddleware(async (req, res, next) => {
const pdfPath = await downloadFile(req.body.url);
if (pdfPath) {
res.type('application/pdf');
res.sendFile(pdfPath);
res.on('finish', function () {
try {
fs.unlinkSync(pdfPath);
} catch (e) {
console.error(e);
console.log(`Unable to delete file ${pdfPath}`);
}
});
} else
res.status(404).send('Not found');
}));
function downloadFile(url) {
return new Promise((resolve, reject) => {
const absoluteFilePath = path.join(__dirname, `public/${crypto.randomBytes(20).toString('hex')}.pdf`);
const file = fs.createWriteStream(absoluteFilePath);
console.log(`Requested url ${url}`);
const request = http.get(url, function (downloadResponse) {
downloadResponse.pipe(file).on('finish', () => {
resolve(absoluteFilePath);
});
}).on('error', function (err) {
fs.unlink(absoluteFilePath);
resolve(null);
});
});
}
Note: For educational & learning purposes, this will work, but there are various security issues with deploying your code to production this way.
Primarily, your server should be able to make requests to any site on the Internet
Secondarily, without some kind of authentication, your site will bee a hotspot for anyone wishing to download an external resource blocked by CORS
(Similar to [https://cors-anywhere.herokuapp.])
As for your second question, yes, it is possible to use the pdfjs library with react & npm.
You can refer to yurydelendik's repo, taken from official pdf.js mozilla repository.
I have also created a fork of the same here demonstrating above said server-side solution.