We receive a local file (typically a PDF, PNG or JPG) by drag and drop in a variable (using dropzone.js - at this stage it's base64 plus the characters to specify the file type). We encrypt it (now it's binary) into a javascript variable. We then create a Blob using that variable and upload it to a server running PHP. (See our finding out how to send a js variable to PHP $_FILE.)
We are finding that the .size of the blob is about 50% larger than the .length of the file we are uploading. (We had been uploading by converting to base64 then uploading with JSON, but one reason we are looking to change is to hopefully avoid the 33% bump in size from using base64.)
The blob is consistently about 50% larger from moderate sizes up to larger sizes. As a small test, we created a Blob using 120 chars as input and found the Blob.size to be 210. (We normally use the correct file.type; image/png was just to have it be interpreted as binary data that didn't need encoding.) From actual use in our code: we uploaded a 900K PDF file. Type was something like 'application/pdf'. The resultant blob was like 1,400K. Also tried with PNG.
I would think that the Blob should be about the same size as the input,no? What might we be doing wrong?
new Blob(["123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890"], {type:"image/png"});
We receive a local file (typically a PDF, PNG or JPG) by drag and drop in a variable (using dropzone.js - at this stage it's base64 plus the characters to specify the file type). We encrypt it (now it's binary) into a javascript variable. We then create a Blob using that variable and upload it to a server running PHP. (See our finding out how to send a js variable to PHP $_FILE.)
We are finding that the .size of the blob is about 50% larger than the .length of the file we are uploading. (We had been uploading by converting to base64 then uploading with JSON, but one reason we are looking to change is to hopefully avoid the 33% bump in size from using base64.)
The blob is consistently about 50% larger from moderate sizes up to larger sizes. As a small test, we created a Blob using 120 chars as input and found the Blob.size to be 210. (We normally use the correct file.type; image/png was just to have it be interpreted as binary data that didn't need encoding.) From actual use in our code: we uploaded a 900K PDF file. Type was something like 'application/pdf'. The resultant blob was like 1,400K. Also tried with PNG.
I would think that the Blob should be about the same size as the input,no? What might we be doing wrong?
new Blob(["123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890"], {type:"image/png"});
Share
Improve this question
edited May 23, 2017 at 12:00
CommunityBot
11 silver badge
asked Apr 29, 2015 at 18:31
Mark KassonMark Kasson
1,7101 gold badge18 silver badges29 bronze badges
5
- 1 I haven't done the math, but Blob length is equivalent to blob.size - the size in bytes of the Blob, String.length is the count of chars in the string, However, one char = 2 bytes. I understand that with this logic the blob.size would be twice that of the string but I'll do a little googling and get back to you on that one but this has to be the long and short of it. – TechnicalChaos Commented Apr 29, 2015 at 19:05
- Sorry, .size is the correct property name. I'll correct it above. We don't usually use text; I just did that as a quick test. (I'm not sure about the encoding using 2 bytes/char by default, but I'll take your word for it.) Usually our files are things like PDF or PNG. I'll be clearer about that as well. – Mark Kasson Commented Apr 29, 2015 at 19:19
- They're equivalent so matters little. However when I started running some tests on your sample string, it is indeed 210 characters in length and I can't replicate the issue on a shorter string. x = new Blob(["12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890"]) Blob {type: "", size: 110, slice: function} – TechnicalChaos Commented Apr 29, 2015 at 19:23
- 1 My research led me to this post: stackoverflow.com/questions/23795034/… – TechnicalChaos Commented Apr 29, 2015 at 19:50
- Honing in on the solution. This has is in a similar direction but uses forge (which we are using). stackoverflow.com/questions/28585353/… – Mark Kasson Commented Apr 30, 2015 at 0:02
2 Answers
Reset to default 9There were three factors that led to the increase in size.
Our first issue was that we were reading the file using FileReader's readAsDataURL. This reads a file and encodes it in base64, which results in a roughly 33% increase in size. We changed to readAsArrayBuffer and read into a Uint8Array (an array of 8 bit bytes).
We are passing the file to encryption system forge.js and that only takes data in as a string, so we had to convert the binary ArrayBuffer to a string. We used the more performant solution here. This reference is more thorough and refers to the relatively new TextEncoder/Decoder APIs. We haven't gotten to using them yet. I'd guess they perform better as they're purely native.
Once forge does the encryption, we have to convert to a Blob, so see this on how to convert ArrayBuffer to and from Blob.
Second, as @TechnicalChaos pointed to, we were using a binary string in javascript. This encoding causes it to be larger in size because strings in javascript are encoded in 2 byte characters.
The blob could then be attached to a form to be uploaded to our PHP server into $_FILE.
Now our uploads are approximately the same size as the files we encrypt.
I had a similar issue with putting binary data into a Javascript blob - turns out Blob was assuming UTF-8 encoding and so some of the raw data bytes ended up as multibyte characters.
The solution was to put each byte of binary data into a Uint8Array and pass that to Blob instead.