Problem
I am trying to upload a file into AWS (since I am testing I am using a dockerized minio).
- I am calculating the checksum of the file like this:
const fileContent = await readFile(fileName, { encoding: "binary", }); const checksum = checksums.crc32(fileContent);
- Then I create a multipart upload, this part of my code.
- I store the
UploadId
for future use here. - Then I start uploading each chunk inside a
for
loop. - I also store all the parts responses in an array (here).
- Finally I try to complete the whole file upload by sending a
CompleteMultipartUploadCommand
.
If I remove line 77 and 78:
ChecksumType: "FULL_OBJECT",
ChecksumCRC32: checksum.toString(),
It will upload the file but that is not what I want.
Desired Outcome
- I want send the calculated checksum of the enter file (from the first byte to the last byte) to AWS S3 when I am sending the
CompleteMultipartUploadCommand
. - So that AWS S3 can check data integrity of the uploaded part when they are being assembled back.
Side Notes
- I know that there are other forms of data integrity check, like composite checksums but that is not what I am trying to accomplish here.
- I also tried to make sense of
Upload
class exported from @aws-sdk/lib-storage but I was not able to make sense of how they are doing aFULL_OBJECT
checksum check. - I read these docs too but none was really useful:
- Tutorial: Upload an object through multipart upload and verify its data integrity.
- Checking object integrity in Amazon S3.
- Uploading and copying objects using multipart upload in Amazon S3.
Questions
First, thanks in advance for you're answer.
Please if possible add a link to a repo or share some example.
Explain what I do not know about this holy grail of checksum & data integrity check.
I am not exactly familiar with how AWS S3 is generating those CRC32 checksums since the ones I was able to generate are all numbers and nothing like things AWS S3 returns as your checksum. You can look at the logs of
Parts
, here is one of them:ChecksumCRC32: 'VG/A4w=='
Whereas the one I generate from the entire file is
209188370
, a number!So maybe someone out there know how in NodeJS I can generate the same CRC32 as AWS folks do since I feel like my code is broken somehow.
Problem
I am trying to upload a file into AWS (since I am testing I am using a dockerized minio).
- I am calculating the checksum of the file like this:
const fileContent = await readFile(fileName, { encoding: "binary", }); const checksum = checksums.crc32(fileContent);
- Then I create a multipart upload, this part of my code.
- I store the
UploadId
for future use here. - Then I start uploading each chunk inside a
for
loop. - I also store all the parts responses in an array (here).
- Finally I try to complete the whole file upload by sending a
CompleteMultipartUploadCommand
.
If I remove line 77 and 78:
ChecksumType: "FULL_OBJECT",
ChecksumCRC32: checksum.toString(),
It will upload the file but that is not what I want.
Desired Outcome
- I want send the calculated checksum of the enter file (from the first byte to the last byte) to AWS S3 when I am sending the
CompleteMultipartUploadCommand
. - So that AWS S3 can check data integrity of the uploaded part when they are being assembled back.
Side Notes
- I know that there are other forms of data integrity check, like composite checksums but that is not what I am trying to accomplish here.
- I also tried to make sense of
Upload
class exported from @aws-sdk/lib-storage but I was not able to make sense of how they are doing aFULL_OBJECT
checksum check. - I read these docs too but none was really useful:
- Tutorial: Upload an object through multipart upload and verify its data integrity.
- Checking object integrity in Amazon S3.
- Uploading and copying objects using multipart upload in Amazon S3.
Questions
First, thanks in advance for you're answer.
Please if possible add a link to a repo or share some example.
Explain what I do not know about this holy grail of checksum & data integrity check.
I am not exactly familiar with how AWS S3 is generating those CRC32 checksums since the ones I was able to generate are all numbers and nothing like things AWS S3 returns as your checksum. You can look at the logs of
Parts
, here is one of them:ChecksumCRC32: 'VG/A4w=='
Whereas the one I generate from the entire file is
209188370
, a number!So maybe someone out there know how in NodeJS I can generate the same CRC32 as AWS folks do since I feel like my code is broken somehow.
- 1 From the documentation about the checksum header: "This header specifies the Base64 encoded, 32-bit CRC-32 checksum of the object." – Anon Coward Commented Feb 14 at 18:09
- @AnonCoward thanks to your comment and some other resources I was able to fix the checksum part, you can see how I am doing it in NodeJS in this Stackoverflow Q&A. – Mohammad Jawad Barati Commented Feb 15 at 5:21
1 Answer
Reset to default 0So here is the thing:
- I was generating the checksum correctly but not hashing it properly. So here I've explained it how it should be done.
- Minio right now (16.02.2025) has some internal issue with checksum for the entire object as I've explained here. Meaning that AWS S3 works just fine with this code.
And make sure to watch my YouTube video about this: https://youtu.be/Pgl_NmbxPUo.