最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - Upload via presigned request - 403 forbidden for Unicode Filename - Stack Overflow

programmeradmin0浏览0评论

I have just ran into a weird issue with S3 blocking requests (403 Forbidden, no response body) if the filename is constructed from Unicode codepoints.

Backend code that generates the presigned URL:

$userId = ...; // Irrelevant.
$filename = ...; // Comes from POST data.
$safeName = trim(preg_replace('/[^a-z0-9\-_.]/i', '-', $filename), '-'); // AWS only allows specific characters in key.
$key = sprintf('user-documents/%s/%s', $userId, $safeName);

$metadata = [
    'type'     => 'USER_DOCUMENT',
    'userId'   => $userId,
    'filename' => $filename, // The raw one from POST.
];
$s3 = new S3Client([
    'region'      => getenv('AWS_REGION'),
    'version'     => 'latest',
    'credentials' => CredentialProvider::env(),
]);
$uploadUrl = $s3->createPresignedRequest(
    $s3->getCommand('PutObject', [
        'Bucket'   => getenv('AWS_BUCKET_USER_DATA'),
        'Key'      => $key,
        'Metadata' => $metadata,
    ]),
    '+1 hour',
)->getUri();

$response = [
    'uploadUrl' => $uploadUrl,
    'metadata'  => $metadata,
];

Frontend code that uploads the files to S3:

const file = fileInput.files[0];
const response = await getUploadUrl(file.name); // This is where the POST filename comes from.
await fetch(response.uploadUrl, {
    method: 'PUT',
    headers: {
        'x-amz-meta-type': response.metadata.type,
        'x-amz-meta-userid': response.metadata.userId,
        'x-amz-meta-filename': response.metadata.filename
    },
    body: file,
}).then(resp => {
    if (!resp.ok) {
        throw new Error('File upload failed: ' + resp.status + ' ' + resp.statusText)
    }
});

This code works completely fine if the filenames are in ASCII, but if a filename contains a unicode letter, e.g. ä, then the uploadUrl request gives me a 403 Forbidden response without a response body, turning debugging into guessing. I only tried changing ä into a because another StackOverflow question had some answers mentioning filename URL encoding, and that worked.

So the question is - what do I need to change in this code in order to not have this issue? I'm not even sure where the problem is, because uploadUrl contains the URL-encoded original filename, which AWS - presumably - decodes on their end (it's a query parameter, of course it should be URL-decoded, it's their own SDK that encodes it!), and metadata headers also contain the original filename (non-encoded).

aws/[email protected], if that changes anything.

I have just ran into a weird issue with S3 blocking requests (403 Forbidden, no response body) if the filename is constructed from Unicode codepoints.

Backend code that generates the presigned URL:

$userId = ...; // Irrelevant.
$filename = ...; // Comes from POST data.
$safeName = trim(preg_replace('/[^a-z0-9\-_.]/i', '-', $filename), '-'); // AWS only allows specific characters in key.
$key = sprintf('user-documents/%s/%s', $userId, $safeName);

$metadata = [
    'type'     => 'USER_DOCUMENT',
    'userId'   => $userId,
    'filename' => $filename, // The raw one from POST.
];
$s3 = new S3Client([
    'region'      => getenv('AWS_REGION'),
    'version'     => 'latest',
    'credentials' => CredentialProvider::env(),
]);
$uploadUrl = $s3->createPresignedRequest(
    $s3->getCommand('PutObject', [
        'Bucket'   => getenv('AWS_BUCKET_USER_DATA'),
        'Key'      => $key,
        'Metadata' => $metadata,
    ]),
    '+1 hour',
)->getUri();

$response = [
    'uploadUrl' => $uploadUrl,
    'metadata'  => $metadata,
];

Frontend code that uploads the files to S3:

const file = fileInput.files[0];
const response = await getUploadUrl(file.name); // This is where the POST filename comes from.
await fetch(response.uploadUrl, {
    method: 'PUT',
    headers: {
        'x-amz-meta-type': response.metadata.type,
        'x-amz-meta-userid': response.metadata.userId,
        'x-amz-meta-filename': response.metadata.filename
    },
    body: file,
}).then(resp => {
    if (!resp.ok) {
        throw new Error('File upload failed: ' + resp.status + ' ' + resp.statusText)
    }
});

This code works completely fine if the filenames are in ASCII, but if a filename contains a unicode letter, e.g. ä, then the uploadUrl request gives me a 403 Forbidden response without a response body, turning debugging into guessing. I only tried changing ä into a because another StackOverflow question had some answers mentioning filename URL encoding, and that worked.

So the question is - what do I need to change in this code in order to not have this issue? I'm not even sure where the problem is, because uploadUrl contains the URL-encoded original filename, which AWS - presumably - decodes on their end (it's a query parameter, of course it should be URL-decoded, it's their own SDK that encodes it!), and metadata headers also contain the original filename (non-encoded).

aws/[email protected], if that changes anything.

Share Improve this question edited Feb 5 at 15:37 hakre 198k55 gold badges447 silver badges855 bronze badges Recognized by PHP Collective asked Feb 3 at 14:30 jurchiksjurchiks 1,4334 gold badges26 silver badges61 bronze badges 2
  • getUploadUrl(file.name); // This is where the POST filename comes from. - which in turn should mean, that body: file will still contain the original file name. But shouldn't the key, and the name of the file you are actually uploading, match? – C3roe Commented Feb 3 at 14:39
  • No, key can be anything. – jurchiks Commented Feb 3 at 14:40
Add a comment  | 

2 Answers 2

Reset to default 1

By following information about RFC 2047 from @hakre's answer, I found the following package that encodes strings into the RFC 2047 "Q" encoding, but it depends on another package. IMO this is too much for a few characters in rare cases, but it might be useful for other people, so I'm leaving it here for them. Using this library, one would only need to escape the headers values in the PUT request.

Instead of adding multiple new dependencies, I have chosen to just convert those characters into ASCII by using this package that I'm already using in backend instead.

The required changes then look like this:

$pathinfo = pathinfo($filename);
$safeName = (new Slugify())->slugify($pathinfo['filename']) . '.' . $pathinfo['extension'];
...
$metadata = [
    ...
    'filename' => $safeName, // Changed from `$filename`.
];

Everything else stays the same. This has the unfortunate drawback that the original filenames will not be preserved anywhere (AWS does not allow Unicode characters in key and metadata headers are not being URL-decoded on AWS end), which could be a problem if I was dealing with filenames in Arabic or something like that, but I'm not, so in my case it's close enough (Extended Latin to ASCII).

It's a damn shame that fetch() throws an error instead of just encoding the headers in Q encoding like they're supposed to be encoded...

Using x-amz-meta-filename suggests that the user-provided meta-data is transferred via HTTP request headers.

Those values require encoding per RFC 2047 - MIME (Multipurpose Internet Mail Extensions) Part Three: Message Header Extensions for Non-ASCII Text (ietf.) first.

Explanation and examples can be found in User-defined object metadata (amazon).


One Idea was to use the Base64 encoding and do the wrapping/quoting/envelope with string concatenation:

{
    method: 'PUT',
    headers: {
        'x-amz-meta-type': response.metadata.type,
        'x-amz-meta-userid': response.metadata.userId,


        'x-amz-meta-filename': `=?UTF-8?B?${btoa(Array.from(new TextEncoder().encode(

                response.metadata.filename

            ), b => String.fromCodePoint(b)).join(''))}?=`


    },
    body: file,
}

But this does not work with AWS (via comment, just note taking.)

发布评论

评论列表(0)

  1. 暂无评论