I am posting documents into paperless-ngx via REST api. For some pdf documents the API reliably responds with
{"detail":"Multipart form parse error - Request max total header size exceeded."}
I checked one of the offending documents and found it to be a normal, valid PDF of about 180 kb size. There should not be too much fuss about it, yet I have that error.
Now I am wondering where this error might come from and how to get around it. Does it come from GUnicorn, Django or maybe the application itself?
I am posting documents into paperless-ngx via REST api. For some pdf documents the API reliably responds with
{"detail":"Multipart form parse error - Request max total header size exceeded."}
I checked one of the offending documents and found it to be a normal, valid PDF of about 180 kb size. There should not be too much fuss about it, yet I have that error.
Now I am wondering where this error might come from and how to get around it. Does it come from GUnicorn, Django or maybe the application itself?
Share Improve this question asked Jan 19 at 21:37 queegqueeg 9,4831 gold badge22 silver badges56 bronze badges 6- This needs a minimal reproducible example – Kraigolas Commented Jan 19 at 21:57
- 1 It comes from Django. – Vegard Commented Jan 19 at 22:10
- @Kraigolas This would definitely require me to post the application, the client and the data. I am not willing to share the data. – queeg Commented Jan 20 at 6:09
- @Vegard Good hint! ...and that likely gets called from github/django/django/blob/main/django/http/…. So I am now investigating where paperless-ngx sets MAX_TOTAL_HEADER_SIZE – queeg Commented Jan 20 at 6:52
- @queeg please read minimal reproducible example for future reference because it does not mean that remotely. – Kraigolas Commented Jan 21 at 1:59
1 Answer
Reset to default 1As Vegard already commented, the error stems from Django's MultiPartParser when it detects the header is larger than the passed max parameter: https://github/django/django/blob/main/django/http/multipartparser.py#L693
The method is called with the max value in MAX_TOTAL_HEADER_SIZE
https://github/django/django/blob/main/django/http/multipartparser.py#L754
But it also seems there is no easy way to configure a bigger value as it is hardcoded https://github/django/django/blob/main/django/http/multipartparser.py#L45
Edit:
I learned the MAX_HEADER_TOTAL_SIZE is not a problem. The whole error message is misleading since the headers never were too big. I figured out the root cause and can now explain.
My client application was POSTing files into some REST API. Most of the time the API consumed the documents successfully, but sometimes it errored out with one of two different errors and I assumed they need to be followed up separately. Here are the two errors I got:
- Request max total header size exceeded
- No document submitted
The root cause was that in the multipart form encoded request the filename got encoded in UTF-8 but was not marked up as such. Which means as long as the filename contained only ASCII characters everything was fine. But as soon as multibyte character sequences were part of it things bekame spooky.
So it was the client application that was sending wrong requests, and I fixed it by URL-encoding the filenames. Not a good fix, and I'd like to see rfc6266 in effect. But currently the best I could do.