最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

urlencode - Can a data URL safely encode binary using a character set beyond base64 without requiring percent-encoding? - Stack

programmeradmin3浏览0评论

The context is for a data URI (now known as data URL) to be dropped into an Apple API. It is not for a data URI that would be consumed by a browser.

I am interested in using base91 or base85 to get tighter encoding than base64, or at least some kind of Base80.

I suspect I can use data:,SOME_ENCODED_DATA where SOME_ENCODED_DATA is plain ASCII but uses a wider character set than base64.

This question is close to the 15yo What's valid and what's not in a URI query? and, especially, What is the smallest URL friendly encoding? but restricted purely to the scope of a data URL.

The Mozilla docs say

If the data contains characters defined in RFC 3986 as reserved characters, or contains space characters, newline characters, or other non-printing characters, those characters must be percent-encoded.

but then go on to say

Base64 uses the characters + and /, which may have special meanings in URLs. Because Data URLs have no URL path segments or query parameters, this encoding is safe in this context.

That comment relaxing restrictions on + and / seems in violation of the 1995 RFC237 which defines the URL as

  • dataurl := "data:" [ mediatype ] [ ";base64" ] "," data where
  • data := *urlchar and
  • "urlchar" is imported from RFC2396 and only allows the alphanumeric plus -_.!~*'() without %-encoding.

Looking further for relaxed rules, the Wikipedia entry says:

The characters permitted within the data part include ASCII upper and lowercase letters, digits, and many ASCII punctuation and special characters. Note that this may include characters, such as colon, semicolon, and comma which are delimiters in the URI components preceding the data part.

That seems to match the experience reported in What is the smallest URL friendly encoding? (12 years ago!).

I'm intending to add a code sample to explore this but thought it was worth putting the question out there for people to report experiences or things I've missed.

Note that RFC2396 does not require you to use base64 to encode binary. It assumes, if no data type is specified, you conform to the default text/plain;charset=US-ASCII.

与本文相关的文章

发布评论

评论列表(0)

  1. 暂无评论