最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

swift - Converting string to data - What happens, when the wrong encoding it used? - Stack Overflow

programmeradmin0浏览0评论

Let`s say I've got a string with characters, which doesn't exist in ASCII.

When I use the correct encoding everything works fine.

let example = "Testing, ÜÄÖ ?ß 123 ..."
let data = example.data(using: .utf8)
let example2 = String(decoding: data!, as: UTF8.self)
print(example2) // Testing, ÜÄÖ ?ß 123 ...

When I change the encoding to 'String.Encoding.ascii' nil becomes returned. But what happens there in the background? It can't find a bit-combination for the character?

How is each character transformed to data and what happens if the transformation fails? Can someone explain it in simple terms?

Let`s say I've got a string with characters, which doesn't exist in ASCII.

When I use the correct encoding everything works fine.

let example = "Testing, ÜÄÖ ?ß 123 ..."
let data = example.data(using: .utf8)
let example2 = String(decoding: data!, as: UTF8.self)
print(example2) // Testing, ÜÄÖ ?ß 123 ...

When I change the encoding to 'String.Encoding.ascii' nil becomes returned. But what happens there in the background? It can't find a bit-combination for the character?

How is each character transformed to data and what happens if the transformation fails? Can someone explain it in simple terms?

Share Improve this question asked Mar 2 at 6:48 cluster1cluster1 5,8587 gold badges38 silver badges59 bronze badges 1
  • For more insight, you can refer: developer.apple/documentation/swift/string/utf8view – Lovina Hajirawala Commented Mar 3 at 6:50
Add a comment  | 

1 Answer 1

Reset to default 1

As Wikipedia for UTF-8 states:

It was designed for backward compatibility with ASCII: the first 128 characters of Unicode, which correspond one-to-one with ASCII, are encoded using a single byte with the same binary value as ASCII, so that a UTF-8-encoded file using only those characters is identical to an ASCII file.

Basically, ASCII is a subset of UTF-8 so the encoding just fails if a byte representation of your UTF-8 character is longer than 1 byte.

发布评论

评论列表(0)

  1. 暂无评论