Julia has a function codepoint
which converts from "character" to unicode codepoint value, aka UInt32
. For ASCII text, this will be a value smaller than 255.
It can be used to convert a String
containing many Char
s (unicode codepoints) to an array of integer values.
How does one convert in the other direction? For example, if I have some numerical values (eg UInt8
values) how do I convert one of these values to a String
?
There may be two paths to do this.
- Convert
UInt8
toChar
(32 bits in Julia) and then convert manyChar
toString
? - Convert
UInt8
toString
and then concatenate many length 1 strings into a largerString
?
Julia has a function codepoint
which converts from "character" to unicode codepoint value, aka UInt32
. For ASCII text, this will be a value smaller than 255.
It can be used to convert a String
containing many Char
s (unicode codepoints) to an array of integer values.
How does one convert in the other direction? For example, if I have some numerical values (eg UInt8
values) how do I convert one of these values to a String
?
There may be two paths to do this.
- Convert
UInt8
toChar
(32 bits in Julia) and then convert manyChar
toString
? - Convert
UInt8
toString
and then concatenate many length 1 strings into a largerString
?
2 Answers
Reset to default 1If you have UInt32
, you can convert to Char
and then String
.
julia> codepoint.(collect("xyë"))
3-element Vector{UInt32}:
0x00000078
0x00000079
0x000000eb
julia> String(Char.(ans))
"xyë"
If you have UInt8
, you can go directly to String
.
julia> UInt8.(codeunits("xyë"))
4-element Vector{UInt8}:
0x78
0x79
0xc3
0xab
julia> String(ans)
"xyë"
Julia prior to version 0.7 used to have methods for handling non-UTF-8 strings, but that functionality was removed from the language and given to packages to implement. The official documentation mentions LegacyStrings.jl to emulate functions like utf32
, which will convert a Vector{UInt32}
to a UTF32String
for you, and from there you can convert it to a String
:
using LegacyStrings
hello = "Hello,