String
from_utf8
In a Rust program we may have a vector of u8
values (bytes), but want a String
. This conversion can be done with String::from_utf8
or from_utf8_lossy
.
With just a slice (not an entire Vector
) we can get a str
(not a String
). And from_utf8_unchecked
uses unsafe code to avoid verifying the byte
data.
This program creates a byte
array (containing bytes for the letters "cats") and then creates 2 Vectors from the bytes array. We use from()
and clone()
to create the Vectors.
String::from_utf8
method in an if-let
Ok statement. This gives us access to "value" which is the String
.from_utf8
on str
, which receives a byte slice (not a byte
Vector
). This gives us a str
(not a String
).str::from_utf8
with a different byte slice—we take a range-based slice of an byte
array.from_utf8_lossy
, we get a Cow (a copy-on-write) string
. We can use the Cow directly, like we would a string
.String
struct
after calling from_utf8_lossy
, we can call to_string()
, which may copy the string
if necessary.from_utf8_unchecked
in an unsafe block. This avoids conversion, which is faster, but the string
may be corrupt.use std::str; fn main() { let bytes = *b"cats"; let bytes_vec = Vec::from(&bytes); let bytes_vec2 = bytes_vec.clone(); // Part 1: use String from_utf8 with vector of bytes. if let Ok(value) = String::from_utf8(bytes_vec) { println!("{} {} {}", value, value.len(), value.to_uppercase()); } // Part 2: use str from_utf8 on a byte slice. if let Ok(value) = str::from_utf8(&bytes) { println!("{} {} {}", value, value.len(), value.to_uppercase()); } // Part 3: use str from_utf8 on another slice. if let Ok(value) = str::from_utf8(&bytes[0..3]) { println!("{} {} {}", value, value.len(), value.to_uppercase()); } // Part 4: use String from_utf8_lossy on a byte slice. let temp = String::from_utf8_lossy(&bytes); println!("{} {} {}", temp, temp.len(), temp.to_uppercase()); // Part 5: get actual String from from_utf8_lossy. let temp2 = String::from_utf8_lossy(&bytes).to_string(); println!("{} {} {}", temp2, temp2.len(), temp2.to_uppercase()); // Part 6: use String from utf8_unchecked on vector of bytes. unsafe { let value = String::from_utf8_unchecked(bytes_vec2); println!("{} {} {}", value, value.len(), value.to_uppercase()); } }cats 4 CATS cats 4 CATS cat 3 CAT cats 4 CATS cats 4 CATS cats 4 CATS
For the clearest syntax, and safest code, from_utf8()
is probably the best choice. With from_utf8_lossy
, we can avoid copies in some cases, and the end string
will be valid.
from_utf8_unchecked
, except in cases where performance is critical—it may return an invalid String
.There are many ways to perform a byte
vector to String
conversion in Rust, but the safe from_utf8
methods are probably the best choice. They do not lead to invalid strings.