String from_utf8. In a Rust program we may have a vector of u8 values (bytes), but want a String. This conversion can be done with String from_utf8 or from_utf8_lossy.
With just a slice (not an entire Vector) we can get a str (not a String). And from_utf8_unchecked uses unsafe code to avoid verifying the byte data.
Example. This program creates a byte array (containing bytes for the letters "cats") and then creates 2 Vectors from the bytes array. We use from() and clone() to create the Vectors.
Part 2 It is possible to use from_utf8 on str, which receives a byte slice (not a byte Vector). This gives us a str (not a String).
Part 3 We use str from_utf8 with a different byte slice—we take a range-based slice of an byte array.
Part 4 With from_utf8_lossy, we get a Cow (a copy-on-write) string. We can use the Cow directly, like we would a string.
Part 5 If we need an actual String struct after calling from_utf8_lossy, we can call to_string(), which may copy the string if necessary.
Part 6 Finally we call from_utf8_unchecked in an unsafe block. This avoids conversion, which is faster, but the string may be corrupt.
use std::str;
fn main() {
let bytes = *b"cats";
let bytes_vec = Vec::from(&bytes);
let bytes_vec2 = bytes_vec.clone();
// Part 1: use String from_utf8 with vector of bytes.
if let Ok(value) = String::from_utf8(bytes_vec) {
println!("{} {} {}", value, value.len(), value.to_uppercase());
}
// Part 2: use str from_utf8 on a byte slice.
if let Ok(value) = str::from_utf8(&bytes) {
println!("{} {} {}", value, value.len(), value.to_uppercase());
}
// Part 3: use str from_utf8 on another slice.
if let Ok(value) = str::from_utf8(&bytes[0..3]) {
println!("{} {} {}", value, value.len(), value.to_uppercase());
}
// Part 4: use String from_utf8_lossy on a byte slice.
let temp = String::from_utf8_lossy(&bytes);
println!("{} {} {}", temp, temp.len(), temp.to_uppercase());
// Part 5: get actual String from from_utf8_lossy.
let temp2 = String::from_utf8_lossy(&bytes).to_string();
println!("{} {} {}", temp2, temp2.len(), temp2.to_uppercase());
// Part 6: use String from utf8_unchecked on vector of bytes.
unsafe {
let value = String::from_utf8_unchecked(bytes_vec2);
println!("{} {} {}", value, value.len(), value.to_uppercase());
}
}cats 4 CATS
cats 4 CATS
cat 3 CAT
cats 4 CATS
cats 4 CATS
cats 4 CATS
For the clearest syntax, and safest code, from_utf8() is probably the best choice. With from_utf8_lossy, we can avoid copies in some cases, and the end string will be valid.
Finally It is best to avoid from_utf8_unchecked, except in cases where performance is critical—it may return an invalid String.
There are many ways to perform a byte vector to String conversion in Rust, but the safe from_utf8 methods are probably the best choice. They do not lead to invalid strings.
Dot Net Perls is a collection of tested code examples. Pages are continually updated to stay current, with code correctness a top priority.
Sam Allen is passionate about computer languages. In the past, his work has been recommended by Apple and Microsoft and he has studied computers at a selective university in the United States.