Home
Rust
String from_utf8 (Convert Bytes Vec to String)
This page was last reviewed on Jan 17, 2024.
Dot Net Perls
String from_utf8. In a Rust program we may have a vector of u8 values (bytes), but want a String. This conversion can be done with String from_utf8 or from_utf8_lossy.
With just a slice (not an entire Vector) we can get a str (not a String). And from_utf8_unchecked uses unsafe code to avoid verifying the byte data.
Example. This program creates a byte array (containing bytes for the letters "cats") and then creates 2 Vectors from the bytes array. We use from() and clone() to create the Vectors.
vec
From
Part 1 Here we use the String from_utf8 method in an if-let Ok statement. This gives us access to "value" which is the String.
if
Part 2 It is possible to use from_utf8 on str, which receives a byte slice (not a byte Vector). This gives us a str (not a String).
Part 3 We use str from_utf8 with a different byte slice—we take a range-based slice of an byte array.
Part 4 With from_utf8_lossy, we get a Cow (a copy-on-write) string. We can use the Cow directly, like we would a string.
Part 5 If we need an actual String struct after calling from_utf8_lossy, we can call to_string(), which may copy the string if necessary.
Part 6 Finally we call from_utf8_unchecked in an unsafe block. This avoids conversion, which is faster, but the string may be corrupt.
use std::str; fn main() { let bytes = *b"cats"; let bytes_vec = Vec::from(&bytes); let bytes_vec2 = bytes_vec.clone(); // Part 1: use String from_utf8 with vector of bytes. if let Ok(value) = String::from_utf8(bytes_vec) { println!("{} {} {}", value, value.len(), value.to_uppercase()); } // Part 2: use str from_utf8 on a byte slice. if let Ok(value) = str::from_utf8(&bytes) { println!("{} {} {}", value, value.len(), value.to_uppercase()); } // Part 3: use str from_utf8 on another slice. if let Ok(value) = str::from_utf8(&bytes[0..3]) { println!("{} {} {}", value, value.len(), value.to_uppercase()); } // Part 4: use String from_utf8_lossy on a byte slice. let temp = String::from_utf8_lossy(&bytes); println!("{} {} {}", temp, temp.len(), temp.to_uppercase()); // Part 5: get actual String from from_utf8_lossy. let temp2 = String::from_utf8_lossy(&bytes).to_string(); println!("{} {} {}", temp2, temp2.len(), temp2.to_uppercase()); // Part 6: use String from utf8_unchecked on vector of bytes. unsafe { let value = String::from_utf8_unchecked(bytes_vec2); println!("{} {} {}", value, value.len(), value.to_uppercase()); } }
cats 4 CATS cats 4 CATS cat 3 CAT cats 4 CATS cats 4 CATS cats 4 CATS
For the clearest syntax, and safest code, from_utf8() is probably the best choice. With from_utf8_lossy, we can avoid copies in some cases, and the end string will be valid.
Finally It is best to avoid from_utf8_unchecked, except in cases where performance is critical—it may return an invalid String.
There are many ways to perform a byte vector to String conversion in Rust, but the safe from_utf8 methods are probably the best choice. They do not lead to invalid strings.
Dot Net Perls is a collection of tested code examples. Pages are continually updated to stay current, with code correctness a top priority.
Sam Allen is passionate about computer languages. In the past, his work has been recommended by Apple and Microsoft and he has studied computers at a selective university in the United States.
This page was last updated on Jan 17, 2024 (new).
Home
Changes
© 2007-2024 Sam Allen.