String split Examples
This page was last reviewed on Feb 7, 2023.
Dot Net Perls
Split. When processing text in Rust, we often need to separate apart the values in the strings. Parts are separated with delimiter chars or strings.
Iterator notes. When we invoke split() we get an iterator. This can be used in a for-loop, or it can be collected into a Vec of strings.
String Array
First example. To begin, we use split() in the simplest way possible. We first declare a string literal (test) that has a delimiter char—here it is a semicolon.
Step 1 We invoke split, passing the semicolon as an argument—it is a char argument. We do not convert or collect the iterator.
Step 2 We loop over the resulting iterator with the for-in loop. We print each value, getting the results of split as we proceed.
fn main() { let test = "cat;bird"; // Step 1: get iterator from splitting on character. let values = test.split(';'); // Step 2: print all results from iterator. for v in values { println!("SPLIT: {}", v) } }
SPLIT: cat SPLIT: bird
Delimiter function. Suppose we wish to have more complex logic that tests for a delimiter. We can use a closure, or a function, to test chars.
Here We have 2 chars we want to split on, the space and newline chars. Whitespace_test returns true if the argument matches.
fn whitespace_test(c: char) -> bool { return c == ' ' || c == '\n'; } fn main() { let test = "cat dog\nbird"; // Call split, using function to test separators. let values = test.split(whitespace_test); // Print results. for v in values { println!("SPLIT: {}", v) } }
SPLIT: cat SPLIT: dog SPLIT: bird
Split_whitespace. There is another function in the Rust standard library that always splits on any whitespace. This is split_whitespace.
Here We have a "terms" string that has 5 parts separated by various whitespace characters. We split it apart.
fn main() { let terms = "bird frog tree\n?\t!"; // Split on whitespace. for term in terms.split_whitespace() { println!("{}", term); } }
bird frog tree ? !
Split ascii whitespace. If we have a string that is known to have ASCII delimiters, we can use split_ascii_whitespace. This is a good solution when we are sure we just have ASCII.
fn main() { let terms = "bird frog tree\n?\t!"; // Has the same results as split_whitespace. for term in terms.split_ascii_whitespace() { println!("{}", term); } }
bird frog tree ? !
Split and parse. It is possible to split apart a string and parse each number in the string. This code is often used for parsing in text files containing numbers.
Info We split the string on spaces, and then parse each resulting string in the iterator with the parse() function.
fn main() { let test = "123 456"; let values = test.split(' '); for v in values { // Parse each part. let parsed: u32 = v.parse().unwrap(); // Add 1 to show that we have a u32 value. println!("SPLIT PARSE: {} {}", parsed, parsed + 1) } }
SPLIT PARSE: 123 124 SPLIT PARSE: 456 457
Collect. Suppose we want to get a vec from the split function. The easiest way to do this is to call collect() with the turbofish operator to specify the desired type.
fn main() { let source = String::from("a,b,c"); // Use collect to get a vector from split. let letters = source.split(',').collect::<Vec<&str>>(); println!("{:#?}", letters); }
[ "a", "b", "c", ]
Read file, split. Suppose we have a file of key-value pairs, with a key and value separated by an equal sign on each line. With Rust we can parse this file into a HashMap of string keys.
Start We open the file with File open() and then loop over the lines() in the file. Then we split() each line.
Detail We collect the result of split, and then place the left side as the key, and the right side as the value in the HashMap.
Finally We get a key from the HashMap, which was populated by the file we just read in. The file text is shown in the example.
use std::io::*; use std::fs::File; use std::collections::HashMap; fn main() { // Open file of key-value pairs. let file = File::open("/Users/sam/example.txt").unwrap(); let reader = BufReader::new(file); let mut hash: HashMap<String, String> = HashMap::new(); // Read and parse file. for line in reader.lines() { let line_inner = line.unwrap(); let values: Vec<&str> = line_inner.split('=').collect(); if values.len() == 2 { hash.insert(values[0].to_string(), values[1].to_string()); } } // Get value from file. let cat = hash.get("cat"); if let Some(value) = &cat { println!("VALUE FOUND: {}", value); } }
bird=blue cat=orange
Split once. Suppose we have a string and it has one separator, and we want to split apart the two sides of the string. This can be done with a split_once call.
And We can avoid complicated collect() method calls or loops. Just assign a tuple pair to the result of split_once.
Tip Many uses of split() can be replaced with split_once(), so it is a good function to know.
fn main() { let value = "left:right"; // Get left and right from value. let (left, right) = value.split_once(":").unwrap(); println!("left = {} right = {}", left, right); }
left = left right = right
Collect benchmark. Often the split function is called with a following collect call. This is sometimes needlessly inefficient—we can avoid the collect.
Version 1 This version of the code calls split, and then calls collect to get a vector.
Version 2 Here we call split, but then directly use the result of the split call in a for-loop.
Result There are possible performance improvements when avoiding calling collect—try to use the result of split directly (as in a for-loop).
use std::time::*; fn main() { let source = String::from("a,b,c"); let t0 = Instant::now(); // Version 1: call collect after split. for _ in 0..1000000 { let letters = source.split(',').collect::<Vec<&str>>(); for _ in &letters { } } println!("{}", t0.elapsed().as_millis()); // Version 2: avoid collect after split. let t1 = Instant::now(); for _ in 0..1000000 { let letters = source.split(','); for _ in letters { } } println!("{}", t1.elapsed().as_millis()); }
62 ms split, collect 19 ms split
A summary. Split() is often used with Vec and string arrays. We can pass functions (or closures) to split() for more complex behavior. An iterator is returned.
Dot Net Perls is a collection of tested code examples. Pages are continually updated to stay current, with code correctness a top priority.
Sam Allen is passionate about computer languages. In the past, his work has been recommended by Apple and Microsoft and he has studied computers at a selective university in the United States.
This page was last updated on Feb 7, 2023 (new example).
© 2007-2024 Sam Allen.