Remove
duplicatesIn Rust we find a dedup
function on Vectors. And this can be used to eliminate duplicates from a Vector
—but only a sorted Vector
.
When using dedup()
, we must either have kept the vector in sorted order, or we need to sort it before calling dedup
. Otherwise dedup
and dedup_by_key
will not work.
Consider this string
vector: it has 3 animal names, and they are not sorted. We must first sort the data collection before calling dedup
.
sort()
with no argument to sort by value (alphabetically). The 2 birds are now at the start of the vector.dedup
on the sorted list. We see that one of the 2 birds was removed, but the fish remains in the vector.sort()
call, and you will see that dedup
does not work correctly without it.fn main() { let mut animals = vec!["bird", "fish", "bird"]; println!("{:?}", animals); // Step 1: sort the vector. animals.sort(); println!("{:?}", animals); // Step 2: call dedup to remove duplicates in a sorted vector. animals.dedup(); println!("{:?}", animals); }["bird", "fish", "bird"] ["bird", "bird", "fish"] ["bird", "fish"]
In this example, we use the dedup_by_key
function, along with sort_by_key
. These 2 methods transform elements into keys before acting upon the values.
sort_by_key
and dedup_by_key
functions. It is easier to pass the closure directly as an argument.to_uppercase()
to act upon all elements as though they are uppercase.fn main() { let mut animals = vec!["cat", "bird", "CAT"]; // First sort he vector with keys. // ... Then remove duplicates with keys. animals.sort_by_key(|a| a.to_uppercase()); animals.dedup_by_key(|a| a.to_uppercase()); println!("{:?}", animals); }["bird", "cat"]
In Rust, using dedup()
and dedup_by_key()
is often a 2-step process. We must first sort the vector before invoking these functions—just adjacent elements are removed.