Split
Strings often contain blocks of data. With split, we separate these blocks based on a delimiter. In Ruby, a string
, or a regular expression, is used as the separator.
Split
detailsThis method is widely used. When we omit an argument, it separates a string
on spaces. This is the default behavior.
Consider the input string here: it contains 3 parts. Each one is separated with a comma character—and there are internal spaces in each part.
split()
, specifying a comma as the delimiter character. This separates those 3 parts.string
elements. We loop over these with the "each" iterator.# Split this string on comma characters. input = "lowercase a,uppercase A,lowercase z" values = input.split(",") # Display each value to the console. values.each do |value| puts value endlowercase a uppercase A lowercase z
No arguments are required to split on a space character. The space delimiter is implicit: you do not need to specify it. This can make some programs easier to read.
split()
is called with a delimiter, so this may not be as expected.input = "a b c" # We do not specify an argument: space is implicit. values = input.split() puts valuesa b c
Regexp
Split
does not require a simple string
argument. It can act also upon a regular expression (regexp
). In Ruby we specify these with forward slashes.
string
array has no empty values. It contains just the four words stored within the text.value = "one, two three: four" # Split on one or more non-word characters. a = value.split(/\W+/) # Display result. puts aone two three four\W+ One or more non-word characters.
This is the maximum number of array elements that are returned. If more elements are found than are allowed by the limit, the excess ones are grouped in the final array element.
string
.# Contains five vegetable names. value = "carrot,squash,corn,broccoli,spinach" # Split with limit of 3. vegetables = value.split(",", 3) puts vegetablescarrot squash corn,broccoli,spinach
Often the split method will return empty entries. These are caused by having two delimiters with no interior content. We can invoke delete_if
to remove these empty elements.
# Split on a comma. value = "cat,,dog,bird" elements = value.split(",") print elements, "\n" # Remove empty elements from the array. elements.delete_if{|e| e.length == 0} print elements["cat", "", "dog", "bird"] ["cat", "dog", "bird"]
With split we can get the characters from a string
. Pass an empty string
literal ("") to the split method. The length of the array equals the length of the string
.
value = "xyz 1" # Separate chars. array = value.split "" # Write length. puts array.length # Write elements. print array5 ["x", "y", "z", " ", "1"]
Often we need to handle CSV files. We first use the IO.foreach
iterator to easily loop over the lines in a text file. Each line must be chomped to remove the trailing newline.
split()
on the commas. The parts between the comma chars are returned in an array.# Open this file (change file name for your program). IO.foreach("/files/csv.txt") do |line| # Remove trailing whitespace. line.chomp! # Split on comma. values = line.split(",") # Write results. print values.join("+") << "... " << String(values.length) << "\n" endcat,tiger,meow,100 airplane,bird,200 tree,grove,400 sand,beach,fish,50cat+tiger+meow+100... 4 airplane+bird+200... 3 tree+grove+400... 3 sand+beach+fish+50... 4
Parse
integersOften we need to parse integer values that are in a CSV format. We first split the line, and then use Integer to convert each string
.
line = "100,200,300" # Split on the comma char. values = line.split(",") # Parse each number in the result array. values.each do |v| number = Integer(v) # Display number if it is greater than or equal to 200. if number >= 200 puts number end end200 300
Join
This is the opposite of split. It merges together values in an array. With join and split we can parse a string
, modify the values, and return the string
to its original state.
In CSV files, input lines contain separating characters. We do not need a special parsing method to extract the inner strings. Split()
, with a special delimiter, works well.
We learned how to split based on a string
delimiter. A regular expression offers more power. And finally we used join to combine strings in an Array.