Java Split Examples

Separate strings on a delimiter with the split method. Split lines of a file.

Split. Often strings are read in from lines of a file. And these lines have many parts, separated by delimiters. With use split() to break them apart.

Regex. Split in Java uses a Regex. A single character (like a comma) can be split upon. Or a more complex pattern (with character codes) can be used.

A simple example. Let's begin with this example. We introduce a string that has 2 commas in it, separating 3 strings (cat, dog, bird). We split on a comma.

For: Split returns a String array. We then loop over that array's elements with a for-each loop. We display them.

Java program that uses split public class Program { public static void main(String[] args) { // This string has three words separated by commas. String value = "cat,dog,bird"; // Split on a comma. String parts[] = value.split(","); // Display result parts. for (String part : parts) { System.out.println(part); } } } Output cat dog rat

Split lines in file. Here we use BufferedReader and FileReader to read in a text file. Then, while looping over it, we split each line. In this way we parse a CSV file with split.File

Println: Finally we use the System.out.println method to display each part from each line to the screen.

Contents: file.txt carrot,squash,turnip potato,spinach,kale Java program that reads file, splits lines import; import; import; public class Program { public static void main(String[] args) throws IOException { // Open this file. BufferedReader reader = new BufferedReader(new FileReader( "C:\\programs\\file.txt")); // Read lines from file. while (true) { String line = reader.readLine(); if (line == null) { break; } // Split line on comma. String[] parts = line.split(","); for (String part : parts) { System.out.println(part); } System.out.println(); } reader.close(); } } Output carrot squash turnip potato spinach kale

Either character. Often data is inconsistent. Sometimes we need to split on a range or set of characters. With split, this is possible. Here we split on a comma and a colon.

Tip: With square brackets, we specify the possible characters to split upon. So we split on all colons and commas, with one call.

Java program that splits on either character public class Program { public static void main(String[] args) { String line = "carrot:orange,apple:red"; // Split on comma or colon. String[] parts = line.split("[,:]"); for (String part : parts) { System.out.println(part); } } } Output carrot orange apple red

Count, separate words. We can use more advanced character patterns in split. Here we separate a String based on non-word characters. We use "\W+" to mean this.

Pattern: The pattern means "one or more non-word characters." A plus means "one or more" and a W means non-word.

Note: The comma and its following space are treated as a single delimiter. So two characters are matched as one delimiter.

Java program that counts, splits words public class Program { public static void main(String[] args) { String line = "hello, how are you?"; // Split on 1+ non-word characters. String[] words = line.split("\\W+"); // Count words. System.out.println(words.length); // Display words. for (String word : words) { System.out.println(word); } } } Output 4 hello how are you

Numbers. This example splits a string apart and then uses parseInt to convert those parts into ints. It splits on a two-char sequence. Then in a loop, it calls parseInt on each String.ParseInt
Java program that uses split, parseInt public class Program { public static void main(String[] args) { String line = "1, 2, 3"; // Split on two-char sequence. String[] numbers = line.split(", "); // Display numbers. for (String number : numbers) { int value = Integer.parseInt(number); System.out.println(value + " * 20 = " + value * 20); } } } Output 1 * 20 = 20 2 * 20 = 40 3 * 20 = 60

Limit. Split accepts an optional second parameter, a limit Integer. If we provide this, the result array has (at most) that many elements. Any extra parts remain part of the last element.

Info: To have a limit argument, we must use a Regex. Here we escape the vertical bar so it is treated like a normal char.

Here: We get the first 2 parts split apart correctly, and the third part has all the remaining (unsplit) parts.

Java program that uses split with limit public class Program { public static void main(String[] args) { String value = "a|b|c|d|e"; // Use limit of just 3 parts. // ... Escape the bar for a Regex. String parts[] = value.split("\\|", 3); // Only 3 elements are in the result array. for (String part : parts) { System.out.println(part); } } } Output a b c|d|e

Pattern.compile, split. A split method is available on the Pattern class, found in java.util.regex. We can compile a Pattern and reuse it many times. This can enhance performance.

Note: A call to Pattern.compile optimizes all split() calls afterwards. But this only helps if many splits are done.

Java program that uses Pattern.compile, split import java.util.regex.Pattern; public class Program { public static void main(String[] args) { // Separate based on number delimiters. Pattern p = Pattern.compile("\\d+"); String value = "abc100defgh9ij"; String[] elements = p.split(value); // Display our results. for (String element : elements) { System.out.println(element); } } } Output abc defgh ij

Benchmark, pattern split. We can improve the speed of splitting strings based on regular expressions by using Pattern.compile. We create a delimiter pattern. Then we call split() with it.

Version 1: This version of the code uses Pattern split(): it reuses the same Pattern instance many times.

Version 2: This code uses split() with a Regex argument, so it does not reuse the same Regex.

Result: When many Strings are split, a call Pattern.compile before using its Split method optimizes performance.

Java program that times Pattern split import java.util.regex.Pattern; public class Program { public static void main(String[] args) { // ... Create a delimiter pattern. Pattern pattern = Pattern.compile("\\W+"); String line = "cat; dog--ABC"; long t1 = System.currentTimeMillis(); // Version 1: use split method on Pattern. for (int i = 0; i < 1000000; i++) { String[] values = pattern.split(line); if (values.length != 3) { System.out.println(false); } } long t2 = System.currentTimeMillis(); // Version 2: use String split method. for (int i = 0; i < 1000000; i++) { String[] values = line.split("\\W+"); if (values.length != 3) { System.out.println(false); } } long t3 = System.currentTimeMillis(); // ... Benchmark results. System.out.println(t2 - t1); System.out.println(t3 - t2); } } Output 471 ms, Pattern split 549 ms, String split

Join. This method combines Strings together—we specify our desired delimiter String. Join is sophisticated. It can handle a String array or individual Strings.Join

Word count. We can count the words in a string by splitting the string on non-word (or space) characters. This is not the fastest method, but it tends to be a fairly accurate one.Word Count

With split, we use a regular expression-based pattern. But for simple cases, we provide the delimiter itself as the pattern. This too works. Split is elegant and powerful.
Dot Net Perls
© 2007-2020 Sam Allen. Every person is special and unique. Send bug reports to