HomeSearch

C# CSV Methods (Parse and Segment)

This C# example page handles a CSV text string. It splits a CSV file into many separate files.

CSV files.

A comma-separated values file stores data. It separates each unit with a comma character. We can use built-in methods like Split() to parse CSV files.

For complex situations,

we may want to separate a CSV file apart into 2 or more segments. This can allow easier uploading. An example method is here.

First example.

To begin, we see the Split() method. This approach to handling a CSV file is well-covered in the Split article. But it is worth reviewing.SplitTextFieldParser
C# program that parses CSV string using System; class Program { static void Main() { string text = "field one,field2,description,identity"; // Split the cvs on a comma. string[] parts = text.Split(','); foreach (string value in parts) { Console.WriteLine(value); } } } Output field one field2 description identity

Separation example.

This method separates CSV files. It turns a file into smaller files containing parts of the original data. Sometimes you can only upload 1 MB sections.

Example: Here we see a static class. It divides a large input CSV file, such as example.csv, into smaller files of one megabyte.

Here: Pay attention to the method call in the Main method, which specifies files of 1024 times 1024 bytes, or one megabyte.

Main, Args

File: We use File.ReadLines to read in the entire source CSV file. In the for-loop, it adds up the current byte length of the strings.

And: When it exceeds the maximum length in bytes, it outputs a new file. It generates file names "split_00.txt", "split_01.txt" and more.

C# program that segments CSV files using System; class Program { static void Main() { // Split this CSV file into 1 MB chunks. CSVSplitTool.SplitCSV("example.csv", "split", 1024 * 1024); } } /// <summary> /// Tool for splitting CSV files at a certain byte size on a line break. /// </summary> static class CSVSplitTool { /// <summary> /// Split CSV files on line breaks before a certain size in bytes. /// </summary> public static void SplitCSV(string file, string prefix, int size) { // Read lines from source file string[] arr = System.IO.File.ReadAllLines(file); int total = 0; int num = 0; var writer = new System.IO.StreamWriter(GetFileName(prefix, num)); // Loop through all source lines for (int i = 0; i < arr.Length; i++) { // Current line string line = arr[i]; // Length of current line int length = line.Length; // See if adding this line would exceed the size threshold if (total + length >= size) { // Create a new file num++; total = 0; writer.Dispose(); writer = new System.IO.StreamWriter(GetFileName(prefix, num)); } // Write the line to the current file writer.WriteLine(line); // Add length of line in bytes to running size total += length; // Add size of newlines total += Environment.NewLine.Length; } writer.Dispose(); } /// <summary> /// Get an output file name based on a number. /// </summary> static string GetFileName(string prefix, int num) { return prefix + "_" + num.ToString("00") + ".txt"; } }

Verify.

Here we verify the correctness of the method to make sure it works. The example CSV file is a 6,409,636-byte CSV file containing 60,000 lines, each with 10 fields.

And: Each field is a random number. The sum of the six output files is 6.11 MB, which is the same as the input file.

Results: The first five output files are 1024 KB each. This is displayed as 0.99 MB in the file manager. The final file is 116 KB.

Also: The lines in the output files were also checked for accuracy. The first file split occurs after line 9816.

Therefore: Line 9816 is the final line in the first output file, and line 9817 is the first line in the second output file.

Summary.

This static method splits CSV files based on byte size. You can use it to split your CSV files on any size boundaries. This is useful for inputting CSV files to a database.
Home
Dot Net Perls
© 2007-2019 Sam Allen. All rights reserved. Written by Sam Allen, info@dotnetperls.com.