CSV files. A comma-separated values file stores data. It separates each unit with a comma character. We can use built-in methods like Split() to parse CSV files.
For complex situations, we may want to separate a CSV file apart into 2 or more segments. This can allow easier uploading. An example method is here.
First example. To begin, we see the Split() method. This approach to handling a CSV file is well-covered in the Split article. But it is worth reviewing.
using System;
class Program
{
static void Main()
{
string text = "field one,field2,description,identity";
// Split the cvs on a comma.
string[] parts = text.Split(',');
foreach (string value in parts)
{
Console.WriteLine(value);
}
}
}field one
field2
description
identity
Separation example. This method separates CSV files. It turns a file into smaller files containing parts of the original data. Sometimes you can only upload 1 MB sections.
Detail Here we see a static class. It divides a large input CSV file, such as example.csv, into smaller files of one megabyte.
Here Pay attention to the method call in the Main method, which specifies files of 1024 times 1024 bytes, or one megabyte.
Detail We use File.ReadLines to read in the entire source CSV file. In the for-loop, it adds up the current byte length of the strings.
And When it exceeds the maximum length in bytes, it outputs a new file. It generates file names "split_00.txt", "split_01.txt" and more.
using System;
class Program
{
static void Main()
{
// Split this CSV file into 1 MB chunks.CSVSplitTool.SplitCSV("example.csv", "split", 1024 * 1024);
}
}
/// <summary>
/// Tool for splitting CSV files at a certain byte size on a line break.
/// </summary>
static class CSVSplitTool
{
/// <summary>
/// Split CSV files on line breaks before a certain size in bytes.
/// </summary>
public static void SplitCSV(string file, string prefix, int size)
{
// Read lines from source file
string[] arr = System.IO.File.ReadAllLines(file);
int total = 0;
int num = 0;
var writer = new System.IO.StreamWriter(GetFileName(prefix, num));
// Loop through all source lines
for (int i = 0; i < arr.Length; i++)
{
// Current line
string line = arr[i];
// Length of current line
int length = line.Length;
// See if adding this line would exceed the size threshold
if (total + length >= size)
{
// Create a new file
num++;
total = 0;
writer.Dispose();
writer = new System.IO.StreamWriter(GetFileName(prefix, num));
}
// Write the line to the current file
writer.WriteLine(line);
// Add length of line in bytes to running size
total += length;
// Add size of newlines
total += Environment.NewLine.Length;
}
writer.Dispose();
}
/// <summary>
/// Get an output file name based on a number.
/// </summary>
static string GetFileName(string prefix, int num)
{
return prefix + "_" + num.ToString("00") + ".txt";
}
}
Verify. Here we verify the correctness of the method to make sure it works. The example CSV file is a 6,409,636-byte CSV file containing 60,000 lines, each with 10 fields.
And Each field is a random number. The sum of the six output files is 6.11 MB, which is the same as the input file.
Result The first five output files are 1024 KB each. This is displayed as 0.99 MB in the file manager. The final file is 116 KB.
Also The lines in the output files were also checked for accuracy. The first file split occurs after line 9816.
Detail Line 9816 is the final line in the first output file, and line 9817 is the first line in the second output file.
Summary. This static method splits CSV files based on byte size. You can use it to split your CSV files on any size boundaries. This is useful for inputting CSV files to a database.
Dot Net Perls is a collection of tested code examples. Pages are continually updated to stay current, with code correctness a top priority.
Sam Allen is passionate about computer languages. In the past, his work has been recommended by Apple and Microsoft and he has studied computers at a selective university in the United States.
This page was last updated on Jun 19, 2021 (simplify).