Divide your large CSV file into smaller files. When you upload data to a database, you use a CSV file, where each record is separated by a new line. You need code to segment a CSV file quickly and reliably on 2 MB boundaries.
Our solution method will take 1 file of any number of lines of text, and then output files of up to 2 MB that together contain all the data. The output files of this method will name output files with incremented numbers.
We need a method that can return proper filenames to generate. It is static because it does not need to refer to any state in the class. It uses ToString("00") to show a 2-digit number, with a leading 0 if required.
static string FileName(string baseFileName, int fileNum)
{
//
// This will be one of the file names in the output files.
//
return baseFileName + fileNum.ToString("00") + ".txt";
}This method uses StreamReader, generates file names, and keeps track of the file sizes written. Here's the main code.
public static void WriteSegments(string inFile, string outPrefix)
{
List<string> lines = new List<string>();
StreamReader reader = new StreamReader(inFile);
//
// Read in the specified file.
//
string line;
while ((line = reader.ReadLine()) != null)
{
lines.Add(line);
}
reader.Dispose();
int runningTotal = 0;
string baseFileName = outPrefix + "_";
int fileNum = 0;
StreamWriter writer = new StreamWriter(FileName(baseFileName, fileNum));
//
// Iterate through each line in the file lines. Keep track of
// the current length of the data. After we hit a certain length,
// take the data and write it to a new file. Then, create another
// file for the next segment.
//
for (int i = 0; i < lines.Count; i++)
{
int length = lines[i].Length;
string thisLine = lines[i];
if (runningTotal + length >= _1Mb)
{
fileNum++;
runningTotal = 0;
writer.Dispose();
writer = new StreamWriter(FileName(baseFileName, fileNum));
}
writer.WriteLine(thisLine);
runningTotal += length;
}
writer.Dispose();
}The following code will take the file of the name specified and create a series of smaller files from it. Call it on comma-separated values file.
//
// Could take "all-names.txt", and write
// ALL_OUT00.txt, ALL_OUT01.txt, ALL_OUT02.txt
//
Segment.WriteSegments("all-names.txt", "ALL_OUT");This code is a life-saver when your database goes down and you need a quick way to upload new information to it. This kind of code is valuable in your tool belt. [Segment C# - dotnetperls.com]