C#Dot Net Perls

C#
Recursive File and Directory List

by Sam Allen
Folder containing more folders in Vista.

Problem

Recurse through files and directories. Create a fast algorithm that finds all files and folders at a specified level on the disk, and then everything within that. There is an algorithm on MSDN, but it has some shortcomings. We need code that takes a folder name and returns a list of all file names within every folder.

When passed this directory path We want output like this
C:\ C:\file.txt
C:\interesting.doc
C:\zyzzyvas.xml
C:\Directory\file_two.txt
C:\Directory\file_four.txt
C:\Directory\Nested\FileThree.doc
C:\Random\number.txt

C# Solution

Here is an interesting version of the directory recursion algorithm that doesn't actually use recursion, but instead uses a Stack and processes all directories from the stack. This means that it is simpler to understand, maintain, and change.

How is it implemented?

With a Stack. The approach we must use is to combine the special methods provided by Microsoft in .NET that return a list of all directories (folders) at a specific level, and also a list of all files. The following method has some important differences in behavior than Microsoft's version.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;

/// <summary>
/// Use this method to explore a directory and all of its files. Then, it
/// recurses into the next level of directories, and collects a listing
/// of all the file names you want.
/// Original code by Samuel Allen. Copyright 2008.
/// Dot Net Perls, http://dotnetperls.com/
/// </summary>
static class FileSystemUtil
{
    /// <summary>
    /// Find all files in a directory, and all files within every nested
    /// directory.
    /// </summary>
    /// <param name="baseDir">The starting directory you want to use.</param>
    /// <returns>A string array containing all the file names.</returns>
    static public string[] GetAllFileNames(string baseDir)
    {
        // Store results in the file results list.
        List<string> fileResults = new List<string>();

        // Store a stack of our directories.
        Stack<string> directoryStack = new Stack<string>();
        directoryStack.Push(baseDir);

        // While there are directories to process and we don't have too many results
        while (directoryStack.Count > 0 && fileResults.Count < 1000)
        {
            string currentDir = directoryStack.Pop();

            // Add all files at this directory.
            foreach (string fileName in Directory.GetFiles(currentDir, "*.*"))
            {
                fileResults.Add(fileName);
            }

            // Add all directories at this directory.
            foreach (string directoryName in Directory.GetDirectories(currentDir))
            {
                directoryStack.Push(directoryName);
            }
        }
        return fileResults.ToArray();
    }
}
  1. Static method and class
    First, the method is completely static, as it doesn't need to save state. The result List is simply a local variable. This is better than the previous implementation I showed here, and also easier to deal with than Microsoft's implementation.
  2. Receives a "base" directory
    Pass in the starting or base directory to begin the directory search.
  3. Uses the Stack
    I replaced all recursion with a Stack. What we need to do here is to find all directories at a level, and then for each directory, open it and do the same thing again. The recursion can easily be replaced with a stack that uses Pop and Push.
  4. Stack methods used
    On the stack, we use Push to add a directory string, and Pop to remove one that we have processed. This ensures we only look at each directory once!
  5. Out of memory?
    A workaround for the out of memory problem that can occur is to limit the number of results to 1000. You can eliminate or change this constant. We use the Count property for this.
  6. Finally, returns array
    The method returns a string[] array, which is a very convenient form for callers. You can change this to a List if you want.

Why does it use Stack?

To avoid problems associated with other methods. Stacks in programming are the same as stacks in real life. We "collect" all the directories in the Stack, and then process them one by one. This avoids recursion and means we don't have to deal with reentrancy and other hard stuff. It is easier to debug, and may be more efficient in some scenarios.

Stack method Its usage here
Push Add a directory string to our "stack" of things we will need to deal with when we get around to it.
Count We only keep going when we have items in our stack. If you try to Pop a stack with nothing in it, you get an exception.
Pop Returns the top item from the stack, and removes it.

How can I call it?

An example is shown next. Keep in mind that the method here is different from Microsoft's method because it lists all the files in the first specified directory. I have always found this to be the desired behavior, but your requirements may vary. Please see the MSDN method if you want to compare.

class Program
{
    static void Main(string[] args)
    {
        // Get all files in the allensamuel directory (calling convention).
        string dir = @"C:\Users\allensamuel";
        string[] fileNames = FileSystemUtil.GetAllFileNames(dir);

        // Loop through all files in the string array.
        foreach (string fileName in fileNames)
        {
            Console.WriteLine(fileName);
        }
    }
}

Conclusion

This method is simpler, has more useful behavior in the first directory, and is easier to quickly adapt to your project than the one on MSDN. I tested it, and it works the same as Microsoft's. It removes recursion and uses Stack, which is easier to debug and understand. Use this method for an alternative directory searching or recursion algorithm that has many advantages.

Dot Net Perls is dedicated to sharing code and knowledge. It has
© 2007-2008 Sam Allen. All rights reserved.

Ads by The Lounge