We have a Microsoft Word document and want to read it in a C# program. With the Microsoft.Office.Interop.Word
assembly, we get the contents and formatting from the document.
In Visual Studio, please add the Microsoft.Office.Interop.Word
assembly to your project. Go to Project and then Add Reference.
First, our file contains 3 paragraphs containing one word each. The program instantiates an Application instance and then we call Documents.Open
on that variable.
using System; using Microsoft.Office.Interop.Word; class Program { static void Main() { // Open a doc file. Application application = new Application(); Document document = application.Documents.Open("C:\\word.doc"); // Loop through all words in the document. int count = document.Words.Count; for (int i = 1; i <= count; i++) { // Write the word. string text = document.Words[i].Text; Console.WriteLine("Word {0} = {1}", i, text); } // Close word. application.Quit(); } }Word 1 = One Word 2 = Word 3 = Two Word 4 = Word 5 = three Word 6 =
The empty paragraphs in the input file are considered words. If you have multiple words in a paragraph, they will each be separate in the Words collection.
Interop.Word
, a paragraph is made up of a collection of one or more words.Why is the application.Quit
statement important? If you don't include this, the WINWORD.EXE
application will remain in the process list.
We looked at the Microsoft.Office
Interop.Word
assembly and learned how to read in data from a Word document. This can be useful when you have DOC or DOCX files.