Getting Data into Memory with Excel Add-In Interop

To extend the functionality of Excel (for example, adding a machine learning operation such as data clustering), you can write an Excel add-in. The basic add-in typically does the UI but to do anything meaningful you usually need to use Excel Interop to read worksheet contents into memory, and then after doing some processing, write values in memory to the worksheet. I described the add-in creation process in a previous blog post at

When I need to read the contents of a worksheet into memory, I typically use one of three approaches. One way is like so:

using Excel = Microsoft.Office.Interop.Excel;
using Tools = Microsoft.Office.Tools.Excel;
using Office = Microsoft.Office.Core;
. . .
Excel.Worksheet worksheet =
Excel.Range usedRange = worksheet.UsedRange;
object[,] allData = usedRange.Value2;
// trim empty rows or columns

I use the UsedRange to get a reference to all cells that have values, and then the Value2 property to store those values into a two-dimensional matrix of type object. Unfortunately, the UsedRange property returns all cells that either currently have contents or had contents at some time in the past, so you might get many empty rows or columns. This requires some post-processing of the object[,] matrix in memory.

A second approach can be used when the add-in has some UI that allows users to specify the range of data. For example:

string upperLeftCell = // get from UI
string lowerRightCell = // get from UI
// or get one string in "A1:B2" form
Excel.Range specifiedRange =
  worksheet.get_Range(upperLeftCell, lowerRightCell);
object[,] someData = specifiedRange.Value2;

A third approach gets a user-selected (via the mouse) range of data. For example:

Excel.Range selectedRange =
object[,] selectedData = selectedRange.Value2;
listBox1.Items.Add("First cell of user-selected data is " +
  selectedData[1, 1]);

In all cases, annoyingly, the object[,] matrix is 1-based, rather than 0-based. This can really cause bugs. So much so, that in many cases I transfer the contents to a normal 0-based matrix.

To get the address of a range, use the AddressLocal property or get_AddressLocal method:

Excel.Range usedRange = worksheet.UsedRange;
string usedRangeAddress =
  usedRange.AddressLocal; // is like "$C$3:$F$7"
string usedRangeAddressNoDollar =
  usedRange.get_AddressLocal(false,false); // "C3:F7"

Excel.Range upperLeftRange = usedRange.get_Resize(1, 1);
string upperLeftAddress = 
  upperLeftRange.AddressLocal;  // "$C$3"
string upperLeftAddressNoDollar =
  upperLeftRange.get_AddressLocal(false, false);  // "C3"

Working with Excel add-ins is interestijng but not always intuitive.


This entry was posted in Machine Learning, Software Test Automation. Bookmark the permalink.