How do Excel tables remember formulas

If you’ve used tables before you have seen your formulas automatically expand and contract with your data. This is a pretty handy feature of the tables. Some might say they just copy and paste the formulas, but this is not true. (See this blog post to see why structured table references should be used, which shows how Excel doesn’t copy/paste table formulas: http://excelandaccess.wordpress.com/2013/01/07/why-you-should-use-structured-references-in-tables/) The Excel team at Microsoft were much smarter than that. While this has been available since Office 2007 when the new file formats came out, I only found out about it a few months ago.

The new file formats are much more efficient in how they store data because they use an Open XML format. This is all-around better for the user. The files are basically zip files containing a hierarchy of folders containing data. If you need to get up to speed with the new file formats, check out these links:

http://www.ecma-international.org/news/TC45_current_work/TC45_available_docs.htm (very technical)

http://www.jkp-ads.com/Articles/Excel2007FileFormat00.asp (by Microsoft Excel MVP, Jan Karel Pieterse)

I’m going to show how to view table formulas (and associated information) from an XLSX file. Please note this will work with an XLSM, but the file structure is different for XLSB files because they are in binary format and store data in bin files, although it can be done. For viewing the XML structure I use Notepad ++, although you can use any XML editor (notepad-plus-plus.org/download/).

Starting Data

I’ve entered some very generic data into a blank Excel file with a column of data and a calculated column.

The formula is very simple. I typed it into the first cell of the ‘b’ column which auto-filled down the entire column. At that point the data was written to the files XML.

I’ve also changed the table name to ‘TestTable’ (as seen above). This is not what will be seen when viewing the XML file. All tales get their own ID number associated with them. You can change this if you want, although it is only accessible/visible in the XML structure and has no bearing on the name you give the table in Excel. Because of this I recommend not changing the ID of a table.

Change File Type

With the file closed (must be saved), you can change the extension of the file type from XLSX to ZIP.

You will be prompted to ensure you really want to do this, click Yes.

View XML Structure

Each table has its own XML structure/file. When you open the zip file (double-click it) it will open in a Windows file explorer (folder) window. Navigate to the ‘xl’ folder and you will find a ‘tables’ sub-folder. Open this and you will see a list of XML files, one for each table in the workbook. In this example I have only created one table.

If you open the XML file from here (within the zip file) it will be in read-only mode. The easiest method is to drag-and-drop this file (I use the desktop for ease of use and speed) and open it from there. Since this is a zip file it will create a copy and not move the file. When done, if you’ve made changes, you can drag and drop the file back again and choose to overwrite the file with your changes.

Once you’re done viewing the parts of the file while it’s a zip, close the zip folder and change the extension back from ZIP to XLSX. You will get the same prompt asking if you’re sure you want to do this, click Yes. At this point you can open the file normally in Excel and any changes will show.

Table Guts

When you open the XML file (using Notepad ++, I also use the XML plugin, available for free from the Plugins menu, for easier viewing as seen below) you will see the table structure. There is a lot of information you can see, including the table name, display name, range reference, etc. If you look in the tableColumns node you will see two columns, one has only an id and name (my first column with manual data, however the second column has a child node of calculatedColumnFormula. This is the heart and soul of how an Excel table remembers your formulas.

You may look at the formula and see it is slightly different from what I showed above. This is because the XML written to the file is in the base format, which is the structured table references from Excel 2007. They weren’t the friendliest and were changed in subsequent versions for easier writing in formulas. The use of the @ symbol for This Row reference is not native to Excel 2007. So to keep backwards compatibility, the XML written, no matter the version (post 2007) will be saved in the 2007 structured table reference format. This is how you can write a structured table reference in 2007 and view in 2013, and vice versa, without breaking it, even though the formulas are different in the formula bar. It’s quite genius.

You can delete the entire data body range of a table and as soon as you instantiate a row the formulas will come back, and this is how Excel knows what to put. It’s not magic, it’s Excel, and thanks to the Excel team members at Microsoft we have the ability to seamlessly take calculated columns from one version to the next (2007 and on).

Also, you can edit the formulas here, although it’s much easier from within Excel.

Additional Info

Jan Karel Pieterse has some great articles on his site. One of which is how to get and update any XML portion of an Excel file (2007+ file formats).

http://www.jkp-ads.com/Articles/Excel2007FileFormat02.asp

If you do not view file extensions you can change the file type viewing in your Windows settings. This link tells how to do that:

http://windows.microsoft.com/en-us/windows/show-hide-file-name-extensions#show-hide-file-name-extensions=windows-7