Xlookup Built-In Offset Functions Use w/ Messy PDF Data

 

Microsoft Excel Prefers Clean Data Tables:

Businesses often find themselves programmatically importing data into Excel from a PDF data file.  The Excel Xlookup’s Built-In Offset Functions use with Messy PDF Data.   When you do this, which formulas you are able to use to extract the data ultimately depends on the layout of the imported PDF data.

If your data is in a nice tabular format, much like an Excel Table, an XLookup or SumIf would be the most effective way to reference the data.   But that is if your PDF data is in a usable format, which increasingly is not the case, PDF data is getting messy.

When PDF data gets really messy, workarounds are implemented via a complicated mix of embedded functions, and/or advanced Excel vba coding. But often all you need is the ability to offset from the reference cell.  There are a lot of ways to do that, but we find the easiest and most powerful way to do so is simply to use the Microsoft Excel XLookup Function.

The XLookup function is one of the new Dynamic Array Functions, these Spill Functions simplify Excel programming and are available in Office 365.

 

 

PDF Data was once easy to import into Excel. Those days are gone.


Here, the VLookup Function would not return the desired result, but the XLookup will.

 

 


 

A Little History on Working with PDF Data in Excel

A little history, in the past, clients were able to easily use PDF-supplied data more easily than they are today.  PDF data these days is often hit or miss in terms of its usability.

In recent years the PDF downloads we have seen have become very messy, and many are just not useable without great effort and the associated cost.  So what to do with sloppy PDF data, without excessive effort?  Do you have to write a lot of complicated vba code, or do you need to embed two or more Excel functions to get the end result?   Or is there a single Excel formula that can quickly and easily deal with this?

 

What does that mean to Business:

It means extracting said data with a single, easy to use formula is getting harder and harder to do.  But luckily for Excel programmers, Microsoft has released a series of new Dynamic Array Spill Functions, such as the XLookup, ChooseCols, Filter, Take, and Unique.

These new Spill Array Functions, specifically the Microsoft Excel XLookup, makes working with messy PDF data all that much easier. One stand-alone function which is easy to use. Easy to use means that it takes less time to program, and that converts to cost.  Businesses are concerned with cost, so save money, use the new Dynamic Array Functions.

 

The Microsoft Excel XLookup Array Function:

The Excel XLookup Function if used right, allows you to look in one row, and to return the result set from a different row, different tab, or even a different workbook. The term for that is “Offset“.   Offsetting is so useful that there is an Excel that strictly, does this, it is the Offset Function.  The MS Excel XLookup has the built-in ability to Offset due to the fact that the function utilizes two separate Arays, hence its power.

The XLookup is one of the Excel Spill Functions, a new generation of functions, retiring many functions that came before.  The XLookup can return the result to a single cell, or it can return an Array.  Blows the Vlookup/HLookup out of the water in our opinion.  Don’t even mention Index/Match.

But in this post, the super simple “workaround” with this messy PDF is to use the XLookup Function to offset the return set.  For this simple need, this is the simplest solution.

The XLookup is an Intermediate level Excel Function that you should know.  Here is a list of Intermediate Excel Formulas.

 

In our opinion, the Microsoft XLookup Spill Function is the most Powerful, is the most Dynamic, and is the most Exploitable Excel function to date.   We LOVE IT!

 

 

Excel Expert Note:  If you are creative, and if you know the other advanced Excel functions, the uses of the XLookup are limited only by your imagination and ability to affectively leverage them with other Excel functions.

 

Excel Outside the Box:  The XLookup for example can look in multiple tables, one after another until it finds the matching record, by hijacking the “If not found” Parameter.  Put another Xlookup in there, repeat.

 

 

 

 


 

Excel’s Lookup Formulas are Often Used with PDF Data

 

Messy PDF Data Complicates Using Vlookup for Data Extraction

Historically, experienced Excel consultants would agree, that if you want to reference data from PDF-supplied data, with an Excel formula, the Lookups ( XLookup / Vlookup / HLookup / Lookup ) & SumIf Functions are often the best and most effective way to do so.  Why, they work, and they do so with power.

 

But Using Index/Match workaround Method is Old School

Would you go straight to the Index/Match Method?   Index/Match is a workaround, it is not a native Excel Function; it is the combination of two Excel functions.   Is this the most effective and straightforward method to find a record on one row, and to return a value from different row.  Can it return an Array?   No, but historically it is often the one most used.

 

Why is Index Match used so much?  VLookups look right, not left.

 

 

Give it Time:  Most used does not mean best, as the XLookup is new and people are still learning to use it.  The XLookup will replace the VLookup as the most used and Most Googled Microsoft Excel Function of all time.

 

Super Common Solution –  Index/Match:  =INDEX(OffsetRangeTest,MATCH(TestID,IDColumnTest,0))

 

Here are several examples of functions that can look in one row, and return a result from another row.


Index/Match is one of the most used combination of Excel Functions. It is commonly used as a workaround for poorly built workbooks.

 


 

Excel XLookup – Great Solution to Messy PDF Data:

 

One Easy Possible Solution to Messy PDF Data:

Enter the powerful Dynamic Array Spill Function, the Microsoft Excel XLookup.  The XLookup can look at one row, and return results from a different row, or even from a different table on a different tab or even from a different file. The XLookup can return one cell, or a Dynamic Range.   Now that is POWER.

 

Yes, the XLookup can look in one file and return results from another tab or another file. What other Lookup Functions can easily do that?

 

To be able to ‘Offset’ is the Solution in this simple example:

There are many ways to “Offset” in Excel, but you can most easily do that with the XLookup.  The Offset ability is basically built-into the XLookup, due to the fact that it is using two independent arrays in the function.

=XLOOKUP($A$1,$E$6:$E$45,$K$7:$K$45)

 

 

The Microsoft Excel XLookup is the most powerful and most useful formula in Microsoft Excel today.


The XLookup natively allows us to Offset, perfect for messy PDF data.

 

Messy PDF data is becoming an issue. The XLookups ability to Offset makes it a good candidate for our use.


In Excel there are always many ways to do the same thing. As an expert Excel programmer, it is our job to use the best function for the need.

 


 

Using VLookup Function in this Case Causes a Challenge:

If we want to look for a record on say row 5, and return the result from Row 6, the Excel VLookup will not do this natively.  So, the Excel programmer is then forced to look for alternative ways to look in one row for a matching value, and to offset to another row, to return the desired value.  The ability to Offset quickly comes to mind.

Typically, in this situation the Excel programmer would use the most common Excel workaround, combining two Excel functions into one, such as Index/Match.  Excel programmers love to embed functions, it is how Excel gets done.

There are a variety of Excel Functions, that when combined, will do this for you.  But why be forced to use Index/Match, or to embed one formula in another, which is not intuitive to most Excel users?  Most Excel users are not good enough to use Index/Match, but they can use Lookups.

 

We recommend use one function that can easily do this, one that can return a dynamic array. Dynamic Arrays are our go to formulas for a reason.  Spill Functions make programming Microsoft Excel so much easier.

 

The Sloppy PDF Data Solution:

The solution is simple, of the 500+ functions in Excel. But there are only a few functions that have the ability to “Offset”, built in.   Sure, you can use Offset with Match, or Index with Match, the older “Lookup”, etc., but why bother when the XLookup will do it even easier?  And which of these other methods can return a Dynamic Array and not just one value in one cell?

 

Common Excel Functions to Offset the Results:

As we said, there are countless ways to get the results that you want, when working with messy PDF data.  But what is the easiest way to do so? We have listed some of the most common ways to do this, below.

  1. =XLOOKUP(TestID,IDColumnTest,OffsetRangeTest)
  2. =INDEX(OffsetRangeTest,MATCH(TestID,IDColumnTest,0))
  3. =INDEX(K:K,MATCH(TestID,E:E,0)+1)
  4. =OFFSET(E6,MATCH(TestID,IDColumnTest,0),6)
  5. =OFFSET(K1,TestID*2+4,0)
  6. =LOOKUP(TestID,IDColumnTest,OffsetRangeTest)

 

Expert’s Note: Which method can return an Array and not just populate one cell?  Which is the easiest to use, and the most powerful?   Simple, it is the Excel XLookup Dynamic Array Spill Function.

 

Index/Match, Offset, XLookup, which works best depends on the situation.


If you build your workbook, based on Excel Tables, then the XLookup along with SumIfs, that is pretty much all you need to properly extract data for reports and analysis.

 

 


 

 

Conclusion:

More and more we find ourselves telling clients that their PDF data is not in a usable format.  Or at least it is in a messy format, and we will need to see which Excel function(s) will be able to return the result set we need.

In our super simple example, we showed a PDF that had the data separated onto multiple lines, instead of one clean row.  We needed to find a formula that would easily allow us to Offset a Lookup Function.  And for this example, the Excel XLookup fits the bill.  Quick, simple and powerful.

No need to use Index/Match, though experienced Excel programmers love that combination of Excel Function.  I however prefer the new Dynamic Array Spill Functions over all 500+ Excel formulas.

The Microsoft XLookup allows you to look at one row, and return a Dynamic Data Set from another row or range.  This allows you to quickly work with messy PDF data in your Excel files.  The XLOOKUP, use it.

 

 


 

Other Posts on how to use the new Excel Spill Array Functions.

Related Post:  Using ChooseCols Function in Excel Reports – Excel Reporting Just Got Easier.

Related Post:  Using VStack Function to Consolidate Excel Data

Related Post:  Using Unique Function in Validation Controls – Users Love Drop Down Lists