Before all — Microsoft made life much easier by introducing the =UNIQUE() function in Excel 365. It is a powerful tool with one shortcoming — it is not backwards compatible with older versions. This means that even as of the moment of writing (Feb 2020) there is a sizable share of non-upgraded user who can’t take advantage of the functionality — not only the formula is absent when they start typing, but it will also resolve in error if a file with it is sent to them.
This is why the issue is approached old-school by relying on formulas available in Office 2007 and later. Furthermore, sticking with the complicated formula has some flexibility benefit that will be demonstrated as well.
The following file is used for the purposes of the tutorial:
One of the features of the splitting tool is the option to list the routes taken as described by their points of origination and destination (separately). Total length in km will be present as well for later use. This is an input on the sheet Routes.
Since users need be able to use drop-down lists on sheet Calculator to choose starting and ending points of the journey, couple of formulas must be present:
- The first goes through the list of origination points and lists the unique values among them. This way the user will not see duplicate values in the drop-down menu.
- The second one accounts for what has been chosen as starting place and goes through the destinations’ list extracting the unique items related to the trip origin only.
For better readability of the formulas the range of cells where routes are logged (Sheet Routes) are converted into a table named RoutesList. This step is not necessary but makes the work a bit more future proof. If all the rows in the table have been filled-in and add new entry is made on the line just beneath it, the table will automatically expand. The way tables are referenced (as demonstrated later) means that we no formulas will need be updated to accommodate the extra lines of data.
The general formula would be examined inside out as it is being solved by Excel.
At first the COUNTIF($E$3:E3;RoutesList[From])=0) is studied.
Marked with (1) is the short list of unique items. This is a An expanding reference is present when formula addresses range by combining absolute reference at the beginning/ending of the range with a relative reference at… More so as the formula is expanded it will perform its function against an increasing range of cells.
Marked with (2) is the routes’ table From entries. Unlike the previous range this one is taken as a whole regardless of how far the formula has been extended.
On every instance when the function is run the the two arguments combined would return integer value for every row with trip origination points. This will vary between 0, if the record is not present in the unique list yet, and 1 — if the item has already been listed.Click to see how it looks
Then at the point marked with (3) the produced stream of results is compared with the value of 0. This will return TRUE/FALSE values for every 0/1.Click to see how it looks
Next in the big formula, by executing the (RoutesList[From]<>“”) part, a test is performed whether the records in the From field of the long list are not empty. This is made so even if not all table rows are filled-in, the blanks will not making it to the short list.
As the reference to the column suggests — on every instance when we run the full formula all records are referenced. This is a logical test and will return TRUE/FALSE value.
As the brackets guide, next the two streams of TRUE/FALSE values (described above) are being multiplied. For each row in the long list a new value will be produced.
Multiplication of logical values is not exactly straightforward. Naturally, if all of the arguments are TRUE, then outcome is TRUE as well. However, Excel would not resolve the TRUE*TRUE product as TRUE but as numeric value of 1 instead. If any of the factors is FALSE (eg. TRUE*FALSE or FALSE*FALSE) then the result will be 0. Excel assumes that a logical test that is TRUE represents an integer value of 1; and FALSE — 0. Multiplying any number of those would return 1 if all arguments are equal to 1; but will be equal to 0 if any of the arguments is 0.
|1st argument||2nd argument||Multiplication||Result|
|TRUE||TRUE||TRUE*TRUE or 1*1||=1|
|TRUE||FALSE||TRUE*FALSE or 1*0||=0|
|FALSE||FALSE||FALSE*FALSE or 0*0||=0|
Next is the division of 1 by the 1/0 stream obtained in the previous step.
Having 0 in the denominator is clearly wrong mathematically and for all instances this is tried Excel will return #DIV/0! error. Usually this will be great inconvenience but in this case works in our advantage.Click to see how the division translates as results
This might be obvious but dividing 1 by 1 or 0 will always result into 1 or error (#DIV/0!). No other results are possible.
At this point the =LOOKUP() function comes into play.
The entire expression of
has been turned into
LOOKUP(2;Stream of 1s and #DIV/0! errors;RoutesList[From])
The highlighted stream is essentially a property of the items in the From column of the table (RoutesList[From]) and tells if
- the item is already recorded in the short list or is a blank value — #DIV/0! value. This also means it will not be considered by the =LOOKUP().
- the item is not yet shortlisted — value of 1.
The expert move is putting 2 as the first argument of the function. Why not input 1 if no other values can appear? In the documentation on office.com the following note can be found
Important: The values in lookup_vector [that would be the 1/#DIV/0! stream] must be placed in ascending order: …, ‑2, ‑1, 0, 1, 2, …, A‑Z, FALSE, TRUE; otherwise, LOOKUP might not return the correct value. Uppercase and lowercase text are equivalent.
Looking-up for 1 has the potential to mess-up the workings of the formula if the appropriate set of records is found in the lookup_vector (the entries in the From column in the long list). Using 2 as an argument is circumventing the issue of the data not producing sorted stream of 1/#DIV/0! errors.
What is achieved is that on every other row with the extracting formula the latest non-listed & non-blank value from the long list is being populated.
If the =LOOKUP() function can’t find any unique value to list, it will resolve in #N/A error. That is why =IFERROR() encloses the =LOOKUP() and swaps any error with “-” (the value within quotes can be removed or replaces with whatever suits your purposes).
At this point extracting the unique journey starting points is achieved. In the next post it will be demonstrate how the short list can be dynamically converted into a drop-down list from which the users can pick their choice (on the sheet Calculator). The cell where this is done is a Any range of cells, being just a single one or much bigger body, can be assigned a unique name. This way, instead of referencing the underlying cells by the reg… More From_User_Choice.
When extracting the unique destinations, the selected origin value will be needed — despite the reference being perfectly good as regular one (e.g. =Calculator!$B$2), naming it makes it much easier to understand what kind of information is obtained. This helps not only when making screenshots for an Excel blog but also when the file is opened months later and one has no memory of what was put in B2.
Obtaining the unique destinations is the next task.
This utilizes the same formula that has been used so far with just couple of changes:
- As expected — the columns referred before are now shifted to the right
- There is one extra logical check inside the body of the =LOOKUP() function.
The addition of *(RoutesList[From]=From_User_Choice) is a call back to the TRUE/FALSE results and their multiplication. For every entry of the To column in the long list the corresponding value in the From column is tested against the value chosen by the user as starting point (on the sheet Calculator). This way another layer is introduced in the multiplication of the logical tests’ results.
If we assume that Meeting point 2 has been selected, the new (3rd) check will return FALSE for all rows not having it the From column. Regardless of how the other two checks have turned out for the values in the To column:
- If the particular value is already in the shortlist or not;
- and if the value is blank or not;
in case FALSE is returned in the third check, it will be treated as 0 when multiplying and in the workings of the formula this means no new record for the short list.
Stacking more rules by multiplying the logical tests’ outcomes grants the flexibility of being able to precisely pinpoint the items that need be shortlisted. In the context of the example used, there might be more than one route between the same starting and ending points — meaning different distance traveled. Introducing column that carries such data can be easily accommodated into the filtering rules.
- The solution does not sort alphabetically the results on any step. This makes it more appropriate to use when there will be relatively small number of unique values.
- The above expectation is related to the number of rows in which the extracting formula will be present. In theory every row in the long list might have unique value so the lack of duplicates will render the unique list being the same length. However, if such length is not utilized, it will be a drag on the spreadsheet performance. Arbitrary decision needs to be made so understanding the nature of the information is essential. Proper sizing helps efficiency.