User defined array functions in Excel

Many languages have functions to process lists (array). These functions usually return a list, so you can pass that to another list function. This chaining of functions is really powerful.

UNIX provides this sort of chaining capability. If I had a cities (with some repetitions) and I wanted to find out how many started with the letter ‘A’, I’d just type:

```cat cities | sort | uniq | grep "^A" | wc
```
```cat: types the cities.
sort: sorts the cities alphabetically.
uniq: finds unique cities (works only if sorted - that's why we had to sort the list).
grep: filters the cities. Only allows cities beginning with A.
wc: word count
```

To do this on Excel, the only way is to

1. get the unique values. Data – Filter – Advanced Filter, and select “Unique records only”, “Copy the list to another location”, and select a location
2. get the first letter. =LEFT(cell,1) returns the first letter of the cell.
3. count the number of “A”s. =COUNT(range, “A”) counts the number of “A”s.

But ideally, I’d like a 1-line formula like

`=LENGTH(UNIQUE(GREP("^A", range)))`

Excel doesn’t provide these functions by default, but you can add them as user defined functions. Doing this lets you condense several cells into one. Instead of having to copy all your data into a set of unique values, and then adding a column for the first cell, the entire operation can be condensed into one formula.

I consider the following functions the a basic set for list processing.

• LENGTH(list) counts the number of elements in a list
• INDEX(list, n) returns the nth element of the list
• GREP(string, list) returns elements of the list that have the string
• UNIQUE(list) filters unique values
• UNION(list, list) returns elements in at least one of the lists
• INTERSECTION(list, list) returns elements in both lists
• DIFFERENCE(list, list) returns the elements in the first list but not the second
• REVERSE(list) reverses the order of the list
• STRJOIN(separator, list) joins the elements of the list into a string, separated by a separator
• STRSPLIT(separator, string) splits the string into a list, using a separator
• MVLOOKUP(value, lookup, result) looks up value in “lookup”, and returns the corresponding MULTIPLE values from “result”

I created these UDFs. You can download the functions and play with them. Below are some tasks that you can do with them, that are difficult otherwise.

• Get the file name from a path.
`=INDEX(REVERSE(STRSPLIT("\", filename)), 1)`
• Count the number of unique elements in a range.
`=LENGTH(UNIQUE(range))`
• How many common elements are there in range 1 and range 2?
`=LENGTH(INTERSECTION(range1, range2))`
• How many words are there in a string?
`=LENGTH(STRSPLIT(" ", string))`
• Get the smallest unique numbers in a range
`=SMALL(UNIQUE(range), 5)`
• Count the number of mismatches between two lists.
`=COUNT(Range1)+COUNT(Range2) - COUNT(INTERSECTION(Range1,Range2))`
• Get a list of mismatches between two lists.
`=STRJOIN(",",UNION(DIFFERENCE(Range1,Range2), DIFFERENCE(Range2,Range1)))`
• Count duplicate entries in a range.
`=LENGTH(Range)-LENGTH(UNIQUE(Range))`
• VLOOKUP multiple values
`=MVLOOKUP(A1,Lookup_Range,Return_Range)`
• Count the unique matches in a VLOOKUP
`=COUNT(UNIQUE(MVLOOKUP(A1,Lookup_Range,Return_Range)`

This is a small sample. The power of list processing is phenomenal, especially when combined with array formulas. Download these macros and play with them!

1. James says:

Cheers, a lot of great stuff here, makes my Excel work much easier. My hat off to you!

2. Pete Watkins says:

Thanks ever so much for your work in developing these functions, as these have allowed me to avoid a huge amount of drudgery in processing data!

3. Krishna says:

Anand, thanks for an extremely useful stuff.In connection to our discussion on enhancing my excel skills, i think it would help me largely. Krishna

4. Sundar says:

Very neat job Anand. Congrats for a super job. Waht is your e-mail address? Thanx in advance…………S.

5. Anand Srini says:

Hi I reproduce this extract from your article “Instead of having to copy all your data into a set of unique values, and then adding a column for the first cell, the entire operation can be condensed into one formula” can you explain why a column has to be added for the first cell ?

6. ashwin says:

Dear Anand, As usual; another useful post from you. Thanks, Ashwin

7. Kabir says:

you truly are a genius

8. Shiva says:

Your site is turning to a Dexters Laboratory

9. Ramly says:

Thank you very much this will help a lot

10. Mansi says:

Hi, was just randomly searching for how to generate unique values on excel.. and came across your website. Must say, truly impressive work and this has helped me a lot! Thank you and keep up the good work! ðŸ™‚