Excel tips

Excel – Avoid manual labour 3

A corollary of Rule 3: Never type the same formula twice. Design the formula so that if you cut and paste it elsewhere, it works correctly. The $ symbol and the F4 key for cell references help in 90% of the cases. For complex requirements and large data, 5 functions come in handy: INDIRECT, OFFSET, ADDRESS, ROW and COLUMN.

I once did a survey, and had data spread across 300 sheets (same format on all sheets). I needed cell D3 across all sheets in a column, to summarise the results. The image explains what I did.

Excel snapshot

INDIRECT returns the value of a cell. INDIRECT(“Sheet2!D3″) is the value of cell D3 in Sheet2. And INDIRECT(CONCATENATE(A2,”!D3″)) will give you the value of cell D3 in whatever sheet A2 specifies! I created a list of sheet names in column A, and column B had “D3” in each of those sheets. In effect, INDIRECT can transpose sheets into columns.

Getting a list of sheet names on to a column is tough, however. If your sheets are sequentially numbered, this image shows a trick that may help.

Excel snapshot

If you need multiple cells from a sheet, say D3:Z3, use the ADDRESS, ROW and COLUMN functions. ROW(D3) returns 3, and COLUMN(D3) returns 4 — the respective row and column. If you copy ROW(D3) to multiple rows, you will see ROW(D3), ROW(D4), ROW(D5), … which are 3, 4, 5 respectively. Similarly for COLUMN. It’s a useful way of linking values to position.

Excel snapshot

ADDRESS does the opposite. ADDRESS(ROW(D3),COLUMN(D3)) = ADDRESS(3,4) = “$D$3”. ADDRESS(3,4,1,1,”Sheet2″) returns “Sheet2!$D$3”. (See help for the ,1,1 in the middle, and just put it in always.) To cells D3:Z3 from all the sheets, copy the formula INDIRECT(ADDRESS(3,COLUMN(D3),1,1,$A2)) to the entire range. The INDIRECT, ADDRESS, ROW/COLUMN combination can slice contiguous data across sheets in any way you want.

Another useful function is OFFSET. OFFSET(D3,2,1) returns the value in cell E5. It shifts the reference D3 down by 2 rows and right by 1 column. OFFSET can be used instead of the INDIRECT and ADDRESS when multiple sheets are not involved. OFFSET can also return a range. OFFSET(D3,0,0,2,2) returns the range D3:E4, which is the 2×2 range starting from 0,0. So SUM(OFFSET(D3,0,0,2,2)) is the same as SUM(D3:E4). With OFFSET, you can specify a range with variable position and variable size (which you can’t with $ references).

Once, we were modelling a leasing company’s accounts. (Warning: this is a complex example.) We knew the volume of loans they would disburse over the next 3 years. The monthly interest rate is, say, 1%. What would be their interest income every month? Well, it’s not just 1% of what they’ve lent out. Customers pay back in equal monthly installments (EMIs). The EMI includes the principal and the interest. Initially, the EMI has a large interest component and very little principal, because there’s a lot left to repay. Towards the end, the balance dies down and so does the interest; it’s mainly the balance principal that’s being repaid. The interest income is not the same every month even for a single lease.

The calculation is conceptually simple. The IPMT function tells you the monthly interest each month. Let’s say all leases are for 36 months. So, to calculate the March interest income, take the January disbursals and multiply it by the third month interest component: IPMT(1%,3,36,-1). Take the Feb disbursals and multiply it by IPMT(1%,2,36,-1). Take the March disbursals and multiply it by IPMT(1%,1,36,-1). And add them up. For April, you’d add 4 terms. And so on. Mathematicians call this a convolution. It’s like a SUMPRODUCT of a series with another series in reverse.

Excel snapshot

Cell E4 on the image alongside does exactly that for month 3 (March). There are 5 columns:
A: Month
B: Amount disbursed that month
C: Months in reverse
D: Interest component for month in reverse
E: Interest income for month
E4 is the sumproduct of B2 to B4 (the Jan, Feb and March disbursals) with something else — an OFFSET. The offset says, from D1, move down C4 (34) rows and select A4 (3) cells further down. This has the interest components for the first, second and third months in reverse. So, the disbursal for Jan is multiplied with the 3rd month’s interest, Feb with the 2nd month’s interest, and Mar with the 1st month’s interest. That’s exactly what we wanted.

It may look complex. But remember: you have to type this complex formula only once, not 36 times. (And in my case, I had 18 product types.) Also, you’re less likely to make typing errors when cutting and pasting. So this saves you debugging time as well.

Excel – Avoid manual labour 2

Rule #3: Avoid manual labour (continued)

Reconciling data is where I spend most of my time on Excel. Say you have a list of branches by city from 2 banks. You want to know where both banks have branches. Excel doesn’t know that Kolkata is Calcutta. There are 500 cities, and you have 30 minutes.

Excel snapshot

Use VLOOKUP for a start. If Bank A’s cities are in column A (say 2-500) and Bank B’s cities in column B (say 2-400), in C2 type VLOOKUP(A2, B$2:B$400, 1, 0) (read Excel help — all I’ll say is, don’t miss out the 0 at the end: otherwise you get approximate match, and that’s not good). Copy the formula to down to C500. Similarly, in D2 type VLOOKUP(B2, A$2:A$500, 1, 0). Copy the formula down to D400.

Excel snapshot

You’ll see the #N/A where there’s no match. #N/A in Column C means Bank A has a branch here, but Bank B does not. Column D has the converse. But we’re not done yet. There could be spelling mistakes. Using two VLOOKUPs simplifies that problem considerably. We just need to match the cities having #N/A on both lists to check for alternate spellings of cities — which is a lot less work! So prepare a separate list: unmatched cities from Bank A, and unmatched cities from Bank B. (See the section on removing unwanted rows to simplify this.)

Excel snapshot

Sort both the lists while remembering the original order. You’ll want to remember the original order often — so just add a column, number it sequentially (1,2,3… use Alt-E-I-S), and sort the city names along with the numbers. When you want to get back the original order, just sort by the numbers again. To avoid distraction, you can move or hide these numbers. Now, you have a sorted list of unmatched cities. Notice that you can visually match many of these cities. There’s nothing easier to search (visually) than a sorted list.

Finally, when you’ve mapped all your columns, the ones that are remaining are the ones where there is no overlap.

Excel – Avoid manual labour

Rule #3: Avoid manual labour. Use Excel to automate the task.

I use Excel’s formulas to speed up repetitous tasks. These techniques are powerful, meaning, you can do a lot with a little, but can have unforeseen consequences.

Excel can find and replace formulas. If you had hardcoded formulas and wanted to change =B1*3.14 to =B1*3.1416 across all rows, just find “*3.14” and replace it with “*3.1416”. Find and replace works in formulas. This is very powerful. You can use it to change the source (e.g. change the source from column B to C by finding “=B” and replacing with “=C”) or even the formula (find and replace “SUM(” with “SUBTOTAL(9,”).

You can also search and replace for errors (like #N/A, #REF, etc). In the Find dialog options, select “Values” under the Look in” option. To replace these cells, copy and paste the cells by value (Ctrl-C, go where you want to paste, Alt-E-S-V-Enter). Now you can search and replace #N/A just like any other value.

Find external links. If you have links to other Excel files, and one of them is missing, you’ll get an error saying “This workbook contains links …” It’s annoying, and difficult to trace the source. But since links to external files have the formula =Path\[file]Sheet!Cell, just search for “[” across sheets (you can search across sheets using the Options button).

Format based on value. Conditional formatting (Alt-O-D) accepts formulas. You can set a cell’s background to red, yellow or green if it’s value is low, medium or high. Pick a cell, say D3. In conditional formatting, select “Formula is” instead of “Cell value is”, and type “=D3<10". Set the format to a red background. Copy the formatting to all the cells (use the Format Painter button, or Edit-Paste Special-Formats). All cells less than 10 will have a red background. This is powerful because you can use any formula based on any cell. For example, you could pick the conditional formatting formula “=$C3<10" for the cell D3. The cell becomes red if the cell to its left is less than 10. Best of all, you can say "=$C3<$A$1". As you change the value in cell A1, the colours change automatically. Since you can copy and paste the formatting alone (Edit - Paste Special - Format), you can set the conditional format in one cell, and copy & paste it across any selection.

Remove unwanted rows. Sometimes, when you import data, you have empty rows, or errors or whatever. To delete empty rows, select the data, go to Data – Filter – Autofilter. Click on one of the arrow buttons, and select the “(Blank)”. This will display all the blank rows. Select them all and delete them. If you want to delete rows with errors, click the arrow button, select whichever values are errors, and delete those rows. If you want to delete rows based on some other criteria, create a new column that shows TRUE or FALSE based on the criteria, do the Autofilter, and delete the rows.

An alternative to Autofilter is to sort the data based on the column you want. All blank rows (or errors, etc) will be grouped together, and you can delete them at one shot.

Excel – Never type in data

Rule #2: Never type in data in Excel.

You rarely spend time creating voluminous data. Usually, you’re just processing it (copying, transforming, whatever).

Sometimes data is on a web page — typically tables. To copy such data, open the page in Internet Explorer and paste it in Excel. You won’t like the formatting. So copy the cells you just pasted, go to a different sheet, and Edit-Paste Special just the values (Alt-E-S-V-Enter).

Sometimes data is on a text file. You can open text files directly in Excel. Each line becomes a row. You can split lines into columns if there is a “delimiter” between any two cells. Just load a text file, select all the rows, and play with the Data – Text to Columns menu (Alt-D-E).

Sometimes, data is on a PDF file. Usually, such data is in a table. If you have Adobe Reader, tough luck. Just select and copy the table, paste it into Notepad, manually format it (painful), copy again from Notepad and paste in Excel. If you have Adobe Acrobat, it’s slightly better. You can use the “Select Column” tool to select and copy entire columns of the table in one shot.

Sometimes, data is on paper. Scanner often come with an optical character recognition (OCR) software. If not, Microsoft Office 2003 comes with a Microsoft Office Document Imaging tool has OCR. Just scan the image, open it in the Microsoft Document Imaging tool, go to the Tools – Recognize Text Using OCR… menu, and pray.

After all this importing, the data is never “clean”. Errors due to unintended delimiters, extraneous blank lines, etc are fairly frequent. I’ll talk about how to manage this when discussing Rule #3: Automate the task

Excel – Never use the mouse

I spend a lot of time building models on Excel. I have 4 rules that help me get things done fast.

  1. Never use the mouse. The keyboard is much faster.
  2. Never type in data. You can always import it.
  3. Avoid manual labour. Use Excel to automate the task.
  4. Make your data visually obvious.

Let’s look at Rule #1: Never use the mouse.

Using the keyboard can be 10 times faster than the mouse. It takes time to move one hand from the keyboard to the mouse, locate the item you want to click at, move the mouse there, adjust it finely so it’s pointing at the exact spot, and then click it. For example, to insert text without formatting, I’d just go Alt-E, S, enter. It takes half a second. It took me 5 seconds with a mouse. (I timed 10 continuous attempts in both cases.)

A factor of 10 speed advantage like that is good for two reasons: it saves you time, and it doesn’t distract you from what you’re doing (provided the keyboard shortcuts have become a habit.)

For newbies: To use menus using keyboard shortcuts, first go to Start – Settings – Control Panel – Display – Appearance tab – Effects button – Hide underlined letters for keyboard navigation until I press the Alt key. Make sure it is turned off. To use a menu, let’s say “Insert – Row”. look for the underlined letter on the menu bar (the “I” on “Insert”), press Alt and the underlined letter (Alt-I in this case), and look for the underlined letter on the next menu item (“R” on the “Rows” in this case) and press that letter. So, Alt-I-R is the shortcut to insert a row. Now, just practice Alt-I-R, Alt-I-R, Alt-I-R repeatedly.

Shortcuts next to the menu are quicker, where they exist. For example, the Ctrl-C next to the Edit-Copy menu.

Apart from the arrow keys, Ctrl-S, Ctrl-X, Ctrl-C, Ctrl-V, the keys I use frequently are:

  • F2: edit the current cell
  • F4: repeat the last action (very useful)
  • Ctrl-Z, Ctrl-Y: Undo, Redo
  • Ctrl-1: Format cells
  • Ctrl-Shift-down arrow: Select all filled cells below selection (also works with other arrow keys)
  • Ctrl-PgUp/PgDn: Shift between tabs
  • Ctrl-Home/End: Go to top-left, or bottom-right of the sheet (or cell, if you’re editing a cell)
  • Ctrl-`: Show formulae
  • Shift-Space, Ctrl-Space: Select row, select column
  • Alt-Enter: To create a new line while you’re typing in a cell
  • Alt-E-I-S-Enter: Select a set of cells and fill a continuous series of numbers in it
  • Alt-E-S-Enter: Paste unformatted text
  • Alt-D-G-G: Group a set of rows (use Alt-D-G-U for ungroup)
  • Alt-D-G-S: Show a collapsed group (use Alt-D-G-H to collapse a group)
  • Alt-O-D: Conditional formatting
  • Alt-F8: Macros