Thursday, January 12, 2006

A Careful Balancing Act - The Daily WTF:

There's not much more I can add to today's story, so let me turn it over to Alex Villacis Lasso ...

I once worked as a junior programmer in a once important bank in Ecuador, South America (which bank, I won't say, but its name starts with an "F"). This bank had an IT department in which MS VisualBasic 5 was the dominant tool, with some mainframes thrown in and running programs written in COBOL, RPG, and a few other exotic languages. While working there, I was assigned the following task: take an existing program (whose purpose I will explain below) and modify it to accept a new file format. When I finished studying the program, my jaw had dropped to the floor...

The program at hand was a fix for an issue with a COBOL system in one of the mainframes. This COBOL program generated a balance sheet like this:

0020304     10278.93  # Total Assets
0020304C 9223.41 # Cash Holdings
0020304R 1055.52 # Accounts Receivable
0020304R1 876.03 # Loan 00070259
0020304R2 89.75 # Loan 00009007
0020304R3 137.22 # Loan 00020113

However, this COBOL program had this little problem. The account entries (e.g.0020304R1, 0020304R2) did not add up to the container (0020304R) and drifted by a few cents. For a bank, this is obviously a serious bug. However, instead of fixing the COBOL program, somebody wrote a VB5 program (which now I had to modify) to read the balance sheet generated by the COBOL program and transform it into the following format:

0020304       =B2+B3  # Total Assets
0020304C 9223.41 # Cash Holdings
0020304R =B4+B5+B6 # Accounts Receivable
0020304R1 876.03 # Loan 00070259
0020304R2 89.75 # Loan 00009007
0020304R3 137.22 # Loan 00020113

The idea was that the file would become a CSV file that would be loaded into MS-Excel, and then Excel would recalculate the proper values from the formulas. How they thought that was an acceptable solution, I don't know - that was written before my time at the bank. I was handed the VB5 source code and told to modify the output format to conform to an ongoing standardization project on file formats.

This is how the original VB5 program attempted to build the formulas for the balance sheet:

1) Open the input file (the COBOL-generated one)
2) Write the entire file into a temporary copy
3) Create the (initially empty) output file
4) Read the first account from the input file
5) Open the newly created temporary copy
6) Scan the *entire* temporary copy created at step 2, looking for the contained accounts (0020304 has 0020304C and 0020304R which in turn has 0020304R1, 0020304R2, and 0020304R3)
6) Build the formula from the line numbers at which the contained accounts were found
7) Write the formula to the output file
8) Close the output file
9) Lather, rinse, repeat from step 4 for every single account on the input file.

This is obviously an O(N^2) operation with N file scans. The typical balance sheet had over 700 account entries, and so the program took half a minute to process such a file. Throw in a complete lack of input file validation and the fact that the code (uncommented, of course) was written entirely inside a Button_OnClick procedure... I seized the chance and rewrote the whole thing to slurp the file into memory, qsort it (just for good measure - the COBOL file was usually sorted already), and scan it with a recursive function that kept a backreference to add contained accounts, then write it to the output file. My bosses were worried at first because the improved program did the task in under half a second on a 486, and thought I forgot to do some operation.

The bank in question (along with several others) was involved in a huge financial collapse around 1999-2000. Many people were unable to use their savings for months and months. Having seen some of the programs used internally, I suspect the less-than-ideal quality of them might have played a role in this collapse.

0 Comments:

Post a Comment

<< Home