Thursday, March 18, 2010

Badfiles

Whenever rows get rejected from an insert/update/whatever, they get written to the badfiles. This is kind of like "time out" for naughty records. Unlike naughty children, however, you can choose to completely ignore these records for all eternity. That might not be what you want to do, however.

Assuming you care what's in that file, you'll need to read it. Thing is, it's kind of a convoluted mess upon close examination. Informatica creates a dump of your record plus a bunch of other stuff into a comma-delimited flat file. You can read this file like any other source record, except it's not the same as your failed session's target file definition.

In fact, it looks a little something like this:
0,D,26,D,,N,138,D,,N,,N,,N,,N,7,D,,N,,N,1,D....

The first column is a row_indicator (smallint). It contains values from 0 to 9. The most common being 0 or 1 which means the row was either inserted or updated and failed.

Then, there is a column indicator. The first one of this is always a "D", which means it's valid data.

Then you get every column that was written to your previous target that failed, followed by a column indicator after every column.

So, if you tried to write to a table with 3 columns A, B and C, and failed on a unique constraint violation, your badfiles table will look like this:

0,D,A,D,B,D,C,D

Where all the "D" values are column indicators indicating that the columns were valid data, and the 0 indicates that it was an insert. In the event that a column was null, you'll two commas (,,) followed by an "N" for Null.

For more information on badfiles, consult Informatica's excellent help. This topic is listed under "reject files" - you'll have a hard time searching for it under "badfiles".