Oracle DBA – A lifelong learning experience

Managing the WINDOW_ID in Goldengate V11.2.1.0.33

Posted by John Hallas on August 16, 2017

When we import data into the landing schema on a Dat Warehouse via Goldengate, we add 3 fields to each record detailing when and how the record got loaded. This can be found in the *.inc files under $GG_HOME/dirinc on the target GG installation. An example of this is:
map xxx.DBA_GGCUTOVER_TEST, TARGET YYY.DBA_GGCUTOVER_TEST,  INSERTALLRECORDS, IGNOREDELETES
COLMAP (
USEDEFAULTS,
WINDOW_ID = @STRCAT(@GETENV(“RECORD”, “FILESEQNO”), @STRNUM(@GETENV(“RECORD”, “FILERBA”), RIGHTZERO, 10)),
OPER_TYPE = @GETENV (“GGHEADER”, “OPTYPE”),
CDC_LOAD_TIMESTAMP= @DATENOW());
map xxx.DBA_GGCUTOVER_TEST #EXCEPTION_HANDLER();
OPER_TYPE is Insert, update, delete etc
CDC_LOAD_TIMESTAMP is self explanatory
WINDOW_ID is more interesting, its made up of the number of the pump trail file that hold the record, and the RBA, padded to 10 digits.
So, if the record is in file $GG_HOME/dirdat/R1MOP101/rm077196, at RBA 78954, then the WINDOW_ID value in the staging table would be:
771960000078954
BI BATCH Issue due to above
We found out yesterday that the BI team (rightly or wrongly, discuss!!) use the WINDOW_ID field in each record on the landing schema’s data to check whether that record has been previously loaded by the batch:
  • For each table, there is a control record that stores the max WINDOW_ID value for that table.
  • The next time the batch is run, it looks at the control record, and then only loads data with a WINDOW_ID greater than the control value.
  • After the batch has completed, the control value is updated.
  • And so on each night.
BUT – what happens if this WINDOW_ID is reset ? The above logic fails!
After we had replatformed the source system, new data coming in from a fresh GG install, now had a low WINDOW_ID:
Old source WINDOW_ID in region of:
771960000078954
New source WINDOW_ID in region of:
   960000078954
To resolve this, we did the following:
  • Stopped the pump process.
  • Waited till replicat caught up, then stopped replicat.
  • Increased the GG Pump Sequence Number via a script like this (the below shows 2 increments of the sequence, we did 10,000’s of these, to get the value to exactly 90,000!:
ggsci <<EOF
alter extract P1MOP101 etrollover
alter extract P1MOP101 etrollover
EOF
 
** Note I didn’t know you could run ggsci commands via a shell script!
  • Started the pump process
  • Checked that a file was created on xxx called $GG_HME/dirdat/R1MOP101/rm090000
  • Altered the replicat to point to this new file with “alter replicat R1MOP101, extseqno 90000, extrba 0”, then started the replicat.
  • After this any records that came through, and had a new higher WINDOW_ID.
We still had an issue with data that had gone in between Sunday up to the time we stopped the pump above.
We knew the old data was no higher than:
799990000000000
We knew the new data was no higher than
  1990000000000
We also new the we didn’t want these records to be in the new 900000000000000 range.
So we added 800000000000000 to the new WINDOW_IDs so now the data loaded between Sunday morning and Monday evening would be in region of
800000000000000
to
801990000000000
 
…This WINDOW_ID would then be higher than the control table values, and therefore would get processed by the BI batch.
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: