Oracle DBA – A lifelong learning experience

Posts Tagged ‘Goldengate’

Managing the WINDOW_ID in Goldengate V11.2.1.0.33

Posted by John Hallas on August 16, 2017

When we import data into the landing schema on a Dat Warehouse via Goldengate, we add 3 fields to each record detailing when and how the record got loaded. This can be found in the *.inc files under $GG_HOME/dirinc on the target GG installation. An example of this is:
map xxx.DBA_GGCUTOVER_TEST, TARGET YYY.DBA_GGCUTOVER_TEST,  INSERTALLRECORDS, IGNOREDELETES
COLMAP (
USEDEFAULTS,
WINDOW_ID = @STRCAT(@GETENV(“RECORD”, “FILESEQNO”), @STRNUM(@GETENV(“RECORD”, “FILERBA”), RIGHTZERO, 10)),
OPER_TYPE = @GETENV (“GGHEADER”, “OPTYPE”),
CDC_LOAD_TIMESTAMP= @DATENOW());
map xxx.DBA_GGCUTOVER_TEST #EXCEPTION_HANDLER();
OPER_TYPE is Insert, update, delete etc
CDC_LOAD_TIMESTAMP is self explanatory
WINDOW_ID is more interesting, its made up of the number of the pump trail file that hold the record, and the RBA, padded to 10 digits.
So, if the record is in file $GG_HOME/dirdat/R1MOP101/rm077196, at RBA 78954, then the WINDOW_ID value in the staging table would be:
771960000078954
BI BATCH Issue due to above
We found out yesterday that the BI team (rightly or wrongly, discuss!!) use the WINDOW_ID field in each record on the landing schema’s data to check whether that record has been previously loaded by the batch:
  • For each table, there is a control record that stores the max WINDOW_ID value for that table.
  • The next time the batch is run, it looks at the control record, and then only loads data with a WINDOW_ID greater than the control value.
  • After the batch has completed, the control value is updated.
  • And so on each night.
BUT – what happens if this WINDOW_ID is reset ? The above logic fails!
After we had replatformed the source system, new data coming in from a fresh GG install, now had a low WINDOW_ID:
Old source WINDOW_ID in region of:
771960000078954
New source WINDOW_ID in region of:
   960000078954
To resolve this, we did the following:
  • Stopped the pump process.
  • Waited till replicat caught up, then stopped replicat.
  • Increased the GG Pump Sequence Number via a script like this (the below shows 2 increments of the sequence, we did 10,000’s of these, to get the value to exactly 90,000!:
ggsci <<EOF
alter extract P1MOP101 etrollover
alter extract P1MOP101 etrollover
EOF
 
** Note I didn’t know you could run ggsci commands via a shell script!
  • Started the pump process
  • Checked that a file was created on xxx called $GG_HME/dirdat/R1MOP101/rm090000
  • Altered the replicat to point to this new file with “alter replicat R1MOP101, extseqno 90000, extrba 0”, then started the replicat.
  • After this any records that came through, and had a new higher WINDOW_ID.
We still had an issue with data that had gone in between Sunday up to the time we stopped the pump above.
We knew the old data was no higher than:
799990000000000
We knew the new data was no higher than
  1990000000000
We also new the we didn’t want these records to be in the new 900000000000000 range.
So we added 800000000000000 to the new WINDOW_IDs so now the data loaded between Sunday morning and Monday evening would be in region of
800000000000000
to
801990000000000
 
…This WINDOW_ID would then be higher than the control table values, and therefore would get processed by the BI batch.
Advertisements

Posted in Goldengate | Tagged: , , , , , | Leave a Comment »

UKOUG 2012 – Day 2

Posted by John Hallas on December 4, 2012

A deep dive into Dataguard is a tough way to start the day off but Emre Baransel handled it well. I jotted down about 6 or 7 takeaways for later investigation. Much of the talk was about tuning the redo log apply and I must admit that I don’t consider we have many problems in that area across the estate but it is still worth reviewing. One topic that stuck me as looking at in more detail was the ability to recover a standby database from a primary backup. Chatting to a few people afterwards they had all heard of the capability to do that but had never tested it.

Maria Colgan is fanatastic. Nuff said. Very, very knowledgeable about the optimizer and a great speaker. She just seems so enthusiastic about what can be a very dry subject. Too many notes for me to mention but one worth highlighting is a tip when you wonder if your hint was used or even seen. Take a 10053 trace and search for ‘Dropping Hints’ which will be found right at the end. The used and error values will give you the answer to your question.
PS nice but firm shutting up of someone in the audience who liked the sound of his own voice too much.

Owen Ireland ran through an introduction to Goldengate which despite using GG was quite informative and helpful and had lots of pointers to useful resources. If looking at GG then I suggest you get the slides from the UKOUG library

Before lunch Larry Carpenter was talking about a new 12C feature which allows Data Guard to transfer to a remote (a long way away and therefore suffering from potential latency) standby. Far Sync allows a lightweight database to be sited locally but not that close that it will be affected with a primary site failover. A lightweight server with minimal memory, cpu and storage can be used (and it can host several Far Sync instances). Sync (max availability) transfer to there and that performs async (max performance to the remote standby database. It seems a good idea if the DR site is hundreds or thousands of miles away.
There will also be a DG administrator role in12C that will only be able to manage DG but not touch or see data.

My second Carlos Sierra talk of the conference was about the SQL Health Check script which is like a poor man’s SQLTEXPLAIN. However this script does not create any database objects so can be run on production systems without a change control. Well worth investigating, all you do is plug a sql_id in and then review the html page that is output.

Each year I try and attend a session about a subject that is new to me and this year I decided on Exalytics. The presenter was Robin Moffatt who gave a very accomplished overview of the toolset (which must be heaven for Oracle sales staff as it involved licenses for OBIEE, Times10, Exalytic hardware and Goldengate (optional)). Robin’s presentation was very professional and gave a really good overview, however the real value was where Robin talked about what worked and what didn’t and he obviously had plenty of experience of both aspects.

A good day spoilt by feeling ill during the late afternoon and evening.

Posted in UKOUG | Tagged: , , , , , | 1 Comment »

UKOUG 2011 – day 3

Posted by John Hallas on December 9, 2011

The final day of the 2011 UKOUG conference and it was straight in at the deep end with Joel Goodman talking about automatic parallelism in 11GR2. The talk was full of information, as Joel’s talks normally are. He also had time to cover Parallel Bulk Update which groups sets of rows into chunks. Each chunk can have a success or fail independently of other chunks which removes the ‘all or nothing’ approach normally seen with PDML. He has a good blog entry on this which is well worth perusing if you are interested. http://dbatrain.wordpress.com/2011/07/01/add-bulk-to-your-parallel-updates/

 My site is just going down the road with Goldengate so the talk by Pythian’s Marc Fielding on a real life Goldengate migration was very useful. This was a large financial institution where the system was crucial to business continuity and GG was to be used to provide a rapid fallback facility if things went wrong. The main thing I took away from the talk was how small-minded they must be not to provide adequate testing facilities for such a large project. Not being able to use full data sets and similar sized hardware (OK it was a 14TB database) does add a lot of risk and no small matter of frustration to the technicians involved in the migration. Some of the diagnostics that Marc talked about will be very useful to use and I was interested in the alternatives to supplementary logging which may be required if there is no primary key and it is difficult to identify a row specifically.

I did start to listen to another talk but after around 10 people had left I plucked up courage and made a hasty exit myself. It was just not for me.

The best presentation I saw at the conference was Connor Macdonald on a fresh approach to the optimizer statistics. Connor is a real showman and his easy on-stage manner belies the degree of effort he must spending preparing his numerous slides. The set of slides associated with the ITIL process deserved a round of applause by itself and indeed it received one.  This was the second session I went  to that mentioned the value of index key compression and the way it can be calculated by using ‘analyze index validate structure’. A very good presentation that provided food for thought.

My final session was Mike Swing talking about database tuning for packaged apps. He had way too much content and rushed through it much too fast. As several people said to me afterwards, all he really recommended was getting faster disk and more memory.  I liked his presentation style and easy manner but it was a bit light on useful content.

So here endeth day 3. I think this was the conference I have enjoyed the most and got the most from. The presentations were of a top standard and even though I was only interested in the RDBMS stream I had plenty of choice for most time slots. I know that cancellations and changes are hard to avoid but there did seem to be a lot and that made planning harder than it should have been. I think my only constructive critique would be that there were a number of presentations repeated from last year (and some from other SIGs almost 2 years ago). I fully understand that a good presentation is  still a good presentation a year later and not everyone has the chance to have seen it but personally  I am not in favour of too much repeated material.

 

Posted in 11g new features, Oracle, UKOUG | Tagged: , , , , , , , | 1 Comment »