Oracle DBA – A lifelong learning experience

Creating a database link using a tnsnames entry

Posted by John Hallas on December 29, 2017

I have had this in my draft posts for a couple of weeks and I refrained from publishing because it was too simple.

However the main purpose of this post is to help others save time by providing useful information. The secondary purpose of to remind myself how I overcame whatever problem is being covered.

A user did not have permissions to edit the tnsnames.ora file on a server so I created the database link for him to use the connect description, thereby circumventing the need for an update to the tnsnames.ora file.

I prefer database links to be named to show the database they connect to, however in this case they wanted a standard name which is used in code and can be used across all the environments.

 create database link ebsdev connect to remote_username  identified by password using
‘EBSDEV =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = servername)(PORT = 1531))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = EBSDEV)
)
)’;
select ‘x’ from dual@ebsdev;
Advertisements

Posted in Oracle | Leave a Comment »

ORA-01450: maximum key length (3800) exceeded

Posted by John Hallas on December 13, 2017

This is an oddity as essentially I have an index on a table but I cannot rebuild it because it suggests the block size should be bigger than 8K – however both the database and tablespace block size are 16K already.

The maximum allowed index key length depends on your block size. So the minimum allowed size reported in ORA-01450 varies, depending on which block size your index is using:

ORA-01450 maximum key length (758) exceeded -> (2K Block)

ORA-01450 maximum key length (1578) exceeded -> (4K block)

ORA-01450 maximum key length (3218) exceeded -> (8K Block)

ORA-01450 maximum key length (6498) exceeded -> (16K Block)
See MOSC Note 136158.1 and MOSC Note 236329.1 for more details on the ORA-01450 error and key length.

Even more odd was that I could drop the index and recreate it with no issues and yet rebuild failed every time, even after the recreate. Studying the table and data there did not seem to be a value that was much bigger than the others and certainly nothing that would cause one or more rows to be much larger than the others. Read the rest of this entry »

Posted in Oracle | Tagged: | 3 Comments »

How to add a metric extension and use a ‘control’ table to test results

Posted by John Hallas on November 27, 2017

I have previously blogged about creating a User Defined Metric (UDM) before  (7 years ago!)  but the method I am using today demonstrates how you can test out the alert using specific data.

The aim was to see if integration data is flowing through a table.  If there is a backlog we want to know about it.

Simon Ryan, a colleague did all the preparatory work and created all the 18 !! screen shots which I had to copy and paste into this entry

select * from sm7.EVENTOUTM1; -- Transient table so data not usually present.

The data is processed quickly but occasionally you can see a handful of rows for a few seconds.

select DESCRIPTOR ,
EVTYPE ,
TO_CHAR(EVTIME, 'DD-MON-YYYY HH24:MI:SS') ,
EVSYSSEQ ,
EVUSRSEQ from sm7.EVENTOUTM1;
DESCRIPTOR EVTYPE EVTIME EVSYSSEQ EVUSRSEQ
 (BLOB) email 08-NOV-2017 11:40:01 0000000005254328

A typical query for the ME would be very simple:

select count(*) from sm7.EVENTOUTM1;

However, we are going to plugin a ‘control’ table in order for us to test the alert by artificially adjusting the results. Read the rest of this entry »

Posted in Oracle | Tagged: , , , | Leave a Comment »

DBA vacancy Bradford, West Yorkshire

Posted by John Hallas on November 24, 2017

I have a vacancy in my team and the details are in the link below

https://apply.morrisons.jobs/vacancies/2137/technology-specialist–database.html

Lots of interesting work going including 85Tb Data Warehouse on Exadata and developing Cloud capability using AWS and Google.

 

Posted in Oracle | 3 Comments »

Baselines – session creating privs v session running privs

Posted by John Hallas on November 21, 2017

A colleague Richard Wilkinson was telling me about an issue he had come across with baselines and I asked him to write it up as it was an interesting experience.

The following MERGE SQL runs once a day. It always uses the same plan and roughly takes between 30 and 80 minutes:

SQL_ID/ PHV 6zs5dk6t6pkfs / 3064471754

 MERGE INTO DWB_X_PLNGRM_ATTR_DAY t
 USING (
 SELECT ilv.bsns_unit_cd,
 ilv.sku_item_nbr,
 ilv.cntnt_cd ,
 ilv.sel_units_cntnt_status,
 ilv.plngrm_item_grp_cd,
 dwr.bay_cnt_key,
 dwr.bay_cnt
 FROM (
 SELECT DISTINCT bsns_unit_cd,
 sku_item_nbr,
 cntnt_cd ,
 sel_units_cntnt_status,
 plngrm_item_grp_cd,
 --bay_cnt_key,
 max(bay_cnt_key) bay_cnt_key
 --bay_cnt
 FROM dwb_x_sel_item_units_day
 WHERE (1=1)
 AND day_key >= :v_daywkey
 GROUP BY bsns_unit_cd, sku_item_nbr, cntnt_cd, sel_units_cntnt_status, plngrm_item_grp_cd
 ) ilv,
 dwr_x_plngrm_loc_bay_cnt dwr
 WHERE ilv.bay_cnt_key = dwr.bay_cnt_key
 ) s
 ON (
 t.CNTNT_CD = s.CNTNT_CD
 AND t.SKU_ITEM_NBR = s.SKU_ITEM_NBR
 AND t.BSNS_UNIT_CD = s.BSNS_UNIT_CD
 AND t.PLNGRM_CNTNT_STATUS = s.SEL_UNITS_CNTNT_STATUS
 AND t.PLNGRM_ITEM_GRP_CD = s.PLNGRM_ITEM_GRP_CD
 --and t.DAY_KEY >= :v_daywkey
 )
 WHEN MATCHED
 THEN UPDATE SET
 t.bay_cnt_key = s.bay_cnt_key,
 t.bay_cnt = s.bay_cnt;
Plan hash value: 3064471754 
 
---------------------------------------------------------------------------------- 
| Id | Operation | Name | 
---------------------------------------------------------------------------------- 
| 0 | MERGE STATEMENT | | 
| 1 | PX COORDINATOR | | 
| 2 | PX SEND QC (RANDOM) | :TQ10007 | 
| 3 | INDEX MAINTENANCE | DWB_X_PLNGRM_ATTR_DAY | 
| 4 | PX RECEIVE | | 
| 5 | PX SEND RANGE | :TQ10006 | 
| 6 | MERGE | DWB_X_PLNGRM_ATTR_DAY | 
| 7 | PX RECEIVE | | 
| 8 | PX SEND HYBRID (ROWID PKEY) | :TQ10005 | 
| 9 | VIEW | | 
| 10 | HASH JOIN BUFFERED | | 
| 11 | PX RECEIVE | | 
| 12 | PX SEND HASH | :TQ10003 | 
| 13 | HASH JOIN BUFFERED | | 
| 14 | PX RECEIVE | | 
| 15 | PX SEND HASH | :TQ10001 | 
| 16 | PX BLOCK ITERATOR | | 
| 17 | TABLE ACCESS STORAGE FULL | DWR_X_PLNGRM_LOC_BAY_CNT | 
| 18 | PX RECEIVE | | 
| 19 | PX SEND HASH | :TQ10002 | 
| 20 | VIEW | | 
| 21 | SORT GROUP BY | | 
| 22 | PX RECEIVE | | 
| 23 | PX SEND HASH | :TQ10000 | 
| 24 | PX BLOCK ITERATOR | | 
| 25 | TABLE ACCESS STORAGE FULL| DWB_X_SEL_ITEM_UNITS_DAY | 
| 26 | PX RECEIVE | | 
| 27 | PX SEND HASH | :TQ10004 | 
| 28 | PX BLOCK ITERATOR | | 
| 29 | TABLE ACCESS STORAGE FULL | DWB_X_PLNGRM_ATTR_DAY | 
----------------------------------------------------------------------------------

One day this sql changes execution plan and runs for over 5 hours before being killed.

The new plan is 4185795484:

----------------------------------------------------------------------------- 
| Id | Operation | Name | 
----------------------------------------------------------------------------- 
| 0 | MERGE STATEMENT | | 
| 1 | MERGE | DWB_X_PLNGRM_ATTR_DAY | 
| 2 | PX COORDINATOR | | 
| 3 | PX SEND QC (RANDOM) | :TQ10005 | 
| 4 | VIEW | | 
| 5 | HASH JOIN BUFFERED | | 
| 6 | PX RECEIVE | | 
| 7 | PX SEND HASH | :TQ10003 | 
| 8 | PX BLOCK ITERATOR | | 
| 9 | TABLE ACCESS STORAGE FULL | DWR_X_PLNGRM_LOC_BAY_CNT | 
| 10 | PX RECEIVE | | 
| 11 | PX SEND HASH | :TQ10004 | 
| 12 | HASH JOIN BUFFERED | | 
| 13 | PX RECEIVE | | 
| 14 | PX SEND HASH | :TQ10001 | 
| 15 | VIEW | | 
| 16 | SORT GROUP BY | | 
| 17 | PX RECEIVE | | 
| 18 | PX SEND HASH | :TQ10000 | 
| 19 | PX BLOCK ITERATOR | | 
| 20 | TABLE ACCESS STORAGE FULL| DWB_X_SEL_ITEM_UNITS_DAY | 
| 21 | PX RECEIVE | | 
| 22 | PX SEND HASH | :TQ10002 | 
| 23 | PX BLOCK ITERATOR | | 
| 24 | TABLE ACCESS STORAGE FULL | DWB_X_PLNGRM_ATTR_DAY | 
-----------------------------------------------------------------------------

The main difference I can see is that the tables are being joined in a different order.

3064471754 (good): DWB_X_SEL_ITEM_UNITS_DAY joined to DWR_X_PLNGRM_LOC_BAY_CNT then on to the target of the MERGE, DWB_X_PLNGRM_ATTR_DAY.
4185795484 (bad): DWB_X_SEL_ITEM_UNITS_DAY joined to DWB_X_PLNGRM_ATTR_DAY then to DWR_X_PLNGRM_LOC_BAY_CNT.

The fix is simple, fix the good plan that has always been used (3064471754) using SQL baselines.

It looks like this statement already has a few couple of baselined plans , neither are fixed.

sql handle: SQL_0e4f45d7832dd5ab
SQL_PLAN_0wmu5uy1kvpdb8c1d711e 10-NOV-17 10.38.07.000000000 MANUAL-LOAD
SQL_PLAN_0wmu5uy1kvpdb1eaf5716 29-SEP-17 23.16.31.000000000 AUTO-CAPTURE

Looking at the detailed view of the baseline shows that one of the plans is the bad plan and the other has no history of use in ASH:

select * 
 from table(dbms_xplan.display_sql_plan_baseline(sql_handle => 'SQL_0e4f45d7832dd5ab',
 format => 'basic')); --detailed view


PLAN_TABLE_OUTPUT 
----------------------------------------------------------------------------- 
SQL handle: SQL_0e4f45d7832dd5ab 
SQL text: MERGE INTO DWB_X_PLNGRM_ATTR_DAY T USING ( SELECT ILV.BSNS_UNIT_CD, 
ILV.SKU_ITEM_NBR, ILV.CNTNT_CD , ILV.SEL_UNITS_CNTNT_STATUS, 
ILV.PLNGRM_ITEM_GRP_CD, DWR.BAY_CNT_KEY, DWR.BAY_CNT FROM ( SELECT 
DISTINCT BSNS_UNIT_CD, SKU_ITEM_NBR, CNTNT_CD , SEL_UNITS_CNTNT_STATUS, 
PLNGRM_ITEM_GRP_CD, MAX(BAY_CNT_KEY) BAY_CNT_KEY FROM 
DWB_X_SEL_ITEM_UNITS_DAY WHERE (1=1) AND DAY_KEY >= :B1 GROUP BY 
BSNS_UNIT_CD, SKU_ITEM_NBR, CNTNT_CD, SEL_UNITS_CNTNT_STATUS, 
PLNGRM_ITEM_GRP_CD ) ILV, DWR_X_PLNGRM_LOC_BAY_CNT DWR WHERE 
ILV.BAY_CNT_KEY = DWR.BAY_CNT_KEY ) S ON ( T.CNTNT_CD = S.CNTNT_CD AND 
T.SKU_ITEM_NBR = S.SKU_ITEM_NBR AND T.BSNS_UNIT_CD = S.BSNS_UNIT_CD AND 
T.PLNGRM_CNTNT_STATUS = S.SEL_UNITS_CNTNT_STATUS AND 
T.PLNGRM_ITEM_GRP_CD = S.PLNGRM_ITEM_GRP_CD ) WHEN MATCHED THEN UPDATE 
SET T.BAY_CNT_KEY = S.BAY_CNT_KEY, T.BAY_CNT = S.BAY_CNT 
-------------------------------------------------------------------------------- 
 
-------------------------------------------------------------------------------- 
Plan name: SQL_PLAN_0wmu5uy1kvpdb1eaf5716 Plan id: 514807574 
Enabled: YES Fixed: NO Accepted: NO Origin: AUTO-CAPTURE 
-------------------------------------------------------------------------------- 
 
Plan hash value: 4185795484 /******BAD PLAN*******/ 
 
----------------------------------------------------------------------------- 
| Id | Operation | Name | 
----------------------------------------------------------------------------- 
| 0 | MERGE STATEMENT | | 
| 1 | MERGE | DWB_X_PLNGRM_ATTR_DAY | 
| 2 | PX COORDINATOR | | 
| 3 | PX SEND QC (RANDOM) | :TQ10005 | 
| 4 | VIEW | | 
| 5 | HASH JOIN BUFFERED | | 
| 6 | PX RECEIVE | | 
| 7 | PX SEND HASH | :TQ10003 | 
| 8 | PX BLOCK ITERATOR | | 
| 9 | TABLE ACCESS STORAGE FULL | DWR_X_PLNGRM_LOC_BAY_CNT | 
| 10 | PX RECEIVE | | 
| 11 | PX SEND HASH | :TQ10004 | 
| 12 | HASH JOIN BUFFERED | | 
| 13 | PX RECEIVE | | 
| 14 | PX SEND HASH | :TQ10001 | 
| 15 | VIEW | | 
| 16 | SORT GROUP BY | | 
| 17 | PX RECEIVE | | 
| 18 | PX SEND HASH | :TQ10000 | 
| 19 | PX BLOCK ITERATOR | | 
| 20 | TABLE ACCESS STORAGE FULL| DWB_X_SEL_ITEM_UNITS_DAY | 
| 21 | PX RECEIVE | | 
| 22 | PX SEND HASH | :TQ10002 | 
| 23 | PX BLOCK ITERATOR | | 
| 24 | TABLE ACCESS STORAGE FULL | DWB_X_PLNGRM_ATTR_DAY | 
----------------------------------------------------------------------------- 
 
-------------------------------------------------------------------------------- 
Plan name: SQL_PLAN_0wmu5uy1kvpdb8c1d711e Plan id: 2350739742 
Enabled: YES Fixed: NO Accepted: YES Origin: MANUAL-LOAD 
-------------------------------------------------------------------------------- 
 
Plan hash value: 1932271691 /**********no record of use in ASH*********/ 
 
----------------------------------------------------------------------------- 
| Id | Operation | Name | 
----------------------------------------------------------------------------- 
| 0 | MERGE STATEMENT | | 
| 1 | MERGE | DWB_X_PLNGRM_ATTR_DAY | 
| 2 | PX COORDINATOR | | 
| 3 | PX SEND QC (RANDOM) | :TQ10005 | 
| 4 | VIEW | | 
| 5 | HASH JOIN BUFFERED | | 
| 6 | PX RECEIVE | | 
| 7 | PX SEND HASH | :TQ10003 | 
| 8 | HASH JOIN BUFFERED | | 
| 9 | PX RECEIVE | | 
| 10 | PX SEND HASH | :TQ10001 | 
| 11 | PX BLOCK ITERATOR | | 
| 12 | TABLE ACCESS STORAGE FULL | DWR_X_PLNGRM_LOC_BAY_CNT | 
| 13 | PX RECEIVE | | 
| 14 | PX SEND HASH | :TQ10002 | 
| 15 | VIEW | | 
| 16 | HASH GROUP BY | | 
| 17 | PX RECEIVE | | 
| 18 | PX SEND HASH | :TQ10000 | 
| 19 | PX BLOCK ITERATOR | | 
| 20 | TABLE ACCESS STORAGE FULL| DWB_X_SEL_ITEM_UNITS_DAY | 
| 21 | PX RECEIVE | | 
| 22 | PX SEND HASH | :TQ10004 | 
| 23 | PX BLOCK ITERATOR | | 
| 24 | TABLE ACCESS STORAGE FULL | DWB_X_PLNGRM_ATTR_DAY | 
-----------------------------------------------------------------------------

Plan 1932271691 joins the table in the same order as 3064471754 and looks logically the same, so I think this is preferable to the bad plan (which is also in the baseline).
I’m not sure why the optimizer isn’t choosing plan 1932271691.

I created an SQL Tuning Set from a snapshot range containing yesterday’s good run and confirmed that the only plan in there was 3064471754.

I uploaded the STS to the SQL baseline, however it does not add a new plan, it still shows:

sql handle: SQL_0e4f45d7832dd5ab
SQL_PLAN_0wmu5uy1kvpdb8c1d711e
SQL_PLAN_0wmu5uy1kvpdb1eaf5716

The only difference is that the timestamp of SQL_PLAN_0wmu5uy1kvpdb8c1d711e is now current so it looks like it’s been updated.

When querying the baseline in detail I expect to see SQL_PLAN_0wmu5uy1kvpdb8c1d711e now contains 3064471754, however it doesn’t.
It still shows 1932271691, nothing has changed except the timestamp, it looks like we can’t fix the good plan that has been used every day for months.

I fixed the potentially good plan and flushed the bad plan from all instances, however upon retrying the MERGE the bad plan was still chosen.

As an experiment I ran an explain plan of the MERGE in SQL Developer replacing the bind variable with the literal that was being passed.
The explain plan returns the good plan (same as 3064471754).

As a workaround I ask the team to run the MERGE from SQL Developer with the literal (after confirming their explain plan also returns the good plan).

Upon monitoring the SQL Developer session it is confirmed that it is using the good plan with the exact same plan ID (3064471754) and completes in 1 hour 15 minutes.

As it is Friday afternoon we agree on this workaround for the weekend (support team is 24 hours / shifts).

The next day the batch job MERGE has picked a different bad plan.
The support team kill it and run the workaround.

The workaround then stops working well as it has picked the same bad plan as the batch job.
The SQL Developer MERGE runs for 5 hours before being killed.

The new bad plan is 1501257745.

1501257745 looks logically the same as the good plan (3064471754) and the potentially good plan in the baseline (1932271691) as it has the same join order.

However we now have a breakthrough – the new bad plan has now given us a new angle to look at.

1501257745 starts by spawning many parallel processes, however it soon drops off to one serial process.
This one serial process is performing the top-level MERGE operation which flatlines for hours before being killed.

I decide to retry fixing the original good plan.
I start with a clean slate and drop the SQL baseline and both plans contained within.
I create a new STS from a different snapshot period than before which only contains 3064471754.
I load the STS into the baseline and a new baseline and plan are created.

When querying the contents of the baseline it still just contains the potentially good plan from before (1932271691).

Oracle is now deliberately and consistently changing my plan in transit from 3064471754 to 1932271691.

With the new insight given by the new bad plan (1501257745) I now look at the top-level DML parts of the plans.

Logically the potentially good plan (1932271691) and the new bad plan (1501257745) are the same statement as 3064471754 but without the parallel MERGE.

Plan hash value: 1932271691 and 1501257745 
----------------------------------------------------------------------------- 
 | Id | Operation | Name | 
----------------------------------------------------------------------------- 
 | 0 | MERGE STATEMENT | | 
 | 1 | MERGE | DWB_X_PLNGRM_ATTR_DAY |


Plan hash value: 3064471754 
---------------------------------------------------------------------------------- 
| Id | Operation | Name | 
 ---------------------------------------------------------------------------------- 
| 0 | MERGE STATEMENT | | 
| 1 | PX COORDINATOR | | 
| 2 | PX SEND QC (RANDOM) | :TQ10007 | 
| 3 | INDEX MAINTENANCE | DWB_X_PLNGRM_ATTR_DAY | 
| 4 | PX RECEIVE | | 
| 5 | PX SEND RANGE | :TQ10006 | 
| 6 | MERGE | DWB_X_PLNGRM_ATTR_DAY |

Now we are concentrating on the parallelism of DML, I remember working at a site previously and having to set ENABLE PDML at the session level.

I ran an explain plan of the MERGE before and after issuing the following statement:

ALTER SESSION ENABLE PARALLEL DML;

Afterwards the plan changed to include parallel steps for the MERGE.

I told the support team to do this in their SQL Developer session and the MERGE began running with the usual good plan and even had the same plan id (3064471754).

Now that PDML is enabled in my session, I reloaded the same STS as before into the same baseline.

Once again the only high-level change was the changing of the timestamp on the plan, however when querying the detail of the plan it no longer contained 1932271691 but a new plan, 2197295520.

Plan hash value: 2197295520 
 
---------------------------------------------------------------------------------- 
| Id | Operation | Name | 
---------------------------------------------------------------------------------- 
| 0 | MERGE STATEMENT | | 
| 1 | PX COORDINATOR | | 
| 2 | PX SEND QC (RANDOM) | :TQ10007 | 
| 3 | INDEX MAINTENANCE | DWB_X_PLNGRM_ATTR_DAY | 
| 4 | PX RECEIVE | | 
| 5 | PX SEND RANGE | :TQ10006 | 
| 6 | MERGE | DWB_X_PLNGRM_ATTR_DAY | 
| 7 | PX RECEIVE | | 
| 8 | PX SEND HYBRID (ROWID PKEY) | :TQ10005 | 
| 9 | VIEW | | 
| 10 | HASH JOIN BUFFERED | | 
| 11 | PX RECEIVE | | 
| 12 | PX SEND HASH | :TQ10003 | 
| 13 | HASH JOIN BUFFERED | | 
| 14 | PX RECEIVE | | 
| 15 | PX SEND HASH | :TQ10001 | 
| 16 | PX BLOCK ITERATOR | | 
| 17 | TABLE ACCESS STORAGE FULL | DWR_X_PLNGRM_LOC_BAY_CNT | 
| 18 | PX RECEIVE | | 
| 19 | PX SEND HASH | :TQ10002 | 
| 20 | VIEW | | 
| 21 | HASH GROUP BY | | /****only difference from original good plan (which has SORT GROUP BY instead)****/ 
| 22 | PX RECEIVE | | 
| 23 | PX SEND HASH | :TQ10000 | 
| 24 | PX BLOCK ITERATOR | | 
| 25 | TABLE ACCESS STORAGE FULL| DWB_X_SEL_ITEM_UNITS_DAY | 
| 26 | PX RECEIVE | | 
| 27 | PX SEND HASH | :TQ10004 | 
| 28 | PX BLOCK ITERATOR | | 
| 29 | TABLE ACCESS STORAGE FULL | DWB_X_PLNGRM_ATTR_DAY | 
----------------------------------------------------------------------------------

The tables are joined in the correct order and the MERGE is parallel.
The only difference is line 21 which has HASH GROUP BY instead of SORT GROUP BY.

I fixed this plan hoping that the batch version of the MERGE will pick it up tomorrow.
It doesn’t, however it now runs with the original good plan every day (3064471754).

Whilst I’m still looking into this a key point to take away is that when using SQL baselines the environment of the session creating the baseline and the of the session accessing the baseline counts toward the eligibility of the plan.

Posted in Oracle | Tagged: , , | 3 Comments »

Managing the WINDOW_ID in Goldengate V11.2.1.0.33

Posted by John Hallas on August 16, 2017

When we import data into the landing schema on a Dat Warehouse via Goldengate, we add 3 fields to each record detailing when and how the record got loaded. This can be found in the *.inc files under $GG_HOME/dirinc on the target GG installation. An example of this is:
map xxx.DBA_GGCUTOVER_TEST, TARGET YYY.DBA_GGCUTOVER_TEST,  INSERTALLRECORDS, IGNOREDELETES
COLMAP (
USEDEFAULTS,
WINDOW_ID = @STRCAT(@GETENV(“RECORD”, “FILESEQNO”), @STRNUM(@GETENV(“RECORD”, “FILERBA”), RIGHTZERO, 10)),
OPER_TYPE = @GETENV (“GGHEADER”, “OPTYPE”),
CDC_LOAD_TIMESTAMP= @DATENOW());
map xxx.DBA_GGCUTOVER_TEST #EXCEPTION_HANDLER();
OPER_TYPE is Insert, update, delete etc
CDC_LOAD_TIMESTAMP is self explanatory
WINDOW_ID is more interesting, its made up of the number of the pump trail file that hold the record, and the RBA, padded to 10 digits.
So, if the record is in file $GG_HOME/dirdat/R1MOP101/rm077196, at RBA 78954, then the WINDOW_ID value in the staging table would be:
771960000078954
BI BATCH Issue due to above
We found out yesterday that the BI team (rightly or wrongly, discuss!!) use the WINDOW_ID field in each record on the landing schema’s data to check whether that record has been previously loaded by the batch:
  • For each table, there is a control record that stores the max WINDOW_ID value for that table.
  • The next time the batch is run, it looks at the control record, and then only loads data with a WINDOW_ID greater than the control value.
  • After the batch has completed, the control value is updated.
  • And so on each night.
BUT – what happens if this WINDOW_ID is reset ? The above logic fails!
After we had replatformed the source system, new data coming in from a fresh GG install, now had a low WINDOW_ID:
Old source WINDOW_ID in region of:
771960000078954
New source WINDOW_ID in region of:
   960000078954
To resolve this, we did the following:
  • Stopped the pump process.
  • Waited till replicat caught up, then stopped replicat.
  • Increased the GG Pump Sequence Number via a script like this (the below shows 2 increments of the sequence, we did 10,000’s of these, to get the value to exactly 90,000!:
ggsci <<EOF
alter extract P1MOP101 etrollover
alter extract P1MOP101 etrollover
EOF
 
** Note I didn’t know you could run ggsci commands via a shell script!
  • Started the pump process
  • Checked that a file was created on xxx called $GG_HME/dirdat/R1MOP101/rm090000
  • Altered the replicat to point to this new file with “alter replicat R1MOP101, extseqno 90000, extrba 0”, then started the replicat.
  • After this any records that came through, and had a new higher WINDOW_ID.
We still had an issue with data that had gone in between Sunday up to the time we stopped the pump above.
We knew the old data was no higher than:
799990000000000
We knew the new data was no higher than
  1990000000000
We also new the we didn’t want these records to be in the new 900000000000000 range.
So we added 800000000000000 to the new WINDOW_IDs so now the data loaded between Sunday morning and Monday evening would be in region of
800000000000000
to
801990000000000
 
…This WINDOW_ID would then be higher than the control table values, and therefore would get processed by the BI batch.

Posted in Goldengate | Tagged: , , , , , | Leave a Comment »

HP Systems Management vacancy

Posted by John Hallas on August 8, 2017

I know I do not have the right readership on this blog for a Systems Management vacancy but if any readers have colleagues who have experience using any of HP Openview, OpsBridge, HP Service Management and HP UCMDB then I have a vacancy at our Head Office in Bradford for a permanent position

Details below

https://apply.morrisons.jobs/vacancies/491/technology-specialist–system-management.html

Posted in Oracle | Leave a Comment »

Using DataGuard broker to show apply lag and throughput

Posted by John Hallas on June 20, 2017

To determine how much lag there is I normally run a script similar to this

select sequence#, applied, to_date(to_char(first_time,’DD-MON-YY:HH24:MI:SS’),
‘DD-MON-YY:HH24:MI:SS’) “First Time” ,
to_char(next_time,’DD-MON-YY:HH24:MI:SS’) “Next Time”
from v$archived_log
UNION
select NULL,database_role,NULL, db_unique_name from v$database
order by “First Time”;

However there is another way which I sometime use which actually gives a lot more information. This uses the dataguard broker command line. Use the show configuration parameter to determine database name if you are not sure

DGMGRL> show configuration

Configuration - DR

Protection Mode: MaxPerformance
 Databases:
 xxxxxx2a - Primary database
 xxxxxx2b - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS

Show specific database details.

DGMGRL> show database "xxxxxx2b"

Database - xxxxxx2b

Role: PHYSICAL STANDBY
 Intended State: APPLY-ON
 Transport Lag: 4 minutes 24 seconds (computed 6 seconds ago)
 Apply Lag: 5 minutes 16 seconds (computed 0 seconds ago)
 Apply Rate: 191.08 MByte/s
 Real Time Query: ON
 Instance(s):
 xxxxxx2b1 (apply instance)
 xxxxxx2b2
 xxxxxx2b3
 xxxxxx2b4
 xxxxxx2b5
 xxxxxx2b6
 xxxxxx2b7
 xxxxxx2b8

Database Status:
SUCCESS

Lots of good information there including which node is hosting the MRP process and also the apply rate. In our case that is normally between 150 and 400 MB per second.

Posted in Oracle | Tagged: , , , , , , | Leave a Comment »

Downgrading a RAC database from 11.2.0.4 to 11.2.0.3

Posted by John Hallas on May 4, 2017

It is not often that I see a database downgrade activity performed and so I thought it would be worthwhile just noting how it was done.
 2 node RAC 11.2.0.4 database to 11.2.0.3, downgraded the database only and not the grid home.
Downgrade taking place on HP-UX, any downgrades taking place on Windows OS have several additional steps and won’t be covered in this post. 
This database does not use Database Vault and pre-requisite compatibility checks were carried out

Assume all commands are being run on node 1, any commands that need to be run on node 2 will be explicitly stated. Any commands will be formatted in italics.

Set ORACLE_HOME to current 11.2.0.4 environment.
export ORACLE_HOME= /app/oracle/product/11.2.0.4/dbhome_SOA1
 
Tail alert log of both nodes in separate windows
tail -f /app/oracle/diag/rdbms/soapre2a/SOAPRE2A1/trace/alert_SOAPRE2A1.log
 
1. Stop database using srvctl on primary node
 
srvctl stop database -d SOAPRE2A
 
Monitor the alert logs to confirm when database has successfully shutdown.
2. Create pfile from spfile
 
sqlplus / as sysdba
SQL>create pfile=’/home/oracle/SOAPRE2_downgrade/SOAPRE2_clusterdisable.ora’ from spfile=’+DATA/SOAPRE2A/spfilesoapre2a.ora’;
 
3. Alter pfile CLUSTER_DATABASE parameter to FALSE
*.cluster_database=FALSE
*.compatible=’11.2.0.0.0′
 
4. Recreate spfile with new parameter
 
sqlplus / as sysdba
SQL>create spfile=’+DATA/SOAPRE2A/spfilesoapre2a.ora’ from pfile=’/home/oracle/SOAPRE2_downgrade/SOAPRE2_clusterdisable.ora’;
 
5. Startup database in downgrade mode using new spfile
 
cd $ORACLE_HOME/rdbms/admin
sqlplus / as sysdba
SQL>startup downgrade
 
6. Execute Oracle downgrade script
From the original OH –  11.2.0.4
SQL>spool /home/oracle/SOAPRE2_downgrade/downgrade.log
SQL>@catdwg.sql
SQL>spool off
SQL>shutdown immediate;
SQL>exit
 
This script can be run multiple times, in the event any errors are encountered correct them and rerun until completion.
7. Change environment variables and restore config files
 
Execute these steps on both nodes.
 
Alter ORACLE_HOME and PATH environment variable to point to downgraded directories, in our case for example:
export ORACLE_HOME=’/app/oracle/product/11.2.0.3/dbhome_SOA1_1
 
Ensure any entries in your oratab file are also altered to reference the downgraded directory.
Copy password files and config files from current ORACLE_HOME to downgraded directory.
8. Reload version specific components
 
change to downgraded release home  – 11.2.0.3
cd /app/oracle/product/11.2.0.3/dbhome_SOA1_1/rdbms/admin
SQL> sqlplus / as sysdba
SQL> startup upgrade
SQL>spool /home/oracle/SOAPRE2_donwgrade/reload.log
SQL>@catrelod.sql
SQL>spool off
This step can take quite some time to complete, in our case ~2.5 hours
9. Recompile invalid objects
 
SQL> shutdown immediate
SQL> startup
SQL> @utlprp.sql
SQL> exit
 
10. Downgrade cluster services
 
The final step was to downgrade cluster services to our old ORACLE_HOME and version, using the following srvctl command:
srvctl downgrade database -d db-unique-name -o old_ORACLE_HOME t to_old_versnum
 
in our case this was the following:
 
srvctl downgrade database -d SOAPRE2A -o /app/oracle/product/11.2.0.3/dbhome_SOA1_1 -t 11.2.0.3

Posted in 11g new features, Oracle | Tagged: , , | 3 Comments »

What is the future for an Oracle DBA?

Posted by John Hallas on April 10, 2017

I have worked with Oracle databases for around 25 years now and during that time I have been very fortunate in that there has always been work for DBAs and it has been one of the higher paying disciplines within IT.

I am not prophesying the end of the Oracle database engine but I do see the writing on the wall for some of the large corporate solutions sitting on physical equipment in a datacentre. I also have to criticise Oracle for their business practises which I know are seeing customers move away to other solutions.

Without any doubt there is pressure on those who wish to perform a purely Oracle DBA role. The growing use of Cloud does reduce the opportunities and whilst databases always need to be built the techniques used in the Cloud undoubtedly speed up that process and effectively de-skill it. The rise of SAAS style applications where the on-site DBA no longer performs upgrades, patching and similar work also reduces the requirement.

In conjunction with that there is a threat from the more established players in the market. I manage database teams that support a variety of databases and a few years ago I undoubtedly had the view that Oracle was good for large databases (I might have considered 1Tb to be the dividing line between large and medium) and SQL Server was suitable for smaller ones. I am aware that is a very basic dividing line and does not take into account functionality and advanced database requirements. I do not have that view in the slightest now and consider that Oracle is too expensive and does not offer value for money. SQL Server is much higher in my focus and we now include MySQL (which I know is owned by Oracle) and also PostGres and DynamoDB.

I referred to business practises as being a reason not to use Oracle. I am specifically referring to the change in licensing for databases in the Cloud but not in the Oracle Cloud. See this article by Tim Hall for more detail.  The comments also support the theme of this blog – that there are many more alternatives to Oracle these days.

If I was starting out now I think I would be trying to go down the Data Architect road and also grabbing myself a good overview of the benefits and risks of the various types of database solutions that are now available. That skill set would also assist in becoming an infrastructure architect.

Saying all of the above – in my view there is nothing more satisfying than taking a query and improving its performance, no matter what the underlying database technology is.

Posted in Oracle | 17 Comments »