Oracle DBA – A lifelong learning experience

HP Systems Management vacancy

Posted by John Hallas on August 8, 2017

I know I do not have the right readership on this blog for a Systems Management vacancy but if any readers have colleagues who have experience using any of HP Openview, OpsBridge, HP Service Management and HP UCMDB then I have a vacancy at our Head Office in Bradford for a permanent position

Details below–system-management.html


Posted in Oracle | Leave a Comment »

Using DataGuard broker to show apply lag and throughput

Posted by John Hallas on June 20, 2017

To determine how much lag there is I normally run a script similar to this

select sequence#, applied, to_date(to_char(first_time,’DD-MON-YY:HH24:MI:SS’),
‘DD-MON-YY:HH24:MI:SS’) “First Time” ,
to_char(next_time,’DD-MON-YY:HH24:MI:SS’) “Next Time”
from v$archived_log
select NULL,database_role,NULL, db_unique_name from v$database
order by “First Time”;

However there is another way which I sometime use which actually gives a lot more information. This uses the dataguard broker command line. Use the show configuration parameter to determine database name if you are not sure

DGMGRL> show configuration

Configuration - DR

Protection Mode: MaxPerformance
 xxxxxx2a - Primary database
 xxxxxx2b - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:

Show specific database details.

DGMGRL> show database "xxxxxx2b"

Database - xxxxxx2b

 Intended State: APPLY-ON
 Transport Lag: 4 minutes 24 seconds (computed 6 seconds ago)
 Apply Lag: 5 minutes 16 seconds (computed 0 seconds ago)
 Apply Rate: 191.08 MByte/s
 Real Time Query: ON
 xxxxxx2b1 (apply instance)

Database Status:

Lots of good information there including which node is hosting the MRP process and also the apply rate. In our case that is normally between 150 and 400 MB per second.

Posted in Oracle | Tagged: , , , , , , | Leave a Comment »

Downgrading a RAC database from to

Posted by John Hallas on May 4, 2017

It is not often that I see a database downgrade activity performed and so I thought it would be worthwhile just noting how it was done.
 2 node RAC database to, downgraded the database only and not the grid home.
Downgrade taking place on HP-UX, any downgrades taking place on Windows OS have several additional steps and won’t be covered in this post. 
This database does not use Database Vault and pre-requisite compatibility checks were carried out

Assume all commands are being run on node 1, any commands that need to be run on node 2 will be explicitly stated. Any commands will be formatted in italics.

Set ORACLE_HOME to current environment.
export ORACLE_HOME= /app/oracle/product/
Tail alert log of both nodes in separate windows
tail -f /app/oracle/diag/rdbms/soapre2a/SOAPRE2A1/trace/alert_SOAPRE2A1.log
1. Stop database using srvctl on primary node
srvctl stop database -d SOAPRE2A
Monitor the alert logs to confirm when database has successfully shutdown.
2. Create pfile from spfile
sqlplus / as sysdba
SQL>create pfile=’/home/oracle/SOAPRE2_downgrade/SOAPRE2_clusterdisable.ora’ from spfile=’+DATA/SOAPRE2A/spfilesoapre2a.ora’;
3. Alter pfile CLUSTER_DATABASE parameter to FALSE
4. Recreate spfile with new parameter
sqlplus / as sysdba
SQL>create spfile=’+DATA/SOAPRE2A/spfilesoapre2a.ora’ from pfile=’/home/oracle/SOAPRE2_downgrade/SOAPRE2_clusterdisable.ora’;
5. Startup database in downgrade mode using new spfile
cd $ORACLE_HOME/rdbms/admin
sqlplus / as sysdba
SQL>startup downgrade
6. Execute Oracle downgrade script
From the original OH –
SQL>spool /home/oracle/SOAPRE2_downgrade/downgrade.log
SQL>spool off
SQL>shutdown immediate;
This script can be run multiple times, in the event any errors are encountered correct them and rerun until completion.
7. Change environment variables and restore config files
Execute these steps on both nodes.
Alter ORACLE_HOME and PATH environment variable to point to downgraded directories, in our case for example:
export ORACLE_HOME=’/app/oracle/product/
Ensure any entries in your oratab file are also altered to reference the downgraded directory.
Copy password files and config files from current ORACLE_HOME to downgraded directory.
8. Reload version specific components
change to downgraded release home  –
cd /app/oracle/product/
SQL> sqlplus / as sysdba
SQL> startup upgrade
SQL>spool /home/oracle/SOAPRE2_donwgrade/reload.log
SQL>spool off
This step can take quite some time to complete, in our case ~2.5 hours
9. Recompile invalid objects
SQL> shutdown immediate
SQL> startup
SQL> @utlprp.sql
SQL> exit
10. Downgrade cluster services
The final step was to downgrade cluster services to our old ORACLE_HOME and version, using the following srvctl command:
srvctl downgrade database -d db-unique-name -o old_ORACLE_HOME t to_old_versnum
in our case this was the following:
srvctl downgrade database -d SOAPRE2A -o /app/oracle/product/ -t

Posted in 11g new features, Oracle | Tagged: , , | 3 Comments »

What is the future for an Oracle DBA?

Posted by John Hallas on April 10, 2017

I have worked with Oracle databases for around 25 years now and during that time I have been very fortunate in that there has always been work for DBAs and it has been one of the higher paying disciplines within IT.

I am not prophesying the end of the Oracle database engine but I do see the writing on the wall for some of the large corporate solutions sitting on physical equipment in a datacentre. I also have to criticise Oracle for their business practises which I know are seeing customers move away to other solutions.

Without any doubt there is pressure on those who wish to perform a purely Oracle DBA role. The growing use of Cloud does reduce the opportunities and whilst databases always need to be built the techniques used in the Cloud undoubtedly speed up that process and effectively de-skill it. The rise of SAAS style applications where the on-site DBA no longer performs upgrades, patching and similar work also reduces the requirement.

In conjunction with that there is a threat from the more established players in the market. I manage database teams that support a variety of databases and a few years ago I undoubtedly had the view that Oracle was good for large databases (I might have considered 1Tb to be the dividing line between large and medium) and SQL Server was suitable for smaller ones. I am aware that is a very basic dividing line and does not take into account functionality and advanced database requirements. I do not have that view in the slightest now and consider that Oracle is too expensive and does not offer value for money. SQL Server is much higher in my focus and we now include MySQL (which I know is owned by Oracle) and also PostGres and DynamoDB.

I referred to business practises as being a reason not to use Oracle. I am specifically referring to the change in licensing for databases in the Cloud but not in the Oracle Cloud. See this article by Tim Hall for more detail.  The comments also support the theme of this blog – that there are many more alternatives to Oracle these days.

If I was starting out now I think I would be trying to go down the Data Architect road and also grabbing myself a good overview of the benefits and risks of the various types of database solutions that are now available. That skill set would also assist in becoming an infrastructure architect.

Saying all of the above – in my view there is nothing more satisfying than taking a query and improving its performance, no matter what the underlying database technology is.

Posted in Oracle | 17 Comments »

GoldenGate – Restarting a replicat with the command filterduptransactions

Posted by John Hallas on April 4, 2017

If a Goldengate replicat process fails then occasionally on the restart it skips the correct RBA and ‘loses it’s position’. The relative byte address (RBA) is the location within the trail file to indicate the current transaction.

The old school method was to calculate which RBA was the correct one and then restart the replicat. However there is a new command on the block now (pun intended) and I will demonstrate how the two methods can be used to restart the trail file correctly

Today, we saw the following in the GG log file:
PS sorry if the format is a bit off. I normally spend me as much time formatting this blog as it I do writing it. However in this case much of the work was done by Alex Priestley – a fellow DBA

Read the rest of this entry »

Posted in Goldengate, Oracle | Tagged: , , , , | 2 Comments »

Problem with V$RECOVERY_AREA_USAGE view and FRA space not being reclaimed

Posted by John Hallas on March 16, 2017

We received the following alert from our alerting system
Flash Recovery Area for FOLPRD1A has 9.97 percent of usable space left.
This is a standby database:
HOST       INSTANCE   STATUS     VERSION      STARTED                   UPTIME
———- ———- ———- ———— ————————- ————————————————–
xxxxxxxx   INSTANCE   MOUNTED   18-JAN-2017 18:58:18      33 days(s) 13 hour(s) 4 minute(s) 1 seconds
———– ———– —————— ———- ——————– ——————–
Usually, when under space pressure the standby will delete archivelogs and flashback logs that it no longer needs so this alert isn’t normal for a standby.  However, in this scenario, none of the space is reclaimable.  Therefore, without intervention the FRA would eventually hit 100% and stop logs being transported to the standby.
—————————————- ———– ———- —————– —————
+FRA                                             400     379.48                 0           11268
——————– ———- ————– —————— ————————- —————
CONTROL FILE                .12              0                .03                         0               1
REDO LOG                      0              0                  0                         0               0
ARCHIVED LOG             326.12              0              81.53                         0           11068
BACKUP PIECE                .12              0                .03                         0               1
IMAGE COPY                    0              0                  0                         0               0
FLASHBACK LOG             53.08              0              13.27                         0             197
FOREIGN ARCHIVED LOG          0              0                  0                         0               0
I checked the RMAN configuration suspecting it hadn’t been changed post switchover:
Rather than being “NONE” it did look right but I did notice it was ‘ALL’.  I thought we usually had it as follows:
I assumed this wouldn’t make a difference as we only have one standby and all the logs had been applied.  However, I changed it anyway.  Straight away this had the desired effect and practically all the space became reclaimable.
—————————————- ———– ———- —————– —————
+FRA                                             400     379.48            325.92           11268
——————– ———- ————– —————— ————————- —————
CONTROL FILE                .12              0                .03                         0               1
REDO LOG                      0              0                  0                         0               0
ARCHIVED LOG             326.12         325.88              81.53                     81.47           11068
BACKUP PIECE                .12              0                .03                         0               1
IMAGE COPY                    0              0                  0                         0               0
FLASHBACK LOG             53.08              0              13.27                         0             197
FOREIGN ARCHIVED LOG          0              0                  0                         0               0
The alert subsequently cleared.
Looking at the report “Archivelog Deletion Policy – Core Production Databases” we have many databases configured with the ALL parameter.  I checked another at random and it was fine.  I suspected maybe it was the action of changing the parameter rather than the parameter being wrong and thought maybe dbms_backup_restore.refreshagedfiles would have done the job.
After speaking to a colleague he said this alert came out weeks ago and the trick of lowering the db_recovery_file_dest_size to force the database to be under space pressure had cleared the old logs and the alert.  Therefore, the fact that this worked suggests that the space was always reclaimable, just not shown in view which the alert uses.  I found a nice blog that shows the same issue and alludes to a bug.
“V$RECOVERY_AREA_USAGE is an aggregate view. If we check its definition, we see that the reclaimable size comes from x$kccagf.rectype.”  It directs you to a bug (for this version) that describes that the standby “does not refresh reclaimable space automatically”.
The workaround is to run exec dbms_backup_restore.refreshagedfiles; 
The blog also claims…..”but I’ve found another workaround: just run the CONFIGURE ARCHIVELOG POLICY from RMAN as it refreshes the reclaimable flag – even when there is no change.”
This is effectively what I did.  Therefore, I’ve put the original parameter back and switched logs on the primary numerous times and the reclaimable space is being updated.  For now we shall keep an eye on this as it’s not an issue anywhere else.

Posted in Oracle | Tagged: , , , , | 1 Comment »

Xmas day -150 hits. What is wrong with the world

Posted by John Hallas on December 30, 2016

Yes, very tongue in cheek. I know everyone does not celebrate Xmas.

I was still surprised though. This is what was viewed

Best wishes for 2017 to all my readers.


Posted in Oracle | Leave a Comment »

RMAN checksyntax function

Posted by John Hallas on December 29, 2016

I was looking at the RMAN DEBUG options and came across the CHECKSYNTAX function which I had not used before.

Firstly a quick recap on the DEBUG option.

This can be called using the following syntax

rman target / catalog rman12g1/xxxx@rmancat debug trace=rmantrace.log cmdfile=backup.rcv


rman target / catalog rman12g1/xxxx@rmancat debug trace=rmantrace.log then run RMAN> @backup.rcv (or just type in your run block of commands)

There are a number of options to DEBUG and one of the error messages lists them out quite neatly

RMAN-01009: syntax error: found "integer": expecting one of: "all, duplicate, recover, restore, resync, sql"

To be honest if I was tracing I would just stick with the DEBUG=ALL format. The DEBUG=SQL gives all the internal commands that RMAN calls and could be interesting if you were doing a deep dive into RMAN functionality

Anyway, back to the CHECKSYNTAX option.

I run it against the edited version of the command file used above

rman target / catalog rman12g1/xxxx@rmancat checksyntax cmdfile=backup.rcv 

Recovery Manager: Release - Production on Wed Dec 28 10:22:20 2016
Copyright (c) 1982, 2014, Oracle and/or its affiliates. All rights reserved.
connected to target database: T12TEST (DBID=1543168240)
connected to recovery catalog database
RMAN> run {
2> sql "alter session set nls_date_format=''YYYY-MM-DD:HH24:MI:SS''";
3> allocate channel c1 device type disk format '/app/oracle/backup/backup_db_%d_S_%s_P_%p_T_%t';
4> allocate channel c2 device type disk format '/app/oracle/backup/backup_db_%d_S_%s_P_%p_T_%t';
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-00558: error encountered while parsing input commands
RMAN-01009: syntax error: found "identifier": expecting one of: "controlfile"
RMAN-01008: the bad identifier was: CONTROLFILEs
RMAN-01007: at line 6 column 33 file: backup.rcv

Note the file backup.rcv has a blank line as the first line which confuses the line numbering

Pretty neat. I edit the file and put a different error in, a much more common missing semi-colon

RMAN> run {
2> sql "alter session set nls_date_format=''YYYY-MM-DD:HH24:MI:SS''";
3> allocate channel c1 device type disk format '/app/oracle/backup/backup_db_%d_S_%s_P_%p_T_%t';
4> allocate channel c2 device type disk format '/app/oracle/backup/backup_db_%d_S_%s_P_%p_T_%t'
5> backup
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-00558: error encountered while parsing input commands
RMAN-01009: syntax error: found "backup": expecting one of: "auxiliary, connect, format, maxopenfiles, maxpiecesize, parms, rate, send, to, comma, ;"
RMAN-01007: at line 6 column 1 file: backup.rcv

Overall not a mind-shatteringly exciting find but something that might be of use one day

Posted in Oracle, RMAN | Tagged: , | 6 Comments »

Performance problems with OEM AWR warehouse

Posted by John Hallas on December 20, 2016

The Enterprise Manager AWR Warehouse is designed to hold performance data from multiple databases  for long-term analysis. It promoses that it will save storage and improve performance on your production systems. In that it is indeed correct. However the warehouse itself does not seem to be performant when taking in multiple sources and retaining them long-term – 400 days in our case. Why 400 days is an obvious question that might be asked. Primarily because we are a Retail organisation and Easter is variable each year.


The AWR repository database is performing poorly during the insert stage of the upload process.
Just to quickly summarise the process:
  • A dmp file is extracted on the source database and transferred across to the AWR server
  • The dmp file is then imported into a temporary schema called AWR$XXXXXX (this loads quickly)
  • This data is then inserted into the main AWR tables inside the SYS schema. Is is this stage that is slow.

In order to completely isolate the issue, we altered a parameter, so only one AWR file gets loaded at once, cutting any contention / locking issues out of the equation:

Read the rest of this entry »

Posted in 12c new features, Oracle | Tagged: , , , | 4 Comments »

Reorganising data within tablespaces

Posted by John Hallas on December 19, 2016

I am currently interested in managing storage on our Exadata platform which is primarily used for running a data warehouse.

The ongoing challenge is that most of the large tablespaces use a bigfile datafile and we insert data using append. That causes data gaps within the datafile and it is constantly growing. A recent example showed a 7Tb datafile containing only 2Tb of data and yet the ASM diskgroups were reporting we were running out of space.

What I intend to share here is how we are organising object to reclaim some of that space and I will include the scripts I am using. I am fully aware of the DBMS_REDEFINITION package and also the capabilities from within Oracle Enterprise Manager but I decided to stick to a method where it was all under my control and I knew exactly where I was. Daily batch on this system can run for up to 24 hours and is normally between 16 and 20 hours long so I had to be aware of when certain tablespaces were not being used.

I have been working through a tablespace at a time so that is the process I will demonstrate – using a tablespace DW_BIA_RTL_TBS_BKP as an example

From a standard script of which there are many versions around on the net to show tablespace usage I select the tablespace I am interested in Read the rest of this entry »

Posted in Oracle | Tagged: , , , , | Leave a Comment »