Oracle DBA – A lifelong learning experience

Archive for the ‘RMAN’ Category

Flashback logging at tablespace level

Posted by John Hallas on June 16, 2010

Our site policy is to enable flashback logging on our production databases unless there is a good reason not to, two examples are the data warehouse where volume is prohibitive and the other is a highly active Peoplesoft database where performance is a primary consideration and we suffer from “flashback buf free by RVWR" wait event as it cannot write the flashback logs quickly enough. 

On the same system we had experimented with disabling flashback against two tablespaces which contained objects of a transient nature and that could be rebuilt if necessary. We finally agreed that the overhead of flashback was too heavy balanced aganst the likelyhood of us ever using it and so we disabled it at the database level.

However, when we are applying application code changes during an agreed outage we bounce the database and restart it with flashback enabled. Therefore when the changes have been applied and basic testing has taken place we can quickly flashback the database to a clean point with a minimum of fuss in the event that the changes have not produced the desired results. If the bundle is OK we disable flashback logging dynamically without a further database bounce. We find this an excellent belt and braces method which has little overhead or manageability cost.

During a recent bundle release we did want to flashback but had some problems Read the rest of this entry »

Posted in 11g new features, Oracle, RMAN | Tagged: , , , | Leave a Comment »

Cataloging RMAN backups

Posted by John Hallas on April 15, 2010

We had a situation where we had backed up several databases and then would restore them as part of performance testing work. The catalog entries in the controlfile were lost either because they had become obsolete or we had restored to another point prior to the backup being taken.

 The backups were on still on disk and therefore they could be re-introduced to the controlfile. 

Read the rest of this entry »

Posted in Oracle, RMAN | Tagged: , , , , | Leave a Comment »

Identifying corrupt blocks

Posted by John Hallas on April 14, 2010

Recently we had a datafile with a number of corrupt blocks and once it was fixed I resolved to try out all the methods available of identifying block corruption. Searching around I came across an excellent blog by   Asif Momen – Practising block recovery which provided me pretty much what I wanted. However I did think it worth adding a blog entry as I wanted to document all the various methods available to identify corruption.  I also wanted to cover how to test causing a corruption when using ASM datafiles.

Create a datafile, add data and find out which block that data is in

 create smallfile tablespace bad_data datafile '+DATA' size 10M;

Create table test (username varchar2(9), password varchar2(6)) tablespace bad_data;

DECLARE

u  VARCHAR2(9);

p  VARCHAR2(6);

BEGIN

FOR jump  IN 1 ..10000  LOOP

u := 'TEST'||jump;

p := 'P'||jump;

insert into test values (u,p);

END LOOP;

commit;

END;

/

PL/SQL procedure successfully completed.

select

rowid ,

to_number(utl_encode.base64_decode(utl_raw.cast_to_raw(lpad(substr(rowid, 1, 6), 8, 'A'))), 'XXXXXXXXXXXX') as objid,

to_number(utl_encode.base64_decode(utl_raw.cast_to_raw(lpad(substr(rowid, 7, 3), 4, 'A'))), 'XXXXXX') as filenum,

to_number(utl_encode.base64_decode(utl_raw.cast_to_raw(lpad(substr(rowid, 10, 6), 8, 'A'))), 'XXXXXXXXXXXX') as blocknum,

to_number(utl_encode.base64_decode(utl_raw.cast_to_raw(lpad(substr(rowid, 16, 3), 4, 'A'))), 'XXXXXX') as rowslot

from test where password='P7777'

ROWID                   OBJID    FILENUM   BLOCKNUM    ROWSLOT

—————— ———- ———- ———- ———-

AAAXaiAAHAAAACdAAr      95906          7        156         43 Read the rest of this entry »

Posted in Oracle, RMAN | Tagged: , , , , | 3 Comments »

Reducing RMAN backup time for unevenly sized tablespaces

Posted by John Hallas on April 6, 2010

The following note was written by Kevin Cardew after we had discussed how to reduce the backup time of some of our large databases, especially when they constrained by most of the data being stored in a single bigfile tablespace .

Our build standards require that 11g (11.1.0.7) databases are built using BIGFILE tablespaces, however this has led to one or two problems with RMAN backups. In several of the databases the deployed application, third party supplied, has all the objects deployed to a single tablespace. The result of this is that an increasing amount of the database size resides in a single data file. During the backup RMAN balances the load by attempting to divide the backup volume (data files) over the available channels to maximise backup speed. Given the unbalanced nature of the data files, this results in a single channel being allocated to the largest data file and the rest load balanced across the remaining channels. Therefore the overall backup time is determined by the time taken to backup a single data file. Read the rest of this entry »

Posted in Oracle, RMAN | Tagged: , , , , , , | 2 Comments »

Recovering from a cold backup using RMAN

Posted by John Hallas on March 22, 2010

These are  basic notes that I am using when taking copies of databases that are being used for performance testing work and on which we will be performing regular regressions. I prefer writing  to disk if we have sufficient space as no time is lost in waiting for tape drive availability and  also know that I don’t have to worry about being past the retention data and the backup being deleted. There are 4 databases involved in this test and I prefer to close them all down, startup mount, backup to disk and then make them all available again.

It is a good habit to create the recovery script at the same time as the backup is created and leave a copy in the same directory as the RMAN backup pieces as it aids recovery. Both the channel command and the use of the TAG command both help in this.

Technically this is not a cold backup as the database is open, although not in use, which is why a resetlogs is required at the end. A proper cold backup is to shut the database down and take image copies of the database files and replace them as part of a restore. Read the rest of this entry »

Posted in Oracle, RMAN | Tagged: , , , | 3 Comments »

Recovering old spfile from autobackup

Posted by John Hallas on January 18, 2010

This post is about a problem I had today when trying to restore a spfile from a previous autobackup.  I did not want the latest one but one from earlier in the week. The database was a RAC one but I don’t think that is relevant to the problems I had.

Recovering a spfile from autobackup is relatively easy – startup mount and recover spfile from autobackup.  I set my date format and started the database up in mount mode and then tried to recover the spfile. Read the rest of this entry »

Posted in ASM, Oracle, RMAN | Tagged: , , , , , , , | 2 Comments »

Cloning a database (ASM to ASM)

Posted by John Hallas on April 2, 2009

Cloning production ASM (RAC)  to ASM (non RAC)

In this document I will be cloning database SOURCEDB (11.1.0.6 RAC) to TARGETDB (11.1.0.6 single instance). There will be no downtime on the source database and minimal impact.

Source server source_server  150Gb data ASM storage used

Target server target_server

Pre-requirements -ensure that you have backed up the source database and are comfortable that you are covered in case anything goes wrong

200Gb of disk space

NFS mount between the 2 servers. In this case I created a directory /app/oradata/rman_backup_dir and arranged for a NFS mount to the same filesystem on target_server

source_server:/app/oradata/rman_backup_dir

403701760 42657229 338479316   11% /app/oradata/rman_backup_dir

The steps we will follow will be :-

  • Prepare database on target server
  • Backup source database
  • Run duplicate database scripts

Note 1

RMAN uses a terminology of target as being the source database and auxiliary as the duplicated database. In this document I am using the terms source and target for the two databases. However the RMAN script will use target and auxiliary instead out of necessity.

Note 2

Initially I wanted to use active duplicate database however whilst it did work once it failed many more times and there is an Oracle bug Bug 6603587 – CREATION OF SECOND STANDBY USING RMAN DUPLICATE FAILS WITH RMAN-20242

I then tried using non-active database and with a database backup. This time I hit Bug.5653255: PHSB: RMAN CAN\’T RESTORE DATAFILES CORRECTLY ON ASM USING IMAGE COPIES. The workaround to this was to use a backupset..

Prepare target database

I am assuming that we have an existing database built on the target server and we are just performing a refresh from production. However for completeness this section shows how to prepare the spfile and startup nomount the target database.

If this process is a refresh rather than new build then go to here

Create a tnsnames.ora entry on the source server

Note that we are using port 1524 whereas the source database uses port 1522

TARGETDB_CLONE  =

(DESCRIPTION =

(ADDRESS_LIST=

(ADDRESS = (PROTOCOL = TCP)(HOST = target_server)(PORT = 1524))

)

(CONNECT_DATA =

(SERVER = DEDICATED)

(SERVICE_NAME = TARGETDB)

)

)

Create a listener on the target server

Note that we are using port 1524 whereas the source database uses port 1522

LISTENERCLONE =

(DESCRIPTION_LIST =

(DESCRIPTION =

(ADDRESS = (PROTOCOL = TCP)(HOST = target_server)(PORT = 1524))

)

)

SID_LIST_LISTENERCLONE =

(SID_LIST =

(SID_DESC =

(GLOBAL_DBNAME = TARGETDB)

(ORACLE_HOME = /app/oracle/product/11.1.0/db_1/)

(SID_NAME = TARGETDB)

)

)

Start the listener

lsnrctl start listenerclone

On source_server

create pfile=\’/home/oracle/initTARGETDB.ora\’ from spfile;

Create pfile – target_server. Edit to remove references to RAC (threads etc). As I am moving from DATA/FRA diskgroup to DATA/FRA diskgroup I have no need to use any covert file_name parameters. Edit the spfile to replace the source SID name with the new SID

Edit /etc/oratab to add database entry

Copy the pfile from source_server

cd $ORACLE_HOME/dbs

scp oracle@source_server:/home/oracle/initTARGETDB.ora .

Create audit dest on targetserver

mkdir -p /app/oracle/admin/TARGETDB/adump

Copy passwordfile from source_server

scp oracle@source_server:/app/oracle/product/11.1.0/db_1/orapwSOURCEDB /app/oracle/product/11.1.0/db_1/orapwTARGETDB

Startup standby database from targetserver

sqlplus  / as sysdba

create spfile from pfile;

startup nomount

exit

Start the clone listener on target_server.

This is required as you cannot connect to a database that is unavailable with the asm listener

lsnrctl start listenerclone

From the primary database server source_server test the connection to the target database using the command:

sqlplus sys/password@TARGETDB_CLONE as sysdba

Prepare refreshed database

sqlplus  / as sysdba

shutdown immediate;

startup nomount;

exit

Backup source database

rm /app/oradata/rman_backup_dir/*

export NLS_DATE_FORMAT=\’YYYY-MM-DD:HH24:MI:SS\’

rman target /

run

{

allocate channel d1 device type disk format \’/app/oradata/rman_backup_dir/%U\’;

allocate channel d2 device type disk format \’/app/oradata/rman_backup_dir/%U\’;

allocate channel d3 device type disk format \’/app/oradata/rman_backup_dir/%U\’;

allocate channel d4 device type disk format \’/app/oradata/rman_backup_dir/%U\’;

allocate channel d5 device type disk format \’/app/oradata/rman_backup_dir/%U\’;

backup as backupset database plus archivelog;

}

This takes around 10 minutes. Make a note of the finish time (which is why the NLS_DATE_FORMAT variable was set)

Duplicate database

2 scripts have been created in /home/oracle on pebor02 which are pasted here

rman_duplicate_from_backup.sh

#

# Set the Oracle Environment

#

export ORACLE_HOME= /app/oracle/product/11.1.0/db_1/

export ORACLE_SID=SOURCEDB

export PATH=$ORACLE_HOME/bin:$PATH

export NLS_LANG=american

export NLS_DATE_FORMAT=\’YYYY-MM-DD:HH24:MI:SS\’

echo $ORACLE_HOME

echo $ORACLE_SID

#

# Run the RMAN Command

#

nohup  $ORACLE_HOME/bin/rman target sys/password nocatalog auxiliary sys/password@TARGETDB_clone @rman_duplicate_from_backup.cmd log=/home/oracle/rman_duplicate_from_backup.log  &

rman_duplicate_from_backup.cmd

run

{

set until scn 9660022443946;

allocate auxiliary channel d1 device type disk;

allocate auxiliary channel d2 device type disk;

allocate auxiliary channel d3 device type disk;

allocate auxiliary channel d4 device type disk;

allocate auxiliary channel d5 device type disk;

duplicate target database to TARGETDB;

}

Run the shell script  ./ rman_duplicate_from_backup.sh then it will fail with an invalid SCN, however  the output will provide the latest SCN that can be recovered until. Edit the rman_duplicate_from_backup.cmd file and replace the SCN and rerun.

You can get the SCN from the timestamp by this way you are letting Oracle do the work for you..

Restart the shell script and monitor the log file specified in the shell script.

Tidy up

Stop the listener_clone and start a listener for the newly refreshed database.

Remove the files from /app/oradata/rman_backup_dir

Notify the Apps team that the database has been refreshed and the time that the backup finished.

Posted in ASM, Oracle, RMAN | Tagged: , , , , , , | 6 Comments »

ORA-0600 1880 – archivelog compression

Posted by John Hallas on March 5, 2009

We had a problem this morning on a 10.2.0.3 database which runs our OMS repository. Enterprise manager access was hanging and a connect to the database took ages. The filesystem holding the bdump log files was full so I tidied that up but the database still had problems so I stopped OMS with opmnctl stopall, shutdown the database and then restarted it. The database failed to open with

SQL>startup

ORACLE instance started.

Total System Global Area 788529152 bytes

Fixed Size 2076176 bytes

Variable Size 243270128 bytes

Database Buffers 536870912 bytes

Redo Buffers 6311936 bytes

Database mounted.

ORA-00600: internal error code, arguments: [1880], [0x110462EF8],

[0x110462C90], [1], [0], [], [], []

The server was bounced and fsck performed on the filesystems but all to no avail.

Metalink only had one hit using the ORA000 lookup tool which was a reference (note 460217.1) to maxpiecesize in the RMAN configuration causing problems when set. I immediately thought to myself how could a RMAN config setting cause a database to fail to start when it has been started many times before. However we did not have that setting enabled.

We then started looking at the trace files (most of which we had deleted earlier to release disk space) and spotted the following line in the alert log (repeated many times)

Errors in file /app/oracle/admin/SID/bdump/mtsprd1a_arc1_1532034.trc:

ORA-00600: Message 600 not found; No message file for product=RDBMS, facility=ORA; arguments: [1880] [0x1103EF620] [0x1103EF3B8] [1] [0]

Which pointed me to this in the trace files

Redo thread mounted by this instance: 1

Oracle process number: 122

Unix process pid: 786544, image: oracle@xxx (ARC1)

Detected dead process 786542; subsuming V$MANAGED_STANDBY slot

*** 2009-03-05 07:22:14.972

*** SERVICE NAME:(SYS$BACKGROUND) 2009-03-05 07:22:14.972

*** SESSION ID:(21.6335) 2009-03-05 07:22:14.972

Archivelog compression complete.

Input: 47124480 bytes Output: 45775293 bytes

Compression Performance: 94.00 percent or 7.77 bits per byte

*** 2009-03-05 07:22:14.974

ksedmp: internal or fatal error

ORA-00600: Message 600 not found; No message file for product=RDBMS, facility=ORA; arguments: [1880] [0x1103EF620] [0x1103EF3B8] [1] [0]

A Google search did not provide anything helpful but my colleague James Hardaker had seen something like this before and thought we should take off archivelog compression

startup mount;

alter database noarchivelog

alter database archivelog;

alter database open;

alter database archive log compress disable;

That did work for us and the database restarted cleanly but we need to investigate the cause of the problem. We believe it is around the area of lack of disk space when trying to compress. Hopefully the problem will be resolved as we are currently provisioning a new set of servers (HP Itanium) which will run a 2 node RAC cluster with a standby database on a third node. At the same time we will upgrade the EM database to 11.1.0.6 and OMS 10.2.0.4 which will improve supportability as well as performance.

Posted in Oracle, RMAN | Tagged: , , , | 3 Comments »

Script to tidy archivelogs from ASM and RMAN

Posted by John Hallas on July 17, 2008

We have a pre-production RAC cluster that is kept in archivelog mode to allow true performance monitoring (and to be used for Streams and DataGuard testing). However we do not need the archivelogs for recovery purposes and as we perform high-volume testing the +FRA diskgroup (on solid-state disk) gets full very quickly.

I wrote a script that can be run to quickly free up space. It connects to the ASM instance and removes the logfiles.

It sets the SID and ORACLE_HOME to that of the RAC instance then runs RMAN to perform a crosscheck and delete of the archivelogs.

Not particularly complex but efficient.

export ORACLE_SID=+ASM1
export ORACLE_HOME=/u00/app/asm/product/11.1.0/db_1

asmcmd -p << EOF

ls FRA/RACCLUSTER/ARCHIVELOG/*2008*/*
rm -rf FRA/RACCLUSTER/ARCHIVELOG/*
ls FRA/RACCLUSTER/ARCHIVELOG/*
EOF

export ORACLE_SID=RACSID
export ORACLE_HOME=/u00/app/oracle/product/11.1.0/db_2

rman target / catalog username/password@catdb << EOF1

CHANGE ARCHIVELOG ALL VALIDATE;
DELETE NOPROMPT EXPIRED ARCHIVELOG ALL;
EOF1

Posted in ASM, Oracle, RMAN, scripts | Tagged: , , , | Leave a Comment »

RMAN CHECKSYNTAX

Posted by John Hallas on July 17, 2008

I tend to write on subjects that I have had problems with or on commands and syntax that are new to me or hard to find good working examples of.
I was trying to find out how to answer a prompt after an RMAN command within a script and I came across the CHECKSYNTAX command to RMAN. This came in with 10g
Very simple to use but also very helpful when putting an RMAN script together.

The documentation example is as good as any to try out

Create an flat file badcommand with the following contents :-

#command file with bad syntax commands
restore database
recover database

Unix_prompt> RMAN CHECKSYNTAX @badcommand

Recovery Manager: Release 11.1.0.6.0 – Production on Thu Jul 17 09:07:08 2008

Copyright (c) 1982, 2007, Oracle. All rights reserved.

RMAN> #command file with bad syntax commands
2> restore database
3> recover
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-00558: error encountered while parsing input commands
RMAN-01009: syntax error: found “recover”: expecting one of: “archivelog, channel, check, controlfile, clone, database, datafile, device, from, force, high, (, preview, primary, ;, skip, spfile, skip readonly, standby, tablespace, to restore point, until, until restore point, validate”
RMAN-01007: at line 3 column 1 file: badcmdfile
Recovery Manager complete.

Edit the badcommand file and add semi-colons at the end of each line

Unix_prompt> RMAN CHECKSYNTAX @badcommand

Recovery Manager: Release 11.1.0.6.0 – Production on Thu Jul 17 09:07:33 2008

Copyright (c) 1982, 2007, Oracle. All rights reserved.

RMAN> #command file with bad syntax commands
2> restore database;
3> recover database;
4>
The cmdfile has no syntax errors

Recovery Manager complete.

Overall nice and simple and well worth using when writing RMAN scripts

Posted in Oracle, RMAN | Tagged: , | Leave a Comment »

 
Follow

Get every new post delivered to your Inbox.

Join 134 other followers