Archive for the ‘ASM’ Category

h1

Much faster performance with ASM – a real world example

October 9, 2009

We have a  Peoplesoft database that  manages HR for ~80,000 staff.  Oracle 10.2.03.  Peoplesoft Tools 8.49 PS HR 9 HPUX 11.31 Itanium

There has been long running problems caused by I/O bottlenecks writing down to Hitachi XP SAN disks. We were using filesystem storage and disk_asynch_io was set to false

The root cause of our problem was that the Time and Labour batches contend when they need to acquire exclusive access to write dirty blocks to data files during truncate operations. 

David Kurtz has written about a specific Peoplesoft feature around truncate http://www.go-faster.co.uk/docs.htm#local_write_wait

 We migrated to using ASM disk and achieved significant benefits.

 The testing stages were

1)       Prepare a 2 hour Time and labour batch run which would generate significant throughput

2)       Capture a baseline at 10.2.0.3

3)       Upgrade to 10.2.0.4 and retest

4)       Upgrade to 10.2.0.4 and migrate to ASM (11.1.0.7) and retest

5)       Upgrade to 11.1.0.7 and migrate to ASM (11.1.0.7) and retest

 

Test 3 did not show any difference at all. We were still blocked on disk, which was expected. We probably would have remained at 10.2.0.3 but gone to ASM but there are problems with seeing ASM disk when using 10.2.0.3 (Metalink Note 434500, ORA-15059) so we had the option of applying a fix to get ASM working with 10.2.0.3 or upgrading to 10.2.0.4. We decided that the benefits of getting onto a terminal release and suggestions from Oracle that small changes in the truncate process had been made between 10.2.0.3 and 10.2.0.4 out-weighed the overhead of any application testing that we needed to do.

Test 4 gave us big performance benefits. Test 5 was not as clear cut but we knew some code had to be optimised and we did not have the luxury of time to address the issues. Remaining on 10g minimised the application testing, certainly compared to the testing require to upgrade to 11g.

The benefits can be seen quite clearly in the graph below

asynch_io

This shows a 2 hour period where a constant throughput of T&L jobs were being processed. In parallel a constant online load was being simulated

The black line shows throughput and elapsed time. The run consists of a number of jobs which take 2 hours to complete and batch time is around the 450 - 600 second mark

The same processing is then run with 10.2.0.4 database and ASM. disk_aysnch_io is set to FALSE. The jobs complete 40 minutes faster and elapsed time is around the 300-400 second mark but showing spikes and inconsistent performance

Finally the orange line shows 10.2.0.4 database and ASM. disk_aysnch_io =TRUE. The processing time has now been reduced to just over an hour and the elapsed time is significantly better at an average of 250 seconds and with a very consistent profile.

As with all things in life we have now moved onto the next bottleneck which is the wait flashback buf free caused because we have enabled flashback logging and as the redo log buffer fills up we wait for flashback to be written out. However this is a miniscule issue compared to the original problems.  We have taken flashback off some tablespaces primarily used for PS temporary tables (aka working storage tables) although that does give us an issue when when rebuilding the primary database after a DataGuard failover using flashback logs

I hope this real world example is of use

h1

Managing control files in ASM

September 28, 2009

Just a couple of notes on how to create multiple ASM control files and moving control files toa different diskgroup.

Assuming we already have a ASM control file in the DATA diskgroup and we wish to add one to the FRA diskgroup

 SQL>show parameter control_files
+DATA/SID/controlfile/current.730.698514673
alter system set control_files='+DATA','+FRA' scope=spfile;
shutdown immediate;
startup nomount;
exit
rman target
/restore controlfile from '+DATAHR/MHRPRD1A/CONTROLFILE/current.730.698514673';
alter database mount;
alter database open;
SQL>show parameter control_files
+DATA/SID/controlfile/current.730.698514673, +FRAHR/SID/controlfile/current.766.698514673

 

                                                                           

Here is an example of how to migrate from one ASM diskgroup to another

Add the new disk discovery path in (if needed), create a diskgroup and add disks

alter system set asm_diskstring='/dev/oracle','/dev/oracle_hr/r*';

create diskgroup DATANEW  external redundancy disk '/dev/oracle_hr/rdisk1','/dev/oracle_hr/rdisk2';

SQL>show parameter control_files

control_files                        string      +DATA10/SID/controlfile/c urrent.796.695384211
alter system set CONTROL_FILES = '+DATANEW' scope=spfile

shutdown
rman target /
startup nomount
restore controlfile  from  '/+DATA/SID/controlfile/current.796.695384211';
alter database mount;
alter database open;
h1

Moving from ASM storage back to filesystem

September 17, 2009

Whilst it is not very common to move from ASM to filesystem I needed to prepare a regression plan for the migration of a 450 datafile database in the event of problems during the migration or shortly afterwards.

The example below is from a small database called TEST and I used filestore in /app/oracle whereas normally I would have a /oradata/SID/ filesystem

The details below should be enough to get someone through the migration and the sql scripts can be modified to be more generic if necessary but it is not too hard to use them as they are now. 

List the datafiles and we will add a second datafile to the users tablespace to provide an example of an issue that can occur.


SYS@TEST SQL>select file_name from dba_data_files;  +DATA/test/datafile/undotbs1.301.697649965
+DATA/test/datafile/sysaux.300.697649963
+DATA/test/datafile/system.299.697649963
+DATA/test/datafile/users.302.697649965
+DATA/test/datafile/users.309.697650601

 mkdir /app/oracle/TEST

Some tablespaces may have multiple datafiles which would end up with the same name using the code above. However as a quick circumvention I manually identified those tablespaces using this sql and then edited the new data file names

select tablespace_name,count(file_name) from dba_data_files group by tablespace_name ORDER BY 2 

In my example it only applied to the USERS tablespace and I edited the second ASM datafile to be mapped to users02.dbf not users01.dbf as the script will output. This action would have to be repeated for all tablespaces with multiple datafiles

sqlplus / as sysdba

set lines 120 pages 0

select 'copy datafile '||file_id||' to '||'''/app/oracle/TEST/'||substr(file_name,21,instr(file_name,'.')-21)||'_01.dbf'''||';' from dba_data_files order by 1;

copy datafile 1 to '/app/oracle/TEST/system_01.dbf';

copy datafile 2 to '/app/oracle/TEST/sysaux_01.dbf';

copy datafile 3 to '/app/oracle/TEST/undotbs1_01.dbf';

copy datafile 4 to '/app/oracle/TEST/users_01.dbf';

copy datafile 5 to '/app/oracle/TEST/users_02.dbf';

shutdown immediate;

startup mount;

exit

rman target /

copy datafile 1 to '/app/oracle/TEST/system_01.dbf';

copy datafile 2 to '/app/oracle/TEST/sysaux_01.dbf';

copy datafile 3 to '/app/oracle/TEST/undotbs1_01.dbf';

copy datafile 4 to '/app/oracle/TEST/users_01.dbf';

copy datafile 5 to '/app/oracle/TEST/users_02.dbf';

exit

sysdba

select 'alter database rename file '''||file_name||''' to '||'''/app/oracle/TEST/'||substr(file_name,21,instr(file_name,'.')-21)||'_01.dbf'''||';' from dba_data_files order by 1;

alter database rename file '+DATA/test/datafile/sysaux.300.697649963' to '/app/oracle/TEST/sysaux_01.dbf';
alter database rename file '+DATA/test/datafile/system.299.697649963' to '/app/oracle/TEST/system_01.dbf';
alter database rename file '+DATA/test/datafile/undotbs1.301.697649965' to '/app/oracle/TEST/undotbs1_01.dbf';
alter database rename file '+DATA/test/datafile/users.302.697649965' to '/app/oracle/TEST/users_01.dbf';
alter database rename file '+DATA/test/datafile/users.309.697650601' to '/app/oracle/TEST/users_02.dbf';

open the database and check for files still using ASM

open database

select name from v$controlfile
union
select name from v$tempfile
union
select member from v$logfile
union
select filename from v$block_change_tracking;

 

NAME
--------------------------------------------------------------------------------
+DATA/test/controlfile/current.303.697650047
+DATA/test/onlinelog/group_1.304.697650049
+DATA/test/onlinelog/group_2.305.697650049
+DATA/test/onlinelog/group_3.306.697650049
+DATA/test/tempfile/temp.307.697650095
+FRA/test/controlfile/current.260.697650047
+FRA/test/onlinelog/group_1.261.697650049
+FRA/test/onlinelog/group_2.262.697650049
+FRA/test/onlinelog/group_3.263.697650051

alter database rename file '+DATA/test/datafile/sysaux.300.697649963' to '/app/oracle/TEST/sysaux_01.dbf';

alter database rename file '+DATA/test/datafile/system.299.697649963' to '/app/oracle/TEST/system_01.dbf';

alter database rename file '+DATA/test/datafile/undotbs1.301.697649965' to '/app/oracle/TEST/undotbs1_01.dbf';

alter database rename file '+DATA/test/datafile/users.302.697649965' to '/app/oracle/TEST/users_01.dbf';

alter database rename file '+DATA/test/datafile/users.309.697650601' to '/app/oracle/TEST/users_02.dbf';

ALTER DATABASE ADD LOGFILE GROUP 4 ('/app/oracle/TEST/redo_04a.dbf','/app/oracle/TEST/redo_04b.dbf') size 50M;

ALTER DATABASE ADD LOGFILE GROUP 5 ('/app/oracle/TEST/redo_05a.dbf','/app/oracle/TEST/redo_05b.dbf') size 50M;

ALTER DATABASE ADD LOGFILE GROUP 6 ('/app/oracle/TEST/redo_06a.dbf','/app/oracle/TEST/redo_06b.dbf') size 50M; 

You will need to perform logfile switches or checkpoint before the log groups using ASM are available for removal. 

SQL>

show parameter control

 control_files +DATA/test/controlfile/current .303.697650047, +FRA/test/cont rolfile/current.260.697650047

create a pfile from spfile and edit the pfile to show new placement for the controlfiles

*.control_files='/app/oracle/TEST/control01.ctl','/app/oracle/TEST/control02.ctl'

startup nomount

$rman target / 

Recovery Manager: Release 11.1.0.7.0 - Production on Wed Sep 16 16:36:15 2009 connected to target database:

TEST (not mounted) using target database control file instead of recovery catalog

 restore controlfile from '+DATA/test/controlfile/current.303.697650047';

Starting restore at 16-SEP-09 allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=70 device type=DISK channel ORA_DISK_1: copied control file copy output file name=/app/oracle/TEST/control01.ctl output file name=/app/oracle/TEST/control02.ctl

Finished restore at 16-SEP-09

 exit

Recovery Manager complete.

 

create spfile='/app/oracle/product/11.1.0/db_1/dbs/spfileTEST.ora' from pfile='/home/oracle/john.ora';

File created.
shutdown immediate; 
startup

show parameter control

control_files  /app/oracle/TEST/control01.ctl , /app/oracle/TEST/control02.ctl 

All that is left is to recreate the temp tablespace to use filesystem storage.

startup mount;

RMAN> run { set newname for tempfile 1 to '/app/oracle/TEST/temp01.dbf'; switch tempfile all;}

executing command:

SET NEWNAME renamed tempfile 1 to /app/oracle/TEST/temp01.dbf in control file  rman target /

 TEST (not mounted) using target database control file instead of recovery catalog RMAN> restore controlfile from '+DATA/test/controlfile/current.303.697650047';

Starting restore at 16-SEP-09 allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=69 device type=DISK channel ORA_DISK_1:

copied control file copy output file

name=/app/oracle/TEST/control01.ctl output file name=/app/oracle/TEST/control02.ctl

Finished restore at 16-SEP-09

All should be complete now. As I said, this is just a hint on what actions to take and the sql scripts should be useful if you have lots of datafiles.

All that is left is to create a new  block_change_tracking file and the ASM storage can be released.

h1

Oracle 11G Release 2 available (Linux only though)

September 2, 2009

Linux only at the moment

 http://www.oracle.com/technology/products/database/oracle11g/index.html

  11GR2 documentation

 http://www.oracle.com/pls/db112/homepage

 11GR2 new features

 http://download.oracle.com/docs/cd/E11882_01/server.112/e10881/toc.htm

  new features index page shows the following (of which I will be looking at ASM first)

 1.1 Application Development

1.10.4.3 XMLIndex Enhancements

h1

ASM space marked as internal – use check all repair

July 30, 2009

The problem was that the FRA diskgroup seemed to be using a lot of space and and yet there was hardly  files on  disk as far as I could tell. HP Itanium – 11.1.0.7

ASM Disk Groups
===============

     Group Group Name                State           Type      Total GB Free GB
---------- ------------------------- --------------- ------- ---------- -------
         1 DATA                      MOUNTED         EXTERN        2581     109
         2 FRA                       MOUNTED         EXTERN        1602     127

 
Group Disk Header    Mode     State           Redundancy   Total MB    Free MB Disk Name                      Failure Gr
----- ---- --------- -------- --------------- ---------- ---------- ---------- ------------------------------ ----------
    2    1 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0001                       FRA_0001
    2    2 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0002                       FRA_0002
    2    4 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0004                       FRA_0004
    2    5 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0005                       FRA_0005
    2    6 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0006                       FRA_0006
    2    7 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0007                       FRA_0007
    2    8 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0008                       FRA_0008
    2    9 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0009                       FRA_0009
    2   10 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0010                       FRA_0010
    2   11 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0011                       FRA_0011
    2   12 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0012                       FRA_0012
    2   13 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0013                       FRA_0013
    2   14 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0014                       FRA_0014
    2   15 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0015                       FRA_0015
    2   16 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0016                       FRA_0016
    2   17 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0017                       FRA_0017
    2   18 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      65242 FRA_0018                       FRA_0018
    2   19 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      65112 FRA_0019                       FRA_0019

 I had removed one disk from FRA and forced a rebalance but that still did not release space. I had checked the FRA disks using asmcmd but that only showed 2 online logs which were 2Gb each (we were not in archive log mode).

I had checked the diskgroup using the EM Gui which issued the command  

ALTER DISKGROUP FRA CHECK ALL:

but still no success. Looking at the documentation the check command taking NOREPAIR as the default should have reported any issues into the ASM alert log.  Which indeed it did (and I would have seen them if I had bothered to look)

Wed Jul 29 16:37:55 2009
SQL> ALTER DISKGROUP FRA CHECK DISK FRA_0001,FRA_0002,FRA_0004,FRA_0005,FRA_0006,FRA_0007,FRA_0008,FRA_0009,FRA_0010,FRA_0011,FRA_0012,FRA_0013,FRA_0014,FRA_0015,FRA_0016,FRA_0017,FRA_0018,FRA_0019
WARNING: Deprecated privilege SYSDBA for command 'ALTER DISKGROUP CHECK'
kfdp_checkDsk(): 33
NOTE: disk FRA_0001, used AU total mismatch: DD=4294965892, AT=2283
kfdp_checkDsk(): 34
NOTE: disk FRA_0002, used AU total mismatch: DD=4294966012, AT=2283
kfdp_checkDsk(): 35
NOTE: disk FRA_0004, used AU total mismatch: DD=4294966210, AT=2283
kfdp_checkDsk(): 36
NOTE: disk FRA_0005, used AU total mismatch: DD=4294966625, AT=2283
kfdp_checkDsk(): 37
NOTE: disk FRA_0006, used AU total mismatch: DD=4294965615, AT=2280
kfdp_checkDsk(): 38
NOTE: disk FRA_0007, used AU total mismatch: DD=4294964292, AT=2283
kfdp_checkDsk(): 39
NOTE: disk FRA_0008, used AU total mismatch: DD=4294967237, AT=2284
kfdp_checkDsk(): 40
NOTE: disk FRA_0009, used AU total mismatch: DD=4294965595, AT=2280
kfdp_checkDsk(): 41
NOTE: disk FRA_0010, used AU total mismatch: DD=4294966758, AT=2279
kfdp_checkDsk(): 42
NOTE: disk FRA_0011, used AU total mismatch: DD=4294966324, AT=2279
kfdp_checkDsk(): 43
NOTE: disk FRA_0012, used AU total mismatch: DD=4294966743, AT=2279
kfdp_checkDsk(): 44
NOTE: disk FRA_0013, used AU total mismatch: DD=4294964514, AT=2278
kfdp_checkDsk(): 45
NOTE: disk FRA_0014, used AU total mismatch: DD=4294965734, AT=2278
kfdp_checkDsk(): 46
NOTE: disk FRA_0015, used AU total mismatch: DD=4294965204, AT=2279
kfdp_checkDsk(): 47
NOTE: disk FRA_0016, used AU total mismatch: DD=4294965599, AT=2279
kfdp_checkDsk(): 48
NOTE: disk FRA_0017, used AU total mismatch: DD=4294964950, AT=2280
kfdp_checkDsk(): 49
NOTE: disk FRA_0018, used AU total mismatch: DD=25896, AT=2278
kfdp_checkDsk(): 50
NOTE: disk FRA_0019, used AU total mismatch: DD=26026, AT=2280
WARNING: deprecated use of ALTER DISKGROUP CHECK arguments
SUCCESS: ALTER DISKGROUP FRA CHECK DISK FRA_0001,FRA_0002,FRA_0004,FRA_0005,FRA_0006,FRA_0007,FRA_0008,FRA_0009,FRA_0010,FRA_0011,FRA_0012,FRA_0013,FRA_0014,FRA_0015,FRA_0016,FRA_0017,FRA_0018,FRA_0019

From the command line I then ran the comand

ALTER DISKGROUP FRA CHECK ALL REPAIR;

and I had immediate success. It took about 1 minute and corrected the issues and then released the space that had been grabbed internally.

     Group Group Name                State           Type      Total GB Free GB
---------- ------------------------- --------------- ------- ---------- -------
         1 DATA                      MOUNTED         EXTERN        2581     109
         2 FRA                       MOUNTED         EXTERN  1602   1,562

Group Disk Header    Mode     State           Redundancy   Total MB    Free MB Disk Name                      Failure Gr
----- ---- --------- -------- --------------- ---------- ---------- ---------- ------------------------------ ----------
    2    1 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88855 FRA_0001                       FRA_0001
    2    2 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88855 FRA_0002                       FRA_0002
    2    4 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88855 FRA_0004                       FRA_0004
    2    5 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88855 FRA_0005                       FRA_0005
    2    6 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88858 FRA_0006                       FRA_0006
    2    7 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88855 FRA_0007                       FRA_0007
    2    8 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88854 FRA_0008                       FRA_0008
    2    9 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88858 FRA_0009                       FRA_0009
    2   10 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88859 FRA_0010                       FRA_0010
    2   11 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88859 FRA_0011                       FRA_0011
    2   12 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88859 FRA_0012                       FRA_0012
    2   13 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88860 FRA_0013                       FRA_0013
    2   14 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88860 FRA_0014                       FRA_0014
    2   15 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88859 FRA_0015                       FRA_0015
    2   16 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88859 FRA_0016                       FRA_0016
    2   17 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88858 FRA_0017                       FRA_0017
    2   18 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88860 FRA_0018                       FRA_0018
    2   19 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88858 FRA_0019                       FRA_0019

 

SQL> alter diskgroup fra check all repair
NOTE: starting check of diskgroup FRA
kfdp_checkDsk(): 51
WARNING: disk FRA_0001, changing DD used AUs from 4294965892 to 2283
kfdp_checkDsk(): 52
WARNING: disk FRA_0002, changing DD used AUs from 4294966012 to 2283
kfdp_checkDsk(): 53
WARNING: disk FRA_0004, changing DD used AUs from 4294966210 to 2283
kfdp_checkDsk(): 54
WARNING: disk FRA_0005, changing DD used AUs from 4294966625 to 2283
kfdp_checkDsk(): 55
WARNING: disk FRA_0006, changing DD used AUs from 4294965615 to 2280
kfdp_checkDsk(): 56
WARNING: disk FRA_0007, changing DD used AUs from 4294964292 to 2283
kfdp_checkDsk(): 57
WARNING: disk FRA_0008, changing DD used AUs from 4294967237 to 2284
kfdp_checkDsk(): 58
WARNING: disk FRA_0009, changing DD used AUs from 4294965595 to 2280
kfdp_checkDsk(): 59
WARNING: disk FRA_0010, changing DD used AUs from 4294966758 to 2279
kfdp_checkDsk(): 60
WARNING: disk FRA_0011, changing DD used AUs from 4294966324 to 2279
kfdp_checkDsk(): 61
WARNING: disk FRA_0012, changing DD used AUs from 4294966743 to 2279
kfdp_checkDsk(): 62
WARNING: disk FRA_0013, changing DD used AUs from 4294964514 to 2278
kfdp_checkDsk(): 63
WARNING: disk FRA_0014, changing DD used AUs from 4294965734 to 2278
kfdp_checkDsk(): 64
WARNING: disk FRA_0015, changing DD used AUs from 4294965204 to 2279
kfdp_checkDsk(): 65
WARNING: disk FRA_0016, changing DD used AUs from 4294965599 to 2279
kfdp_checkDsk(): 66
WARNING: disk FRA_0017, changing DD used AUs from 4294964950 to 2280
kfdp_checkDsk(): 67
WARNING: disk FRA_0018, changing DD used AUs from 25896 to 2278
kfdp_checkDsk(): 68
WARNING: disk FRA_0019, changing DD used AUs from 26026 to 2280
SUCCESS: check of diskgroup FRA found no errors
WARNING: deprecated use of ALTER DISKGROUP CHECK arguments
SUCCESS: alter diskgroup fra check all repair
Thu Jul 30 10:31:33 2009

 

I would have expected the gui to have reported errors but it didn’t. Equally I can see no good reason  for not fixing errors it identified when a check is run. That was the default in 10G  so  there must be a good reason why it was changed. Perhaps it had a performance impact which I would not have noticed on a pretty empty diskgroup. The other reason is that it might make the assumption that the DBA would want a good backup before repairing.

As we had nothing in there and we backup ASM metadata every day then I was happy to go ahead. http://jhdba.wordpress.com/2009/06/11/script-to-backup-asm-metadata/

 So the lessons learned are

  1. Use the check all feature
  2. Check the alert log afterwards
  3. Then use the check all repair statement from command line
h1

Autostart using the ‘W’ flag in oratab

May 12, 2009

In 10.2.0.2 an option arrived to manage the automatic startup and shutdown of the databases. This is documented in the manual and also in Metalink note 788502.1.
It  is probably easier to reproduce the key paragraph here

Database entries in the oratab file are displayed in the following format:

SID:ORACLE_HOME:{Y|N|W}

In this example, the values Y and N specify whether you want the scripts to start up or shut down the database, respectively. For each database for which you want to automate shutdown and startup, first determine the instance identifier (SID) for that database, which is identified by the SID in the first field.

Then, change the last field for each to Y.

You can set dbstart to autostart a single-instance database that uses an Automatic Storage Management installation that is auto-started by Oracle Clusterware. This is the default behavior for an Automatic Storage Management cluster. If you want to do this, then you must change the oratab entry of the database and the Automatic Storage Management installation to use a third field with the value W and N, respectively. These values specify that dbstart auto-starts the database only after the Automatic Storage Management instance is started.

However on playing around with these settings on a server with ASM and 11.1.0.7 installed complete with a single instance database I could not see any way that the functionality was working.

Looking around on the net there does not seem to be much information out there and nothing on Metalink

In our environment we have a standard that we set up the oratab in the following manner

DATABASE SID that will be the default ORACLE_SID and ORACLE_HOME when you login
DATABASE_SID1 – if available
DATABASE_SID2 – if available
+ASM – the asm instance entry
agent10g – the ORACLE_HOME for the grid agent.

I suspect that if the ‘W’ flag in oratab did work then we would need to change that standard to get the ASM entry in as the first entry.

If anyone has any more feedback please feel free to comment

h1

Location of diagnostic directories in 11g

April 3, 2009

Since the advent of 11g and the use of diagnostic_dest there is no requirement to create the bdump/cdump/udump  and other directories required for oracle trace and admin files. Well, that is true if you are happy to accept the default location under $ORACLE_BASE. However the path to the trace area where alert logs and dump files are kept is a long one and I prefer  to use an alias to get there.

In the .profile I set up two aliases to point to the trace areas for ASM and whichever database my current SID is set to

alias diagdest=’cd $ORACLE_BASE/`adrci exec=”show homes” | grep $ORACLE_SID`/trace’

alias diagdestasm=’cd $ORACLE_BASE/`adrci exec=”show homes” | grep ASM`/trace’

I am sure there are other ways of setting a similar alias and I am happy to hear of any that others use

h1

Cloning a database (ASM to ASM)

April 2, 2009

Cloning production ASM (RAC)  to ASM (non RAC)

In this document I will be cloning database SOURCEDB (11.1.0.6 RAC) to TARGETDB (11.1.0.6 single instance). There will be no downtime on the source database and minimal impact.

Source server source_server  150Gb data ASM storage used

Target server target_server

Pre-requirements -ensure that you have backed up the source database and are comfortable that you are covered in case anything goes wrong

200Gb of disk space

NFS mount between the 2 servers. In this case I created a directory /app/oradata/rman_backup_dir and arranged for a NFS mount to the same filesystem on target_server

source_server:/app/oradata/rman_backup_dir

403701760 42657229 338479316   11% /app/oradata/rman_backup_dir

The steps we will follow will be :-

  • Prepare database on target server
  • Backup source database
  • Run duplicate database scripts

Note 1

RMAN uses a terminology of target as being the source database and auxiliary as the duplicated database. In this document I am using the terms source and target for the two databases. However the RMAN script will use target and auxiliary instead out of necessity.

Note 2

Initially I wanted to use active duplicate database however whilst it did work once it failed many more times and there is an Oracle bug Bug 6603587 – CREATION OF SECOND STANDBY USING RMAN DUPLICATE FAILS WITH RMAN-20242

I then tried using non-active database and with a database backup. This time I hit Bug.5653255: PHSB: RMAN CAN\’T RESTORE DATAFILES CORRECTLY ON ASM USING IMAGE COPIES. The workaround to this was to use a backupset..

Prepare target database

I am assuming that we have an existing database built on the target server and we are just performing a refresh from production. However for completeness this section shows how to prepare the spfile and startup nomount the target database.

If this process is a refresh rather than new build then go to here

Create a tnsnames.ora entry on the source server

Note that we are using port 1524 whereas the source database uses port 1522

TARGETDB_CLONE  =

(DESCRIPTION =

(ADDRESS_LIST=

(ADDRESS = (PROTOCOL = TCP)(HOST = target_server)(PORT = 1524))

)

(CONNECT_DATA =

(SERVER = DEDICATED)

(SERVICE_NAME = TARGETDB)

)

)

Create a listener on the target server

Note that we are using port 1524 whereas the source database uses port 1522

LISTENERCLONE =

(DESCRIPTION_LIST =

(DESCRIPTION =

(ADDRESS = (PROTOCOL = TCP)(HOST = target_server)(PORT = 1524))

)

)

SID_LIST_LISTENERCLONE =

(SID_LIST =

(SID_DESC =

(GLOBAL_DBNAME = TARGETDB)

(ORACLE_HOME = /app/oracle/product/11.1.0/db_1/)

(SID_NAME = TARGETDB)

)

)

Start the listener

lsnrctl start listenerclone

On source_server

create pfile=\’/home/oracle/initTARGETDB.ora\’ from spfile;

Create pfile – target_server. Edit to remove references to RAC (threads etc). As I am moving from DATA/FRA diskgroup to DATA/FRA diskgroup I have no need to use any covert file_name parameters. Edit the spfile to replace the source SID name with the new SID

Edit /etc/oratab to add database entry

Copy the pfile from source_server

cd $ORACLE_HOME/dbs

scp oracle@source_server:/home/oracle/initTARGETDB.ora .

Create audit dest on targetserver

mkdir -p /app/oracle/admin/TARGETDB/adump

Copy passwordfile from source_server

scp oracle@source_server:/app/oracle/product/11.1.0/db_1/orapwSOURCEDB /app/oracle/product/11.1.0/db_1/orapwTARGETDB

Startup standby database from targetserver

sqlplus  / as sysdba

create spfile from pfile;

startup nomount

exit

Start the clone listener on target_server.

This is required as you cannot connect to a database that is unavailable with the asm listener

lsnrctl start listenerclone

From the primary database server source_server test the connection to the target database using the command:

sqlplus sys/password@TARGETDB_CLONE as sysdba

Prepare refreshed database

sqlplus  / as sysdba

shutdown immediate;

startup nomount;

exit

Backup source database

rm /app/oradata/rman_backup_dir/*

export NLS_DATE_FORMAT=\’YYYY-MM-DD:HH24:MI:SS\’

rman target /

run

{

allocate channel d1 device type disk format \’/app/oradata/rman_backup_dir/%U\’;

allocate channel d2 device type disk format \’/app/oradata/rman_backup_dir/%U\’;

allocate channel d3 device type disk format \’/app/oradata/rman_backup_dir/%U\’;

allocate channel d4 device type disk format \’/app/oradata/rman_backup_dir/%U\’;

allocate channel d5 device type disk format \’/app/oradata/rman_backup_dir/%U\’;

backup as backupset database plus archivelog;

}

This takes around 10 minutes. Make a note of the finish time (which is why the NLS_DATE_FORMAT variable was set)

Duplicate database

2 scripts have been created in /home/oracle on pebor02 which are pasted here

rman_duplicate_from_backup.sh

#

# Set the Oracle Environment

#

export ORACLE_HOME= /app/oracle/product/11.1.0/db_1/

export ORACLE_SID=SOURCEDB

export PATH=$ORACLE_HOME/bin:$PATH

export NLS_LANG=american

export NLS_DATE_FORMAT=\’YYYY-MM-DD:HH24:MI:SS\’

echo $ORACLE_HOME

echo $ORACLE_SID

#

# Run the RMAN Command

#

nohup  $ORACLE_HOME/bin/rman target sys/password nocatalog auxiliary sys/password@TARGETDB_clone @rman_duplicate_from_backup.cmd log=/home/oracle/rman_duplicate_from_backup.log  &

rman_duplicate_from_backup.cmd

run

{

set until scn 9660022443946;

allocate auxiliary channel d1 device type disk;

allocate auxiliary channel d2 device type disk;

allocate auxiliary channel d3 device type disk;

allocate auxiliary channel d4 device type disk;

allocate auxiliary channel d5 device type disk;

duplicate target database to TARGETDB;

}

Run the shell script  ./ rman_duplicate_from_backup.sh then it will fail with an invalid SCN, however  the output will provide the latest SCN that can be recovered until. Edit the rman_duplicate_from_backup.cmd file and replace the SCN and rerun.

You can get the SCN from the timestamp by this way you are letting Oracle do the work for you..

Restart the shell script and monitor the log file specified in the shell script.

Tidy up

Stop the listener_clone and start a listener for the newly refreshed database.

Remove the files from /app/oradata/rman_backup_dir

Notify the Apps team that the database has been refreshed and the time that the backup finished.