Archive for the ‘11g new features’ Category

h1

Configuring Dataguard Broker in 11G

October 12, 2009

DataGuard Broken?

A guest blog from colleague Simon Ryan

Last year we did some sandpit work on implementing DataGuard.  This was 10g on a Windows platform.  When it came to the broker we didn’t have much luck.  At the UKOUG Michael Moller referred to the Broker as DataGuard Broken (DataGuard 11g New Features).  From the audience reaction this seemed to be a popular perception.

Since then we’ve successfully implemented DataGuard on most of our Production databases (both 10g and 11g) on HP-UX.  However, we’ve still not embraced the Broker!  The ‘broken’ tag still taints its reputation.  A recent visit by Oracle attempted to assure us that the Broker with 11g is much improved.

The following is a basic guide to implementing the broker and our findings:

Our Setup

o       Primary – Two node RAC Cluster (XXXPRD1A)

o       Physical Standby – Single node (XXXPRD1B)

o       HP-UX 11.31 Itanium

o       Database using ASM (both 11.1.0.7).

o       DataGuard running in Max Availability with Real Time Apply.

Configuration

Add the following entries to tnsnames.ora on all instances:

 

XXXPRD1A_DGMGRL =

   (DESCRIPTION=

     (ADDRESS=(PROTOCOL=tcp)(HOST=xxxem03-vip)(PORT=1526))

     (CONNECT_DATA=

       (SERVICE_NAME=XXXPRD1A_DGMGRL)

     )

    )

 

XXXPRD2A_DGMGRL =

   (DESCRIPTION=

     (ADDRESS=(PROTOCOL=tcp)(HOST=xxxem04-vip)(PORT=1526))

     (CONNECT_DATA=

       (SERVICE_NAME=XXXPRD2A_DGMGRL)

     )

    )

 

XXXPRD1B_DGMGRL =

   (DESCRIPTION=

     (ADDRESS=(PROTOCOL=tcp)(HOST=xxxem01-vip)(PORT=1526))

     (CONNECT_DATA=

       (SERVICE_NAME=XXXPRD1B_DGMGRL)

     )

    )

 

Add the following entry to SID_LIST of your listener in listener.ora on all instances:

 

 (SID_DESC = 

      (ORACLE_HOME = /app/oracle/product/11.1.0/db_1)

      (SID_NAME = XXXPRDxx)

      (GLOBAL_DBNAME = XXXPRDxx_DGMGRL)

    )

 Stop/start listener.

 Test you can now connect:

 

 sqlplus sys@XXXPRD2A_dgmgrl as sysdba

  

Make the following parameter changes:

 

 XXXPRD1A:
<pre>
 

alter system set dg_broker_start=true scope=spfile;

alter system set dg_broker_config_file1='+DATA/XXXPRD1A/broker1.dat' scope=spfile;

alter system set dg_broker_config_file2='+FRA/XXXPRD1A/broker2.dat' scope=spfile;

 XXXPRD1B:

alter system set dg_broker_start=true scope=spfile;

alter system set dg_broker_config_file1='+DATA/XXXPRD1B/broker1.dat' scope=spfile;

alter system set dg_broker_config_file2='+FRA/XXXPRD1B/broker2.dat' scope=spfile; 

 Stop/Start the databases:

 

 Stop XXXPRD1A:
<pre>
 

srvctl stop database -d XXXPRD1A -o immediate

 Stop XXXPRD1B:

shutdown immediate

 Start XXXPRD1B:

startup mount

 Start XXXPRD1A:

srvctl start database -d XXXPRD1A

 

 Check for the broker process:

 

 XXXPRD1A:
<pre>
 

ps –ef | grep ora_dmon_XXXPRD1A

XXXPRD1B:

ps –ef | grep ora_dmon_XXXPRD1B

 

Create DG Broker Configuration:

 

[code lang='xml'] Primary:

dgmgrl

connect sys

<em>enter sys password</em>

 

create configuration XXXPRD1A_DGCONFIG as primary database is “XXXPRD1A” connect identifier is XXXPRD1A;

Configuration "XXXPRD1A_dgconfig" created with primary database "XXXPRD1A"

 

add database “XXXPRD1B” as connect identifier is XXXPRD1B maintained as physical;

Database "XXXPRD1B" added

 

enable configuration;

Enabled.

 

show configuration;

 

Configuration

  Name:                XXXPRD1A_dgconfig

  Enabled:             YES

  Protection Mode:     MaxAvailability

  Databases:

    XXXPRD1A - Primary database

    XXXPRD1B - Physical standby database

 

Fast-Start Failover: DISABLED

 

Current status for "XXXPRD1A_dgconfig":

SUCCESS

 

 Check Broker Files Exist:

Primary and Standby ASM:

Check they are there.

 

Show Database:

 Primary: 

show database verbose “XXXPRD1A”;

 

Database

  Name:            XXXPRD1A

  Role:            PRIMARY

  Enabled:         YES

  Intended State:  TRANSPORT-ON

  Instance(s):

    XXXPRD1A

    XXXPRD2A

 

  Properties:

    DGConnectIdentifier             = 'XXXPRD1A'

    ObserverConnectIdentifier       = ''

    LogXptMode                      = 'SYNC'

    DelayMins                       = '0'

    Binding                         = 'OPTIONAL'

    MaxFailure                      = '0'

    MaxConnections                  = '1'

    ReopenSecs                      = '300'

    NetTimeout                      = '30'

    RedoCompression                 = 'DISABLE'

    LogShipping                     = 'ON'

    PreferredApplyInstance          = ''

    ApplyInstanceTimeout            = '0'

    ApplyParallel                   = 'AUTO'

    StandbyFileManagement           = 'AUTO'

    ArchiveLagTarget                = '0'

    LogArchiveMaxProcesses          = '10'

    LogArchiveMinSucceedDest        = '1'

    DbFileNameConvert               = 'XXXPRD1B, XXXPRD1A'

    LogFileNameConvert              = 'XXXPRD1B, XXXPRD1A'

    FastStartFailoverTarget         = ''

    StatusReport                    = '(monitor)'

    InconsistentProperties          = '(monitor)'

    InconsistentLogXptProps         = '(monitor)'

    SendQEntries                    = '(monitor)'

    LogXptStatus                    = '(monitor)'

    RecvQEntries                    = '(monitor)'

    HostName(*)

    SidName(*)

    StaticConnectIdentifier(*)

    StandbyArchiveLocation(*)

    AlternateLocation(*)

    LogArchiveTrace(*)

    LogArchiveFormat(*)

    LatestLog(*)

    TopWaitEvents(*)

    (*) - Please check specific instance for the property value

 

Current status for "XXXPRD1A":

SUCCESS

 

Standby:

show database verbose “XXXPRD1B”;

 

Database

  Name:            XXXPRD1B

  Role:            PHYSICAL STANDBY

  Enabled:         YES

  Intended State:  APPLY-ON

  Instance(s):

    XXXPRD1B

 

  Properties:

    DGConnectIdentifier             = 'XXXPRD1B'

    ObserverConnectIdentifier       = ''

    LogXptMode                      = 'SYNC'

    DelayMins                       = '0'

    Binding                         = 'OPTIONAL'

    MaxFailure                      = '0'

    MaxConnections                  = '1'

    ReopenSecs                      = '60'

    NetTimeout                      = '30'

    RedoCompression                 = 'DISABLE'

    LogShipping                     = 'ON'

    PreferredApplyInstance          = ''

    ApplyInstanceTimeout            = '0'

    ApplyParallel                   = 'AUTO'

    StandbyFileManagement           = 'AUTO'

    ArchiveLagTarget                = '0'

    LogArchiveMaxProcesses          = '10'

    LogArchiveMinSucceedDest        = '1'

    DbFileNameConvert               = 'XXXPRD1A, XXXPRD1B'

    LogFileNameConvert              = 'XXXPRD1A, XXXPRD1B'

    FastStartFailoverTarget         = ''

    StatusReport                    = '(monitor)'

    InconsistentProperties          = '(monitor)'

    InconsistentLogXptProps         = '(monitor)'

    SendQEntries                    = '(monitor)'

    LogXptStatus                    = '(monitor)'

    RecvQEntries                    = '(monitor)'

    HostName                        = 'xxxem01'

    SidName                         = 'XXXPRD1B'

    StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=xxxem01)(PORT=1526))(CONNECT_DATA=(SERVICE_NAME=XXXPRD1B_DGMGRL)(INSTANCE_NAME=XXXPRD1B)(SERVER=DEDICATED)))'

    StandbyArchiveLocation          = 'USE_DB_RECOVERY_FILE_DEST'

    AlternateLocation               = ''

    LogArchiveTrace                 = '0'

    LogArchiveFormat                = '%t_%s_%r.dbf'

    LatestLog                       = '(monitor)'

    TopWaitEvents                   = '(monitor)'

 

Current status for "XXXPRD1B":

SUCCESS

 Ensure logs are still being transported and applied

All still works!

 

What does it give us?

  • Improved monitoring in OEM.
  • Simpler administration via DGMGRL or OEM i.e. changing Protection Mode.
  • Easy switchover/failover
    • DGMGRL: switchover to “XXXPRD1A”
    • Click of a button in OEM.
  • Fast-Start Failover (with an observer).

 

Conclusions

The broker configuration was relatively painless and I have successfully performed a switchover via both DGMGRL and OEM.  I’ve also used the various OEM functions without issue too.  I’m uneasy about performing such tasks via OEM but then most DBA’s prefer to be at the helm via the command line rather than clicking a button (that’s too easy!). 

If you plan to use DataGuard Broker then embrace it and make sure you don’t alter parameters behind the covers.  It does that for you and it doesn’t like you meddling!  I also think it’s a good idea to setup DataGuard initially without the broker and then configure it once you’re happy that DataGuard is working.

So what’s made the difference since our experience last year?  UNIX v Windows (maybe), 10g v 11g (probably).  A big factor would also have to be our experience level.  In the last year we’ve increased our DataGuard count from zero to ten so we are therefore much more comfortable in the DataGuard world.

When a sand-pit database becomes available I’d like to try and setup the observer and test Fast-Start Failover.  I’m inclined to remove the ‘broken’ tag but I’ll reserve judgment until I’ve tested the remaining functions.

Simon Ryan

h1

Moving from ASM storage back to filesystem

September 17, 2009

Whilst it is not very common to move from ASM to filesystem I needed to prepare a regression plan for the migration of a 450 datafile database in the event of problems during the migration or shortly afterwards.

The example below is from a small database called TEST and I used filestore in /app/oracle whereas normally I would have a /oradata/SID/ filesystem

The details below should be enough to get someone through the migration and the sql scripts can be modified to be more generic if necessary but it is not too hard to use them as they are now. 

List the datafiles and we will add a second datafile to the users tablespace to provide an example of an issue that can occur.


SYS@TEST SQL>select file_name from dba_data_files;  +DATA/test/datafile/undotbs1.301.697649965
+DATA/test/datafile/sysaux.300.697649963
+DATA/test/datafile/system.299.697649963
+DATA/test/datafile/users.302.697649965
+DATA/test/datafile/users.309.697650601

 mkdir /app/oracle/TEST

Some tablespaces may have multiple datafiles which would end up with the same name using the code above. However as a quick circumvention I manually identified those tablespaces using this sql and then edited the new data file names

select tablespace_name,count(file_name) from dba_data_files group by tablespace_name ORDER BY 2 

In my example it only applied to the USERS tablespace and I edited the second ASM datafile to be mapped to users02.dbf not users01.dbf as the script will output. This action would have to be repeated for all tablespaces with multiple datafiles

sqlplus / as sysdba

set lines 120 pages 0

select 'copy datafile '||file_id||' to '||'''/app/oracle/TEST/'||substr(file_name,21,instr(file_name,'.')-21)||'_01.dbf'''||';' from dba_data_files order by 1;

copy datafile 1 to '/app/oracle/TEST/system_01.dbf';

copy datafile 2 to '/app/oracle/TEST/sysaux_01.dbf';

copy datafile 3 to '/app/oracle/TEST/undotbs1_01.dbf';

copy datafile 4 to '/app/oracle/TEST/users_01.dbf';

copy datafile 5 to '/app/oracle/TEST/users_02.dbf';

shutdown immediate;

startup mount;

exit

rman target /

copy datafile 1 to '/app/oracle/TEST/system_01.dbf';

copy datafile 2 to '/app/oracle/TEST/sysaux_01.dbf';

copy datafile 3 to '/app/oracle/TEST/undotbs1_01.dbf';

copy datafile 4 to '/app/oracle/TEST/users_01.dbf';

copy datafile 5 to '/app/oracle/TEST/users_02.dbf';

exit

sysdba

select 'alter database rename file '''||file_name||''' to '||'''/app/oracle/TEST/'||substr(file_name,21,instr(file_name,'.')-21)||'_01.dbf'''||';' from dba_data_files order by 1;

alter database rename file '+DATA/test/datafile/sysaux.300.697649963' to '/app/oracle/TEST/sysaux_01.dbf';
alter database rename file '+DATA/test/datafile/system.299.697649963' to '/app/oracle/TEST/system_01.dbf';
alter database rename file '+DATA/test/datafile/undotbs1.301.697649965' to '/app/oracle/TEST/undotbs1_01.dbf';
alter database rename file '+DATA/test/datafile/users.302.697649965' to '/app/oracle/TEST/users_01.dbf';
alter database rename file '+DATA/test/datafile/users.309.697650601' to '/app/oracle/TEST/users_02.dbf';

open the database and check for files still using ASM

open database

select name from v$controlfile
union
select name from v$tempfile
union
select member from v$logfile
union
select filename from v$block_change_tracking;

 

NAME
--------------------------------------------------------------------------------
+DATA/test/controlfile/current.303.697650047
+DATA/test/onlinelog/group_1.304.697650049
+DATA/test/onlinelog/group_2.305.697650049
+DATA/test/onlinelog/group_3.306.697650049
+DATA/test/tempfile/temp.307.697650095
+FRA/test/controlfile/current.260.697650047
+FRA/test/onlinelog/group_1.261.697650049
+FRA/test/onlinelog/group_2.262.697650049
+FRA/test/onlinelog/group_3.263.697650051

alter database rename file '+DATA/test/datafile/sysaux.300.697649963' to '/app/oracle/TEST/sysaux_01.dbf';

alter database rename file '+DATA/test/datafile/system.299.697649963' to '/app/oracle/TEST/system_01.dbf';

alter database rename file '+DATA/test/datafile/undotbs1.301.697649965' to '/app/oracle/TEST/undotbs1_01.dbf';

alter database rename file '+DATA/test/datafile/users.302.697649965' to '/app/oracle/TEST/users_01.dbf';

alter database rename file '+DATA/test/datafile/users.309.697650601' to '/app/oracle/TEST/users_02.dbf';

ALTER DATABASE ADD LOGFILE GROUP 4 ('/app/oracle/TEST/redo_04a.dbf','/app/oracle/TEST/redo_04b.dbf') size 50M;

ALTER DATABASE ADD LOGFILE GROUP 5 ('/app/oracle/TEST/redo_05a.dbf','/app/oracle/TEST/redo_05b.dbf') size 50M;

ALTER DATABASE ADD LOGFILE GROUP 6 ('/app/oracle/TEST/redo_06a.dbf','/app/oracle/TEST/redo_06b.dbf') size 50M; 

You will need to perform logfile switches or checkpoint before the log groups using ASM are available for removal. 

SQL>

show parameter control

 control_files +DATA/test/controlfile/current .303.697650047, +FRA/test/cont rolfile/current.260.697650047

create a pfile from spfile and edit the pfile to show new placement for the controlfiles

*.control_files='/app/oracle/TEST/control01.ctl','/app/oracle/TEST/control02.ctl'

startup nomount

$rman target / 

Recovery Manager: Release 11.1.0.7.0 - Production on Wed Sep 16 16:36:15 2009 connected to target database:

TEST (not mounted) using target database control file instead of recovery catalog

 restore controlfile from '+DATA/test/controlfile/current.303.697650047';

Starting restore at 16-SEP-09 allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=70 device type=DISK channel ORA_DISK_1: copied control file copy output file name=/app/oracle/TEST/control01.ctl output file name=/app/oracle/TEST/control02.ctl

Finished restore at 16-SEP-09

 exit

Recovery Manager complete.

 

create spfile='/app/oracle/product/11.1.0/db_1/dbs/spfileTEST.ora' from pfile='/home/oracle/john.ora';

File created.
shutdown immediate; 
startup

show parameter control

control_files  /app/oracle/TEST/control01.ctl , /app/oracle/TEST/control02.ctl 

All that is left is to recreate the temp tablespace to use filesystem storage.

startup mount;

RMAN> run { set newname for tempfile 1 to '/app/oracle/TEST/temp01.dbf'; switch tempfile all;}

executing command:

SET NEWNAME renamed tempfile 1 to /app/oracle/TEST/temp01.dbf in control file  rman target /

 TEST (not mounted) using target database control file instead of recovery catalog RMAN> restore controlfile from '+DATA/test/controlfile/current.303.697650047';

Starting restore at 16-SEP-09 allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=69 device type=DISK channel ORA_DISK_1:

copied control file copy output file

name=/app/oracle/TEST/control01.ctl output file name=/app/oracle/TEST/control02.ctl

Finished restore at 16-SEP-09

All should be complete now. As I said, this is just a hint on what actions to take and the sql scripts should be useful if you have lots of datafiles.

All that is left is to create a new  block_change_tracking file and the ASM storage can be released.

h1

Oracle 11G Release 2 available (Linux only though)

September 2, 2009

Linux only at the moment

 http://www.oracle.com/technology/products/database/oracle11g/index.html

  11GR2 documentation

 http://www.oracle.com/pls/db112/homepage

 11GR2 new features

 http://download.oracle.com/docs/cd/E11882_01/server.112/e10881/toc.htm

  new features index page shows the following (of which I will be looking at ASM first)

 1.1 Application Development

1.10.4.3 XMLIndex Enhancements

h1

ASM space marked as internal – use check all repair

July 30, 2009

The problem was that the FRA diskgroup seemed to be using a lot of space and and yet there was hardly  files on  disk as far as I could tell. HP Itanium – 11.1.0.7

ASM Disk Groups
===============

     Group Group Name                State           Type      Total GB Free GB
---------- ------------------------- --------------- ------- ---------- -------
         1 DATA                      MOUNTED         EXTERN        2581     109
         2 FRA                       MOUNTED         EXTERN        1602     127

 
Group Disk Header    Mode     State           Redundancy   Total MB    Free MB Disk Name                      Failure Gr
----- ---- --------- -------- --------------- ---------- ---------- ---------- ------------------------------ ----------
    2    1 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0001                       FRA_0001
    2    2 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0002                       FRA_0002
    2    4 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0004                       FRA_0004
    2    5 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0005                       FRA_0005
    2    6 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0006                       FRA_0006
    2    7 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0007                       FRA_0007
    2    8 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0008                       FRA_0008
    2    9 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0009                       FRA_0009
    2   10 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0010                       FRA_0010
    2   11 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0011                       FRA_0011
    2   12 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0012                       FRA_0012
    2   13 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0013                       FRA_0013
    2   14 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0014                       FRA_0014
    2   15 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0015                       FRA_0015
    2   16 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0016                       FRA_0016
    2   17 MEMBER    ONLINE   NORMAL          UNKNOWN         91138          0 FRA_0017                       FRA_0017
    2   18 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      65242 FRA_0018                       FRA_0018
    2   19 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      65112 FRA_0019                       FRA_0019

 I had removed one disk from FRA and forced a rebalance but that still did not release space. I had checked the FRA disks using asmcmd but that only showed 2 online logs which were 2Gb each (we were not in archive log mode).

I had checked the diskgroup using the EM Gui which issued the command  

ALTER DISKGROUP FRA CHECK ALL:

but still no success. Looking at the documentation the check command taking NOREPAIR as the default should have reported any issues into the ASM alert log.  Which indeed it did (and I would have seen them if I had bothered to look)

Wed Jul 29 16:37:55 2009
SQL&amp;amp;amp;amp;gt; ALTER DISKGROUP FRA CHECK DISK FRA_0001,FRA_0002,FRA_0004,FRA_0005,FRA_0006,FRA_0007,FRA_0008,FRA_0009,FRA_0010,FRA_0011,FRA_0012,FRA_0013,FRA_0014,FRA_0015,FRA_0016,FRA_0017,FRA_0018,FRA_0019
WARNING: Deprecated privilege SYSDBA for command 'ALTER DISKGROUP CHECK'
kfdp_checkDsk(): 33
NOTE: disk FRA_0001, used AU total mismatch: DD=4294965892, AT=2283
kfdp_checkDsk(): 34
NOTE: disk FRA_0002, used AU total mismatch: DD=4294966012, AT=2283
kfdp_checkDsk(): 35
NOTE: disk FRA_0004, used AU total mismatch: DD=4294966210, AT=2283
kfdp_checkDsk(): 36
NOTE: disk FRA_0005, used AU total mismatch: DD=4294966625, AT=2283
kfdp_checkDsk(): 37
NOTE: disk FRA_0006, used AU total mismatch: DD=4294965615, AT=2280
kfdp_checkDsk(): 38
NOTE: disk FRA_0007, used AU total mismatch: DD=4294964292, AT=2283
kfdp_checkDsk(): 39
NOTE: disk FRA_0008, used AU total mismatch: DD=4294967237, AT=2284
kfdp_checkDsk(): 40
NOTE: disk FRA_0009, used AU total mismatch: DD=4294965595, AT=2280
kfdp_checkDsk(): 41
NOTE: disk FRA_0010, used AU total mismatch: DD=4294966758, AT=2279
kfdp_checkDsk(): 42
NOTE: disk FRA_0011, used AU total mismatch: DD=4294966324, AT=2279
kfdp_checkDsk(): 43
NOTE: disk FRA_0012, used AU total mismatch: DD=4294966743, AT=2279
kfdp_checkDsk(): 44
NOTE: disk FRA_0013, used AU total mismatch: DD=4294964514, AT=2278
kfdp_checkDsk(): 45
NOTE: disk FRA_0014, used AU total mismatch: DD=4294965734, AT=2278
kfdp_checkDsk(): 46
NOTE: disk FRA_0015, used AU total mismatch: DD=4294965204, AT=2279
kfdp_checkDsk(): 47
NOTE: disk FRA_0016, used AU total mismatch: DD=4294965599, AT=2279
kfdp_checkDsk(): 48
NOTE: disk FRA_0017, used AU total mismatch: DD=4294964950, AT=2280
kfdp_checkDsk(): 49
NOTE: disk FRA_0018, used AU total mismatch: DD=25896, AT=2278
kfdp_checkDsk(): 50
NOTE: disk FRA_0019, used AU total mismatch: DD=26026, AT=2280
WARNING: deprecated use of ALTER DISKGROUP CHECK arguments
SUCCESS: ALTER DISKGROUP FRA CHECK DISK FRA_0001,FRA_0002,FRA_0004,FRA_0005,FRA_0006,FRA_0007,FRA_0008,FRA_0009,FRA_0010,FRA_0011,FRA_0012,FRA_0013,FRA_0014,FRA_0015,FRA_0016,FRA_0017,FRA_0018,FRA_0019

From the command line I then ran the comand

ALTER DISKGROUP FRA CHECK ALL REPAIR;

and I had immediate success. It took about 1 minute and corrected the issues and then released the space that had been grabbed internally.

     Group Group Name                State           Type      Total GB Free GB
---------- ------------------------- --------------- ------- ---------- -------
         1 DATA                      MOUNTED         EXTERN        2581     109
         2 FRA                       MOUNTED         EXTERN  1602   1,562

Group Disk Header    Mode     State           Redundancy   Total MB    Free MB Disk Name                      Failure Gr
----- ---- --------- -------- --------------- ---------- ---------- ---------- ------------------------------ ----------
    2    1 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88855 FRA_0001                       FRA_0001
    2    2 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88855 FRA_0002                       FRA_0002
    2    4 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88855 FRA_0004                       FRA_0004
    2    5 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88855 FRA_0005                       FRA_0005
    2    6 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88858 FRA_0006                       FRA_0006
    2    7 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88855 FRA_0007                       FRA_0007
    2    8 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88854 FRA_0008                       FRA_0008
    2    9 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88858 FRA_0009                       FRA_0009
    2   10 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88859 FRA_0010                       FRA_0010
    2   11 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88859 FRA_0011                       FRA_0011
    2   12 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88859 FRA_0012                       FRA_0012
    2   13 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88860 FRA_0013                       FRA_0013
    2   14 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88860 FRA_0014                       FRA_0014
    2   15 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88859 FRA_0015                       FRA_0015
    2   16 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88859 FRA_0016                       FRA_0016
    2   17 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88858 FRA_0017                       FRA_0017
    2   18 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88860 FRA_0018                       FRA_0018
    2   19 MEMBER    ONLINE   NORMAL          UNKNOWN         91138      88858 FRA_0019                       FRA_0019

 

SQL&amp;amp;amp;amp;gt; alter diskgroup fra check all repair
NOTE: starting check of diskgroup FRA
kfdp_checkDsk(): 51
WARNING: disk FRA_0001, changing DD used AUs from 4294965892 to 2283
kfdp_checkDsk(): 52
WARNING: disk FRA_0002, changing DD used AUs from 4294966012 to 2283
kfdp_checkDsk(): 53
WARNING: disk FRA_0004, changing DD used AUs from 4294966210 to 2283
kfdp_checkDsk(): 54
WARNING: disk FRA_0005, changing DD used AUs from 4294966625 to 2283
kfdp_checkDsk(): 55
WARNING: disk FRA_0006, changing DD used AUs from 4294965615 to 2280
kfdp_checkDsk(): 56
WARNING: disk FRA_0007, changing DD used AUs from 4294964292 to 2283
kfdp_checkDsk(): 57
WARNING: disk FRA_0008, changing DD used AUs from 4294967237 to 2284
kfdp_checkDsk(): 58
WARNING: disk FRA_0009, changing DD used AUs from 4294965595 to 2280
kfdp_checkDsk(): 59
WARNING: disk FRA_0010, changing DD used AUs from 4294966758 to 2279
kfdp_checkDsk(): 60
WARNING: disk FRA_0011, changing DD used AUs from 4294966324 to 2279
kfdp_checkDsk(): 61
WARNING: disk FRA_0012, changing DD used AUs from 4294966743 to 2279
kfdp_checkDsk(): 62
WARNING: disk FRA_0013, changing DD used AUs from 4294964514 to 2278
kfdp_checkDsk(): 63
WARNING: disk FRA_0014, changing DD used AUs from 4294965734 to 2278
kfdp_checkDsk(): 64
WARNING: disk FRA_0015, changing DD used AUs from 4294965204 to 2279
kfdp_checkDsk(): 65
WARNING: disk FRA_0016, changing DD used AUs from 4294965599 to 2279
kfdp_checkDsk(): 66
WARNING: disk FRA_0017, changing DD used AUs from 4294964950 to 2280
kfdp_checkDsk(): 67
WARNING: disk FRA_0018, changing DD used AUs from 25896 to 2278
kfdp_checkDsk(): 68
WARNING: disk FRA_0019, changing DD used AUs from 26026 to 2280
SUCCESS: check of diskgroup FRA found no errors
WARNING: deprecated use of ALTER DISKGROUP CHECK arguments
SUCCESS: alter diskgroup fra check all repair
Thu Jul 30 10:31:33 2009

 

I would have expected the gui to have reported errors but it didn’t. Equally I can see no good reason  for not fixing errors it identified when a check is run. That was the default in 10G  so  there must be a good reason why it was changed. Perhaps it had a performance impact which I would not have noticed on a pretty empty diskgroup. The other reason is that it might make the assumption that the DBA would want a good backup before repairing.

As we had nothing in there and we backup ASM metadata every day then I was happy to go ahead. http://jhdba.wordpress.com/2009/06/11/script-to-backup-asm-metadata/

 So the lessons learned are

  1. Use the check all feature
  2. Check the alert log afterwards
  3. Then use the check all repair statement from command line
h1

bug 8525592 – SGA memory leak during db shutdown

June 25, 2009

10 days ago I posted regarding informational messages in trace files and how they seem to have increased with the advent of 11G. See http://jhdba.wordpress.com/2009/06/16/informational-trace-files-in-11g/

I referred to 3 messages, one of which I said I was going to raise a SR against. The trace file contained the line  “Updating SGAs in kzam_check_limit”
and the full trace showed

 Updating SGAs in kzam_check_limit
No rows for property:16, trail_type:12
Resetting MaxSize to default values for OS audit files
Resetting MaxSize to default values for XML audit files
No rows for property:17, trail_type:12
Resetting MaxAge to default values for OS audit files
Resetting MaxAge to default values for XML audit files
No rows for property:24, trail_type:15

The feedback from Oracle was that this was unpublished bug  8525592 – SGA MEMORY LEAK ERROR DURING THE DB SHUTDOWN and that there was no cause for concern.

Now this was appearing on an 11.1.0.7 database on HPUX Itanium and it is the only instance out of around 25 where it is appearing. Co-incidentally we have a  memory leak on that server and we have an ongoing SR open with Oracle. I have alerted my colleague who owns that SR and he will talk to Oracle and see if there is any match between the two issues.

In the meantime I have been able to get a bit more information on what bug 8525592 is about. Apparently it is not a serious issue and isn’t really a memory leak.  It’s to do freeing up shared pool when there are multiple subpools.  If there are child/parent handles split between subpools, it’s possible for the parent one not to be cleared properly.  In normal operation there’s a houskeeping routine that SMON does periodically that sorts this out but if the database is shutdown before that runs, it looks like a memory leak.  It’s only in 11.1.0.x and is not considered serious. Bug 8525592 leads me to bug 8586531 and base bug 7453413

h1

informational trace files in 11g

June 16, 2009

Since Oracle 11G there seems to be many more informational trace files generated than ever before. Perhaps I am noticing more but I will provide three examples in support of what I am saying.

kcrroda: calling ksfdrcres to create AL or RL

Just a warning but it is due to a bug in 11.1.0.6 (still not fixed in 11.1.0.7) that causes diagnostic warnings to be issued after an archivelog switch. Bug 6910132 for anyone who is interested

From the same database I see trace files with the following lines in

Updating SGAs in kzam_check_limit
No rows for property:16, trail_type:12
Resetting MaxSize to default values for OS audit files
Resetting MaxSize to default values for XML audit files
No rows for property:17, trail_type:12
Resetting MaxAge to default values for OS audit files
Resetting MaxAge to default values for XML audit files
No rows for property:24, trail_type:15

I cannot find any information on this at all but I do suspect that it is another diagnostic setting that has been left enabled within the code
I will raise an SR on this matter but I don’t think it is anything for me to be worried by

From another database (again 11.1.0.7) I see numerous trace files containing the line

qerfxGCol:KQFDTTIM – Error converting to LdiDateArray

Metalink note 577674.1 shows that it is yet another trace output than can be ignored
Applies to:
Oracle Server – Enterprise Edition – Version: 11.1.0.6
This problem can occur on any platform.

Symptoms
After upgraded to 11.1.0.6.0 or created a new database on this version, you noticed following repetitively message at MMON process trace:

“qerfxGCol:KQFDTTIM – Error converting to LdiDateArray”

There is no further errors at alert log.

Changes
Upgraded to 11.1.0.6.0.

Cause
This issue has been introduced in 11gR1 11.1.0.6.0 with the new view v$persistent_queues, where a ’select * from v$persistent_queues;’ reproduces the error when there is a ‘null’ value for one of timestamp
columns in that view. This behavior does *not* cause any data loss; performance impact nor any effect for use of ADDM/AWR tools.

This issue has been reported on following bugs:

Bug 6780602 – TRACEFILES WITH ‘QERFXGCOL:KQFDTTIM – ERROR CONVERTING TO LDIDATEARRAY’
Bug 6921602 – QERFXGCOL:KQFDTTIM – ERROR CONVERTING TO LDIDATEARRAY

Solution
It has been fixed in 11.2.

h1

HPUX_SCHED_NOAGE parameter, improve OLTP throughput

May 27, 2009

Sometimes it is easy to miss the simple stuff and I must admit that although I have managed Oracle on HPUX over many years I have not in the last few years (mostly Solaris and Linux).
Having started work at a solely HP site in the last year I never bothered to read the Installation guide as a tried and tested set of practise were in place.

Recently we were having problems with a Peoplesoft database and we went the whole hog in re-striping disk and re-evaluating everything. A HP specialist mentioned the init.ora parameter HPUX_SCHED_NOAGE which I had not come across before. On investigation the parameter was in place on our 10.2.0.3 databases but set to 0. Looking at some 11g databases I was surprised to see that it had been set also but with the recommended value of 178. Obviously it is now enabled by default on 11g on HPUX

So what does it do. Well in a non Unix specialist way I can describe it as ensuring that all the database processes run at the same priority and therefore do not get aged out as new processes come along and look for resource. Older processes are more likely to stay on the CPU which results in less context switching. On a busy system it is likely that the lgwr process will perform better as it gets more time on the CPU and is not switched out as often to make way for a process that is waiting in the normal time-share manner. It is not expected that gains will be seen in OLTP systems rather than DSS systems as there will be more process competing for resource than in a DSS system where fewer longer running processes will be the norm.

Whilst writing this up I came across I came across a blog that describes it all in a much more technical manner than I can, complete with diagrams Christian Bilien

Metalink note 217990.1 (Mar 2009) states This could be suited to online transaction processing (OLTP) environments because OLTP environments can cause competition for critical resources. Overall performance improvements of 5 to 10% for OLTP applications.

If the parameter setting is out of range, Oracle sets the parameter to a permissible value and continues with the SCHED_NOAGE policy with the new value.

It also generates a message in the alert_sid.log file about the new setting. Oracle Corporation recommends that you set the parameter to the required priority level for Oracle processes.

The Oracle manual also shows you how to set the unix pre-reqs for the HPUX_SCHED_NOAGE init.ora parameter to work

To permit Oracle Database to use the SCHED_NOAGE scheduling policy, the OSDBA group (typically, the dba group) must have the RTSCHED and RTPRIO privileges to change the scheduling policy and set the priority level for Oracle processes. To give the dba group these privileges:

1. Log in as the root user.

2. Using any text editor, open the /etc/privgroup file, or create it if necessary.

3. Add or edit the following line, which begins with the name of the OSDBA group, specifying the privileges RTPRIO and RTSCHED that you want to grant to this group every time the system restarts:

4. Save the file, and quit the text editor.

5. Enter the following command to grant the privileges to the OSDBA group:
# /usr/sbin/setprivgrp -f /etc/privgroup

Enter the following command to verify that the privileges are set correctly:

# /usr/sbin/getprivgrp dba

So did it make any difference to us. I can only give an example of how it helped a specific problem for us. On a   Peoplesoft HR system we made 2 changes, bad practise I know but time constraints forced it upon us. We added the scheduling parameter and also reduced the filecache_min and filecache_max kernel parameters from 3%,5 % to 1%,2%. These parameters control the amount of physical memory used for caching file I/O data and we were tight on memory anyway. On a specific time and labour batch processing benchmark we saw an immediate improvement of 12% which was very acceptable.

PS to show the process in Glance use ’s’ then select any PID associated with the database and note in the left hand column that scheduler=NOAGE and Priority=178

HP_SCHED_NOAGE

h1

Purging ADR log and trace files automatically

May 13, 2009

The following script was put together by a colleague, David Sutton, and shows a way to create a script to tidy up the logs and trace files generated in 11G. Note that the purge time is in seconds and this keep s 30 days online. This is probably too many on some of our systems and this is likely to be reduced to 14 days although we will keep incident history for longer

This is called each day from cron (or it could be run from dbms_scheduler, either directly or via an OEM job) 

 

#!/usr/bin/ksh
#
#
#
# Description
# Shell script to interact with the ADR through ADRCI and purge ADR contents
#
# Tunables
###
LOCKFILE=/tmp/mor_adrci_purge_.lck

###

######
# Start Of Functions
#
# tidyup – common fuction called if error has occured
tidyup () {
        rm -f ${LOCKFILE}
        echo “ERROR: Purge aborted at `date` with exit code ${ERR}”
        exit ${ERR}
}

######
# End Of Functions

### Main Program

# Check user is oracle
USERID=`/usr/bin/id -u -nr`
if [ $? -ne 0 ]
then
        echo “ERROR: unable to determine uid”
        exit 99
fi
if [ "${USERID}" != "oracle" ]
then
        echo “ERROR: This script must be run as oracle”
        exit 98
fi

echo “INFO: Purge started at `date`”

# Check if lockfile exists
if [ -f ${LOCKFILE} ]
then
        echo “ERROR: Lock file already exists”
        echo “       Purge already active or incorrectly terminated”
        echo “       If you are sure tidy isn’t active, please remove “
        echo “       ${LOCKFILE}”
        exit 97
fi

# Create lock file
touch ${LOCKFILE} 2>/dev/null
if [ $? -ne 0 ]
then
        echo “ERROR: Unable to create lock file”
        exit 96
fi

# Purge ADR contents
echo “INFO: adrci purge started at `date`”
/app/oracle/product/11.1.0/db_1/bin/adrci exec=”show homes”|grep -v : | while read file_line
do
  echo “INFO: adrci purging diagnostic destination ” $file_line
  echo “INFO: purging ALERT older than 90 days ….”
  /app/oracle/product/11.1.0/db_1/bin/adrci exec=”set homepath $file_line;purge -age 129600 -type ALERT”
  echo “INFO: purging INCIDENT older than 30 days ….”
  /app/oracle/product/11.1.0/db_1/bin/adrci exec=”set homepath $file_line;purge -age 43200 -type INCIDENT”
  echo “INFO: purging TRACE older than 30 days ….”
  /app/oracle/product/11.1.0/db_1/bin/adrci exec=”set homepath $file_line;purge -age 43200 -type TRACE”
  echo “INFO: purging CDUMP older than 30 days ….”
  /app/oracle/product/11.1.0/db_1/bin/adrci exec=”set homepath $file_line;purge -age 43200 -type CDUMP”
  echo “INFO: purging HM older than 30 days ….”
  /app/oracle/product/11.1.0/db_1/bin/adrci exec=”set homepath $file_line;purge -age 43200 -type HM”
  echo “”
  echo “”
done

echo
echo “INFO: adrci purge finished at `date`”

# All completed
rm -f ${LOCKFILE}
echo “SUCC: Purge completed successfully at `date`”

exit 0