Oracle DBA – A lifelong learning experience

Archive for the ‘ASM’ Category

ASM – Adding and dropping disks in one command

Posted by John Hallas on May 29, 2012

One of the possibilities within ASM that is not widely documented is the opportunity to add disks, drop disks and set the rebalance power all in one command.

One might wonder when you might be both adding and dropping disks simultaneously from the same disk group. Recently we have been migrating to a new storage array and this command comes in very useful in that situation

alter diskgroup data add disk '/dev/oracle/disk100','dev/oracle/disk101','dev/oracle/disk102'

drop disk '/dev/oracle/disk197','dev/oracle/disk198','dev/oracle/disk199'

rebalance power 5;

 

Posted in ASM | Tagged: , , , , | 5 Comments »

Test Case for 11gR2 Role Separation issue on HP-UX – help wanted

Posted by John Hallas on May 6, 2012

On any HPUX (11:31) system where grid infrastructure has been applied with 2 software owners – in our case grid and oracle  – oracle’s best practise for GI implementations – RAC or standalone systems.

 The standard  unix account we are using  is , testuser (although in this case he is in the DBA group).

 This user is a member of the following groups :

 uid=664(testuser) gid=500(dba)

 Test Server       = server

Test database     = dbatest

 Login to the testuser user

  1. setsid to set Oracle environment to dbatest

 The ASM luns are owned by the grid user as below :

  

[testuser@server][dbatest]/dev/grid $ls -ltr
total 0
crw-rw----   1 grid       asmdba      13 0x00000a Apr 27 12:38 disk002
crw-rw----   1 grid       asmdba      13 0x000009 Apr 27 22:00 disk001
crw-rw----   1 grid       asmdba      13 0x00000b Apr 28 12:44 disk003

 logon onto unix server as testuser

Login into the database as “sqlplus / as sysdba”

Set database environment variables as appropriate (OH, SID,PATH)

 Database bounced to clear cache out (shared pool flush would also work) Read the rest of this entry »

Posted in ASM, Oracle | Tagged: , , , | 2 Comments »

Huge asm_rbal_trace file with text UNINDENT OF DISK

Posted by John Hallas on May 3, 2012

You may notice a large and growing trace file in diag dest which contains numerous lines starting with the phrase “NOTE: Unident of disk ” followed by a disk path.

*** 2012-04-17 18:31:10.759
NOTE:Unident of disk:/dev/oracle_hr/rdisk130
NOTE:Unident of disk:/dev/oracle_hr/rdisk131
NOTE:Unident of disk:/dev/oracle_hr/rdisk132
NOTE:Unident of disk:/dev/oracle_hr/rdisk133
NOTE:Unident of disk:/dev/oracle_hr/rdisk134
NOTE:Unident of disk:/dev/oracle_hr/rdisk135
NOTE:Unident of disk:/dev/oracle_hr/rdisk136
NOTE:Unident of disk:/dev/oracle_hr/rdisk137
NOTE:Unident of disk:/dev/oracle_hr/rdisk138
NOTE:Unident of disk:/dev/oracle_hr/rdisk139
NOTE:Unident of disk:/dev/oracle_hr/rdisk129

*** 2012-04-17 19:05:20.540
NOTE:Unident of disk:/dev/oracle_hr/rdisk130
NOTE:Unident of disk:/dev/oracle_hr/rdisk131
NOTE:Unident of disk:/dev/oracle_hr/rdisk132
NOTE:Unident of disk:/dev/oracle_hr/rdisk133
NOTE:Unident of disk:/dev/oracle_hr/rdisk134
NOTE:Unident of disk:/dev/oracle_hr/rdisk135
NOTE:Unident of disk:/dev/oracle_hr/rdisk136
NOTE:Unident of disk:/dev/oracle_hr/rdisk137
NOTE:Unident of disk:/dev/oracle_hr/rdisk138
NOTE:Unident of disk:/dev/oracle_hr/rdisk139
NOTE:Unident of disk:/dev/oracle_hr/rdisk129 Read the rest of this entry »

Posted in ASM, Oracle | Tagged: , , , , | Leave a Comment »

Curing unevenly balanced ASM diskgroups to reduce poor file distribution

Posted by John Hallas on May 2, 2012

Back in Nov 2011 I posted a question on the Oracle-L mailing group about my perception that ASM disk rebalances seemed to be required on DATA diskgroups in ASM  (never FRA, presumably because the FRA had lots of similar sized objects (flashlogs archivelogs  etc)) and even after rebalancing there seemed to be a permanent imbalance between disks of the same size. I was querying the effectiveness of the rebalancing operation.

There was some good responses including a script that Dave Herring had created, based on  key MOS articles: 818171.1 (Identifying Files with Imbalances), 351117.1 (Troubleshooting ASM Space Issues), 367445.1 (Advanced Balance and Space Report on ASM). Manual rebalancing  would normally not be required, because ASM automatically rebalances disk groups when their configuration changes. You might want to do a manual rebalance operation if you want to control the speed of what would otherwise be an automatic rebalance operation.

I have been chasing this down and have now come across Bug 7699985: UNBALANCED DISTRIBUTION OF FILES ACROSS DISKS.

It appears to be a problem in 11.1.0.7 and I know a number of sites have reported it. Despite  repeated manual rebalance operations there can be a variance of up to 10% between different disks of the same diskgroup, even if they are the same size. It is fixed in 11.2.0.1.  The workaround is to set the _asm_imbalance_tolerance parameter to be 0 rather than the default of 3.This controls the hundredths of a percentage of inter-disk imbalance to tolerate. Then rebalance the disks manually and reset the parameter back again to 3 as you don’t need to leave it  balancing all the time.

SQL> @asm_imbalance   (from Report the Percentage of Imbalance in all Mounted Diskgroups (Doc ID 367445.1)

@asm_imbalance.sql

 Columns Described in Script               Percent Minimum
                                 Percent Disk Size Percent  Disk Diskgroup
Diskgroup Name                 Imbalance  Varience    Free Count Redundancy
------------------------------ --------- --------- ------- ----- ----------
DATA                                 9.0        .5    15.2    84 EXTERN

SYS@+ASM SQL>l
    Select dg.name,dg.allocation_unit_size/1024/1024 "AU(Mb)",min(d.free_mb) Min,
    max(d.free_mb) Max, avg(d.free_mb) Avg
    from v$asm_disk d, v$asm_diskgroup dg
    where d.group_number = dg.group_number
    group by dg.name, dg.allocation_unit_size/1024/1024
SY@+ASM SQL>/

NAME                               AU(Mb)        MIN        MAX    AVG
------------------------------ ---------- ---------- ---------- ------
DATA                                    1       7728      11599   8026

alter diskgroup data rebalance power 4;

NAME                               AU(Mb)        MIN        MAX    AVG
------------------------------ ---------- ---------- ---------- ------
DATA                                    1       7734      11483   8026 – no difference

alter system set "_asm_imbalance_tolerance"=0;
alter diskgroup data rebalance power 4;
NAME                               AU(Mb)        MIN        MAX        AVG
------------------------------ ---------- ---------- ---------- ----------
DATA                                    1       8020       8062   8026  - Now nicely rebalanced

Posted in ASM, Oracle | Tagged: , , , , , , | 4 Comments »

The Mother of all ASM scripts

Posted by John Hallas on March 6, 2012

Back in 2009 I posted a script which I found very useful to review ASM disks. I gave that post the low-key title of The ASM script of all ASM scripts. Now that script has been improved I have to go a bit further with the hyperbole and we have the The Mother of all ASM scripts.  If it ever gets improved then the next post will just be called ‘Who’s the Daddy’.

I have been using the current script across all our systems for the last 3 years and I find it very useful, a colleague, Allan Webster, has added a couple of improvements and it is now better than before.

The improvements show current disk I/O statistics and a breakdown of the types of files in each disk group and the total sizes of that filetype. The I/O statistics are useful when you have a lot of databases, many of which are test and development and so you do not look at them as that often. It just gives a quick overview that allows you to get a feel if anything is wrong and to see what the system is actually doing. There are also a few comments at the beginning defining the various ASM views available. Read the rest of this entry »

Posted in ASM, Oracle, scripts | Tagged: , , | 12 Comments »

OER 27064: cannot perform async I/O to file – HPUX

Posted by John Hallas on February 10, 2010

I was trying to prove that we had a disk I/O issue on a database server so I ran a set of Orion tests across that server and a number of others for comparison purposes. HPUX 11.31 Itanium using the 11.1.0.7 Orion binaries

The test I used was to use to raw devices which would normally be assigned to an ASM diskgroup but had either not yet been used or were marked as CANDIDATES or FORMER.

The Orion command I was using was

./orion_hpux_ia64 -run advanced -write 40 -matrix basic -duration 120 -testname hpuxdiskio  -num_disks 2

where the file  hpuxdiskio had two lines in of the format /dev/oracle/disk550 and dev/oracle/dev551

On one server where there were a number of free disks I saw the following error

ORION: ORacle IO Numbers — Version 11.1.0.7.0
hpuxdiskio_20100208_1455
Test will take approximately 31 minutes
Larger caches may take longer

Ioctl ASYNC_CONFIG error, errno = 1
SKGFR Returned Error — Async. read failed on FILE: /dev/oracle/disk550
OER 27064: cannot perform async I/O to file
rwbase_issue_req: lun_aiorq failed on read
rwbase_run_test: rwbase_issue_req failed
rwbase_run_process: rwbase_run_test failed
rwbase_rwluns: rwbase_run_process failed
orion_thread_main: rw_luns failed
Test error occurred
Orion exiting

Searching the net I could not find any clues. I added the same disks to a diskgroup with no problem so I knew that Oracle could use them and there were no permission or or other issues. Read the rest of this entry »

Posted in ASM, Oracle | Tagged: , , , , | 3 Comments »

PSU dependancy checking with ASM now enforced in 11G

Posted by John Hallas on January 28, 2010

ASM has to be equal to or higher than the highest version of the databases that are using it and the compatability settings have to be correct.

PSU 1 (Oct 2009) did not enforce that requirement. PSU 2 (Jan 2010) does check.

We determined this because we do not always apply the latest PSU against the ASM binaries but we do against the RDBMS code. Today the following sequence of events took place along with the associated error message.

Read the rest of this entry »

Posted in 11g new features, ASM, Oracle | Tagged: , , , , , | 3 Comments »

Recovering old spfile from autobackup

Posted by John Hallas on January 18, 2010

This post is about a problem I had today when trying to restore a spfile from a previous autobackup.  I did not want the latest one but one from earlier in the week. The database was a RAC one but I don’t think that is relevant to the problems I had.

Recovering a spfile from autobackup is relatively easy – startup mount and recover spfile from autobackup.  I set my date format and started the database up in mount mode and then tried to recover the spfile. Read the rest of this entry »

Posted in ASM, Oracle, RMAN | Tagged: , , , , , , , | 3 Comments »

Much faster performance with ASM – a real world example

Posted by John Hallas on October 9, 2009

We have a  Peoplesoft database that  manages HR for ~80,000 staff.  Oracle 10.2.03.  Peoplesoft Tools 8.49 PS HR 9 HPUX 11.31 Itanium

There has been long running problems caused by I/O bottlenecks writing down to Hitachi XP SAN disks. We were using filesystem storage and disk_asynch_io was set to false

The root cause of our problem was that the Time and Labour batches contend when they need to acquire exclusive access to write dirty blocks to data files during truncate operations. 

David Kurtz has written about a specific Peoplesoft feature around truncate http://www.go-faster.co.uk/docs.htm#local_write_wait

 We migrated to using ASM disk and achieved significant benefits.

 The testing stages were

1)       Prepare a 2 hour Time and labour batch run which would generate significant throughput

2)       Capture a baseline at 10.2.0.3

3)       Upgrade to 10.2.0.4 and retest

4)       Upgrade to 10.2.0.4 and migrate to ASM (11.1.0.7) and retest

5)       Upgrade to 11.1.0.7 and migrate to ASM (11.1.0.7) and retest

 

Test 3 did not show any difference at all. We were still blocked on disk, which was expected. We probably would have remained at 10.2.0.3 but gone to ASM but there are problems with seeing ASM disk when using 10.2.0.3 (Metalink Note 434500, ORA-15059) so we had the option of applying a fix to get ASM working with 10.2.0.3 or upgrading to 10.2.0.4. We decided that the benefits of getting onto a terminal release and suggestions from Oracle that small changes in the truncate process had been made between 10.2.0.3 and 10.2.0.4 out-weighed the overhead of any application testing that we needed to do.

Test 4 gave us big performance benefits. Test 5 was not as clear cut but we knew some code had to be optimised and we did not have the luxury of time to address the issues. Remaining on 10g minimised the application testing, certainly compared to the testing require to upgrade to 11g.

The benefits can be seen quite clearly in the graph below

asynch_io

This shows a 2 hour period where a constant throughput of T&L jobs were being processed. In parallel a constant online load was being simulated

The black line shows throughput and elapsed time. The run consists of a number of jobs which take 2 hours to complete and batch time is around the 450 – 600 second mark

The same processing is then run with 10.2.0.4 database and ASM. disk_aysnch_io is set to FALSE. The jobs complete 40 minutes faster and elapsed time is around the 300-400 second mark but showing spikes and inconsistent performance

Finally the orange line shows 10.2.0.4 database and ASM. disk_aysnch_io =TRUE. The processing time has now been reduced to just over an hour and the elapsed time is significantly better at an average of 250 seconds and with a very consistent profile.

As with all things in life we have now moved onto the next bottleneck which is the wait flashback buf free caused because we have enabled flashback logging and as the redo log buffer fills up we wait for flashback to be written out. However this is a miniscule issue compared to the original problems.  We have taken flashback off some tablespaces primarily used for PS temporary tables (aka working storage tables) although that does give us an issue when when rebuilding the primary database after a DataGuard failover using flashback logs

I hope this real world example is of use

Posted in ASM, Oracle | Tagged: , , , , , , , , | 1 Comment »

Managing control files in ASM

Posted by John Hallas on September 28, 2009

Just a couple of notes on how to create multiple ASM control files and moving control files toa different diskgroup.

Assuming we already have a ASM control file in the DATA diskgroup and we wish to add one to the FRA diskgroup

 SQL>show parameter control_files
+DATA/SID/controlfile/current.730.698514673
alter system set control_files='+DATA','+FRA' scope=spfile;
shutdown immediate;
startup nomount;
exit
rman target
/restore controlfile from '+DATAHR/MHRPRD1A/CONTROLFILE/current.730.698514673';
alter database mount;
alter database open;
SQL>show parameter control_files
+DATA/SID/controlfile/current.730.698514673, +FRAHR/SID/controlfile/current.766.698514673

 
                                                                           

Here is an example of how to migrate from one ASM diskgroup to another

Add the new disk discovery path in (if needed), create a diskgroup and add disks

alter system set asm_diskstring='/dev/oracle','/dev/oracle_hr/r*';

create diskgroup DATANEW  external redundancy disk '/dev/oracle_hr/rdisk1','/dev/oracle_hr/rdisk2';

SQL>show parameter control_files

control_files                        string      +DATA10/SID/controlfile/c urrent.796.695384211
alter system set CONTROL_FILES = '+DATANEW' scope=spfile

shutdown
rman target /
startup nomount
restore controlfile  from  '/+DATA/SID/controlfile/current.796.695384211';
alter database mount;
alter database open;

Posted in ASM, Oracle | Tagged: , , | 5 Comments »