Oracle DBA – A lifelong learning experience

Archive for the ‘ASM’ Category

Adding new ASM disks – what is best practise?

Posted by John Hallas on January 15, 2013

According to My Oracle Support note – “How To Add a New Disk(s) to An Existing Diskgroup on RAC (Best Practices). [ID 557348.1]” you should create a test diskgroup using new storage before adding it to an existing diskgroup. That seems eminently sensible, although it is not something I normally do. It proves you can access the disk, and if there is a conflict (i.e the disk is already mapped and in use elsewhere) you are not risking your production DATA diskgroup. I have pasted the note info at the bottom of this post but basically you just create  a new diskgroup and add the new disk to it. If all is OK then drop the diskgroup and add the new disk to your existing diskgroup.

However the downside of that is that you can hit non-published bug:12398300 which is a duplicate of bug:12356910 (also non-published). Diskgroup Mount Hangs with RBAL Waiting on ‘GPnP Get Item’ and ‘enq: DD – contention’ [ID 1375505.1].  Note: This issue so far has been reported on RAC 11.2.0.2.3 and 11.2.0.3 environments which is where we saw it (RAC – 11.2.0.3 clusterware – 11.2.0.1 rdbms)

Simply the ALTER DISKGROUP MOUNT just hangs and has to be interrupted (CTRL-C). No errors in the ASM alert log

Killing the ora.gpnpd on the node when ASM is blocked in the gpnp wait, permits not having  to stop the ASM instance. For details, please see Note:1392934.1. Otherwise, restart the ASM instance that is causing the lock condition.

The fix will be included in future 11.2.0.3.x Patch Set Updates (PSUs) but no patches yet exists (at the time this article was written – Nov.9.2011). Also likely there will be patch requests for the fix to be included on top of existing 11.2.0.2 PSU, but none yet exist.

So I think I will stick with what I have always done, best practise or not.

--From Node 1

. oraenv

-- specify ASM instance from node 1

+ASM1

-- sudo -u oracle sqlplus may not work when run the first time so run:

sudo -u oracle ls

sudo -u oracle sqlplus / as sysasm

CREATE DISKGROUP TEST EXTERNAL REDUNDANCY DISK '' [DISK ''];

SELECT STATE, NAME FROM V$ASM_DISKGROUP;

-- from node 2

. oraenv

+ASM2

sudo -u oracle sqlplus / as sysasm

 

ALTER DISKGROUP TEST MOUNT;

SELECT STATE, NAME FROM V$ASM_DISKGROUP;

 

-- if all ok then

-- from node 2

alter diskgroup test dismount;

 

-- from node 1

DROP DISKGROUP TEST;

Now we can add the disk to the desired diskgroup safely.

 

-- From node 1

. oraenv

+ASM1

sudo -u oracle sqlplus / as sysasm

-- check disks visible in v$asm_disk

-- header_status should be CANDIDATE or FORMER

set lines 120 pages 100

column path format a20

SELECT name, path, mode_status, state, header_status, os_mb, free_mb

FROM v$asm_disk ORDER BY name, path;

-- check diskgroups

select GROUP_NUMBER,NAME,STATE,TOTAL_MB,FREE_MB from v$asm_diskgroup;

-- add disks to appropriate diskgroups

alter diskgroup x add disk '/dev/hdiskX';

-- monitor rebalance

set lines 170

select * from v$asm_operation;

select GROUP_NUMBER,NAME,STATE,TOTAL_MB,FREE_MB from v$asm_diskgroup;

Posted in ASM, Oracle | Tagged: , , , , | Leave a Comment »

Managing OCR and voting disks

Posted by John Hallas on November 28, 2012

This is basically a set of notes I wrote for myself about adding new voting disks and OCR disks to a sandpit RAC cluster as part of testing for migration between HP XP disk array and HP 3PAR disk array. The o/s was HPUX with 11.1.0.7 database.

 View status of OCR disks and Voting disks

sudo /app/oracle/product/crs/bin/ocrcheck

Status of Oracle Cluster Registry is as follows :

         Version                  :          2

         Total space (kbytes)     :     306972

         Used space (kbytes)      :       5880

         Available space (kbytes) :     301092

         ID                       :  746041401

         Device/File Name         : /dev/oracle/disk500

                                    Device/File integrity check succeeded

         Device/File Name         : /dev/oracle/disk501

                                    Device/File integrity check succeeded

          Cluster registry integrity check succeeded

          Logical corruption check succeeded

crsctl query css votedisk

0.     0    /dev/oracle/disk502

1.     0    /dev/oracle/disk503

2.     0    /dev/oracle/disk504

Located 3 voting disk(s).

Add a new OCR disk

Backup first  (the 10GR1 command format still works)

sudo /app/oracle/product/crs/bin/ocrconfig -export /home/oracle/ocr_backup -s online

Owned by root

-rw——-   1 root       sys         136140 Nov 27 08:15 /home/oracle/ocr_backup

As this is an 11GR1 cluster we will use the 11GR1 format

sudo /app/oracle/product/crs/bin/ocrconfig -manualbackup

dhpor43     2012/11/27 08:23:00     /app/oracle/product/crs/cdata/SANDPITR1/backup_20121127_082300.ocr

Listing the backups shows the recent backups

sudo /app/oracle/product/crs/bin/ocrconfig -showbackup

dhpor43     2012/11/27 06:36:27     /app/oracle/product/crs/cdata/SANDPITR1/backup00.ocr

dhpor43     2012/11/27 02:36:27     /app/oracle/product/crs/cdata/SANDPITR1/backup01.ocr

dhpor43     2012/11/26 22:36:27     /app/oracle/product/crs/cdata/SANDPITR1/backup02.ocr

dhpor43     2012/11/25 02:36:27     /app/oracle/product/crs/cdata/SANDPITR1/day.ocr

dhpor43     2012/11/24 06:36:27     /app/oracle/product/crs/cdata/SANDPITR1/week.ocr

dhpor43     2012/11/27 08:23:00     /app/oracle/product/crs/cdata/SANDPITR1/backup_20121127_082300.ocr

I have 3 disks available (all at 1Gb,  which is easily enough for either a voting or OCR disk)

Free ASM disks and their paths

==============================

Header    Mode     Path                      Disk Size

——— ——– ————————- ———

CANDIDATE ONLINE   /dev/oracle/disk507             1Gb

CANDIDATE ONLINE   /dev/oracle/disk508             1Gb

CANDIDATE ONLINE   /dev/oracle/disk509             1Gb

sudo /app/oracle/product/crs/bin/ocrconfig -replace ocrmirror /dev/oracle/disk507

ocrcheck

Status of Oracle Cluster Registry is as follows :

         Version                  :          2

         Total space (kbytes)     :     306972

         Used space (kbytes)      :       5908

         Available space (kbytes) :     301064

         ID                       :  746041401

         Device/File Name         : /dev/oracle/disk500

                                    Device/File integrity check succeeded

         Device/File Name         : /dev/oracle/disk507 disk replaced (was disk501)

                                    Device/File integrity check succeeded

         Cluster registry integrity check succeeded

However the replaced disk is not available although 507 has been removed from the list of candidate disks

Free ASM disks and their paths

==============================

Header    Mode     Path                      Disk Size

——— ——– ————————- ———

CANDIDATE ONLINE   /dev/oracle/disk508             1Gb

CANDIDATE ONLINE   /dev/oracle/disk509             1Gb

 

sudo /app/oracle/product/crs/bin/ocrconfig -replace ocr /dev/oracle/disk508

app/oracle/product/11.1.0/asm/bin $ocrcheck

Status of Oracle Cluster Registry is as follows :

         Version                  :          2

         Total space (kbytes)     :     306972

         Used space (kbytes)      :       5908

         Available space (kbytes) :     301064

         ID                       :  746041401

         Device/File Name         : /dev/oracle/disk508

                                    Device/File integrity check succeeded

         Device/File Name         : /dev/oracle/disk507

                                    Device/File integrity check succeeded

Add a new voting disk

sudo /app/oracle/product/crs/bin/crsctl add css votedisk /dev/oracle/disk500 –force
/app/oracle/product/crs/bin/crsctl add css votedisk /dev/oracle/disk509 -force

 Now formatting voting disk: /dev/oracle/disk500.

Successful addition of voting disk /dev/oracle/disk500.

app/oracle/product/11.1.0/asm/bin $sudo /app/oracle/product/crs/bin/crsctl query css  votedisk

 0.     0    /dev/oracle/disk502

 1.     0    /dev/oracle/disk503

 2.     0    /dev/oracle/disk504

 3.     0    /dev/oracle/disk509

 4.     0    /dev/oracle/disk500

Located 5 voting disk(s).

crsctl delete css votedisk  /dev/oracle/disk509 

Successful deletion of voting disk /dev/oracle/disk509.

crsctl delete css votedisk  /dev/oracle/disk500 

Successful deletion of voting disk /dev/oracle/disk500.

sudo /app/oracle/product/crs/bin/crsctl query css  votedisk

 0.     0    /dev/oracle/disk502

 1.     0    /dev/oracle/disk503

 2.     0    /dev/oracle/disk504

Located 3 voting disk(s).

My follow up actions are to see if Linux performs in the same manner and what the difference is on a 11GR2 cluster. Finally I want to understand why the released disk retained their header and if there is any way of avoiding  having to dd the header. I expect that asmlib on Linux will prove different

 

Posted in ASM, Oracle | Tagged: , , , , | 2 Comments »

Balancing ASM disks in one command – the benefits

Posted by John Hallas on June 5, 2012

As a response to my previous post about adding and dropping ASM disks with one command Emre Baransel asked a question as to whether it was more efficient or not to do it as one or two commands. It was easier to add  a second post complete with screen shots rather than just a simple reply.

My test system was a 3.5Tb database spread across 10 512Gb luns.

The first choice was to add and drop in a single command one command Read the rest of this entry »

Posted in ASM, Oracle | Tagged: , , , | 1 Comment »

ASM – Adding and dropping disks in one command

Posted by John Hallas on May 29, 2012

One of the possibilities within ASM that is not widely documented is the opportunity to add disks, drop disks and set the rebalance power all in one command.

One might wonder when you might be both adding and dropping disks simultaneously from the same disk group. Recently we have been migrating to a new storage array and this command comes in very useful in that situation

alter diskgroup data add disk '/dev/oracle/disk100','dev/oracle/disk101','dev/oracle/disk102'

drop disk '/dev/oracle/disk197','dev/oracle/disk198','dev/oracle/disk199'

rebalance power 5;

 

Posted in ASM | Tagged: , , , , | 3 Comments »

Test Case for 11gR2 Role Separation issue on HP-UX – help wanted

Posted by John Hallas on May 6, 2012

On any HPUX (11:31) system where grid infrastructure has been applied with 2 software owners – in our case grid and oracle  – oracle’s best practise for GI implementations – RAC or standalone systems.

 The standard  unix account we are using  is , testuser (although in this case he is in the DBA group).

 This user is a member of the following groups :

 uid=664(testuser) gid=500(dba)

 Test Server       = server

Test database     = dbatest

 Login to the testuser user

  1. setsid to set Oracle environment to dbatest

 The ASM luns are owned by the grid user as below :

  

[testuser@server][dbatest]/dev/grid $ls -ltr
total 0
crw-rw----   1 grid       asmdba      13 0x00000a Apr 27 12:38 disk002
crw-rw----   1 grid       asmdba      13 0x000009 Apr 27 22:00 disk001
crw-rw----   1 grid       asmdba      13 0x00000b Apr 28 12:44 disk003

 logon onto unix server as testuser

Login into the database as “sqlplus / as sysdba”

Set database environment variables as appropriate (OH, SID,PATH)

 Database bounced to clear cache out (shared pool flush would also work) Read the rest of this entry »

Posted in ASM, Oracle | Tagged: , , , | 2 Comments »

Huge asm_rbal_trace file with text UNINDENT OF DISK

Posted by John Hallas on May 3, 2012

You may notice a large and growing trace file in diag dest which contains numerous lines starting with the phrase “NOTE: Unident of disk ” followed by a disk path.

*** 2012-04-17 18:31:10.759
NOTE:Unident of disk:/dev/oracle_hr/rdisk130
NOTE:Unident of disk:/dev/oracle_hr/rdisk131
NOTE:Unident of disk:/dev/oracle_hr/rdisk132
NOTE:Unident of disk:/dev/oracle_hr/rdisk133
NOTE:Unident of disk:/dev/oracle_hr/rdisk134
NOTE:Unident of disk:/dev/oracle_hr/rdisk135
NOTE:Unident of disk:/dev/oracle_hr/rdisk136
NOTE:Unident of disk:/dev/oracle_hr/rdisk137
NOTE:Unident of disk:/dev/oracle_hr/rdisk138
NOTE:Unident of disk:/dev/oracle_hr/rdisk139
NOTE:Unident of disk:/dev/oracle_hr/rdisk129

*** 2012-04-17 19:05:20.540
NOTE:Unident of disk:/dev/oracle_hr/rdisk130
NOTE:Unident of disk:/dev/oracle_hr/rdisk131
NOTE:Unident of disk:/dev/oracle_hr/rdisk132
NOTE:Unident of disk:/dev/oracle_hr/rdisk133
NOTE:Unident of disk:/dev/oracle_hr/rdisk134
NOTE:Unident of disk:/dev/oracle_hr/rdisk135
NOTE:Unident of disk:/dev/oracle_hr/rdisk136
NOTE:Unident of disk:/dev/oracle_hr/rdisk137
NOTE:Unident of disk:/dev/oracle_hr/rdisk138
NOTE:Unident of disk:/dev/oracle_hr/rdisk139
NOTE:Unident of disk:/dev/oracle_hr/rdisk129 Read the rest of this entry »

Posted in ASM, Oracle | Tagged: , , , , | Leave a Comment »

Curing unevenly balanced ASM diskgroups to reduce poor file distribution

Posted by John Hallas on May 2, 2012

Back in Nov 2011 I posted a question on the Oracle-L mailing group about my perception that ASM disk rebalances seemed to be required on DATA diskgroups in ASM  (never FRA, presumably because the FRA had lots of similar sized objects (flashlogs archivelogs  etc)) and even after rebalancing there seemed to be a permanent imbalance between disks of the same size. I was querying the effectiveness of the rebalancing operation.

There was some good responses including a script that Dave Herring had created, based on  key MOS articles: 818171.1 (Identifying Files with Imbalances), 351117.1 (Troubleshooting ASM Space Issues), 367445.1 (Advanced Balance and Space Report on ASM). Manual rebalancing  would normally not be required, because ASM automatically rebalances disk groups when their configuration changes. You might want to do a manual rebalance operation if you want to control the speed of what would otherwise be an automatic rebalance operation.

I have been chasing this down and have now come across Bug 7699985: UNBALANCED DISTRIBUTION OF FILES ACROSS DISKS.

It appears to be a problem in 11.1.0.7 and I know a number of sites have reported it. Despite  repeated manual rebalance operations there can be a variance of up to 10% between different disks of the same diskgroup, even if they are the same size. It is fixed in 11.2.0.1.  The workaround is to set the _asm_imbalance_tolerance parameter to be 0 rather than the default of 3.This controls the hundredths of a percentage of inter-disk imbalance to tolerate. Then rebalance the disks manually and reset the parameter back again to 3 as you don’t need to leave it  balancing all the time.

SQL> @asm_imbalance   (from Report the Percentage of Imbalance in all Mounted Diskgroups (Doc ID 367445.1)

@asm_imbalance.sql

 Columns Described in Script               Percent Minimum
                                 Percent Disk Size Percent  Disk Diskgroup
Diskgroup Name                 Imbalance  Varience    Free Count Redundancy
------------------------------ --------- --------- ------- ----- ----------
DATA                                 9.0        .5    15.2    84 EXTERN

SYS@+ASM SQL>l
    Select dg.name,dg.allocation_unit_size/1024/1024 "AU(Mb)",min(d.free_mb) Min,
    max(d.free_mb) Max, avg(d.free_mb) Avg
    from v$asm_disk d, v$asm_diskgroup dg
    where d.group_number = dg.group_number
    group by dg.name, dg.allocation_unit_size/1024/1024
SY@+ASM SQL>/

NAME                               AU(Mb)        MIN        MAX    AVG
------------------------------ ---------- ---------- ---------- ------
DATA                                    1       7728      11599   8026

alter diskgroup data rebalance power 4;

NAME                               AU(Mb)        MIN        MAX    AVG
------------------------------ ---------- ---------- ---------- ------
DATA                                    1       7734      11483   8026 – no difference

alter system set "_asm_imbalance_tolerance"=0;
alter diskgroup data rebalance power 4;
NAME                               AU(Mb)        MIN        MAX        AVG
------------------------------ ---------- ---------- ---------- ----------
DATA                                    1       8020       8062   8026  - Now nicely rebalanced

Posted in ASM, Oracle | Tagged: , , , , , , | 4 Comments »

The Mother of all ASM scripts

Posted by John Hallas on March 6, 2012

Back in 2009 I posted a script which I found very useful to review ASM disks. I gave that post the low-key title of The ASM script of all ASM scripts. Now that script has been improved I have to go a bit further with the hyperbole and we have the The Mother of all ASM scripts.  If it ever gets improved then the next post will just be called ‘Who’s the Daddy’.

I have been using the current script across all our systems for the last 3 years and I find it very useful, a colleague, Allan Webster, has added a couple of improvements and it is now better than before.

The improvements show current disk I/O statistics and a breakdown of the types of files in each disk group and the total sizes of that filetype. The I/O statistics are useful when you have a lot of databases, many of which are test and development and so you do not look at them as that often. It just gives a quick overview that allows you to get a feel if anything is wrong and to see what the system is actually doing. There are also a few comments at the beginning defining the various ASM views available. Read the rest of this entry »

Posted in ASM, Oracle, scripts | Tagged: , , | 5 Comments »

OER 27064: cannot perform async I/O to file – HPUX

Posted by John Hallas on February 10, 2010

I was trying to prove that we had a disk I/O issue on a database server so I ran a set of Orion tests across that server and a number of others for comparison purposes. HPUX 11.31 Itanium using the 11.1.0.7 Orion binaries

The test I used was to use to raw devices which would normally be assigned to an ASM diskgroup but had either not yet been used or were marked as CANDIDATES or FORMER.

The Orion command I was using was

./orion_hpux_ia64 -run advanced -write 40 -matrix basic -duration 120 -testname hpuxdiskio  -num_disks 2

where the file  hpuxdiskio had two lines in of the format /dev/oracle/disk550 and dev/oracle/dev551

On one server where there were a number of free disks I saw the following error

ORION: ORacle IO Numbers — Version 11.1.0.7.0
hpuxdiskio_20100208_1455
Test will take approximately 31 minutes
Larger caches may take longer

Ioctl ASYNC_CONFIG error, errno = 1
SKGFR Returned Error — Async. read failed on FILE: /dev/oracle/disk550
OER 27064: cannot perform async I/O to file
rwbase_issue_req: lun_aiorq failed on read
rwbase_run_test: rwbase_issue_req failed
rwbase_run_process: rwbase_run_test failed
rwbase_rwluns: rwbase_run_process failed
orion_thread_main: rw_luns failed
Test error occurred
Orion exiting

Searching the net I could not find any clues. I added the same disks to a diskgroup with no problem so I knew that Oracle could use them and there were no permission or or other issues. Read the rest of this entry »

Posted in ASM, Oracle | Tagged: , , , , | 3 Comments »

PSU dependancy checking with ASM now enforced in 11G

Posted by John Hallas on January 28, 2010

ASM has to be equal to or higher than the highest version of the databases that are using it and the compatability settings have to be correct.

PSU 1 (Oct 2009) did not enforce that requirement. PSU 2 (Jan 2010) does check.

We determined this because we do not always apply the latest PSU against the ASM binaries but we do against the RDBMS code. Today the following sequence of events took place along with the associated error message.

Read the rest of this entry »

Posted in 11g new features, ASM, Oracle | Tagged: , , , , , | 3 Comments »

 
Follow

Get every new post delivered to your Inbox.

Join 134 other followers