OER 27064: cannot perform async I/O to file – HPUX

10 02 2010

I was trying to prove that we had a disk I/O issue on a database server so I ran a set of Orion tests across that server and a number of others for comparison purposes. HPUX 11.31 Itanium using the 11.1.0.7 Orion binaries

The test I used was to use to raw devices which would normally be assigned to an ASM diskgroup but had either not yet been used or were marked as CANDIDATES or FORMER.

The Orion command I was using was

./orion_hpux_ia64 -run advanced -write 40 -matrix basic -duration 120 -testname hpuxdiskio  -num_disks 2

where the file  hpuxdiskio had two lines in of the format /dev/oracle/disk550 and dev/oracle/dev551

On one server where there were a number of free disks I saw the following error

ORION: ORacle IO Numbers — Version 11.1.0.7.0
hpuxdiskio_20100208_1455
Test will take approximately 31 minutes
Larger caches may take longer

Ioctl ASYNC_CONFIG error, errno = 1
SKGFR Returned Error — Async. read failed on FILE: /dev/oracle/disk550
OER 27064: cannot perform async I/O to file
rwbase_issue_req: lun_aiorq failed on read
rwbase_run_test: rwbase_issue_req failed
rwbase_run_process: rwbase_run_test failed
rwbase_rwluns: rwbase_run_process failed
orion_thread_main: rw_luns failed
Test error occurred
Orion exiting

Searching the net I could not find any clues. I added the same disks to a diskgroup with no problem so I knew that Oracle could use them and there were no permission or or other issues.

Metalink (MoS) doc 401323.1 provided the answer. This is about how to implement asynch I/O on HPUX and the part I immediately noticed and was aware of was the line about giving the Oracle user MLOCK privilege. This is  a standard build action as the unix team deliver database servers to us but this server had pre-dated the requirement. Once we had granted the privilege the Orion testing worked successfully

From Administering Oracle on HPUX  

To permit Oracle Database to process asynchronous I/O operations, the OSDBA group  must have the MLOCK privilege.Log in as the root user.

#/usr/sbin/getprivgrp dba 

#/usr/sbin/setprivgrp -f /etc/privgroup

Using any text editor, open the /etc/privgroup file, or create it if necessary.

Add or edit the following line, which begins with the name of the OSDBA group, specifying the privilege MLOCK

Note:

You must use only one line to specify the privileges for a particular group in this file. If the file already contains a line for the dba group, then add the MLOCK privilege on the same line.

dba RTPRIO RTSCHED MLOCK

Save the file, and quit the text editor.

Enter the following command to grant the privileges to the OSDBA group:

#/usr/sbin/getprivgrp dba 

Enter the following command to verify that the privileges are set correctly.

#/usr/sbin/setprivgrp -f /etc/privgroup 

I hope that helps anybody else who comes across the problem as I know there is at least one OTN forum question out there regarding  this.

I will be posting more about the Orion testing once I have a definite conclusion.





My first experience of presenting at an UKOUG SIG

8 02 2010

Just a quick note to say what happened, my impressions and thoughts on the process.

I originally got the idea for a presentation and ran the idea past an ex SIG chairman who I know. He suggested I mail the idea to the UNIX SIG chair and I quickly got feedback from UKOUG asking me to complete a precis of the talk and a title. Thinking of a title was undoubtedly the hardest part of the process.

The next step was a bit of a surprise as I was expecting some feedback or questions but I got an e-mail to state that I was doing a 45 minute slot in about 6 weeks time. I accepted but that does lead to a couple of thoughts. I never actually checked my calender to see what I had on that day and nobody had asked me how long my talk would take (and I did not know at the time). I understand now that talks of varying length from 20 minutes to at least 45 minutes can be arranged, so don’t let the idea of filling 45 minutes put you off.

I was well chivvied by Aimee at UKOUG, who kept the communication flow going very well,  to provide my presentation electronically a few days before the event so that it could be made available online for anybody who wanted to print it off in advance. On the day I saw a couple of people with copies of the talk with them.  I have mixed feelings about this, I can understand the reasons with very technical presentations  covering such things as block dumps but on the day that I attended there was nothing that could have been gained from seeing any of the presentations in advance or printing out copies.

The day itself was very well organised and I got a different coloured lapel badge indicating I was a presenter. I had never been to a SIG before and one of the comments I made in my feedback was that I would not have attended this SIG if I had gone by what was on the agenda. However on the day I enjoyed every presentation and got something from them all.  The first speaker was scheduled for 40-45 minutes but spoke at length and had to curtail his talk quite sharply after around 55 minutes.  This did not help my nerves as my dummy run had taken 30 minutes and I expected to be quicker on the day and miss out some content and I was a bit worried about short-selling everyone. My consoling  thought was that I was last on and nobody would mind getting away early.

The day went well, good talks, interspersed with coffee and lunch which was a good time for a chat. I did want to meet someone who worked at a company local to me but it was quite difficult to read name badges and therefore I never got chance to meet him. I knew he had registered because a sheet of all attendees and their company was given out on registering. However there was at least one person who was there who was not listed on the sheet but I did get talking to him anyway which worked out well. The RIBA building was very impressive and the lecture theatre itself was very well kitted out although the 50-60 people in the audience did look a little lost in a room designed for probably 200+.

My time to talk arrived and my slideshow was showing on screen and the lectern was complete with a pointer and slideshow mover on (I am sure there is a technical term available but it has escaped me for the moment).  Once I started talking I was quite comfortable and rambled on for about 50 minutes which surprised me. I was happy to take questions both during and after and my talk seemed to generate a few which I considered to be good. I got nice feedback from several people afterwards as well.

I received a feedback form the next day and I got my feedback summary a fortnight later. The response was sufficient for me to have suggested another talk I might make.

I would definitely recommend the experience . It is a bit daunting,  it can be a challenge presenting technical stuff to your peers (or in some cases people who know much more than you about the subject) but it is also very rewarding. I think it has developed me personally,  it has gone down well at work and there are now another 50-60 UKOUG members who know my face and my name and can avoid me in the future if they want to.





PSU dependancy checking with ASM now enforced in 11G

28 01 2010

ASM has to be equal to or higher than the highest version of the databases that are using it and the compatability settings have to be correct.

PSU 1 (Oct 2009) did not enforce that requirement. PSU 2 (Jan 2010) does check.

We determined this because we do not always apply the latest PSU against the ASM binaries but we do against the RDBMS code. Today the following sequence of events took place along with the associated error message.

A new server build with ASM 11.1.0.7 (no PSU) and DB 11.1.0.7 (Jan PSU). Create database fails:

DB alert log:

ORA-19510: failed to set size of 594 blocks for file "+DATA/oiddev1a/controlfile/current.256.709337153" (block size=16384)
ORA-17505: ksfdrsz:1 Failed to resize file to size 594 blocks
ORA-15061: ASM operation not supported [41]
ORA-1501 signalled during: CREATE DATABASE "OIDDEV1A"

ASM alert log:

NOTE: ASM client OIDDEV1A:OIDDEV1A died unexpectedly.

No useful hits on the net. We tried upping the compatability from 11.1.0.0.0 to 11.1.0.7.0 but it failed because the database was at a higher version. Then the realisation that we do not normally patch the ASM database so that was at 11.1.0.7.0, however the dbms had the Jan 2010 PSU 2 patchset applied which made the database version 11.1.0.7.2. The errror message was correct.

More testing confirmed what we had realised, the Oct 09 PSU 1 patchset had never checked the ASM version but PSU 2 was now performing that check and then stopping the database creation.

So if anybody has jumped to this page because of the post I have just written around PSU patching they will be able to see that I could tell patch 9156613 contained the Oct 09 PSU 1 release because I had not caused the ORA-015061 error seen above whereas if PSU 2 had been applied to the rdbms that would have failed because the ASM instance had not been patched yet.





Identifying applied PSU patches

27 01 2010

Metalink/My Oracle support note 988624.1 states that PSU 2 for 11GR1 was released on January 12th 2010. When applied, this takes the database version to 11.1.0.7.2. I say this but it is not apparent what version is installed.

The same note states that one of the advantages of the PSU releases is that “the version number is incremented so it is easier to tell what is installed”. Well I beg to differ and I will give you some examples to support my argument.

Firstly there is now obvious way to tell from the database what (database) version is applied. That includes looking at sqlplus, dba_registry, v$version, or the output from traces files. There might be options with taking traces or block dumps but that is hardly “it is easier to tell what is installed” is it.

In the example below PSU2 – Jan 2010 has been applied and the obvious way of telling is that the opatch lsinventory has a distinctive list of patches in a list with eight columns of patches and we can also see the date the patch was created (7 Jan 2010). I still recommend keeping a site record of PSU patches and patch numbers for a quick mapping.
Installed Top-level Products (2):

Oracle Database 11g 11.1.0.6.0
Oracle Database 11g Patch Set 1 11.1.0.7.0
There are 2 products installed in this Oracle Home.

Interim patches (1) :
Patch 9209238 : applied on Fri Jan 15 14:49:19 GMT 2010
Unique Patch ID: 12089910
Created on 7 Jan 2010, 04:12:05 hrs PST8PDT
Bugs fixed:
7627743, 7652888, 9135679, 7299153, 8242410, 6059178, 6955744, 7497788
7447559, 8702276, 7690421, 7719143, 8251486, 8367827, 8328853, 7630874
7515145, 8416414, 8250643, 8284633, 8348481, 8230457, 8563948, 7044551
7318049, 8940197, 7013124, 7432514, 7393258, 7553884, 8563944, 8860821
7606362, 8898852, 8483871, 7307972, 7462589, 7426959, 7330434, 7708340
7257038, 7352414, 6452375, 7341598, 8213302, 7516867, 7296258, 7340448
8290478, 8391256, 7462112, 7013817, 7331867, 7527650, 6977167, 8855565
7524944, 7494333, 8402551, 8914979, 7496908, 7719148, 7424804, 7452373
7597354, 6851669, 8543737, 7318276, 8284438, 7420394, 7350127, 7122161
9118622, 8361398, 8348464, 8224083, 8658581, 7206858, 8563945, 8402637
8257122, 8199266, 7345904, 6407486, 8352309, 9114072, 6812439, 7219752
7432601, 8542307, 8413059, 8281906, 7460818, 6843972, 7378322, 7189645
8563947, 7653579, 8876094, 8702535, 7253531, 8339352, 7225720, 7203349
8974548, 8402562, 7438445, 8855577, 6980597, 6618461, 8855570, 8565708
8369094, 8306934, 7434194, 8217795, 8833297, 7486595, 6599920, 7628866
7183523, 7412296, 7135702, 6679303, 7650993, 7830065, 7462709, 6768362
8563941, 7515779, 6870937, 8563946, 8717461, 8244217, 6840740, 8450529
6981690, 7432556, 8650661, 7523787, 9066130, 7613481, 8341623, 7643188
7348847, 8834636, 7113299, 6903819, 6900214, 7639121, 8287680, 8499043
7356443, 7334226, 7446163, 7196532, 8409848, 8342506, 8236851, 7593835
7309458, 8674263, 7694979, 8499600, 7119382, 7411865, 7409110, 6501490
6598432, 7706138, 8402555, 6941717, 7586451, 7715244, 8487273, 7499353
8408887, 7511040, 7311601, 7497640, 7373196, 7500792, 6882739, 7366290
8324760, 9209238, 8199107, 7834195, 7475055, 8563942, 7451927, 8531282
7705669, 7676737, 8301559, 8211920, 8462173, 8855553, 7454752, 7516536
8352304, 7426336, 7416901, 8534338, 7572069, 7436280, 8539923, 7311909
8790767, 7506785, 7276960, 8855559, 7013835, 6972189, 8402548, 8277580
8306933, 7477246, 7263842, 7480809, 8855575, 8433270, 7556778, 8836375
8364676, 7330611, 8549480, 8563943, 8214576, 7036453, 7628387, 8419383
7719668, 6970731, 9118620, 7377810, 8221425, 7680907, 8243648, 7630416
6851110, 7639602, 7441663, 9188010, 7357609, 8318050, 8539335, 7716219
8362693, 8588540, 8613137, 7720494, 6991626, 6784747, 8851675, 7436152
7602341, 7175513, 8339404, 7829321, 7258928, 7393804, 8856696, 6980601, 8226397

Let me compare this with another example

Installed Top-level Products (2):

Oracle Database 11g 11.1.0.6.0
Oracle Database 11g Patch Set 1 11.1.0.7.0
There are 2 products installed in this Oracle Home.

Interim patches (1) :

Patch 9156613 : applied on Wed Jan 27 10:26:44 GMT 2010
Unique Patch ID: 12072859
Created on 18 Jan 2010, 23:43:20 hrs PST8PDT
Bugs fixed:
8563943, 7036284

That tells us that patch 9156613 was created by Oracle on 18 Jan 2010 and applied on 27 Jan 2010. I would assume that no PSU has been applied because there is no long list of patches in that eight column format.

However I would be wrong. This patch is a PSU that is merged with 3 patches that were conflicting with the PSU and we asked Oracle to create a merged patchset. That patchset will be specific and probably not downloadable by the normal methods or generally available.

The problem is that there is no easy way to tell if it includes the PSU at all and if it does which PSU it includes. We will have to maintain that documentation ourselves and I am guessing there is a good chance it will confuse Oracle at some point when we raise a support call in the future.

So this is a heads up on keeping good records of which PSU releases have been applied, where and which merged patchsets have been provided by Oracle for you.

I am also creating a second blog entry on a PSU 2 - 11.1.07.2 difference. That will tell you how I knew that patch 9156613 includes the Oct 09 PSU and not the Jan 10 PSU despite it being created on 18 Jan 2010





Using a service name with load balancing and standby databases

26 01 2010

This one is for Coskan

At the UKOUG SIG last week when I was presenting, Coskan asked a question ‘do you use RAC services’. I prevaricated a bit because I thought that there must be more to the question than a simple yes or no.

Anyway, the answer is yes and I will post a trigger that we use when we switch from primary to Dataguard standby

Firstly the tnsnames entry where the first 2 servers are primary and the next 2 are the standby servers

ABCPRD =
(DESCRIPTION =
(ADDRESS_LIST =
(FAILOVER = YES)
(ADDRESS = (PROTOCOL = TCP)(HOST = abc01-vip.unix.plc.net)(PORT = 1522))
(ADDRESS = (PROTOCOL = TCP)(HOST = abc02-vip.unix.plc.net)(PORT = 1522))
(ADDRESS = (PROTOCOL = TCP)(HOST = abc03-vip.unix.plc.net)(PORT = 1522))
(ADDRESS = (PROTOCOL = TCP)(HOST = abc04-vip.unix.plc.net)(PORT = 1522))
(LOAD_BALANCE = YES)
)
(CONNECT_DATA =
(SERVICE_NAME = ABCPRD )
)
)

We have a database trigger (MANAGE_DGSERVICES) that starts the service “ABCPRD” only if the database role is “Primary”. Therefore, the service is only ever available on the correct node(s). Note that we originally wanted to use an underscore but one of the Apps DBAs advised that he has seen an underscore cause problems. This appears to be because application servers seem to translate an underscore (ASCII code 95) to %95%. This is probably a bug but we decided not to use the underscore in the service name.

To create the service

exec dbms_service.create_service('abc123,'abc123');
exec dbms_service.start_service(abc123');

CREATE OR REPLACE TRIGGER manage_DGservice after startup on database
DECLARE
role VARCHAR(30);
BEGIN
SELECT DATABASE_ROLE INTO role FROM V$DATABASE;
IF role = 'PRIMARY' THEN
DBMS_SERVICE.START_SERVICE('abc123');
END IF;
END;
/

Maybe Coskan will come back and ask the question he was intending to get an answer to. I hope he does and even more so I hope I can answer it.





The use of functions in a .profile file

22 01 2010

My first public presentation is over now and whilst I was very nervous beforehand I felt quite comfortable once I started. To anybody who was there, thanks for putting up with me.

I promised to upload the contents of a .profile we use for the oracle account as that includes a number of useful functions and aliases. This is rolled out to every database server to ensure that we have a similar feel to every server.
We also have the oratab files set up so that the primary database is first (if there is more than one), ASM is next and the Grid agent home is next. That way the default SID setup when logging in is the main database we are likely to be using.

#!/bin/sh
# Standard Oracle .profile.
#
#
##################################################################
# Version Control
##################################################################
# Who Date Description
# Initial Version
# 21/12/2009 Added lsum and lsh functions
# 31/12/2009 Added export USER for agent startup
#
##################################################################

###############################################
# Global Variables NOT exported
###############################################
ORATAB=/etc/oratab

###############################################
# Functions
###############################################

###############################################
# Check for ASM and if present set TNS_ADMIN
###############################################
asmcheck()
{
#
# Check for ASM
# Look for + at start of line to indicate ASM instance
#
QUERY_ASM=`awk 'BEGIN {FS =":"} $1 ~ /^[+]/ && $3 ~ /[N-Y]$/ {print $2}' $ORATAB`
if [ -d $QUERY_ASM/network/admin ]
then
export TNS_ADMIN=$QUERY_ASM/network/admin
else
unset TNS_ADMIN
fi
}

########################################################
# Show list of available Oracle SIDs if on a Terminal
########################################################
showsid ()
{
if tty -s
then
if [ -f $ORATAB ]
then
i=1
echo ""
awk 'BEGIN {FS =":"} $1 ~ /^[+,A-Z,a-z]/ && $3 ~ /[N-Y]$/ {print $1 " : " $2}' ${ORATAB} | while read file_line
do
printf "%2d. %s\n" $i "$file_line"
let i=$i+1
done
echo ""
fi
fi
}

########################################################
# Oracle Environment Set Function if it is a terminal
########################################################
setsid ()
{
if tty -s
then
if [ -f $ORATAB ]
then
line_count=`cat $ORATAB | grep -v ^# | sed 's/:.*//' | wc -l`
# check that the oratab file has some contents
if [ $line_count -ge 1 ]
then
sid_selected=0
while [ sid_selected -eq 0 ]
do
sid_available=0
for i in `cat $ORATAB | grep -v ^# | sed 's/:.*//'`
do
sid_available=`expr $sid_available + 1`
sid[$sid_available]=$i
done
# get the required SID
case ${SETSID_AUTO:-""} in
YES) # Auto set use 1st entry
sid_selected=1 ;;
*)
echo ""
echo "Please select a SID from the list below by entering"
echo "the associated number."
echo ""
i=1
while [ $i -le $sid_available ]
do
printf "%2d. %10s\n" $i ${sid[$i]}
i=`expr $i + 1`
done
echo ""
echo "Select the Oracle SID [1]: \c"
read entry
if [ -n "$entry" ]
then
entry=`echo "$entry" | sed 's/[a-z,A-Z]//g'`
if [ -n "$entry" ]
then
entry=`expr $entry`
if [ $entry -ge 1 ] && [ $entry -le $sid_available ]
then
sid_selected=$entry
fi
fi
else
sid_selected=1
fi
esac
done
#
# At this point we have a valid sid reference
#
export ORACLE_SID=${sid[$sid_selected]}
export ORAENV_ASK=NO
. oraenv
export ORAENV_ASK=NO
echo
echo "Setting ORACLE_SID = $ORACLE_SID" ;
echo

#
# Amend variables based on environment
#
export LD_LIBRARY_PATH=$ORACLE_HOME/lib:$LD_LIBRARY_PATH
export PATH=$ORACLE_HOME/OPatch:$PATH
asmcheck

else
echo "No entries in $ORATAB. no environment set"
fi
fi
fi
}

########################################################
# Sum the sizes of the files in a directory
########################################################
lsum ()
{
ls -l |awk 'NR == 1 {d=1024 ; z="Kb"}
{sz+=$5}
sz > 1048575 {d=1048576 ; z="Mb"}
sz > 1073741823 {d=1073741824 ; z="Gb"}
END {printf ("%.1f %s\n", sz/d,z)}'
}

########################################################
# This function implements the human readable output form in ls which gets you the size of the file in Mb's and GB's rather than bytes
# based on the Solaris -ls -lh command
########################################################
lsh ()
{
ls -l |awk '$5 1048575 && $5 1073741823 {printf ("%s %+10s %+6s %10.1f%s %s %+2s %+5s %s\n", $1,$3,$4,$5/1073741824,"Gb",$6,$7,$8,$9)}'
}
########################################################
# Some Unix Environment Defaults
########################################################
umask 022
export TMOUT=0
export HISTFILE=/.sh_history
export EDITOR=vi
export PATH=$PATH:.:/usr/sbin
export LD_LIBRARY_PATH=/usr/lib/hpux64
export UNIX95=""
export USER=$LOGNAME
export PS1="[\${LOGNAME}@`hostname`][\${ORACLE_SID}]\${PWD} $"

########################################################
# Set up some standard Oracle Environment components
########################################################
export SQLPATH=/shared/oracle
export ORACLE_BASE=/app/oracle
export NLS_DATE_FORMAT="YYYY-MM-DD:HH24:MI:SS"

########################################################
# Standard Alias settings (no dependency on environment)
########################################################

alias asm='asmcmd -p'
alias lss='ls -ltr'
alias sysdba='sqlplus "/ as sysdba"'
alias sysasm='sqlplus "/ as sysasm"'
alias rmant='rman target / nocatalog'
alias pmon='ps -fu oracle | grep pmon | grep -v grep'

#############################################################
# The Main profile code starts here
#############################################################

#
# If you are on terminal then process
#
if tty -s
then

#
# Set up the Terminal
#
if [ "$TERM" = "" ]
then
eval ` tset -s -Q -m ':?hp' `
else
eval ` tset -s -Q `
fi
stty erase "^?" kill "^U" intr "^C" eof "^D"
stty hupcl ixon ixoff
tabs
set -u

#
# Mail Configuration
#
export MAIL=/var/mail/oracle
export MAILMSG="You have mail!"
if [ -s "$MAIL" ]
then
echo "$MAILMSG"
fi

#
# Set the default Oracle Environment
#
SETSID_AUTO="YES" # setsid AUTO function enabled set 1st entry in oratab
setsid
SETSID_AUTO="" # setsid AUTO function disabled

#
# Environment Dependent Alias Settings
#
alias oh='cd $ORACLE_HOME'
alias vahome='if [ -d /oradata/$ORACLE_SID/VirtualAgent ] ; then cd /oradata/$ORACLE_SID/VirtualAgent;fi'
alias diagdest='cd $ORACLE_BASE/`adrci exec="show homes" | grep $ORACLE_SID`/trace'
alias diagdestasm='cd $ORACLE_BASE/`adrci exec="show homes" | grep ASM`/trace'
fi





Recovering old spfile from autobackup

18 01 2010

This post is about a problem I had today when trying to restore a spfile from a previous autobackup.  I did not want the latest one but one from earlier in the week. The database was a RAC one but I don’t think that is relevant to the problems I had.

Recovering a spfile from autobackup is relatively easy – startup mount and recover spfile from autobackup.  I set my date format and started the database up in mount mode and then tried to recover the spfile.

[oracle@server][EBSPRD1A]/home/oracle $srvctl start database -d ebsprd1a -o mount

[oracle@server][EBSPRD1A]/home/oracle $rman target /

Recovery Manager: Release 11.1.0.6.0 - Production on Sun Jan 17 13:23:55 2010

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

connected to target database: EBSPRD1A (DBID=1343018800, not open)

using target database control file instead of recovery catalog

run {
set until time "to_date('15-JAN-2010','DD-MON-YYYY')";
restore spfile from autobackup;
}

executing command: SET until clause
Starting restore at 17-JAN-10
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=473 instance=EBSPRD1A device
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of restore command at 01/17/2010 13:26:20
<span style="color: #ff0000;">RMAN-06564: must use the TO clause when the instance is started with SPFILE</span>
exit

Recovery Manager complete.

So after a failure stating ‘file already exists’ I figured I had better remove the spfile first by setting the SID to be ASM and using asmcmd to go to the folder containing the spfile.

ASMCMD [+DATA/EBSPRD1A/PARAMETERFILE] > rm spfileEBSPRD1A.ora

ASMCMD [+DATA/EBSPRD1A/PARAMETERFILE] > exit

[oracle@server][+ASM1]/home/oracle $setsid

Please select a SID from the list below by entering  the associated number.
1.      EBSPRD1A
2.      +ASM1
3.      agent10g
Select the Oracle SID : 1
Setting ORACLE_SID = EBSPRD1A
The Oracle base for ORACLE_HOME=/app/oracle/product/11.1.0/db_1 is /app/oracle
[oracle@server][EBSPRD1A]/home/oracle $rmant

Recovery Manager: Release 11.1.0.6.0 - Production on Sun Jan 17 13:28:54 2010

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

connected to target database: EBSPRD1A (DBID=1343018800, not open)

using target database control file instead of recovery catalog

run {
set until time "to_date('15-JAN-2010','DD-MON-YYYY')";
restore spfile to '+DATA/EBSPRD1A/PARAMETERFILE/spfileEBSPRD1A.ora' from autobackup;
}
executing command: SET until clause
Starting restore at 17-JAN-10
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=500 instance=EBSPRD1A device
recovery area destination: +FRA
database name (or database unique name) used for search: EBSPRD1A

channel ORA_DISK_1: AUTOBACKUP +fra/EBSPRD1A/AUTOBACKUP/2010_01_14/s_708269575.4081.708269577 found in the recovery area

channel ORA_DISK_1: looking for AUTOBACKUP on day: 20100115

channel ORA_DISK_1: looking for AUTOBACKUP on day: 20100114

channel ORA_DISK_1: restoring spfile from AUTOBACKUP +fra/EBSPRD1A/AUTOBACKUP/2010_01_14/s_708269575.4081.708269577

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of restore command at 01/17/2010 13:31:37
<span style="color: #ff0000;">ORA-32011: cannot restore SPFILE to location already being used by the instance</span>

So I thought about restoring to a filesystem file and copying it in later, which proved impossible

run {
set until time "to_date('15-JAN-2010','DD-MON-YYYY')";
restore spfile to '/home/oracle/spfilejohn.ora' from autobackup;
}
executing command: SET until clause
Starting restore at 17-JAN-10
using channel ORA_DISK_1
recovery area destination: +FRA
database name (or database unique name) used for search: EBSPRD1A
channel ORA_DISK_1: AUTOBACKUP +fra/EBSPRD1A/AUTOBACKUP/2010_01_14/s_708269575.4081.708269577 found in the recovery area

channel ORA_DISK_1: looking for AUTOBACKUP on day: 20100115
channel ORA_DISK_1: looking for AUTOBACKUP on day: 20100114
channel ORA_DISK_1: restoring spfile from AUTOBACKUP +fra/EBSPRD1A/AUTOBACKUP/2010_01_14/s_708269575.4081.708269577
channel ORA_DISK_1: SPFILE restore from AUTOBACKUP complete
Finished restore at 17-JAN-10
Recovery Manager complete.

[oracle@server][EBSPRD1A]/home/oracle $srvctl stop database -d ebsprd1a -o immediate

[oracle@server][EBSPRD1A]/home/oracle $setsid

[oracle@server][+ASM1]/home/oracle $asmcmd -p

ASMCMD [+] > cd +DATA/EBSPRD1A/PARAMETERFILE/

ASMCMD [+DATA/EBSPRD1A/PARAMETERFILE] > cp /home/oracle/spfilejohn.ora ./spfileEBSPRD1A.ora

ASMCMD-08012: can not determine file type for file->'/home/oracle/spfilejohn.ora'

ORA-15056: additional error message

ORA-19762: invalid file type DGGetFileAttr20

ORA-06512: at "SYS.X$DBMS_DISKGROUP", line 228

ORA-06512: at line 3 (DBD ERROR: OCIStmtExecute)

copying /home/oracle/spfilejohn.ora -> +DATA/EBSPRD1A/PARAMETERFILE/./spfileEBSPRD1A.ora

ASMCMD-08016: copy source->'/home/oracle/spfilejohn.ora' and target->'+DATA/EBSPRD1A/PARAMETERFILE/./spfileEBSPRD1A.ora' failed

ORA-00600: internal error code, arguments: [1866], [0x9FFFFFFFFD5B4FE0], [4252995008], [0x60000000003C6140], [], [], [], [], [], [], [], []

ORA-31093: null or invalid value specified for parameter : dbms_diskgroup.copy:SRC_FSIZ

ORA-31093: null or invalid value specified for parameter : dbms_diskgroup.copy:SRC_BLKSZ

ORA-31093: null or invalid value specified for parameter : dbms_diskgroup.copy:SRC_FTYPE

ORA-06512: at "SYS.X$DBMS_DISKGROUP", line 258

ORA-06512: at line 3 (DBD ERROR: OCIStmtExecute)

ASMCMD [+DATA/EBSPRD1A/PARAMETERFILE] > cp /home/oracle/spfilejohn.ora '+DATA/EBSPRD1A/PARAMETERFILE/spfileEBSPRD1A.ora'

ASMCMD-08012: can not determine file type for file->'/home/oracle/spfilejohn.ora'

ORA-15056: additional error message

ORA-19762: invalid file type DGGetFileAttr20

ORA-06512: at "SYS.X$DBMS_DISKGROUP", line 228

ORA-06512: at line 3 (DBD ERROR: OCIStmtExecute)

copying /home/oracle/spfilejohn.ora -> +DATA/EBSPRD1A/PARAMETERFILE/spfileEBSPRD1A.ora

ASMCMD-08016: copy source->'/home/oracle/spfilejohn.ora' and target->'+DATA/EBSPRD1A/PARAMETERFILE/spfileEBSPRD1A.ora' failed

ORA-03113: end-of-file on communication channel

Process ID: 1186

Session ID: 115 Serial number: 1625 (DBD ERROR: OCIStmtExecute)

ASMCMD [+DATA/EBSPRD1A/PARAMETERFILE] > cp /home/oracle/spfilejohn.ora +DATA/EBSPRD1A/PARAMETERFILE/spfileEBSPRD1A.ora

ASMCMD-08012: can not determine file type for file->'/home/oracle/spfilejohn.ora'

ORA-15056: additional error message

ORA-19762: invalid file type DGGetFileAttr20

ORA-06512: at "SYS.X$DBMS_DISKGROUP", line 228

ORA-06512: at line 3 (DBD ERROR: OCIStmtExecute)

copying /home/oracle/spfilejohn.ora -> +DATA/EBSPRD1A/PARAMETERFILE/spfileEBSPRD1A.ora

ASMCMD-08016: copy source->'/home/oracle/spfilejohn.ora' and target->'+DATA/EBSPRD1A/PARAMETERFILE/spfileEBSPRD1A.ora' failed

ORA-03113: end-of-file on communication channel

So finally I decided to start up using a pfile created earlier, restore the spfile from autobackup and hopefully have no problems and that was the way it worked out.

SYS@EBSPRD1A SQL>STARTUP MOUNT PFILE='/app/oracle/product/11.1.0/db_1/dbs/initEBSPRD1A.ora.1601_2';

ORACLE instance started.
Total System Global Area 3.2068E+10 bytes  (yes that is a 32Gb SGA)
Fixed Size                  2120432 bytes
Variable Size            1.8902E+10 bytes
Database Buffers         1.3153E+10 bytes
Redo Buffers               10752000 bytes
Database mounted.
SYS@EBSPRD1A SQL>exit

Disconnected from Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - 64bit Production

With the Partitioning, Real Application Clusters, OLAP, Data Mining and Real Application Testing options

[oracle@server][EBSPRD1A]/app/oracle/product/11.1.0/db_1/dbs $rmant
Recovery Manager: Release 11.1.0.6.0 - Production on Sun Jan 17 13:46:15 2010
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
connected to target database: EBSPRD1A (DBID=1343018800, not open)
using target database control file instead of recovery catalog

run {
set until time "to_date('15-JAN-2010','DD-MON-YYYY')";
restore spfile to '+DATA/EBSPRD1A/PARAMETERFILE/spfileEBSPRD1A.ora' from autobackup;
}

executing command: SET until clause
Starting restore at 17-JAN-10
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=995 instance=EBSPRD1A device
recovery area destination: +FRA
database name (or database unique name) used for search: EBSPRD1A
channel ORA_DISK_1: AUTOBACKUP +fra/EBSPRD1A/AUTOBACKUP/2010_01_14/s_708269575.4081.708269577 found in the recovery area
channel ORA_DISK_1: looking for AUTOBACKUP on day: 20100115
channel ORA_DISK_1: looking for AUTOBACKUP on day: 20100114
channel ORA_DISK_1: restoring spfile from AUTOBACKUP +fra/EBSPRD1A/AUTOBACKUP/2010_01_14/s_708269575.4081.708269577
channel ORA_DISK_1: SPFILE restore from AUTOBACKUP complete
Finished restore at 17-JAN-10

[oracle@server][EBSPRD1A]/app/oracle/product/11.1.0/db_1/dbs $srvctl stop database -d ebsprd1a -o immediate

[oracle@server][SID]/app/oracle/product/11.1.0/db_1/dbs $srvctl start database -d SID

I could have recovered from tape (we use Data Protector as the MML) but at least I got it all working. Should have been much easier though and I think I will be playing around tomorrow in a less important system.





Flashback disabled automatically – a minor rant

6 01 2010

I posted the mail below on the Oracle-L mailing yesterday and was struck by a response given by many

I believe that there is a fundamental flaw in how flashback is managed in a database.

 If I make the decision, based on business requirements and technical reasons, that I want flashback logging to be enabled for a database then I would expect that to remain the situation.

However Oracle can disable flashback and not really inform the user at all. Yes there is a message in the alert log

 

Errors in file /app/oracle/admin/FLASH10G/bdump/flash10g_rvwr_22255.trc:
ORA-38701: Flashback database log 77 seq 201 thread 1: "+FRA/flash10g/flashback/log_77.2579.702048581"
ORA-17503: ksfdopn:2 Failed to open file +PETRAFRA/flash10g/flashback/log_77.2579.702048581
ORA-15012: ASM file '+FRA/flash10g/flashback/log_77.2579.702048581' does not exist
Thu Nov 5 15:14:07 2009
*************************************
RVWR encountered an error when writing flashback database logs.
See error stack in alert log. To avoid crashing the instance,
this instance has turned off flashback database.
*************************************

However that is very easily to miss and there is no OEM policy to raise an alert.

 I have tried to cause flashback to be automatically turned off but I cannot replicate it very easily as the archivelog fills prior to flashback being turned off.

I have logged an SR with Oracle as I want to develop a test so that I can put processes in place to capture the event and raise automatic alerts.

Oracle are now suggesting that this very difficult to do because of the code paths used and are suggesting I raise an enhancement request to get an OEM policy to be created to report when fkashback has become disabled automatically.

I favour a init.ora parameter which asks “Should the database hang if flashback_area full” which the DBA can set to true if he wants flashback logging to be treated like archive logging. I know that the database can be recovered without flashback but cannot with archiving but I still consider the situation should be handled much better than it currently it.

The consensus seemed to be live with it and use a shell script to search the alert logs for relevant errors and deal with them appropriately. Harsh but fair you may think. However I believe that when we are paying significant money to Oracle for their technology including extensive use of OEM then I do not think that we should have to be using in-house scripts to pick up errors that Grid does not deal with in what I believe to be an appropriate manner.

Ten years ago I was all in favour of reading  the alert log and taking appropriate action but I think it a backward step to assume  that we still do that and not ask the tool of choice to do what is required.

I will be putting an enhancement request in to Oracle to ask that we have a parameter to control whether or not flashback is automatically disabled.  Failing that get a policy alert raised when flashback has been disabled, but continue with the database usage.

One other irritation with flashback logs. One has to restart the database to enable them and yet then can be disabled dynamically. I suppose that is an advantage as well but it always strikes me as odd.





The need to ensure that hashed password values are safe

4 01 2010

Dennis Yurichev has produced a FGPA password checker which is available on the internet. I took one of our standard style passwords and pasted it into the link above and it took only 45 minutes to crack the password, which is rather a concern if anyone can get see the hashed passwords that are held in the data dictionary

DBSNMP 1FF13052A07F0164 Standard format randomly generated pwd SOLVED BDLTL5PD time spent: 45m59s; average speed: 56M

That password, on a test system,  has now been changed, however it was 8 characters long and to see it cracked on a publicly available site does make one more focused on the need for security, not that I was not already.  The only good news is that password is shown in uppercase when it reality it was a mixture of case.

Lesson 1   – If using 11G ensure that the initialisation paremeter sec_case_sensitive_logon is set to true  

So which privileges can be used to view hashed passwords.

The hashed value of the password was stored in sys.user$ AND in dba_users in V10, however security concerns caused V11 to be changed so that the hashed password is only stored in the sys.user$ table, despite there still being a column for it in dba_users.

 In 10G we know that the following privileges give access to that password value ( I am ignoring the DBA role for now)

Privilege Dba_users Sys.user$
Select_catalog_role Yes No
Select any dictionary Yes Yes
Select any table No No
     

 Things are better in 11G

Privilege Dba_users Sys.user$
Select_catalog_role Not stored No
Select any dictionary Not stored Yes
Select any table Not stored No
     

Lesson 2  – Do not give out SELECT ANY DICTIONARYprivilege without understanding the risk

Lesson 3  - If using 10G then beware of SELECT_CATALOG_ROLE

Interestingly if you want to give Grid access out and you want a user to be able to see performance pages then one way is to grant the OEM_MONITOR role, which contains SELECT_CATALOG_ROLE. testing here shows that if you copy that role then you cannot login to Grid, even without changing privileges. Our thought was to create a OEM_MONITORING role that was based on OEM_MONITOR but did not contain all the  2000+ grants that OEM_MONITOR contains and in particular to remove SELECT_CATALOG_ROLE. the OEM_MONITOR role has the following privileges :-

SQL>userprivs1
Enter value for user: OEM_MONITOR

                                                                  Adm
Granted                                                           Opt Dfl
----------------------------------------------------------------- --- ---
SELECT_CATALOG_ROLE                                               NO  YES
ADVISOR                                                           NO
ANALYZE ANY                                                       NO
ANALYZE ANY DICTIONARY                                            NO
CREATE JOB                                                        NO
CREATE SESSION                                                    NO
MANAGE ANY QUEUE                                                  NO
SELECT ANY DICTIONARY                                             NO
DEQUEUE on SYS.ALERT_QUE (by SYS)                                 NO
EXECUTE on DBSNMP.BSLN (by DBSNMP)                                NO
EXECUTE on DBSNMP.BSLN_INTERNAL (by DBSNMP)                       NO
EXECUTE on DBSNMP.BSLN_METRIC_SET (by DBSNMP)                     NO
EXECUTE on DBSNMP.BSLN_METRIC_T (by DBSNMP)                       NO
EXECUTE on DBSNMP.BSLN_OBSERVATION_SET (by DBSNMP)                NO
EXECUTE on DBSNMP.BSLN_OBSERVATION_T (by DBSNMP)                  NO
EXECUTE on DBSNMP.BSLN_STATISTICS_SET (by DBSNMP)                 NO
EXECUTE on DBSNMP.BSLN_STATISTICS_T (by DBSNMP)                   NO
EXECUTE on DBSNMP.BSLN_VARIANCE_SET (by DBSNMP)                   NO
EXECUTE on DBSNMP.BSLN_VARIANCE_T (by DBSNMP)                     NO
EXECUTE on DBSNMP.MGMT_RESPONSE (by DBSNMP)                       NO
EXECUTE on DBSNMP.MGMT_UPDATE_DB_FEATURE_LOG (by DBSNMP)          NO
EXECUTE on SYS.DBMS_AQ (by SYS)                                   NO
EXECUTE on SYS.DBMS_AQADM (by SYS)                                NO
EXECUTE on SYS.DBMS_DRS (by SYS)                                  NO
EXECUTE on SYS.DBMS_MONITOR (by SYS)                              NO
EXECUTE on SYS.DBMS_SERVER_ALERT (by SYS)                         NO
EXECUTE on SYS.DBMS_SYSTEM (by SYS)                               NO
EXECUTE on SYS.DBMS_WORKLOAD_REPOSITORY (by SYS)                  NO
SELECT on DBSNMP.BSLN_BASELINES (by DBSNMP)                       NO
SELECT on DBSNMP.BSLN_METRIC_DEFAULTS (by DBSNMP)                 NO
SELECT on DBSNMP.BSLN_STATISTICS (by DBSNMP)                      NO
SELECT on DBSNMP.BSLN_THRESHOLD_PARAMS (by DBSNMP)                NO
SELECT on DBSNMP.BSLN_TIMEGROUPS (by DBSNMP)                      NO
SELECT on DBSNMP.MGMT_BASELINE (by DBSNMP)                        NO
SELECT on DBSNMP.MGMT_BASELINE_SQL (by DBSNMP)                    NO
SELECT on DBSNMP.MGMT_BSLN_BASELINES (by DBSNMP)                  NO
SELECT on DBSNMP.MGMT_BSLN_DATASOURCES (by DBSNMP)                NO
SELECT on DBSNMP.MGMT_BSLN_INTERVALS (by DBSNMP)                  NO
SELECT on DBSNMP.MGMT_BSLN_METRICS (by DBSNMP)                    NO
SELECT on DBSNMP.MGMT_BSLN_STATISTICS (by DBSNMP)                 NO
SELECT on DBSNMP.MGMT_BSLN_THRESHOLD_PARMS (by DBSNMP)            NO
SELECT on DBSNMP.MGMT_HISTORY (by DBSNMP)                         NO
SELECT on DBSNMP.MGMT_HISTORY_SQL (by DBSNMP)                     NO
SELECT on DBSNMP.MGMT_LATEST (by DBSNMP)                          NO
SELECT on DBSNMP.MGMT_LATEST_SQL (by DBSNMP)                      NO

However MOS note 455191.1 does state that these priviliges might be too much for some sites and the following privileges might be sufficient

Instead of granting OEM_MONITOR role, the following roles/privileges can be assigned directly to the user:

[/code]

Roles:
    ->CONNECT
    ->SELECT_CATALOG_ROLE <code>
</code>System Privileges <code>
</code>    ->ANALYZE ANY <code>
</code>    ->CREATE TABLE
Profile <code>
  ->DEFAULT

The select_catalog_role still poses a problem to those using 10G (or earlier) but does assist those managing 11G security. I do not think that the majority of our users would need ANALYZE ANY or CREATE TABLE as part of a grid usage role so we will not be granting those and we will ensuring that the password column of dba_users and user$ is audited.

Lesson 4 - Consider a reduction in the use of OEM_MONITOR and replace with specific privileges

Finally Pete Finnegan (and others I am sure)  has written about how dangerous it is to allow users to read any trace files that they generate and he gives an example of taking a buffer dump to capture a hashed password

Lesson 5 - be careful who has access to read trace files





11GR2 for HPUX now available

22 12 2009

As the title states 11GR2 is now available for HPUX on Itanium. http://www.oracle.com/technology/software/products/database/index.html

Co-incidentally I had my first foray into VMware over the last few days and yesterday got an 11GR2 database built (OEL 5.3)

Initial impressions were that the installer is much cleaner but I have not done anything else with it as yet.

My list of 11GR2 new features that I wanted to investigate includes

11Gr2  Duplicate without connecting to target database
11GR2 OCR disks can now be stored on ASM
11GR2 Oracle restart – autostart ASM/Db/listener 
11GR2 Zero Downtime Patching – CRS/RAC
11GR2 Remote database jobs. 
11GR2 Enterprise Manager Support Workbench 
11GR2 Audit trail management/housekeeping new features