Oracle DBA – A lifelong learning experience

UKOUG 2011 Part Deux

Posted by John Hallas on December 7, 2011

Day 2 of the UKOUG conference at the ICC in Birmingham and back into the fray.

First up was Thomas Presslie talking about Dataguard fast start failover. How he managed to demonstrate transactions and network connectivity using whisky and toilet paper could not be done full justice in a blog – it had to be seen to be believed.

It did make me want to do more with FSFO, especially noting how easy the setup was using OEM. However my belief that the database is only part of the end solution and failing that over to a second datacentre after a network flicker may leave the application stack in a mess does still concern me. Co-incidentally I have a requirement to set up a second standby configuration cascaded from a physical standby but keeping the 3rd database perhaps one hour behind whilst the standby is in real time apply mode with no lag. That might give us a chance to determine the status of the data before a logical corruption (user error) had occurred. Much more likely to be of value is flashback query but we are going to look at both avenues. It is highly unlikely we would ever be in a position to flashback the database.

Julian Dyke then talked for an hour about RAC trouble-shooting (mostly and the time flew past. I made quite a few notes of things to think about. The pros and cons of putting the scan addresses in /etc/hosts (HPUX) to be used in the event of a DNS failure was one thought. Looking at the exectask function and the scripts used to call various function was another action I took for myself. Another was a big list of asmcmd commands, some of which I did not recognise. I think they must have come in with 11GR2 which I have not really used myself although we are using it on site.

Tanel Poder’s biggest ever problem was next up. I had seen this presentation last year and knew the answer but how he got there was still interesting. The use of the HPUX command kitrace (similar to dtrace on Solaris – see reply below for more details) reminded me that I was going to look at that in some detail but have never got around to it. As my site is likely to be moving away from HPUX sooner rather than later perhaps there is not much point now.

After lunch John Beresniewicz was talking about ASH outliers. Quite mathematically based, which is always a challenge for me but he will be posting a script (possibly via Doug Burn’s blog) which he has developed as another means of dissecting and analysing ASH data.

Michael Salt’s talk on indexes was full of real world examples and there were lots of nice little hints and tips, none of which were earth-shattering but all of which were good practise and I found it a useful reminder of what I should be doing when looking at code. On the same theme two slots later Tony Hasler was presenting a beginners guide to SQL tuning.   I have never seen Tony present before but I really liked both his style and the content. A lot of information thrown in and good explanations of various autotrace outputs. I will definitely be downloading his presentation to run through it and see what I can put to further use. Whilst I do not think I am expert in the field of SQL tuning, indeed far from it, I do like to think I know what to look for. Sometimes listening to others you realise in the same lecture both how much you already know and how little you actually follow best practises. There is no real substitute from looking at code and trying to improve performance. For a lot of us who have a very wide-ranging DBA role then that opportunity to practise odes not appear often enough which is why it is good to review and refresh your approach now and then.

At every conference I like to try and hear something new or touch on an area that is outside my day job. John King’s talk on Edition Based Redefinition was just that. I am not really in a position to take advantage of the ability to let users run differing sets of code and then migrate them across to a new release in a seamless manner, all without any outages or interruption to service. However I could see how useful it could be, especially in the world of the Apps DBA, say for EBS. Apparently no less a person than Tom Kyte referred to EBR as the ‘killer feature’ within 11GR2.  John had an easy, comfortable manner  and the time flew past, so much so that he had to be dragged kicking and screaming from the stage by the next presenter.

All in all another good day, rounded off with a couple of beers with work colleagues and a few presenters, all with plenty of Oracle chat included.

6 Responses to “UKOUG 2011 Part Deux”

  1. Vitaly Kaminsky said

    It’s all very well, but unfortunately Thomas did not answer the question what would happen with the “lights out” on the standby site when whitness server is there – in this case the primary supposed to kill itself because it can’t see the whitness.. but the standby is off so we end up with no service alltogether…

    • John Hallas said

      Good point Vitaly, I will raise the question with Thomas if I see him at the conference today.

    • Thomas Presslie said

      Hi Vitaly,

      Apologies for delay in response – I only recently received an email about the question being posted to John’s blog.

      In the event where the primary has lost connectivity to both the observer and the standby (if the observer is located at standby site) and the value of FastStartFailoverThreshold seconds has passed, the primary will assume that an automatic failover has already occurred and the primary will shut down.

      The behaviour of the standby depends on whether connectivity is lost between the standby and observer. If the observer can still connect to the standby it will initiate a failover immediately once the FastStartFailoverThreshold time has passed.

      If observer connectivity to the standby is also lost the standby will remain in it’s current state meaning service would temporarily be unavailable. Once the observer can connect to the standby it will immediately initiate a failover and service will be available again.

      It is important that the observer does not allow a split-brain condition to occur as a feature of FSFO.

      With dual NIC’s connected to different switches, backup links, etc, I would imagine that the outage would have to be fairly drastic for the observer connectivity to be lost to both targets in addition to the loss of primary connectivity to the standby.


  2. jgarry said

    I don’t seem to be getting any google-love on ktitrace. Would you have a reference? Is that an 11.31 thing? Or is it ktracer?

    • John Hallas said

      I had a conversation with a friendly HP guy I know (well two of them actually) and they came up with the following :-

      We normally use kitrace as the basis for our analysis. It’s an internal tool, so there is little to no documentation available about it on google.
      To collect the data simply run “kitrace” as root. Then there is a ki_all –r command that will produce a set of reports, it does require a couple of extra options. The kipid report is the most useful. Interpreting the kipid report is something of an art, so I’m not going to cover it here.

      Depending upon what you’re trying to trace, you may also find value in tusc. You can find it on the internet along with documentation. Where kitrace is great for capturing snapshots of the entire system, tusc is better for capturing a single process.

      I hope that helps and apologies that I named it as ktitrace when it should have been kitrace

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: