Oracle DBA – A lifelong learning experience

Archive for the ‘Grid control and agents’ Category

Grid control sends false alerts with “Agent to OMS communication broken” message

Posted by John Hallas on July 12, 2011

We have been seeing an increasing number of alerts stating that OEM cannot ping an agent. These then generate alerts and incidents and potential callouts. The situation was getting increasingly worse and therefore we started some investigation as we had put it down to a busy network and the fact we have a lot of distributed agents.

The error message is Message=Agent is unable to communicate with the OMS. (REASON = Agent is Unreachable (REASON : Agent to OMS Communication is broken ). Severity=Unreachable Start

We are on GC 10.2.0.5. We came across Note 9276193.8 which highlights bug 9276193 -  gc sends false alerts with “agent to oms communication broken” message

There are two workarounds suggested :-

Turn off alerts notification – which is a bit of a joke really
Increase max_inactive_time in emd_ping table to a large value – the table name is actually mgmt_emd_ping.

Currently the default value is 120 seconds and we upped it to 240 and that resolved our problems.
Below is a test case showing a selection of agents and their target guids and how we proved the fix. Read the rest of this entry »

Posted in Grid control and agents, Oracle | Tagged: , , , , , , | 6 Comments »

Enterprise manager – How To Install An Additional Management Service

Posted by John Hallas on June 7, 2011

How To Install An Additional Management Service

This document will describe how to install an additional management service for Oracle Enterprise Manager Grid Control 10.2.0.5 on HP-UX. It was written by my colleague Carl Holmes.

You must first install the base release of Enterprise Manager (10.2.0.3) and patch it to 10.2.0.5

The recommended installation method is to install the software, patch it to 10.2.0.5 and then configure Enterprise Manager.

To do this, use these installer commands

10.2.0.3

./runInstaller -noconfig -ignoreSysPreReqs b_skipDBvalidation=TRUE

Explanation of this command

-noconfig - Do not run the configuration assistants after the software is installed

-ignoreSysPreReqs - Skip the system pre req checks as this version of Enterprise Manager is not aware of HP-UX version 11.31

b_skipDBvalidation=TRUE - Use this command to stop the installer checking the repository database version, as it is not aware of version 11.1.0.7

When the system pre req screen appears, manually verify the OS version to continue.

When the installation is complete you will be prompted to run  $OMS_HOME/allroot.sh.

This runs the root.sh scripts for the OMS and the management agent that have just been installed Read the rest of this entry »

Posted in Grid control and agents, Oracle | Tagged: , , , , | Leave a Comment »

Using OEM reports to show PSU levels across the estate

Posted by John Hallas on March 30, 2011

The reporting capabilities of OEM are very good, although sometimes it is hard to find  which views the data you want is held in. This post is about sharing how to build a report which details how many databases are at each PSU patch. It will also show how to schedule a repeating report and save the history of each run so that trending can be measured. Most ( if not all ) of the work was performed by Sarabjit Lotay (encouraged by superb leadership !!)

First lets look at the end result. I have had to take out specific detail from our estate but the report looks like this

The red blotches don’t help but you can see the general format and it does prove very useful. The next shot shows how the report is made up

Read the rest of this entry »

Posted in Grid control and agents, Oracle | Tagged: , , , , | 2 Comments »

Overview of OEM management packs

Posted by John Hallas on December 16, 2010

Porus Homi Havewala who produced two of the top ten most downloaded articles on OTN in 2009 has produced an overview of OEM Management Packs

Mostly centred on the Diagnostic and Tuning Packs, which you need to have purchased before you are allowed to use the ASH views, the article briefly covers Configuration Management, Change Management, Data Masking and other management packs.

If you need to get a good overview of the management packs then this article will do for you.

It does not mention a new one that I know is coming out in Mid 2011 – the data subsetting pack (not sure of the actual title) but it will be a pack/toolset to produce cut down referentially complete smaller versions of larger, normally production databases. It is expected that it will be tightly integrated with the Data Masking Pack so that smaller versions of production databases can be used for development and testing purposes with all confidential data masked.

Posted in Grid control and agents, Oracle | Tagged: , , , , , | Leave a Comment »

How to create a User Defined metric(UDM) in Grid

Posted by John Hallas on December 8, 2010

Using Grid it is possible to create user defined metrics that capture information about the state of a database, a host or an application. Once you have a script or sql statement that returns a value then a UDM can be created. Within EM a report can be created to report the metrics or an alert can be raised.

Firstly I will create a UDM that reports when an account is locked and automatically raise an alert. It may be a simple example but I show one or two issues that can occur.

EM/Select the database/User defined Metrics at the bottom of the main page. Create a new UDM and I suggest starting the name off with a business area such as Retail or EBS as you can end up with quite a few metrics once you get the hang of them.

The key choice to make is whether the output will be a number or a string and whether or not you want one or two values output. In my case I want to know the status of an account so that will be string and I want to output two columns, the username and the status. Enter your query and an account to use for the connection. Ensure that the query works and the account has the privilege to run that query. In this case when  I tested the UDM it fails as I had not  created the accounts beforehand. Select a comparison operator (equal, less than MATCH etc) to the sql query you are running and then format the output you want in the alert.

Note the %Key% and the %value% keywords have to be in exactly that case.

Set the test time to be 5 minutes in the schedule team for the purposes of testing it out and then run the test command (or save if fully confident). Remember to reschedule later. Note 5 minutes is the lowest frequency you can use. Read the rest of this entry »

Posted in Grid control and agents, Oracle | Tagged: , , , | 5 Comments »

Using Grid to display database CPU usage

Posted by John Hallas on September 3, 2010

There was a recent post on the Oracle -L list asking  about using Grid Control to  report on a particular databases cpu usage during a certain period of time. A number of answers came in showing  the sql queries that would answer the question but I saw the question being ‘ how can we display the CPU usage in Grid’  or indeed how can we produce a customised metric report on any database in Grid

However for those who are interested in the recommended scripted methods then the the answers that were of most use in my view were from Karl Arao pointing to  a script he has written and Rich Jesse produced the following code

SELECT
mmd.*
FROM
sysman.mgmt$metric_daily mmd
JOIN
sysman.mgmt$target mt
ON mmd.target_name = mt.target_name
AND mmd.target_type = mt.target_type
AND mmd.target_guid = mt.target_guid
WHERE
mmd.metric_column like '%cpu%'
AND mt.target_name = :D B_NAME
AND mt.target_type = 'oracle_database';

My method was to create a report that could be used to report on any instance and this is how I did it. Read the rest of this entry »

Posted in Grid control and agents, Oracle, scripts | Tagged: , , | 1 Comment »

The power of emdiag

Posted by John Hallas on August 20, 2010

I am currently loooking at emdiag and finding it more and more useful as I fully understand it’s capabilities. To copy a comment from a metalink note – EMDIAG is a diagnostics and troubleshooting kit which can help with  a health assesment of a site. It is a set of scripts developed by Werner De Gruyter and instructions for download and usage are in Note 421053.1 EMDiagkit download and master index. I will not go into the installation instructions here but just show a few of the commands that I am finding useful and an example of an issue that it is highlighted. Note that I have set my Oracle Home to be the OMS home and repvfy is found in OH/bin. However the output is located under wherever you have installed the emdiag software and in my case would be OH/emdiag/log

repvfy dump health -pwd password 

This gives an very good overview of repository DB specific information, database performance statistics , installed OMS patchsets and EM monitoring targets.   Read the rest of this entry »

Posted in Grid control and agents, scripts | Tagged: , , , | Leave a Comment »

Upcoming, my first SIG talk

Posted by John Hallas on December 16, 2009

I am making my first presentation to a SIG meeting in January when I talk about how my company has moved from a site where Oracle was almost non-existent less than two years ago to one that is now delivering what seems to be every product that Oracle has invented (or purchased). It won’t be a ‘how great we are’ approach but rather how making some simple but fundamental decisions have made it much easier to build and support a large and growing infrastructure.

I will be talking about having a set of standards that are very simple but are a fundamental building block in our aim of having the same look and feel across the estate and how a small number of setup scripts can make life so much easier.

I will probably diverge from the straight and narrow and discuss our experiences of running a centralised OEM solution and given we currently have 7 open SRs raised referencing Grid problems I haved an inkling of which way it may be slanted.  However I hope that by the time of the talk I can be much more positive because there is  lot to be said for OEM.

The date is 21st January 2010 in London and an abstract can be seen in the Unix SIG  agenda

I hope to see some of you there

Posted in Grid control and agents, Oracle | Tagged: , , | 6 Comments »

Forced remove of targets from OEM repository

Posted by John Hallas on April 8, 2009

I blogged about how to remove targets from OEM <a href=”http://jhdba.wordpress.com/2009/01/07/removing-a-grid-target-from-the-oms/”&gt; removing a Grid target </a> and I used my own blog entry yesterday to try and force the removal of several entries. These appeared to work but when I tried adding new targets (we were migrating databases fom one server to another) I got the error message

 java.sql.SQLException: ORA-20600: The specified target is in the process of being deleted.(target name = SID)(target type = oracle_database)(target guid = 21D8EFD67CCF409D7CDB41DCFD1F9D94)
ORA-06512: at “SYSMAN.TARGETS_INSERT_TRIGGER”, line 36
ORA-04088: error during execution of trigger ‘SYSMAN.TARGETS_INSERT_TRIGGER’
ORA-06512: at “SYSMAN.EM_TARGET”, line 1918
ORA-06512: at “SYSMAN.MGMT_TARGET”, line 2705
ORA-06512: at line 1 

A colleague, Allan Ho, looked at the problem and resolved it by looking in sysman.mgmt_targets_delete

 select * from mgmt_targets_delete;

 To delete the entry use :

 

 begin
    mgmt_admin.delete_target('SID','oracle_database');
 end;

 You can also force deletion by using :

 

 begin
    mgmt_admin.delete_target_internal('SID','oracle_database');
 end;

The dynamic sql would be

 select 'execute sysman.mgmt_admin.delete_target_internal ('''||target_name||''','''||target_type||'''); ' from sysman.mgmt_targets_delete;

Posted in Grid control and agents, Oracle | Tagged: , , , , , | 3 Comments »

Removing alerts from Enterprise manager Grid Control

Posted by John Hallas on February 18, 2009

We have an Enterprise manager Grid control morning check report that indicates issues across all our environments.

The list of checks include :

Usable flash recovery area less than 20%

Filesystems over 90% used

Databases not backed up within 1 day and not blacked out and not a physical standby

Dataguard status (targets not blacked out)

Alert log errors

Various specific job checks

The alert log query we use is a view based on this query

select distinct t.target_name,

m.column_label,

smh.key_value,

smh.string_value

from mgmt_targets t,

mgmt_metrics m,

mgmt_string_metric_history smh

where t.target_guid = smh.target_guid

and m.metric_guid = smh.metric_guid

and m.statefull = 0

and m.metric_name like ‘alertLog%’

However that still requires us to logon to each database and click on the alert_log link under diagnostic options on the home purge. We then need to purge the alerts.

Looking for a better way to do this I traced an EM session on the OMS repository database whilst purging these alerts. This provided me with the following statement which can be used to delete all outstanding alerts for every system.

delete from mgmt_string_metric_history where metric_guid in (

select metric_guid from mgmt_metrics where statefull=0 and metric_name like ‘alertLog%’);

This is set up as a script on the OMS database server. There is no commit within the script. This is intentional so that the count of deleted records should match the number of alerts outstanding on the morning check report. It is also intentional that we have not created this as a EM job. The reason is that it will then perform an auto-commit which does not allow any regression if the record count is different from the number of alerts.

I hope the idea of a morning check report proves useful and the sql statements listed can be utilised.

Posted in Grid control and agents, Oracle | 2 Comments »

 
Follow

Get every new post delivered to your Inbox.

Join 134 other followers