Oracle DBA – A lifelong learning experience

Purge_stats – a worked example of improving performance and reducing storage

Posted by John Hallas on October 13, 2015

I have written a number of blog entries around managing the SYSAUX tablespace and more specifically the stats tables that are in there.

https://jhdba.wordpress.com/2014/07/09/tidying-up-sysaux-removing-old-snapshots-which-you-didnt-know-existed

https://jhdba.wordpress.com/2009/05/19/purging-statistics-from-the-sysaux-tablespace

https://jhdba.wordpress.com/2011/08/23/939/

This is another one around the same theme. What I offer in this one is some new scripts to identify what is working and what is not, along with a worked example with real numbers and timings.

Firstly let’s define the problem. This is a large 11.2.0.4 database running on an 8 node Exadata cluster, lots of daily loads and an ongoing problem in keeping statistics up to date. The daily automatic maintenance windows runs for 5 hours Mon-Fri and 20 hours each weekend day. Around 50% of that midweek window is spent purging stats which doesn’t leave a lot of time for much else.

How do I know that?

col CLIENT_NAME form a25
COL MEAN_JOB_DURATION form a35
col MAX_DURATION_LAST_30_DAYS form a35
col MEAN_JOB_CPU form a35
set lines 180
col window_name form a20
col job_info form a70

 

SQL> select client_name from dba_autotask_client;
 CLIENT_NAME
----------------------------------------------------------------
sql tuning advisor
auto optimizer stats collection
auto space advisor
 
SQL>select client_name, MEAN_JOB_DURATION, MEAN_JOB_CPU, MAX_DURATION_LAST_30_DAYS from dba_autotask_client;
CLIENT_NAME         MEAN_JOB_DURATION                   MEAN_JOB_CPU                        MAX_DURATION_LAST_7_DAYS
-------------------- ----------------------------------- ----------------------------------- -----------------------------------
auto optimizer stats +000000000 03:37:36.296875000       +000000000 01:35:20.270514323       +000 19:59:56
auto space advisor   +000000000 00:12:49.440677966       +000000000 00:04:45.584406780
sql tuning advisor   +000000000 00:11:53.266666667       +000000000 00:10:22.344666667
 
SQL>select client_name,window_name, job_info from DBA_AUTOTASK_JOB_HISTORY order by window_start_time
 CLIENT_NAME         WINDOW_NAME         JOB_INFO
-------------------- -------------------- ---------------------------------------------------------------------------
 auto optimizer stats WEDNESDAY_WINDOW     REASON="Stop job called because associated window was closed"
 collection
 auto optimizer stats THURSDAY_WINDOW     REASON="Stop job called because associated window was closed"
 collection
 auto optimizer stats FRIDAY_WINDOW       REASON="Stop job called because associated window was closed"
 collection
auto optimizer stats SATURDAY_WINDOW     REASON="Stop job called because associated window was closed"
 collection
 auto optimizer stats SUNDAY_WINDOW       REASON="Stop job called because associated window was closed"

 So we know that the window is timing out and that the stats collection is taking an average of 3 hours. Another way is seeing how long the individual jobs take

Col operation format a60
Col start_time format a20
Col end_time form a20
Col duration form a10
alter session set nls_date_format = 'dd-mon-yyyy hh24:mi';
     select operation||decode(target,null,null,' for '||target) operation
          ,cast(start_time as date) start_time
          ,cast(end_time as date) end_time
          ,to_char(floor(abs(cast(end_time as date)-cast(start_time as date))*86400/3600),'FM09')||':'||
          to_char(floor(mod(abs(cast(end_time as date)-cast(start_time as date))*86400,3600)/60),'FM09')||':'||
         to_char(mod(abs(cast(end_time as date)-cast(start_time as date))*86400,60),'FM09') as DURATION
    from dba_optstat_operations
    where start_time between to_date('20-sep-2015 16','dd-mon-yyyy hh24') and to_date('10-oct-2015 21','dd-mon-yyyy hh24')
    and operation ='purge_stats'
 
OPERATION                                                   START_TIME           END_TIME             DURATION
------------------------------------------------------------ -------------------- -------------------- ----------
purge_stats                                                 30-sep-2015 17:31   30-sep-2015 20:02   02:31:01
purge_stats                                                  02-oct-2015 17:22   02-oct-2015 20:00   02:37:42
purge_stats                                                 28-sep-2015 18:04   28-sep-2015 20:38   02:34:18
purge_stats                                                 29-sep-2015 17:35   29-sep-2015 20:00   02:25:04
purge_stats                                                 05-oct-2015 17:53   05-oct-2015 20:48   02:55:20
purge_stats                                                 01-oct-2015 17:17   01-oct-2015 19:28   02:11:17
 
6 rows selected.

Another (minor) factor was how much space the optimizer stats were taking in the SYSAUX tablespace. Over 500Gb which does seem a lot for 14 days

COLUMN "Item" FORMAT A25
COLUMN "Space Used (GB)" FORMAT 999.99
COLUMN "Schema" FORMAT A25
COLUMN "Move Procedure" FORMAT A40

 

     SELECT occupant_name "Item",
     space_usage_kbytes/1048576 "Space Used (GB)",
     schema_name "Schema",
     move_procedure "Move Procedure"
     FROM v$sysaux_occupants
     WHERE occupant_name in ('SM/OPTSTAT')
     ORDER BY 1;

 


Item                     Space Used (GB) Schema                   Move Procedure
------------------------- --------------- ------------------------- ----------------------------------------
SM/OPTSTAT                         527.17 SYS

I ran through some of the techniques mentioned in the posts above – checking the retention period was 14 days, checking we had no data older than that, checking that tables were being partitioned properly and split into new partitions. No obvious answer as to why the purge was taking 2.5 hours every day.   I therefore set a trace up, ran the purge manually and viewed the tkprof file

ALTER SESSION SET TRACEFILE_IDENTIFIER = "JOHN";
ALTER SESSION SET EVENTS '10046 trace name context forever, level 8';
set timing on
exec dbms_stats.purge_stats(sysdate-14);

 

There seemed to be quite a few entries for sql_ids similar to the one below –each taking around 10 minutes to insert data into a partition

SQL ID: 66yqxrmjwfsnr Plan Hash: 3442357004
insert /*+ RELATIONAL("WRI$_OPTSTAT_HISTHEAD_HISTORY")
PARALLEL("WRI$_OPTSTAT_HISTHEAD_HISTORY",1) APPEND NESTED_TABLE_SET_SETID
NO_REF_CASCADE */ into "SYS"."WRI$_OPTSTAT_HISTHEAD_HISTORY" partition
("SYS_P894974") ("OBJ#","INTCOL#","SAVTIME","FLAGS","NULL_CNT","MINIMUM",
"MAXIMUM","DISTCNT","DENSITY","LOWVAL","HIVAL","AVGCLN","SAMPLE_DISTCNT",
"SAMPLE_SIZE","TIMESTAMP#","EXPRESSION","COLNAME","SPARE1","SPARE2",
"SPARE3","SPARE4","SPARE5","SPARE6") (select /*+
RELATIONAL("WRI$_OPTSTAT_HISTHEAD_HISTORY")
PARALLEL("WRI$_OPTSTAT_HISTHEAD_HISTORY",1) */ "OBJ#" ,"INTCOL#" ,"SAVTIME"
,"FLAGS" ,"NULL_CNT" ,"MINIMUM" ,"MAXIMUM" ,"DISTCNT" ,"DENSITY" ,"LOWVAL" ,
"HIVAL" ,"AVGCLN" ,"SAMPLE_DISTCNT" ,"SAMPLE_SIZE" ,"TIMESTAMP#" ,
"EXPRESSION" ,"COLNAME" ,"SPARE1" ,"SPARE2" ,"SPARE3" ,"SPARE4" ,"SPARE5" ,
"SPARE6" from "SYS"."WRI$_OPTSTAT_HISTHEAD_HISTORY" partition
("SYS_P894974") ) delete global indexes
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1     71.29     571.77     498402      26399    1322741           0
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2     71.29     571.78     498402      26399    1322741           0

Misses in library cache during parse: 1
Optimizer mode: CHOOSE
Parsing user id: SYS   (recursive depth: 2)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  LOAD AS SELECT  (cr=26399 pr=498402 pw=60633 time=571778609 us)
   5398837    5398837    5398837   PARTITION RANGE SINGLE PARTITION: 577 577 (cr=22524 pr=22504 pw=0 time=9824150 us cost=3686 size=399126189 card=5183457)
   5398837    5398837    5398837    TABLE ACCESS FULL WRI$_OPTSTAT_HISTHEAD_HISTORY PARTITION: 577 577 (cr=22524 pr=22504 pw=0 time=9169817 us cost=3686 size=399126189 card=5183457)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  db file sequential read                    415262        0.16        458.40
  db file scattered read                        370        0.16         11.96
  direct path write temp                       2707        0.16         36.48
  db file parallel read                           2        0.02          0.04
  direct path read temp                        3742        0.04          8.15
********************************************************************************

I then spent sometime reviewing that table and it’s partitions – they seemed well balanced

SQL> select partition_name from dba_tab_partitions where table_name = 'WRI$_OPTSTAT_HISTHEAD_HISTORY';

 

PARTITION_NAME
------------------------------
SYS_P909952
SYS_P908538
SYS_P906715
SYS_P905214
P_PERMANENT

 

select PARTITION_NAME,HIGH_VALUE,TABLESPACE_NAME,NUM_ROWS,LAST_ANALYZED
from DBA_TAB_PARTITIONS where table_owner ='SYS' and table_name=''WRI$_OPTSTAT_HISTHEAD_HISTORY' order by 1;

 

PARTITION_NAME                 HIGH_VALUE                                                                       TABLESPACE_NAME               NUM_ROWS LAST_ANAL
------------------------------ -------------------------------------------------------------------------------- ------------------------------ ---------- ---------
P_PERMANENT                   TO_D
ATE(' 2014-02-23 09:00:52', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA SYSAUX0 15-SEP-15
SYS_P905214                   TO_DATE(' 2015-09-29 09:00:52', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA SYSAUX 5026601 29-SEP-15
SYS_P906715                   TO_DATE(' 2015-09-30 09:00:52', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA SYSAUX 5178283 30-SEP-15
SYS_P908538                   TO_DATE(' 2015-10-01 09:00:52', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA SYSAUX 4977859 01-OCT-15
SYS_P909952                   TO_DATE(' 2015-10-09 09:00:52', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA SYSAUX

 

select trunc(SAVTIME), count(*) from WRI$_OPTSTAT_HISTGRM_HISTORY group by trunc(SAVTIME) order by 1
TRUNC(SAV   COUNT(*)
--------- ----------
18-SEP-15   22741579
19-SEP-15   24299504
20-SEP-15   22816509
21-SEP-15   24200455
22-SEP-15   24156330
23-SEP-15   24407643
24-SEP-15   23469620
25-SEP-15   23221382
26-SEP-15   25372495
27-SEP-15   23144212
28-SEP-15   23522809
29-SEP-15   24362715
30-SEP-15   25418527
01-OCT-15   24383030

I decided to rebuild the index I_WRI$_OPTSTAT_H_OBJ#_ICOL#_ST really only because it seemed to be very big for the number of rows it was containing – 192Gb


col Mb form 9,999,999
col SEGMENT_NAME form a40
col SEGMENT_TYPE form a6
set lines 120
select sum(bytes/1024/1024) Mb, segment_name,segment_type from dba_segments
where tablespace_name = 'SYSAUX'
and segment_name like 'WRI$_OPTSTAT%'
and segment_type='TABLE'
group by segment_name,segment_type order by 1 asc
/
 
        MB SEGMENT_NAME                             SEGMEN
---------- ---------------------------------------- ------
         0 WRI$_OPTSTAT_SYNOPSIS_PARTGRP            TABLE
         0 WRI$_OPTSTAT_AUX_HISTORY                 TABLE
        72 WRI$_OPTSTAT_OPR                         TABLE
       385 WRI$_OPTSTAT_SYNOPSIS_HEAD$              TABLE
       845 WRI$_OPTSTAT_IND_HISTORY                 TABLE
     1,221 WRI$_OPTSTAT_TAB_HISTORY                 TABLE

 

select sum(bytes/1024/1024) Mb, segment_name,segment_type from dba_segments
where tablespace_name = 'SYSAUX'
and segment_name like '%OPT%'
and segment_type='INDEX'
group by segment_name,segment_type order by 1
/        MB SEGMENT_NAME                             SEGMEN
---------- ---------------------------------------- ------
         0 I_WRI$_OPTSTAT_AUX_ST                    INDEX
         0 I_WRI$_OPTSTAT_SYNOPPARTGRP              INDEX
         0 WRH$_PLAN_OPTION_NAME_PK                 INDEX
         0 WRH$_OPTIMIZER_ENV_PK                    INDEX
        43 I_WRI$_OPTSTAT_OPR_STIME                 INDEX
       293 I_WRI$_OPTSTAT_SYNOPHEAD                 INDEX
       649 I_WRI$_OPTSTAT_IND_ST                    INDEX
       801 I_WRI$_OPTSTAT_IND_OBJ#_ST               INDEX
       978 I_WRI$_OPTSTAT_TAB_ST                    INDEX
     1,474 I_WRI$_OPTSTAT_TAB_OBJ#_ST               INDEX
     4,968 I_WRI$_OPTSTAT_HH_ST                     INDEX
     6,807 I_WRI$_OPTSTAT_HH_OBJ_ICOL_ST            INDEX
   129,509 I_WRI$_OPTSTAT_H_ST                      INDEX
   192,304 I_WRI$_OPTSTAT_H_OBJ#_ICOL#_ST           INDEX

That index dropped to 13Gb – needless to say I continued with the rest

         MB SEGMENT_NAME                             SEGMENT
---------- ---------------------------------------- ------
         0 WRH$_OPTIMIZER_ENV_PK                    INDEX
         0 I_WRI$_OPTSTAT_AUX_ST                    INDEX
         0 I_WRI$_OPTSTAT_SYNOPPARTGRP              INDEX
         0 WRH$_PLAN_OPTION_NAME_PK                 INDEX
         2 I_WRI$_OPTSTAT_OPR_STIME                 INDEX
        39 I_WRI$_OPTSTAT_IND_ST                    INDEX
        47 I_WRI$_OPTSTAT_IND_OBJ#_ST               INDEX
        56 I_WRI$_OPTSTAT_TAB_ST                    INDEX
        72 I_WRI$_OPTSTAT_TAB_OBJ#_ST               INDEX
       200 I_WRI$_OPTSTAT_SYNOPHEAD                 INDEX
       482 I_WRI$_OPTSTAT_HH_ST                     INDEX
       642 I_WRI$_OPTSTAT_HH_OBJ_ICOL_ST            INDEX
     9,159 I_WRI$_OPTSTAT_H_ST                      INDEX
    12,560 I_WRI$_OPTSTAT_H_OBJ#_ICOL#_ST           INDEX

An overall saving of 300Gb

Item                     Space Used (GB) Schema                   Move Procedure
------------------------- --------------- ------------------------- ----------------------------------------
SM/OPTSTAT                         214.12 SYS

 

I wasn’t too hopeful that this would benefit the purge routine because in the trace file above you can see that it is not using an index.

exec dbms_stats.purge_stats(sysdate-14);
Elapsed: 01:47:17.15

Better but not that much so I rebuilt the tables using a move command and parallel 8 and then rebuilt the indexes again. I intentionally left the table and indexes with the parallel setting that it acquires from the move/rebuild just for testing purposes.

select 'alter table '||segment_name||' move tablespace SYSAUX parallel 8;' from dba_segments where tablespace_name = 'SYSAUX' and segment_name like '%OPT%' and segment_type='TABLE'

 

alter table WRI$_OPTSTAT_TAB_HISTORY move tablespace SYSAUX parallel 8;
alter table WRI$_OPTSTAT_IND_HISTORY move tablespace SYSAUX parallel 8;
alter table WRI$_OPTSTAT_AUX_HISTORY move tablespace SYSAUX parallel 8;
alter table WRI$_OPTSTAT_OPR move tablespace SYSAUX parallel 8;
alter table WRH$_OPTIMIZER_ENV move tablespace SYSAUX parallel 8;
alter table WRH$_PLAN_OPTION_NAME move tablespace SYSAUX parallel 8;
alter table WRI$_OPTSTAT_SYNOPSIS_PARTGRP move tablespace SYSAUX parallel 8;
alter table WRI$_OPTSTAT_SYNOPSIS_HEAD$ move tablespace SYSAUX parallel 8;

Then rebuilt the indexes as above

Now we see the space benefits. Space down to 182Gb from 527Gb – 345Gb saved


 

Item                      Space Used (GB) Schema                   Move Procedure
------------------------- --------------- ------------------------- ----------------------------------------
SM/OPTSTAT                         182.61 SYS

The first purge with no data to purge was instantaneous

SQL> exec dbms_stats.purge_stats(sysdate-14);
Elapsed: 00:00:00.91

 

The second which was purging a days worth only took 9 minutes

SQL> exec dbms_stats.purge_stats(sysdate-13);
Elapsed: 00:09:40.65

I then put the tables and indexes back to a degree of 1 and ran a purge of another days worth of data which showed a consistent timing

SQL> exec dbms_stats.purge_stats(sysdate-12);
Elapsed: 00:09:32.05

It is clear from the number of posts on managing optimizer stats and their history that maintenance and management is required and I hope my set of posts help readers to do just that.

The following website has some very good information on it http://maccon.wikidot.com/sysaux-stats and I must credit my colleague Andy Challen  for generating some good ideas and providing a couple of scripts. Talking through a plan of action does seem to be a good way of working.

 

4 Responses to “Purge_stats – a worked example of improving performance and reducing storage”

  1. Dom Brooks said

    The stats history is of limited value most of the time.
    I can count on 0 fingers the number of times it has been genuinely unmissable to me.

    When it is causing problems (space or activity related), I favour shooting first, asking questions later.
    exec DBMS_STATS.PURGE_STATS(DBMS_STATS.PURGE_ALL);
    provided with the official caveat of
    “This option is planned to be used as a workaround on urgent cases and under the advice of Support…”
    🙂

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: