Oracle DBA – A lifelong learning experience

Tidying up SYSAUX – removing old snapshots which you didn’t know existed

Posted by John Hallas on July 9, 2014

This post runs in parallel with a post I made 5 years ago and which is still the most read one on  this blog

It will show how to reduce space taken up in the  SYSAUX tablespace by because you have many more AWR snapshots being retained than you think.

Firstly lets take an example database and we can see that we are using 92Gb  of space


set linesize 120
set pagesize 100
col ash form a30
col retention form a30
col snap form a30

COLUMN "Space Used (GB)" FORMAT 999.99
COLUMN "Move Procedure" FORMAT A40

SELECT  occupant_name "Item",
space_usage_kbytes/1048576 "Space Used (GB)",
schema_name "Schema",
move_procedure "Move Procedure"
FROM v$sysaux_occupants
WHERE occupant_name = 'SM/AWR'


Item                     Space Used (GB) Schema                   Move Procedure
------------------------- --------------- ------------------------- ----------------------------------------

SM/AWR                             91.88                                      SYS

Now we are looking to see how long we should be retaining AWR snapshots for and what we actually have

select sysdate - a.sample_time ash,
sysdate - s.begin_interval_time snap,
from sys.wrm$_wr_control c,
select db.dbid,
min(w.sample_time) sample_time
from sys.v_$database db,
sys.Wrh$_active_session_history w
where w.dbid = db.dbid group by db.dbid
) a,
select db.dbid,
min(r.begin_interval_time) begin_interval_time
from sys.v_$database db,
sys.wrm$_snapshot r
where r.dbid = db.dbid
group by db.dbid
) s
where a.dbid = s.dbid
and c.dbid = a.dbid;


ASH                           SNAP                           RETENTION

—————————— —————————— ——————————

+000000875 22:22:41.045       +000000030 12:22:28.323       +00030 00:00:00.0

We want to keep 30 days worth of snapshots – we have set the retention period to that  – but we have 875 days worth of active session history.

I rather smugly said let’s show an example and then I produce  a schema with nearly 3 years worth of snapshots being maintained but there is no trickery involved. Looking around at other systems I can find a few similar examples and it is always down to the same issue. Let’s look further.

select table_name, count(*)
from dba_tab_partitions
where table_name like 'WRH$%'
and table_owner = 'SYS'
group by table_name
order by 1;


TABLE_NAME                       COUNT(*)
------------------------------ ----------
WRH$_DB_CACHE_ADVICE                    2
WRH$_DLM_MISC                           2
WRH$_EVENT_HISTOGRAM                    2
WRH$_FILESTATXS                         2
WRH$_INST_CACHE_TRANSFER                2
WRH$_INTERCONNECT_PINGS                 2
WRH$_LATCH                              2
WRH$_LATCH_CHILDREN                     2
WRH$_LATCH_PARENT                       2
WRH$_OSSTAT                             2
WRH$_PARAMETER                          2
WRH$_ROWCACHE_SUMMARY                   2
WRH$_SEG_STAT                           2
WRH$_SERVICE_STAT                       2
WRH$_SERVICE_WAIT_CLASS                 2
WRH$_SGASTAT                            2
WRH$_SQLSTAT                            2
WRH$_SYSSTAT                            2
WRH$_SYSTEM_EVENT                       2
WRH$_SYS_TIME_MODEL                     2
WRH$_TABLESPACE_STAT                    2
WRH$_WAITSTAT                           2

24 rows selected.

There is the problem, all the WRH$ data is held in 2 partitions and the overnight house-keeping job does not have time to finish it’s tidy-up before it’s 15 minutes of fame is over again for another 24 hours.

Two components of Server Manageability (SM) components that reside in the SYSAUX tablespaces cause the problem. The components involved are the Automatic Workload Repository (AWR) and Optimizer Statistics History (OPTSTAT). Both of these components have retention periods associated with their data, and the MMON process should run nightly, as part of the scheduled maintenance tasks, to purge data that exceeds these retention periods. From version 11G onwards, the mmon purging process has been constrained to a time-limited window for each of the purges, if this window is exceeded the purging stops and an ORA-12751 error is written to an m000 trace file.

For the AWR data, held in tables with names commencing with WRH$, the probable cause is due to fact that a number of the tables are partitioned. New partitions are created for these tables as part of the mmon process. Unfortunately, it seems that the partition splitting process is the final task in the purge process. As the later partitions are not split they end up containing more data. This results in partition pruning within the purge process becoming less effective.

For the OPTSTAT data, held in tables with names commencing with WRI$, the cause is more likely to be related to the volume of data held in the tables. WRI$ tables hold historical statistical data for all segments in the database for as long as specified by the stats history retention period. Thus, if there database contains a large number of tables with a long retention period – say 30 days, then the purge process will have an issue trying to purge all of the old statistics within the specified window.

I also think that two scenarios lead to this problem

1) SYSAUX running out of space, which I know has happened on at least one occasion and the volume of data left in a partition is too much to handle in a constrained time-window and so continues to grow in the same partition.

2) If the database is shut down over the period of the maintenance task and again the volume in the partition becomes too large to handle in a short-time.


The resolution to this problem is split into two stages.

Firstly a temporary manual fix is required to rectify the partition splitting failure. A job needs to scheduled on the database, to be run as the sys user which executes the following code:This job will force the splitting of partitions for the WRH$ tables, and should be scheduled to run on a daily basis until the number of partitions per table exceeds the AWR retention period.

 execute immediate 'alter session set "_swrf_test_action" = 72';

What I do is drop the retention period to 8 days ( or maybe less) and then run this procedure once a day. It can take up to an hour but afterwards you can see the additional partitions have been created.

TABLE_NAME                       COUNT(*)
------------------------------ ----------
WRH$_DB_CACHE_ADVICE                    3
WRH$_DLM_MISC                           3
WRH$_EVENT_HISTOGRAM                    3
WRH$_FILESTATXS                         3
WRH$_INST_CACHE_TRANSFER                3
WRH$_INTERCONNECT_PINGS                 3
WRH$_LATCH                              3
WRH$_LATCH_CHILDREN                     2
WRH$_LATCH_PARENT                       2
WRH$_OSSTAT                             3
WRH$_PARAMETER                          3
WRH$_ROWCACHE_SUMMARY                   3
WRH$_SEG_STAT                           3
WRH$_SERVICE_STAT                       3
WRH$_SERVICE_WAIT_CLASS                 3
WRH$_SGASTAT                            3
WRH$_SQLSTAT                            3
WRH$_SYSSTAT                            3
WRH$_SYSTEM_EVENT                       3
WRH$_SYS_TIME_MODEL                     3
WRH$_TABLESPACE_STAT                    3

Once the correct number of partitions have been created the old data may be automatically purged by the mmon process. If this does not occur, it will be necessary to manually purge the data from the AWR tables.

Firstly turn off the production of snapshots for the duration of this activity:


get the dbid for this database

select dbid from v$database;

<strong>-- get current snap interval (in minutes)</strong>

select extract( day from snap_interval) *24 *60

       + extract( hour from snap_interval) *60

       + extract( minute from snap_interval) snap_interval

       from wrm$_wr_control where dbid = ;

-- disable snaps by setting interval to 0

exec dbms_workload_repository.modify_snapshot_settings(interval=> 0, dbid => <dbid>)

Then determine the maximum snapshot to be purged from the database by querying the maximum snap-id outside of the retention period:

select max(snap_id) max_snap

from wrm$_snapshot

where begin_interval_time

< (sysdate - (select retention

from wrm$_wr_control

where dbid = ));

This will return the maximum snapshot to be purged

The AWR tables will not purge if baselines exist, so check and remove as required. The following should return only 1 row called SYSTEM_MOVING_WINDOW:

select baseline_name,creation_time

from dba_hist_baseline;

If other baselines are found they will need to be removed:

exec dbms_workload_repository.drop_baseline(baseline_name => <baseline>);

If we look at the current retention period in this code it is set to 15 days, so we set it to 8 days

select d.dbid,w.snap_interval,w.retention from DBA_HIST_WR_CONTROL w, v$database d where w.dbid = d.dbid;

begin</strong> DBMS_WORKLOAD_REPOSITORY.MODIFY_SNAPSHOT_SETTINGS(129600,0); <strong>end</strong>;

Then we purge the snapshots ( – in ranges if there a are a lot as in this example). This can take several hours

EXEC dbms_workload_repository.drop_snapshot_range(low_snap_id=>100000, high_snap_id=>114159);

EXEC dbms_workload_repository.drop_snapshot_range(low_snap_id=>116261, high_snap_id=>116265);

Once all that is complete it is just a  matter of tidying up the indexes

ASH    SNAP    RETENTION —————————————————————————

+000000008 23:47:01.421    +000000008 23:46:55.865    +00008 00:00:00.0

Item                     Space Used (GB) Schema                   Move Procedure
------------------------- --------------- ------------------------- ----------------------------------------
SM/AWR                             93.70 SYS
select 'alter index '||segment_name||' rebuild online parallel (degree 4);' from dba_segments where tablespace_name= 'SYSAUX' and segment_name like 'WRH$_%' and segment_type='INDEX' order by segment_name;

double check for unusable indexes


select 'alter index '||s.segment_name||' rebuild online parallel (degree 4);'

from dba_segments s,  dba_indexes i where s.tablespace_name= 'SYSAUX'

and s.segment_name like 'WRH$_%'

and s.segment_type='INDEX'

and i.status = 'UNUSABLE'

and i.index_name = s.segment_name

order by s.segment_name


select 'alter table '||segment_name||' move tablespace sysaux ;' from dba_segments where tablespace_name= 'SYSAUX' and segment_name like 'WRH$_%' and segment_type = 'TABLE'   order by segment_name
Item                     Space Used (GB) Schema                  Move Procedure
------------------------- --------------- ------------------------- ----------------------------------------
SM/AWR                             10.05 SYS


I hope that has been useful and I would be pleased to hear if others see the same issues on some databases whereby the actual data retained is much lonher that the snapshot retention period would suggest

20 Responses to “Tidying up SYSAUX – removing old snapshots which you didn’t know existed”

  1. […]; […]

  2. Laura Sallwasser said

    Exactly the problem I am facing with every RAC RDBMS I have. These are small, development databases that started out at 5 GB. They have no clients, an 8 day retention period, yet SYSAUX is now 6 GB and keeps running out of space. ASH has 125 days of “hidden” snapshots as of today. I will work through the steps and post an update as soon as possible. Thank you very much for this posting.

    Laura Sallwasser

    • John Hallas said

      I think that the running out of space might have been the first event that caused the issue and that caused the new partition to stop being created. I am looking at how to get better tracing on the mmon process and will update the blog when I find a way

  3. Laura Sallwasser said

    The oldest snapshot was 1. I disabled snapshots, then executed dbms_workload_repository.drop_snapshot_range four times, different ranges, oldest to newest. For example, the low range, high range was (1, 200). The first execution ran for over an hour. After rebuilding the AWR indexes, there’s about 1 GB free space in SYSAUX.

    The largest segments in SYSAUX are WRH$ objects, like WRH$_EVENT_HISTOGRAM. This table returns snap_ids that were in the drop_snapshot_range. It and other WRH$ tables have many records with old snap_ids.

    Is there any way to purge these records?
    Is there any way to minimize the amount of redo used when drop_snapshot_range runs? It chewed through 32 GB in a little over an hour.

    I’m trying to work up some steps to manage the problem across our databases.

    Thank you again for this excellent post.

    • John Hallas said

      Laura, I think you said these were development databases – my solution would be to drop AWR and recreate.
      There are a couple of blog entries available but basically you run 2 commands

      I am surprised everything old is left in the histogram table if you have purged past the snap_id.
      What you could do is backup your database , take or archiving, drop AWR or re-run the purge again this time not generating archive log and the backup again later

      Let me know how you get on


  4. Laura Sallwasser said

    Hi, John

    I have a few (fresh installs, not upgrades) databases that have the same SYSAUX problem. All are RAC. I’ve just run the same steps on another one of these. The active session history had snapshot data back to the first snapshot, #1. Running “drop_snapshot_range” did clear WRH$_ACTIVE_SESSION_HISTORY, but not the other WRH$* tables. I rebuilt the WRH$* index partitions and indexes, and have about 1 GB of free space in SYSAUX. So far, this problem is affecting only; I have dozens of earlier 11gR2 releases, and AWR content seems reasonable. All are RAC. Earlier versions have partitions (more than the retention days), and they don’t have old snapshot detail. I have an open SR on the problem, but I’m not optimistic. Oracle has been ignoring SRs for a while now.

    I will plan to drop and recreate the AWR objects as you suggest. WRH$* is supposed to be partitioned on DBID, SNAP_ID. When I run ‘alter session set “_swrf_test_action” = 72’, how is the snap_id range for the new partition chosen? After I restore the AWR objects, I want to stay on top of creating the proper number of partitions. Maybe this will help keep SYSAUX clear then.

    Thank you again,


    • John Hallas said

      I believe the new partition goes from current snap_id + 1 onwards. I will have to check. We only have one database which is a sandpit environment. I will have a look at it tomorrow and see if it exhibits the same problems – that is on OEL 5.5 or 5.8 – we have both versions in use. Please keep me informed of the progress of the SR. It sounds as if there is a bug in there somewhere.

      • John Hallas said

        It looks like my database exhibits the same issues as yours.
        The partition is named with the max value of the snap_id so before I ran the
        1* select partition_name,high_value from dba_tab_partitions where table_name = ‘WRH$_EVENT_HISTOGRAM’
        SQL> /



        1303786922, MAXVALUE

        and after running the _swrf_test_action” = 72 command I had an additional partition with adbid and a snap_id in the name. However the snap_id is a bit confusing as it shows 414 when the actual max(snap_id) is 411.
        1303786922, 414

        1303786922, MAXVALUE

        Moving the snap_id forward using exec dbms_workload_repository.create_snapshot(); the high value does not change but the next partition we create has a name/value of WRH$_EVENT__1303786922_419 which is again 3 more than the highest snap_id


  5. Laura Sallwasser said

    Hi, John
    It’s been a few days, and I’m still struggling with Oracle on this issue. They recommended patch 14084247, which I applied to one RAC. It separates the creation of partitions from the auto-purge job. However, I believe it’s disabled the actual auto-purge job itself. That’s the struggle; getting Oracle to confirm this has happened.

    The day I applied the patch was the last time an MMON trace file was written with an Auto-Purge action name. However, wrm$_wr_control.most_recent_purge_time keeps incrementing, as if the purge ran. Then I look at the minimum snapshot ID in wrh$_active_session_history and it’s days older than the 8 day retention period.

    Right now it looks like the purge job is taking care of wrm$_snapshot and wrm$_snapshot_details only.

    Many other segments, among them wrh$_event_histogram, wrh$_latch, wrh$_sqlstat, wrh$_parameter, and wrh$_systat, have records that are not being cleaned up. I know I can drop AWR or clear some of the segments manually, but in a few weeks, I’ll be right back where I started.

    Which segments is auto-purge supposed to clean? Is there a list somewhere I can reference?

    I will post this to the MOS Community in a few minutes. Up until you tested your, I thought it was something unique to our environment.

    Thank you again for the follow-up,


  6. Laura Sallwasser said

    Hi, John

    You were spot-on when you diagnosed the problem as missing partitions. Once new partitions are added daily, auto-purge works as advertised. Patch 14084247 fixed the partitions problem. Without it, the lone partition in each WRH$ table just keeps accumulating content, never aging out. One weird thing on my databases: the patch disabled the MMON Auto-Purge trace file generation. I know auto-purge execution can be checked from the database, but I prefer having the trace file, too.

    This worked on my databases:
    1. Applied patch 14084247
    2. Restored the creation of trace files for MMON auto-purge
    module_name=>’MMON_SLAVE’,action_name=>’Auto-Purge Slave Action’,
    3. Waited 3 days; confirmed 3 new partitions were added to all the WRH$ tables, one per day
    4. Changed the window size and snapshot retention to 3 days

    The fourth run of Auto-Purge dropped the old, huge partitions. SM/AWR (v$sysaux_occupants) went from 4,200 MB to 425 MB. Free space in SYSAUX went from 140 MB to 3,885 MB.

    Other Notes:
    1. I was willing to lose 5 days’ history in development. For production, I would wait 9 days post-patch to see reclaimed free space. This would force me to increase the size of SYSAUX while waiting to get one partition a day.
    2. Free space is reclaimed in SYSAUX, but I cannot shrink the datafiles; the high-water mark is out there. I have found no solution to shrinking the SYSAUX datafiles short of recreating the database.
    3. I set autoextend to FALSE for the SYSAUX tablespace. AWR is supposed to reuse the storage, but I have seen two ORA-01688 errors on WRH$ partitions in the last week. However, the partition count and snapshots per partition are consistent.

    Last observation: I tried different tests while troubleshooting. Rebuilding objects was time consuming; I have too many databases. Running dbms_workload_repository.drop_snapshot_range generated a ton of redo, causing problems in the FRA. It affected only the main snapshot segments, leaving behind event_histogram, latch, seg_stat, etc. Adding partitions daily, either with set “_swrf_test_action” or with the patch, keeps SYSAUX tidy, automatically, for the long term.



    • John Hallas said

      I am really glad you persevered and resolved your problems. The notes you have provided make a welcome addition to the original blog content

  7. Dominik Chovan said

    Hello John,

    I went throught your tutorial and it seems that it fits my problem.
    I’m just tring to test it firts on our test database Oracle 11.2g
    But I’m strugling with problem.
    When I’m trying to execute commnad :
    execute immediate ‘alter session set “_swrf_test_action” = 72’;

    Database throws error:
    ORA-02097: Parameter kann nicht verändert werden, da angegebener Wert ungültig ist (parameter cannot be modified because specified value is invalid)

    I didn’t find correct value and over the internet there is same value as you have used in this tutorial.
    Can you help mi with this problem?

    Thanks in advance.

    • John Hallas said

      Just done the command on 11.0.07. and with no problem.
      I think you might need to check that the quotes are correct – I had to change a couple to get the command working. The old Word/Windows issue

      SQL> begin
      execute immediate ‘alter session set “_swrf_test_action” = 72’;
      end; 2 3
      4 /

      PL/SQL procedure successfully completed.

      • Dominik Chovan said

        Hello John,

        the problem was than I was trying to run this command under system not under sys user. It is working now.

        For what I’m wondering now is the command exec dbms_workload_repository(1, , ) ..dbms_workload_repository is not procedure, how can i run this command?

        thanks in advance.

      • John Hallas said

        I think that line was incorrect and not needed – It did not have a procedure name attached to the package. I am surprised nobody else has noticed it before as it is a well read blog entry. I have removed it now.

      • Dominik Chovan said

        Hello John,

        Firstly I want to thank you for quick responses.
        But I had another question/s
        Hope I’m not boddering you.

        You said:
        “Firstly a temporary manual fix is required to rectify the partition splitting failure. A job needs to scheduled on the database, to be run as the sys user which executes the following code:This job will force the splitting of partitions for the WRH$ tables, and should be scheduled to run on a daily basis until the number of partitions per table exceeds the AWR retention period.

        execute immediate ‘alter session set “_swrf_test_action” = 72’;

        How do I know when the number of partitions exceeds the AWR retention period? I’m not sure if I understand this step correctly.
        I run this script twice (first on Monday and second today, this week)
        I see folowing result:
        WRH$_DLM_MISC 2
        WRH$_FILESTATXS 13
        WRH$_LATCH 4
        WRH$_OSSTAT 4
        WRH$_PARAMETER 4
        WRH$_SEG_STAT 4
        WRH$_SGASTAT 4
        WRH$_SQLSTAT 4
        WRH$_SYSSTAT 4
        WRH$_WAITSTAT 4

        I thought If I split partitions for fist time and then I will manually purge old snapshots, MMOM process will start to work automaticaly. Seems it does not work in this way.
        Should I hold to your workaround step by step?

        Anyway is it possible somehow to check if MMOM process is working finally?
        Or just check trace file for ORA error (not sure now what was the exact error for MMOM job).

        Many thanks for your effort.

      • John Hallas said

        To see if MMON process is working look in the trace file it produces and see if there are any ORA-12751 errors. If so then there is more work than the MMON process can do it it’s limited time.
        MoS Note 761298.1 gives some ideas about this and shows how to generate an errrorstack trace against the MMON process with alter system set events ‘12751 trace name errorstack level 3’;

        What is the output of the second query in my blog showing what your retention period is and how many day’s worth of snapshots you actually have?

        What I did was login to the database each day and run the command
        execute immediate ‘alter session set “_swrf_test_action” = 72’;

        Ideally if you have a retention period of 30 days you want to see 30 in that count

        I hope that helps

  8. […] Tidying up SYSAUX – removing old snapshots which you didn't know existed […]

  9. Karen said

    Hi John,
    I have a couple years of AWR data in an database.
    —————————— —————————— ——————–
    +000000708 04:14:30.206 +000000008 15:09:17.857 +00008 00:00:00.0

    I did the alter session set “_swrf_test_action” = 72;, and now I have new partitions.

    Since I already made the mistake of running the process to drop a snapshot range, and generated a whole bunch of redo, I’m hoping it will work better to just resolve this via this method of creating new partitions, and letting Oracle do the cleanup.

    Without applying any patches….
    Do I need to create new partitions every day for 8 or so days?
    And then oracle will “drop” the oldest partition(s) that contain only data older than 8 days?
    Will this process not create undo then? Or will it actually delete data?

    Would I have to keep creating new partitions, from time to time?

    Thanks for any advice you can provide.

  10. Karen said

    Hi John,
    I have a couple years of AWR data in an database. (and growing)
    —————————— —————————— ——————–
    +000000708 04:14:30.206 +000000008 15:09:17.857 +00008 00:00:00.0

    I did the alter session set “_swrf_test_action” = 72;, and now I have new partitions.

    Since I already made the mistake of running the process to drop a snapshot range, and generated a whole bunch of redo, I’m hoping it will work better to just resolve this via this method of creating new partitions, and letting Oracle do the cleanup.

    Without applying any patches….
    Do I need to create new partitions every day for 8 or so days?
    And then oracle will “drop” the oldest partition(s) that contain only data older than 8 days?
    Will this process not create undo then? Or will it actually delete data?

    Would I have to keep creating new partitions, from time to time?

    I know I can do things like dropping and recreating the repository. And I probably would for this database. But we have others that are doing the same thing, so I want some options on how it can be handled.

    We are getting this error, in m000 trace once a day. We’ll be upgrading to 12c soon, and we’re not planning to do any additional patching before then.

    *** SERVICE NAME:(SYS$BACKGROUND) 2016-03-03 00:23:58.883
    *** MODULE NAME:(MMON_SLAVE) 2016-03-03 00:23:58.883
    *** ACTION NAME:(Auto-Purge Slave Action) 2016-03-03 00:23:58.883

    *** KEWROCISTMTEXEC – encountered error: (ORA-06525: Length Mismatch for CHAR or RAW data
    ORA-06512: at “SYS.DBMS_STATS”, line 29022
    ORA-06512: at line 1
    *** SQLSTR: total-len=93, dump-len=93,
    STR={begin dbms_stats.copy_table_stats(‘SYS’, :bind1, :bind2, :bind3, flags=>1, force=>TRUE); end;}

    Thanks for any advice you can provide.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: