SQL Plan Management – how to really utilize and not just implement
Posted by John Hallas on August 28, 2013
The title of this post is intentionally slightly unclear and hopefully it is intrigued people to view the post and even better, add their comments.
SQL Plan Management has been around since 11G came out which is back in 2007. It does not require a tuning pack, so the package DBMS_SPM can be used without additional licensing but if the SQL Tuning advisor is used to generate new profiles via tuning task then that does require a tuning pack license.
There are plenty of articles available to show how to use SPM to manage a SQL statement and ensure it has the best execution plan but what I am intrigued by is how to use it on an ongoing basis across a number of databases and across all standard (non ad-hoc) code i.e. how to implement a site standard for the use of SPM which can both be applicable to those databases that have two dedicated DBAs monitoring activity continuously (type 1) , the databases that we all have that pretty much run by themselves and need little maintenance (type 3) and the multiple systems that lie in between those two types (type 2).
A colleague, Steve Senior has produced a flow chart of how we might deliver SPM across all systems but the stumbling block is how we manage changes on an ongoing basis, both those delivered because changing statistics (derived from data changes) require new execution plans and after code changes then new plans will probably need to be evolved, plus the inclusion of totally new SQL statements, perhaps based on new tables which have been added to the schema.
The overview of the flow chart, which I have shown below for convenience, is that we take an existing live system and capture baselines. That will probably be by gathering plans from AWR rather than setting the init.or parameter optimizer_capture_sql_plan_baselines=TRUE. The reason is that we are concerned about performance bugs around using that on a production system (MoS note 1304775.1 for instance). Perhaps that concern may be unnecessary for systems 2 and 3 out of the three types I managed earlier but as I say I am more interested in deriving an overall policy with minimal exceptions rather than taking an ad-hoc approach for each system. Anybody who has read many of my posts on here or heard me speak at UKOUG events will know that I am quite keen on having a standard approach across systems and documenting exceptions rather than taking the free for all approach which I have seen on some sites.
Once we have gathered the plans, we create a SQL Tuning Set (SQLTS) from the AWR repository, perhaps only focusing on the application schema(s) we are interested in and then load them into a SPM, disable the bad ones, as otherwise all plans will be accepted and enabled by default, set the init.ora parameter optimizer_use_sql_plan_baselines=TRUE and then flush the shared pool to enforce the SPM. I will come back to the point about getting rid of bad plans as that is another area which makes it difficult to standardise on.
Diagram 2 takes a scenario whereby your system only keeps AWR data for a short period, say 7 days, therefore on the first bulk load it doesn’t capture all the sql that has been run over a full monthly workload. This flow chart shows the process that would be used for each of the three subsequent weeks. The process is pretty much the same but the load from the capture into the SPM baseline would only add new sql that had not been captured previously
Finally there is an ongoing process when you are happy with the baseline. This isn’t about capture but evolution of plans and deciding when to review a plan and then accept it. I understand that in 12C there is an automatic tuning job that does this as part of the maintenance plan. As an aside that is worrying in itself as in 11G the automatic maintenance plan does not have enough time to do its work thereby leaving the SYSAUX tablespace in a bit of a mess – but that is a different post.
So to the crux of the matter. The above is a rough and ready guide on how to implement SPM, if you want to see a much better view on that read Maria Colgan’s 4 part blog on the matter , but how do the practicalities actually work. Oracle make provision to have a performance testing database whereby you can run all your normal load in test conditions, capture the plans into a staging table and import them into the production environment. So some of the type 1 databases mentioned earlier might have that capability but the reality is that most databases we manage do not have a totally representative testing and load environment therefore we have to do whatever we do on production.
The other point that strikes me is that when new code is dropped on you need to have a process to baseline that and decide on the appropriate plan.
Finally, there is an ongoing review of plans and the setting of a new baseline. The original point of this post was about just that, how much
interference management can be defined up front, what is your process of reviewing and adding new plans to the baseline, what change control processes would you use?
In a way I am almost tempted to make the argument that you apply your sql baseline at the implementation of your new system and the only way you alter it after that is when you notice poor performance and review the plans and see that there is a better plan and tell SPM to use that. Is that any better than the current method of waiting for poor performance and then adding a sql profile. Obviously SPM offers much more than that but the reality is that you are still fixing a plan based around a few executions of sql and ignoring the fact that the plan might not be the best plan for other sql statements and the plan will age out as data volumes and characteristics change and therefore not be the optimum plan all the time.
I am really interested in any DBAs/sites out there who use a standard process to manage SQL Plan Management across multiple databases using the same procedures in terms of fixed time period between review, quantifying benefits, following a properly documented change process and ensuring that all systems are running optimised sql most of the times. If they do then I and probably many others would like to see a few ideas on how such a policy is implemented across all three of the database types I referred to at the start of this discussion.