Auto Update Statistics & Auto Create Statistics
-
Posted on September 19, 2011 by Derek Dieter
-
1
The ability to create and update statistics is a crucial liberty for the query optimizer. The optimizer needs to know how it is going to query underlying tables, and when the statistics do not match the data, it is very probable that it will choose a non-efficient method for querying.
I’ve seen and heard a lot of debate from DBA’s who think they should turn this off. Of all those DBA’s I think there is one who was correct, and he ended up convincing me of his scenario so I’m not as closed minded as I was before. His server was so bottle-necked during peak time, that he updated the stats off hours. I’m very weary of those who blindly say “it’s better to turn it off” without any factual statistics to back them up. Most of the DBA’s that think they should turn it off are stuck in the SQL Server 7.0 days. At that time, turning this feature off was acceptable because they interfered with the currently running queries.
Example
Case in point, we had a DBA who did turn this feature off in an environment that mirrored production. I overheard the developers complaining how their queries took twice as long on this box. Looking at perfmon, I saw no physical bottlenecks. Since this was the case, I turned to statistics and found both auto update and auto create turned off. After turning it back on, the box was just as fast as production.
Long story short, these options control the one link between the optimizer and the underlying data. If you turn these off, then the very statistics that the optimizer uses to query the data will not be created or updated when needed. You will basically be taking away the optimizer’s reading glasses and turning off the light in it’s room. It will not be able to read anything accurately.
Exceptions
On the other hand, if your server is pegged at 85% CPU, and you’ve done all the optimization’s you can to speed it up, (indexes, stored procedures, memory increases) and you have a plan for updating the statistics in off hours AND you are frequently looking at updating missing indexes when necessary AND your system is mostly read-only (not insert or update intensive).. Then I suppose you can start considering it. But it should be the last leg of consideration in my opinion. And I would not turn off both of them at first. I would probably only turn off the Auto Create Statistics first, and leave the update; unless you have a high number of ad-hoc queries coming in and not much data gets updated.
So, as a summary (and if you didn’t read my long run-on sentence above):
- Your CPU is constantly pegged over 90% and you have thoroughly exhausted all optimizations
- Your system weighs heavily on the read-only side (not many inserts / updates)
- When monitoring traces, you see that statman is taking a lot of resources that pushes you over the bottleneck
- All queries running against the database are pre-defined (i.e. stored procedures) not many ad-hoc
- You have a clear maintenance window to update stats – either nightly (preferred) or at least once a week
- You thoroughly understand the need for statistics and really do not want to turn it off
I hope this helps, good luck with your statistical decisions. 🙂
- Comments (RSS)
- Trackback
- Permalink
I turn auto update statistics off on my server due to the fact we have super large tables. On very large tables SQL Server uses a very poor sample rate and the stats will most likely be skewed leading to poorly performing queries.
In my opinion you may ONLY turn off auto update stats when:
a) You know exactly what you are doing
b) You have a system in place to keep statistics maintained
what if there are some particularly big tables which take hours and hours to run during the weekly maintenance (checkdb, then reindex above a certain fragmentation level, then update stats where last update > 24 hours ago) job, and are still running and causing user issues at the start of the business day? auto create and update are on btw.
every article i can see is only concerned when apps run slowly due to outdated statistics.
this is in a production environment and i can’t really suggest things (switching auto off, changing from fullscan to a percentage etc) when the result isn’t very predictable