Statistics and Recompilations, Part II

I recently blogged about an interesting behavior in SQL Server when the database option, Auto Update Statistics, is disabled…and that post requires a sequel.  The summary is that when you have the Auto Update Statistics option disabled for a database, query plans for tables are not recompiled when their statistics are updated.  There is an exception for the inserted and deleted tables that are created by DML INSTEAD OF triggers, but overall this database setting can have a significant impact on a system, which I alluded to in my previous post.  In a customer system I found a query that was doing a full table scan and required 32,000 reads, even though the predicate was unique and statistics had been updated that day with 100% sample.  The SQL Server instance had not been restarted since December (about 45 days before my discovery) and the customer had gone live in December.  My theory was that the full scan query plan was created by the optimizer in December, when there was no data in the system.  The query was executed regularly and never dropped out of the plan cache, therefore it was still being used in January even though by that point there were millions of rows in the table.  To test this, I found the query handle for the plan, and then dropped just that plan from cache.  Immediately I searched for a new query plan for that query, and found the index seek I expected.  The 32,000 read query no longer appeared in the trace I was running.

Now I’m faced with the challenge of figuring out what to recommend to our customers going forward.  For many years we have recommended that customers have Auto Update Statistics disabled, and I listed the reasons previously:

  • We do not want statistics updating during production hours if at all possible; we want the customer to manage statistics through a regularly scheduled job.
  • We want statistics updated with 100% sample because very often the distribution in critical tables is skewed.  With Auto Update Statistics enabled, the statistics do not update with 100% sample unless the table is less than 8 MB in size.

We cannot continue to make the same recommendation unless we provide more information about the ramifications (notably query plans are not recompiled by an update of statistics); it would be irresponsible of us to not mention this.  Alternatively, we could change our viewpoint entirely and recommend that customers enable Auto Update Statistics.

After a lot of thought, the recommendation is going to go away completely.  That is, we’re going to tell customers that they can either enable or disable the option, it’s their choice and it depends on how they want to manage statistics.  I feel that we need to provide an explanation to customers about why our recommendation has changed, and I also I believe we need to provide a few suggestions on how to manage statistics.  The challenge here is that so many of our customers do not have a full time DBA.  Very often, the application administrator acts as the DBA.  The application administrator will probably not have the time, nor the inclination, to manage statistics at the detailed level to which a full time Production DBA would.  Is there a set of maintenance and configuration options that we can suggest that will work for the Accidental DBA?

Here are the options I have developed, and please read through the entire rest of the post before adding a comment if you disagree:

Option 1

  1. Disable Auto Update Statistics for the database
  2. Create a job to update index and column level statistics with 100% sample on a regular basis (or rebuild indexes + update column statistics with 100% sample on a regular basis)
  3. Clear procedure cache or restart the instance after the update of statistics is complete

Option 2*

  1. Disable Auto Update Statistics for the database
  2. Before running any update statistics jobs, enable the Auto Update Statistics for the database.
  3. Create a job to update index and column level statistics with 100% sample on a regular basis (or rebuild indexes + update column statistics with 100% sample on a regular basis)
  4. When the job is complete, disable the Auto Update Statistics for the database

*Joe Fleming ( t ) initially proposed this option in the comments of my original post

Option 3

  1. Enable Auto Update Statistics for the database
  2. Create a job to update index and column level statistics with 100% sample on a regular basis (or rebuild indexes + update column statistics with 100% sample on a regular basis)

Before you skip straight to the comments to disagree, hear me out.  Remember that the majority of customers are not 24/7 shops, and do not have a full time DBA.  These customers have databases less than 250 GB, and can take the hit of rebuilding every index and updating statistics every week, every other week or once a month.  Is it overkill?  Probably.  But it works for the majority, which is what we have to target.

Do I love the option of clearing query cache?  No, not at all.  That suggestion probably gives many people a heart attack.  I understand.  We can try to mitigate the effect a bit by just clearing the cache for the one database (DBCC FLUSHPROCINDB(<db_id>)).  But yes, clearing the cache will cause every query to be freshly compiled, and this will utilize a lot more CPU, it will cause query duration to go up initially, and it will affect overall performance.  I get that.  But for a system that is not managed by a full time DBA, I will take this hit in order to ensure that the query plans are based on the current set of data.

Now, the ideal option is:

Option 4

  1. Enable Auto Update Statistics for the database
  2. Create a job to rebuild indexes and update statistics on a regular basis, dependent upon the level of fragmentation in an index and the need to update statistics because of changes to the data

This option is appropriate for any customer with a full time DBA who is comfortable managing fragmentation and statistics through custom scripts and jobs.  For example, many DBAs use Ola Hallengren’s Index and Statistics Maintenance script, or Michelle Ufford’s Index Defrag script.  To take things a step further, you can monitor when an automatic update of statistics occurs (use the Auto Stats Event Class in Trace), and you can capture snapshots of table row counts (sys.partitions) and index updates (sys.dm_db_index_usage_stats) to understand what tables have data that changes frequently.  High volume or volatile tables (those with a lot of inserts, updates and deletes) may require a more aggressive approach to managing statistics.

Ultimately, there is no silver bullet…no one perfect answer, unless that answer is “It depends.”  But again, you have to know what “it” depends on.  In this case, whether you should have the Auto Update Statistics option enabled or disabled for a database depends on your ability to manage statistics for a database, and understanding that having it disabled can lead to out of date query plans.

19 Responses to Statistics and Recompilations, Part II
  1. Mike Fal
    February 22, 2012 | 10:15 am

    Erin, while I don’t know the specifics of your app, option 4 always makes the most sense to me. If there’s a concern about an impact on the production system, you can test having auto update on with asynchronous update so that queries aren’t disrupted by the stats gathering. I’d certainly go with that first.

    If your clients are not 24/7 and the dbs are less than 250 GB, I see no reason why you can’t run a defrag/update stats every night. Ola’s scripts are great and I’m guessing your familiar with their use, so I’d recommend them for selective management (no reason to rebuild stuff that has been altered).

    Anyway, that’s my loose change. I won’t outright disagree with your approach without understanding the details of your application, but I’ve had great success with the option 4 approach over the years.

    • Erin Stellato
      February 22, 2012 | 10:32 am

      Mike-

      I agree that option 4 is the best one, and while I agree that customers could run Ola or Michelle’s script weekly or even more frequently, remember that most of them are not DBAs. A script like Ola or Michelle’s is overwhelming to them. Even if they don’t have to modify anything in the script, taking the script and creating a job to run regularly is not something they ever do.

      Could we set up that job for them? We could…but then who’s responsible for making sure that job executes? It’s a very fine line.

      I appreciate your comments!

      Erin

      • LeoPasta
        February 24, 2012 | 10:26 am

        Erin, I also work on a software vendor, in our case we decided to “bite the bullet” and took ownership of statistics, index maintenance. We wrote our own utilities, so that it would integrate into our logging mechanism and we are notified (as well as the client) if anything goes wrong.
        Of course the client can opt to assume the responsibility if he wishes, but up to now no one did.
        Although we spent some resources to develop it, I think it was a good decision. We definitely feel more confident to analyse performance issues, and we come to the investigation already knowing when the indexes we’re last rebuilt or statistics updated. This way we also avoid the client shooting itself on the foot (like rebuilding an index and updating statistics afterwards).

        • Erin Stellato
          February 24, 2012 | 10:40 am

          Leo-

          HI! It’s so great to hear from another software vendor, thank you for commenting!

          We have only discussed the idea of implementing plans for customers, I am impressed that you all went ahead and did it. I would be interested to know how many customers you have (if that can be shared), and do you only support SQL Server as a RDBMS? Also, I’m guessing that you still leave backups up to the customer?

          I really like your point about having more confidence when analyzing a performance issue. Theoretically, there is a ton of information you could gather from customers on a regular basis, and you could be better prepared when they call in with an issue. So much data, so little time.

          Thanks again for your comment!

          Erin

          • LeoPasta
            February 24, 2012 | 11:07 am

            Yes, for some reason I missed your blog until today, but I loved it, several of your last topics resonate with my experiences here. But now you’re on my RSS feed, so this mistake shall not happen again 🙂
            Our clients are usually medium-large financial institutions, we have around 50 clients and we support only SQL Server.
            We have indeed created a small utility to run the checkdbs and backups, the client is responsible to copy the backup files to a safe place and to test them according to their policies. Our biggest clients usually prefer to takeover the whole process, which I find very sensible (as they will have in-house DBAs to take care of it).
            Apart from any error on the system, we receive weekly “reports” on the index fragmentation and disk/db space usage. I would love to get more info, but even those have already helped us a lot in the past.

  2. Colleen Morrow
    February 22, 2012 | 1:07 pm

    I think, if you’re not a DBA, any of those options could be overwhelming. Will you be providing scripts to implement options 1-3? With regards to option 4, an index maintenance script doesn’t have to be as complicated as Ola’s or Michelle’s.

    • Joe Fleming
      February 23, 2012 | 4:57 pm

      Colleen,

      I think part of the problem here is that NOT implementing one of these options might also cause problems with your system. Which way do you choose?

      The client has made a conscious choice not to have a DBA onsite to offer expertise in this sort of thing, so whoever the application administrator is, that poor sap — I mean accidental DBA — is going to be forced into learning a few things about SQL. This is definitely going to be one of them. While I would argue that the ROI for hiring a DBA is pretty good, the ROI for a DBA consultant to come in for a day or two on this is spectacular.

      As Erin says, as a software vendor anything you provide will always come back to bite you at support time, so there are some things that have a lot of variability that you just need to hand off to the client, even though they may feel a bit put off. If you’re a good ISV, you still offer some guidance but you have to be careful not to cross the line into “this is part of the software support agreement.”

      (just a side note, I am not a DBA consultant and so have no vested interest in recommending people hire consultants…but I do know some good ones)

      • Erin Stellato
        February 24, 2012 | 7:09 am

        Joe-

        Agreed, you HAVE to implement something. I have seen customers that don’t have a maintenance plan for rebuilds or statistics, or have one but it hasn’t run in months (in some cases years). This is one of the easiest things for them to overlook, along with CHECKDB.

        To look at it from the perspective of the customer, is it worth it for them to hire a DBA when this is their only application that uses a database? That’s a rare occurrence, but it does happen. In that instance, you’re right, hiring a consultant may be better for them.

        Erin

    • Erin Stellato
      February 24, 2012 | 6:59 am

      Colleen-

      We don’t provide maintenance scripts for customers. Again, it’s a gray area, but if I give them a script to use, and then they have issues, they’re going to call me. I can provide suggestions, but with the current way we support our customers we don’t have the bandwidth to support maintenance as well.

      Erin

  3. […] Statistics and Recompilations, Part II – With Part 2 of her excellent look at this subject it’s Erin Stellato (Blog|Twitter). […]

  4. Dustin
    February 26, 2012 | 9:19 pm

    This is interesting because I support a lot of ISV databases. I’d agree with Fal and say 4 is probably good depending on size and importance. If its a company’s core competency, its their responsibility to give the app adequate hardware AND DBA time.

    Its amazing how long a SQL Server database can go without any TLC. I’d like other DBAs to think of this when they pick strict regimens of maintenance.

    Anyways, I would leave the index and stats maintenance in the customers hands with auto-update on. Don’t be afraid to delegate to us accidental DBAs, we’ll figure it out eventually :]

  5. Daniel Adeniji
    October 2, 2012 | 10:44 pm

    Erin:

    On a side-note, what is the benefit of using DBCC FLUSHPROCINDB over sp_recompile (http://msdn.microsoft.com/en-us/library/ms181647.aspx).

    I know “DBCC FLUSHPROCINDB” is a single statement, but is there anything to be gained in terms of granularity in explicitly listing which objects should be targeted for re-compilation.

    Thanks

    • Erin Stellato
      January 7, 2013 | 1:00 pm

      I don’t know that there is a benefit. If you use FLUSHPROCINDB, you’re clearing everything for a database. When you use sp_recompile, you’re targeting one specific object (table, SP, etc.), so I would argue that you should only try to clear from cache what absolutely needs to be cleared. However, in the scenarios I’ve discussed above, if you’ve updated stats for multiple (all?) objects in the database, it’s just easier to run FLUSHPROCINDB rather than sp_recompile for each object.

  6. Pedro Lopes
    December 13, 2012 | 9:05 am

    Hi Erin,
    I really liked your post.
    I usually ask customers to implement my AdaptiveIndexDefrag procedure (http://blogs.msdn.com/b/blogdoezequiel/archive/2011/07/03/adaptive-index-defrag.aspx) to address index and stats upkeep, as part of a wider maintenance plan (http://blogs.msdn.com/b/blogdoezequiel/archive/2012/09/18/about-maintenance-plans-grooming-sql-server.aspx).
    Cheers

  7. Ronak Patel
    January 7, 2013 | 10:10 am

    HI Erin,
    In my production we have option 4 setup. I have installed FogLight PA for performance analysis. it always says high recompilation rate even after statistics updated at 100% sample. Any suggestion which i could follow or to check. we also have update stats job at default runs every day but its only on heavy usage tables.

    • Erin Stellato
      January 7, 2013 | 12:53 pm

      Hi-
      Does FogLight tell you *what’s* recompiling? That would be more interesting to see. And what does it define as a high recompilation rate? I guess I’m not clear as to whether recompilation is high no matter what, or only after statistics are updated.
      Erin

      • Ronak Patel
        February 5, 2013 | 8:05 am

        IN FogLight PA it simply says “Monitored instance performed 2302.85 complilations during the monitored period, of which 1497.88 were recompliations.”

        indexes displaying outdated statistics.
        but we update statistics daily for some tables with default sampling and weekly for complete database with full scan.

        so what you suggest here, should i disable daily statistics update job.

  8. […] And, also here: Statistics and Recompilation, Part II. […]

Leave a Reply to Required Reading List « sqladventures.com

Wanting to leave an <em>phasis on your comment?

Trackback URL http://erinstellato.com/2012/02/statistics-recompilations-part-ii/trackback/