November’s TSQL Tuesday is hosted by Paul Randal ( blog | Twitter ) and the topic is Why are DBA Skills Necessary?
The topic immediately made me think of some of the customer issues I have worked on over the years, and Paul’s list of things to consider confirmed that I was thinking about the right things. As someone who works for a software vendor, I have worked with DBAs with widely varying skill sets. It was very hard to not write a lengthy post addressing everything Paul brought up and more, but in the end I decided to focus on just three items:
- Problems I’ve seen
- How business continuity can be affected by a lack of DBA skills
- When a SQL Server installation needs a real DBA
At our user group conference in September I did a three hour session titled, “When Bad Things Happen to Good Databases.” I reviewed Backups, Integrity Checks, Optimization tasks and Monitoring, and I discussed some of the worst horror stories I have seen, hoping I could scare the attendees into taking action. Here’s the best of the worst…
Backups
The database server crashed, and the mdf and ldf files were copied off the disks, but they were corrupt (in the system tables, the worst corruption I have seen). The job to create backups had not been running, but the job to delete old backup files had been running. A backup that was taken before the customer’s last upgrade was found. That backup was two years old. Now, in our application, only metadata is stored in the database, so the customer had original files they could re-process…but it was a lot of data. We spent one day trying to recover the database, which was too long. The moment that I discovered corruption in the system tables, I told the customer they had to go to backup. Because the backup was so old, they still wanted me to try and extract as much data as I could from the tables. I spent a few hours on that, and then explained they had no choice but to go to backup because there was absolutely no consistency in what I was saving out. It was another day to get the backup restored and to a point where they could start to re-process data. I have no idea how long it took to re-process everything, and I’m not sure at what point they were “operational” again. I also have no idea if anyone lost their job as a result of the data loss and downtime.
Integrity Checks
A customer discovered corruption in one of the application’s logging tables – it’s typically one of the largest tables in the database. Luckily the corruption was confined to that one table. We ended up selecting out what information we could to a new table, and ultimately had no idea how much data they were missing. No one lost their job, and that customer is still in business and thriving…and they now have a DBA.
Optimizations
In March of this year I dialed into a customer to work on an issue and out of habit, checked statistics, which had not been updated since 2003. Yes, seven years. The application administrator said that users were not complaining of problems, and I did not have time to do any testing to see what performance was really like. The database was less than 10 GB, so I would assume that most of what they needed was probably sitting in memory. Needless to say, I provided them with our maintenance recommendations.
Monitoring
A customer called because they had an error message appear while using the application. The error stated that the transaction log file was full. To resolve the problem, the customer shut down SQL Server and deleted the log file. At least a day’s worth of work (and some unknown amount of data) was lost, and it took almost a full business day to get the system back online (there was a delay between the time the file was deleted and when our group was engaged to assist).
As illustrated in these real-life examples, business continuity can be mildly or profoundly affected by a lack of DBA skills. I have seen customers experience an hour or two of downtime, and I have seen customers experience days of downtime. During this time, people are not working, the business process itself is altered, if not completely stopped, and money is lost every minute. Money is lost to the employees that are being paid to do nothing, lost because the processes that should normally bring in money cannot be completed, and lost to fines incurred because information is unavailable or data is lost.
So when does an organization need a DBA? I struggled with this…because I don’t think it depends just on database size, the number of users, the transactions per second or features in use; it’s about the data. Mainly, how important is it and how long can you stand to be without it?
If your business can continue, uninterrupted, without the data in your database for more than a week, and you aren’t too concerned with how much data you lose, then you don’t need a DBA. Anything less, you need one. Sound extreme? It is. But what’s the most important thing in your business? Your data. Yes, people are important, but if you don’t have data, what are your people going to do?
*If you’re wondering why I picked a week…well, I figure that’s probably the amount of time it would take to find a “real” DBA to come in and help fix the problem. This duration is based on no scientific data whatsoever.
If your business is dependent on the database to function, then you need a DBA. If it goes down and no one notices for a few days, then maybe not…
Excellent post. I think the stats story supports the other side of the coin though. They were operating for 7 years in a situation that would kill many of my databases within 7 weeks and a few within 7 days and one table we have within 7 hours (don’t ask). But I really like your summation. It comes down to how important that data is to the business.
I agree. I know that you can argue the other side of it, but from what I have seen with our customers, better to be prepared than find out the hard way.
Thanks Grant! And yes, the customer running for 7 years does suggest that you could get by without a DBA, but if they had a failure of any sort, they would have struggled, and I think that’s what made me lean toward my extreme view 🙂
But if everyone who needs a dba had one, how owuld you stay in business? 🙂
Every company still has a CEO/CFO/CIO/slew of VPs, and they still hire consultants 🙂
Hello Erin,
I see that the Oracle databases are generally given to the safe hands of DBA’s.
On the other hand, companies generally think that somehow a developer can take care of the databases while he is taking care of other tasks as well.
How unlucky case for the SQL Server.
I believe we need more professional DBA’s for SQL Server management
Hi-
I understand your point. I think that many companies believe that a SQL Server database can run with little to no administration, and in a few cases, that might be the case. I agree that more professional SQL Server DBAs are needed, and I believe that the knowledge needed to become such a DBA is quite readily available. People just need the desire and need to know where to look.
Erin
Hi!
We have not even ventured to think this way because having fully developed functional teams allows for better production and availability of resources. I have heard of cases from colegues of mine who have had software folks administer their databases but only to fumble and crash when there where problems in the data integrity of reports, missing sequences, ghost records, etc…then these folks are speechless to try to answer questions like: “Didnt you see this coming” Answer is no…how could she? There is a great misunderstanding that the Database is all about programming in SQL but its also really necessary to understand hardware, networking, security, etc. Something that a software developer knows little about, at leat thru my experience. Here at the DoD we are security junkies, so one cannot even think of becoming a DBA here without having a CISSP.