Welcome to Neudesic Blogs Sign in | Join | Help

SQL Query Performance - Do you feel dirty? (Dirty Reads)

Ok, so I'm feeling guilty that I was in such a rush to get my blog, and then I failed to posting anything for almost a week.  Well, I thought that rather than let another week go by I ought to post something of use.  But what to post?  It just so happens that I have been working on a document to share some of my experience with optimizing TSQL queries (MS SQL 2000 that is).  I know that in the spirit of SQL 2005, it may seem blasphemy to share knowledge about something that was so yesterday as SQL 2000.  But, that's what I have.  And there you are.

Today's installment will be a tidbit about optimizing queries specifically using locking hints.  Depending on the shape and use of the database, I have used this technique to reduce 30-second queries to 2 seconds.  Other times, it has made no noticable difference at all.  It all depends on the kind of volume and types of transactions hitting the tables being queried.

I'll follow this post up later with more of the same subject.  Here you go:

---------------------------------------------------------------------------------------------------

Turn off record locking in SELECT

 

Microsoft SQL has several levels of lock granularity.  The server does its best to “pessimistically” control concurrency with locks.  This means that the server assumes the worst when it comes to possible concurrency collisions, and protects you from yourself.

 

Locks can be acquired on rows, pages, keys, ranges of keys, indexes, tables, or databases.  SQL Server dynamically determines the appropriate level at which to place locks for each Transact-SQL statement.  The level at which locks are acquired can vary for different objects referenced by the same query; for example one table may be very small and have a table lock applied, while another, larger table may have row locks applied. The level at which locks are applied does not have to be specified by users and needs no configuration by administrators. Each instance of SQL Server ensures that locks granted at one level of granularity respect locks granted at another level. [SQL 2000 online help]

 

While the server assumes the worst when trying to protect us from ourselves, it also delivers less performance while doing so, and in some circumstances may have dire performance consequences. 

 

Remember Optimization Axiom #2: “Locking is not your friend.”  (when optimizing queries)

 

Example 1:

 

Imagine that your program runs the following summary query to display revenue for 2005:

SELECT DivCode, DeptCode, SUM(Revenue) AS RevenueSubTotal
FROM AccountsJournal
WHERE ClosedDate BETWEEN ‘2005-01-01’ AND ‘2006-01-01’
GROUP BY DivCode, DeptCode

Let’s say that this AccountsJournal table contains 1,500,000 rows of data, and already has an index for DivCode+DeptCode as well as a separate index for ClosedDate.

 

Any problems with this?  Indexes are in order.  The query is optimized.  But there is a problem.  A big one.

 

This had better not be running on the main production database during business hours.  It might run in less than 3 minutes, but during that time, the SQL server’s locking traffic cop is deciding the fate of every production user of the AccountsJournal table, and whether or not they’ll be allowed to post their updates in a timely manner.  Almost as bad, is the fact that this query will also be forced to wait while other update queries finish running.  Do you feel lucky?

 

What is a dirty read?

Simply put, a dirty read is where a query reads data from the database without lock protection.  Without lock protection, you cannot be guaranteed that the data isn’t changing during the time that the query is running.

 

SQL Server assumes that you need to be protected from concurrency issues even when you’re running a SELECT statement.  What you need to decide is whether or not you agree with it’s assessment of the danger.  Ask yourself these questions: 

 

  • While my SELECT query is running, does it matter whether or not someone else makes a change to a record or two?  Will it affect my outcome in a meaningful way?
  • Are any of the tables in my query in danger of being affected by a mass-update while the query is in the process of running?  If so, would it be bad if my query only saw part of the mass-update?

 

If the answer is, “No, my query doesn’t mind,” then you need to turn off the locking for your query.  This is called a “dirty read”.  You’re willing to accept less than a pristine and untouched snapshot of data.

 

When should a dirty read be used?

A better question is, “when shouldn’t a dirty read be used?”  It is the author’s opinion and experience that the best practice for database developers is to use dirty reads as the rule, rather than the exception.  Let the hate mail begin...

 

In fact, at two telecommunications companies in the author’s past, it was a serious offense for developers to write any query without enabling dirty reads (without strong justification).   At In Touch Communications, in 2001, the development staff had no SQL standards.  The MS SQL 2000 database that powered their Siebel CRM system was deadlocking often, and the Customer Service and Sales departments would crash in Siebel at least twice per week.  In 90% of the cases, the issue turned out to be long-running queries that had been hand written by developers for various purposes.  Some were queries for reports.  Others were custom scheduled SQL jobs.  All of them used the default locking of SQL server.

 

Once the In Touch Communications I.T. staff migrated all appropriate stored procedures (99%) to use dirty-reads, the deadlocks went away.  Siebel stopped crashing.  Later, I.T. afforded a separate SQL server to run their reports, which further increased system performance (also a best practice).

 

It is rare that a SELECT statement is negatively impacted by using a dirty read.  The only cases the author has run into have been in stored procedures that run one or more SELECT statements to gather information just prior to a mass update or delete.  In these rare cases, a qualified DBA should consider the data dependencies, and decide whether or not dirty reads are safe for that stored procedure.  If locking is allowed on a long-running query (15+ seconds), then the DBA must consider how this query will affect the rest of the system.

How to use dirty reads

 

Best practice rules: 

  1. All SELECT queries should use the WITH (NOLOCK) directive for every table involved in the query.
  2. If it is not practical to use the WITH (NOLOCK) directive on every query, then upon opening a database connection, enable dirty reads as the default behavior for the entire database connection using:
        SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

 

Example 2:

 

To change our previous revenue query to use dirty reads:

SELECT DivCode, DeptCode, SUM(Revenue) AS RevenueSubTotal
FROM AccountsJournal WITH (NOLOCK)
WHERE ClosedDate BETWEEN ‘2005-01-01’ AND ‘2006-01-01’
GROUP BY DivCode, DeptCode

 

Example 3:

 

Here’s an example of WITH (NOLOCK) on a more complex query:

 

SELECT DISTINCT
      P.Handle, SM.SeatDescription
      , PS.PreFlopSeen, PS.FlopSeen
      , PP.Invested, PP.Won, PP.MonetaryProfit

FROM tblPlayer P WITH (NOLOCK)

      INNER JOIN tblPlayerStats PS WITH (NOLOCK)
            ON P.PlayerID = PS.PlayerID

      INNER JOIN tblPlayerHand PH WITH (NOLOCK)
            ON P.PlayerID = PH.PlayerID

      INNER JOIN dbo.tfn_Hand_PlayerCount() HPC
            ON PH.HandID = HPC.HandID

      LEFT JOIN (

            SELECT SM.PlayerCount, SM.SeatIndex
                  , SM.SeatID, ST.SeatDescription
            FROM tblSeatMap SM WITH (NOLOCK)
                  INNER JOIN tblSeat ST WITH (NOLOCK)
                        ON SM.SeatID = ST.SeatID
            WHERE ST.StreetID = 2

      ) SM
            ON SM.PlayerCount = HPC.PlayerCount
                  AND SM.SeatIndex = PH.PreFlopSeatIndex

      LEFT JOIN dbo.tfn_PlayerProfit_Summary() PP
            ON P.PlayerID = PP.PlayerID

WHERE NOT SM.SeatID IS NULL

Notice that both User Defined Functions in Example 3 do not have a WITH(NOLOCK) directive.  More on that later.

 

 

Until next time...

 

Phil Scott
www.neudesic.com
phil.scott@neudesic.com

 

Published Monday, December 05, 2005 4:38 PM by Phil.Scott

Comments

# re: SQL Query Performance - Do you feel dirty? (Dirty Reads)

Wednesday, December 07, 2005 2:00 PM by dfoderick
While I agree with some of what you say, I think there are dangers here. Many people use the NOLOCK hint and they do realize that will cause the query to not issue the read locks. That's fine, but most don't realize that NOLOCK is that same as READUNCOMMITTED (dirty read). Many developers will falsely assume that it will always be safe to use NOLOCK as long as they are only reading data. What about important financial data? What if you are printing a balance sheet and you include a debit but not a credit because it is dirty data. The balance sheet will be out of balance.

Other shops will assume that using NOLOCK will fix all performance problems when the real problem might be lock escalation or recompiles.

# re: SQL Query Performance - Do you feel dirty? (Dirty Reads)

Friday, December 16, 2005 2:44 PM by Phil.Scott
dfoderic,

I agree with your comment. I know that in this entry, I come accross as thought 100% of the time programmers should use dirty reads. But, the example you point out with financial reports including partial transactions is an excellent counter-example.

This is why I mentioned the following rule of thumb (in the heading "What is a dirty read"):

--------------------
What you need to decide is whether or not you agree with it’s assessment of the danger. Ask yourself these questions:

* While my SELECT query is running, does it matter whether or not someone else makes a change to a record or two? Will it affect my outcome in a meaningful way?

* Are any of the tables in my query in danger of being affected by a mass-update while the query is in the process of running? If so, would it be bad if my query only saw part of the mass-update?

If the answer is, “No, my query doesn’t mind,” then you need to turn off the locking for your query. This is called a “dirty read”. You’re willing to accept less than a pristine and untouched snapshot of data.
-------------

You bring up a very lucent point of precaution.

Thanks,

-Phil Scott

# WITH (NOLOCK) IN SQL Server to improve performance « Tech

Wednesday, February 27, 2008 12:57 AM by WITH (NOLOCK) IN SQL Server to improve performance « Tech
Anonymous comments are disabled