Six reasons you should be nervous about parallelism
March 21st, 200935
Six reasons you should be nervous about parallelism
March 21st, 200935
 
 

Don't get me wrong; parallel processing is great.  And with SQL Server in particular, there are many benefits to taking advantage of the power of multiple CPUs.  However, we have all had cases where what looked like parallelism was actually hurting us (you remember turning off hyperthreading on SQL Server boxes, right?), and cases where long CXPACKET waits on individual threads completely defeated the purpose of speeding up the query with parallelism in the first place … often making the query take longer than it would with a MAXDOP 1 query hint, or in some cases run away completely.

These are loosely defined scenarios that are difficult to reproduce and may depend on other environmental factors at the time of the query.  For example, with the runaway case, I have simply stopped the query and re-issued it, and this time it came back quickly.

However, there are other cases that are definitely potential problems, but not as easy to determine the cause.  Some of these issues have been deferred until the next version of SQL Server (tentatively 2011), or have been closed as fixed, but not specific enough about where/when it was fixed in both SQL Server 2005 and SQL Server 2008.


1. #328811, "SCOPE_IDENTITY() sometimes returns incorrect value"

STATUS UNKNOWN – CONNECT ITEM DISAPPEARED

User "dave_dave" reported a problem that I had seen before, where according to the Microsoft engineer, "Yes, it's a bug – whenever a parallel query plan is generated @@IDENTITY and SCOPE_IDENTITY() are not being updated consistently and can't be relied upon."  This problem can result in incorrect data and even data loss. The workarounds include using OPTION (MAXDOP 1) on individual queries or attempting to force serial plans system-wide via sp_configure.  According to the Connect item, at the present time, there is no plan to fix this issue for SQL Server 2008.


2. #382323, "dm_exec_requests does not show blocking SPID"

FIXED SOME TIME AFTER SQL 2008 (?)

In this case, fellow MVP Andrew Kelly demonstrates that parallelism can prevent blockers from showing up in the sys.dm_exec_requests DMV.  This problem exists in both SQL Server 2005 and in SQL Server 2008, but is considered too risky to fix in a service pack or cumulative update, so we will not see a fix for this until SQL Server 2011 at the earliest.  As Adam points out (and I missed this in the recent discussion), the workaround in the meantime is to stop trusting sys.dm_exec_requests for finding blocking/blocked processes, and instead use the sys.dm_os_waiting_tasks DMV.  (Thanks Adam!)


3. #384262, "Snapshot Isolation Transaction uncommitted rows are visible with queries using parallelism"

FIXED IN 2008

This problem, reported by "Mike1D", shows that when using READ_COMMITTED_SNAPSHOT, a session using parallelism may see uncommitted rows from another session.  Effectively this makes it possible for dirty reads to occur in snapshot isolation, which should only be possible in READ_UNCOMMITTED or when using the WITH (NOLOCK) table hint.  The problem does not occur when the reading session uses OPTION (MAXDOP 1).  The fix is checked into the next cumulative update for both SQL Server 2005 SP2 (CU#13) and SP3 (CU#3), however there is no mention of whether this issue affects SQL Server 2008 and, if so, when a fix will be released.


4. #265183, "RING_BUFFER_EXCEPTION when using application locks"

CLOSED – NOT REPRODUCIBLE

User "lasa1" reported this issue, where a blocking issue due to parallelism can lead to unlogged RING_BUFFER_EXCEPTION errors (severity 25) and undetected deadlocks for queries run under the context of sp_getapplock (more info and plenty of gory details here).  The issue is completely resolved by using OPTION (MAXDOP 1) on the affected queries.  The issue is currently in the "Resolved (Not Reproducible)" state.


5. #329335, "A PIVOT query using parallel plan on a 3.5GB table consumes more than 200GB of tempdb space"

FIXED IN 2008

Meiyalagan Balasubramanian reported this issue, where running a PIVOT query that uses parallelism against a very large table can lead to very long runtimes and extremely high tempdb growth.  The bug is closed as fixed in SQL Server 2008, however the user reported it against SQL Server 2005 as well, and for this version there does not seem to be an answer, other than the now-all-too-familiar workaround of adding the OPTION (MAXDOP 1) query hint.


6. #308094, "CLR Assembly causes non-CLR ACCESS VIOLATION Exception"

CLOSED – WON'T FIX (ISOLATED; EASY WORKAROUND)

User "hova" reported this problem, where a CLR string contatenation example from Microsoft fails with an EXCEPTION_ACCESS_VIOLATION error if the table is large enough to induce parallelism.  As a side bit of humor, hova asked for a free upgrade to SQL Server 2008, since the problem is fixed there; the request was denied, of course.  Currently the workaround of using OPTION (MAXDOP 1) in SQL Server 2005 has been deemed "good enough" for those shops that haven't completely migrated to SQL Server 2008 (a.k.a. "just about everybody").


Now hopefully, these issues don't affect you, but you should check each scenario just in case.  If any does, you should address the issue and validate / vote / comment so that Microsoft will have more motivation to treat these bugs seriously.  If you have others in this category, please point them out!And as a fellow MVP rightly pointed out, just because a parallelism bug isn't affecting you right now, who is to say what will happen tomorrow … you may suddenly get a parallel plan due to a hotfix, service pack, migrating to different hardware, altering table structure, or even simple data or statistics changes that hit some yet-unknown threshold.

By: Aaron Bertrand

I am a passionate technologist with industry experience dating back to Classic ASP and SQL Server 6.5. I am a long-time Microsoft MVP, write at Simple Talk, SQLPerformance, and MSSQLTips, and have had the honor of speaking at more conferences than I can remember. In non-tech life, I am a husband, a father of two, a huge hockey and football fan, and my pronouns are he/him.

35 Responses

  1. Daniel Adeniji says:

    Thanks for posting this list.  We have seen a bit of hard to "place" Parallelism problems, as well.  On individual SQL Server Instances, setting 'Max degrees of parallelism' via http://msdn.microsoft.com/en-us/library/ms181007(v=sql.105).aspx seems to be a good compromise.  Note that the value set is based on the Number of Actual \ Physical Processors on a machine.
    Having said so, Siebel and some other OLTP Applications, recommend setting 'Max Degrees of Parallelism' to 1.  Presumably, this might be due to the possibility and side-effect of badly generated Programmer \ user queries.
    The Siebel recommendation is available via http://docs.oracle.com/cd/E05553_01/books/SiebInstWIN/SiebInstCOM_RDBMS28.html
    As previously noted, watching for WaitTypes equal to CXPACKET is a good place to start and watch for.  
    Again, nice & respective community discusion topic.

  2. AaronBertrand says:

    For the SCOPE_IDENTITY()/@@IDENTITY issue mentioned above, Pinal Dave has posted some workarounds that could be helpful:
    http://blog.sqlauthority.com/2009/03/24/sql-server-2008-scope_identity-bug-with-multi-processor-parallel-plan-and-solution/

  3. AaronBertrand says:

    Chris, you'll have to test the effect of the setting on your workload and in your environment.  Any of us could *guess* whether it will help or not, but that won't be very useful to you, I'd suspect.
    The output messages are annoying, sure.  But I'd rather get lots of feedback (especially if the estimated completion time changes during the process due to other impeding activity) than sit there wondering, "are we there yet? are we there yet? are we there yet? are we there yet? how about now?"

  4. Chris Wood says:

    I wrote the above as I have a database recovering for over 2 hours now that says it will take another 6 hours because the process ran on many cores and now is rolling back as a single thread. The joys of poor processing.
    Chris
    And why does SQL2005 give me a message every 30 seconds telling me how much time is left in the recovery process???

  5. Chris Wood says:

    I have a severly overworked development server that has multiple CPU's that is set to MAXDOP 0. Would switching this to 1 maybe improve thruput for this mostly OLTP server? I do know that a transaction that runs on 8 processors and rolls back will only user 1 processor.
    Thanks
    Chris

  6. GrumpyOldDBA says:

    Greg
    I mostly work with financial systems, I wouldn't say that any of these fall into either the oltp or dss camp. Can't be more specific, sorry.

  7. GrumpyOldDBA says:

    I don't want to get into tit for tat comments and I was planning to blog about parallelism anyway; but there is a ms kb which recommends turning parallelism on for oltp databases. http://support.microsoft.com/kb/329204
    The spec tests are all very well but hardly relate to real world applications, and as I have experienced tests can be arranged to prove almost anything!
    In answer to the 2000 to 2005 Q. I did a lot of testing for several clients on upgrades and I can confirm that 2005 handles parallelism much differently; maybe even far better. I also tend to be working on databases with tables that contain tens and hundreds of millions of rows, maybe that makes a difference?                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              

  8. Greg Linwood says:

    I wonder whether this is because the user base has learned to switch DOP off by default though. We certainly advise turning it off routinely & still occassionally get caught out when a mis-configuration slips through during an upgrade, so I'm sure it's still a problem.
    I wonder whether there's anything published on the technical improvements in SQL 2005's parallelism? Introduction of partitioning in SQL 2005 was potentially something that could have improved parallelism but it's well known that parallelism combined with partitioning actually had a lot of problems that were later fixed in SQL 2008. As for non partitioned parallelism, I'm not sure what technical improvements, if any, were actually made.
    Keep in mind that many other improvements were made in SQL 2005, so it might be that these simply masked the problems associated with parallelism. Removing the Std Edn memory restriction in SQL 2005 was a huge advance on its own for example..

  9. MohammedU says:

    Excellent post Aaron…
    Many times I encountered parallalism issues in SQL 7 and 2000…

  10. Linchi Shea says:

    With SQL2000, I routinely see reporting queries in a test suite I used to use go for 3 hours only to complete in seconds when MAXDOP is set to 1. I don't see the same behavior with the test suite on SQL2005. I also don't hear as many complaints about parallel plans gone wild in the SQL2005 prod environment.

  11. Greg Linwood says:

    When you say "better in dealing with parallelism" do you mean that the optimiser chooses parallelism more judiciously or that the query processor's parallelism iterators are actually more efficient? Any specifics would be interesting..
    We still see plenty of problems with parallelism on SQL 2005, though our exposure to SQL 2008 is relatively limited.

  12. Linchi Shea says:

    Again, I don't think there is any argument here. I wasn't precise, I guess. If the workload is predominately OLTP, set MAXDOP = 1.
    > By "real-world non-OLTP-pure workloads" do you mean OLTPs that have a few search & report queries?
    No, I meant real mixed workloads. I would characterize OLTPs that have a few search & report queries as OLTPs. Try as hard as you may to separate OLTP from reporting, they pop up mixed everywhere. It could be the case where OLTP queries are mixed with reporting queries in the same database. It may also be the case you have a database that is basically OLTP living with a database that is basically reporting on the same server. There are many combinations.
    But back to my original point, I do think the SQL optimizer has got a lot better in dealing with parallelism, far from perfect but better nonetheless.

  13. Greg Linwood says:

    By "real-world non-OLTP-pure workloads" do you mean OLTPs that have a few search & report queries? In my experience managing a wide variety of business apps, this is pretty much what you're dealing with in most cases. It is FAR easier to manage these type of apps by turning MAXDOP off globally & enabling where necessary on the few queries that actually benefit from parallelism. Configuring MAXDOP to something low like (2) or (4) still leaves the majority of regular OLTP queries susecptible to parallelism problems, so I disagree with this as being a good approach.

  14. Linchi Shea says:

    Greg;
    I'm not arguing with you on TPC-E's MAXDOP setting. I'd turn it off if I were to configure it. But then it is a typical OLTP workload, whereas Iwas talking about making a compromise in real-world non-OLTP-pure workloads.

  15. Greg Linwood says:

    Linchi – why do you think DOP is turned off (MAXDOP = 1) in high end TPC-E benchmarking then? If it were better on OLTP to have DOP restricted to a "low number", why wouldn't Microsoft set DOP to something like 2 or 3 in their big TPC-E benchmarks?

  16. Linchi Shea says:

    Along with the line many have already put forward, I'd argue that (1)more and more we are seeing mixed workloads on servers, (2) SQL Server query parallelism used to be terrible, but has got a lot better, and (3) keeping parallelism on but restricting it to a low number appears to be a good compromise in many cases.
    Ultimately, this is a problem with the query optimizer. From the their respective default setting, one could argue that Microsoft is probably more confident with their query optimizer w.r.t. parallelism than is Oracle with theirs 🙂
    Ideally, SQL Server should allow more granular control of query parallelism than it currently allows (e.g. to be able to control parallelism for maintnenance operations only or for some databases only).

  17. Greg Linwood says:

    Grumpy – what is a "back end for a data warehouse"?
    Generally speaking, the right thing to do with MAXDOP is turn it off globally for OLTP & use DOP hints for the few reports or big (eg search) queries that actually benefit from parallelism in an OLTP. Even then, you probably don't want individual queries going parallel over ALL CPUs.
    This approach makes far more sense than leaving DOP on globally / unrestricted (0) & having using OPTION (MAXDOP 1) in a large number of queries to control the system (I've never seen anyone actually do this – generally most DBAs just turn parallelism off).
    IMO, it would be far better for SQL Server to be configured with MAXDOP off by default b/c the cases where things go wrong on OLTPs is fairly common. A lot less harm would come from requiring those who really need parallelism to turn it on.

  18. GrumpyOldDBA says:

    Well I guess I could say that it proves my point about disabling parallelism. Many applications do not necessarily fall into either the oltp or dss camp; but the databases I work with would not be described as being the back end for a data warehouse. I suppose you might say they were a type of order processing. My tests were not on what you call a DSS system and were extracted from systems with parallelism turned off where there were performance problems – I was making the point that many of this issues could be resolved by turning parallel processing back on. It's worth pointing out that there is a vast difference ( in the applications I have supported ) in how parallelism is handled from sql 2000 to sql 2005/8.
    Whilst I agree about using individual threads with increasing number of cores how many actual concurrent active processes do you really have on a system – probably far less than you might think, even on a 1000 user system.
    At least it generated a discussion !

  19. AaronBertrand says:

    But after reading Greg's comments I think a good point comes out of it… in a lot of scenarios where the large DSS queries are few and far between, it is probably safer (if you are nervous about these and other parallelism issues) to turn it off globally and add (MAXDOP n) to the DSS queries, than to leave it on and try to find all of the potential problem queries that should have the (MAXDOP 1) hint added manually.

  20. Jason Massie says:

    I agree that OTLP doesn't benefit from parallelism but if we are seeing cxpacket waits, we probably have missing indexes or need query tuning. Alot of OLTP system do have DSS type of queries that run either overnight or because the business requires real time data. Index maint also benefits from parallelism. The arguement could be made to leave parallelism on globally and turn it off on problem queries.

  21. K. Brian Kelley says:

    Configuration choices, aside, my concern is a simple one that I've seen voiced elsewhere. These are functionality breakdowns. Some of these are things folks have depended on because they're supposed to work. The @@IDENTITY and SCOPE_IDENTITY() bug is a perfect example.
    I think Aaron's point of "2011 is a long time away, and if some extra votes create an opportunity to raise the priority, then by all means, vote away," is right on the mark. And this reminds me of the whole SQL Server 2005 SP3 argument. We kept arguing for and it seemed to take a lot for MS to finally agree to put it together and release. If there's functionality issues in a product that's still under Mainstream support (and both SQL Server 2005 and 2008 are), we should be able to expect that these kinds of issues get fixed and not deferred to the next major version.

  22. Greg Linwood says:

    Great post Aaron btw, this list contains some important & very useful pieces of information..
    Cheers,
    Greg Linwood

  23. Greg Linwood says:

    GrumpyOldDBA
    I disagree with your assertion that Parallelism shouldn't be turned "off" by default.
    The following question should always be considered when assessing whether parallelism should be enabled or disabled – does the DBA want individual queries consuming multiple CPUs or, alternatively, does the DBA want multiple queries running concurrently on multiple CPUs?
    In OLTP scenarios, I think most would answer that multiple CPUs are installed to support multiple concurrent queries, not individual query parallelism. These DBAs should clearly turn parallelism off & use hints for the occassions they actually want parallelism (for the few big reports that run accross an OLTP system for example)
    There's little doubt that DSS work loads can benefit from parallelism, but this doesn't mean OLTP workloads do & by a very large margin, most SQL Server systems are OLTP systems.
    You said that you've seen 600% degradation in performance from turning off parallelism, but you didn't state whether this was OLTP or DSS based!?!? Leaving this crucial information out of your statement is a bad omission.
    Note that Microsoft's current top TPC-E OLTP benchmark has Max Degree of Parallelism turned OFF, whilst the top SQL Server TPC-H 10Tb+ Benchmark has it turned ON. Obviously Microsoft's TPC benchmarking team disagrees with your opinion.
    TPC-E (note MAXDOP config on page 30 of 60):
    http://www.tpc.org/results/FDR/tpce/nec.express5800.A1160.e1.7.0.090318.fdr.pdf
    TPC-H (note MAXDOP config on page 44 of 305):
    http://www.tpc.org/tpch/results/tpch_result_detail.asp?id=109021801
    It should also be noted that Oracle takes the opposite approach to SQL Server by leaving parallelism turned OFF by default. This is obviously because more Oracle databases are also deployed for OLTP than DSS. In my opinion, SQL Server *should* be configured the same way by default & this would benefit most users (the relative few who build DSS systems could simply turn it on).
    As for personal experiences, I have professionally measured the performance of thousands of SQL Server systems over 16 years & have NEVER seen an OLTP system that has benefitted more in its overall workload from having parallelism turned on than off. Furthermore, I have seen MANY instances where parallelism has brought even large SQL Server systems down, as I'm sure many others have as well.
    imo, you're up the wrong tree with this one..
    Cheers,
    Greg Linwood

  24. Alexander Kuznetsov says:

    Glenn,
    In high concurrency situations IDENT_CURRENT can return a value inserted from another connection.

  25. AaronBertrand says:

    Glenn, IDENT_CURRENT cannot guarantee that the value you get is the one that YOUR session created.  I do agree about CXPACKET waits.  I have seen them destroy query times that are supposed to be better with parallelism.  <shrug>

  26. Glenn Berry says:

    Using IDENT_CURRENT might be a good work-around for #1.
    If I see high CXPACKET waits with an OLTP workload, I am still going to recommend setting MAXDOP to 1 at the server level, and seeing what the results are. Its not like these settings are permanent…

  27. James Luetkehoelter says:

    Good post Aaron, parallelism has all sorts of means just within SQL. There's processor parallism, I/O affinity, parallel query plans, plus just what happnes under the covers with more than one core. People are often under the impression if they set MAX DOP =1 on some troublesome query (happened a lot in 7.0 and 2000, which still have an install base as you mention – it's happening right now), an 8 core server is wasting 7 processors. Not true. Whatever SQL passes down to the core might still use more than one processor.
    And, to make things more complicated, there's thread programming and creating your own parallelism. I recently saw an web app that had multiple threads running different queries to draw a page. 3 threads would finish in 2 seconds, one in 30. The end result – 30 seconds for the page to render.
    More processors, parallelism, advanced features – I find all of them detrimental if you don't take the time to understand them. Your list, imho, should be common knowledge to anyone who acts as a DBA or dedicated database developer.
    More post like this!

  28. Michelle Ufford says:

    Thanks, Aaron!  #1 is of particular concern for me, so I appreciate the head's up.  ðŸ™‚
    Oh, and FWIW, I do have some procs that use MAXDOP <> 1 restrictions, i.e. MAXDOP 2.  

  29. AaronBertrand says:

    GrumpyOldDBA,
    Clearly my intention was not to create a bunch of Chicken Littles running for the magical sp_configure button.  If it were, I would have titled the post, "Six reasons you should run and turn off parallelism on all your servers right now!"  The purpose of the post was two-fold:
    1) to point out potential issues where people RIGHT NOW might be getting incorrect results, or corrupting data, or worse, without even knowing about it.  If I saved one person from any of this hassle because they found the spots where this could happen, and changed the code or added the hint in that one spot, I'll gladly take 100 Chicken Littles as a consequence.
    2) to raise visibility to the Connect items themselves, which are not being ignored by Microsoft, but are not being treated with very high urgency, either.  2011 is a long time away, and if some extra votes create an opportunity to raise the priority, then by all means, vote away.
    My mention of the sp_configure option was not a suggestion to use this workaround, but rather to poke fun at how ridiculous the workaround is, as it was suggested by Microsoft folk in the Connect item.
    P.S. just because you have specific cases where you found high performance deltas, that does not mean that every single MAXDOP 1 case will yield similar pain, or even in the same ballpark.

  30. GrumpyOldDBA says:

    In certain circumstances any hint is valid within a production environment. I don't like putting in hints but I have and will do in production systems when needs arise. Whilst I accept not everything is always rosy in the garden I take a pretty dim view of articles/posts which encourage users to turn off paralleism on the server – which is what tends to happen – no doubt I will now find this post being championed as a reason to disable parallelism on SQL Server. btw. I have done some extensive tests with parallelism on 16 core boxes showing ( in my tests ) up to 600% performance degredation by turning off parallelism.

  31. AaronBertrand says:

    Good point on 1 Stephen, I realize that query hints override the global setting, I guess I am just naive in assuming that people don't go around hard-coding OPTION (MAXDOP n) in their queries, where n > the sp_configure setting.  In fact other than in educational samples I have yet to come across any production code sample using MAXDOP <> 1.

  32. Stephen Morris says:

    Hi
    re 1 – actually the sp_configure setting does not "force" MaxDoP to 1, query hints take precedence over the sp_configure setting – the sp_configure setting only adjusts the default (not specified behaviour).
    re 4 – At the time Lars was a colleague so I can definitely say that the error could be reproduced consistently.

  33. AaronBertrand says:

    Thanks Adam, updated inline.

  34. cinahcaM madA says:

    Aaron,
    With regard to #2, the workaround is to use the sys.dm_os_waiting_tasks DMV instead of sys.dm_exec_requests.  This does seem to work in all of my tests so far, and I should add that the next version of my Who is Active? script–to be released very shortly–uses the waiting tasks DMV in order to show correct blocker information.