sql cxpacket_对SQL Server中的CXPACKET等待类型进行故障排除

sql cxpacket

The SQL Server CXPACKET wait type is one of the most misinterpreted wait stats. The CXPACKET term came from Class Exchange Packet, and in its essence, this can be described as data rows exchanged among two parallel threads that are the part of a single process. One thread is the “producer thread” and another thread is the “consumer thread”. This wait type is directly related to parallelism and it occurs in SQL Server whenever SQL Server executes a query using parallel plan.

SQL Server CXPACKET等待类型是最被误解的等待状态之一。 CXPACKET术语来自C lass E x change Packet ,从本质上讲,这可以描述为在两个并行线程之间交换的数据行,它们是单个进程的一部分。 一个线程是“生产者线程”,另一个线程是“消费者线程”。 此等待类型与并行性直接相关,只要SQL Server使用并行计划执行查询,它就会在SQL Server中发生。

Generally speaking, the CXPACKET wait type is normal for SQL Server and it is an indicator that SQL Server is using a parallel plan in executing a query, which is generally faster comparing to a query executed in a serialized process. When the parallel plan is used, the query is executed in multiple threads and the query can continue only when all parallel threads are completed. This mean that query will be as fast as the slowest thread.

一般而言,CXPACKET等待类型对于SQL Server是正常的,它指示SQL Server在执行查询时使用并行计划,与在串行化过程中执行的查询相比,它通常更快。 使用并行计划时,查询将在多个线程中执行,并且仅当所有并行线程都完成后,查询才能继续。 这意味着查询将与最慢的线程一样快。

The below diagram will be used for better understanding of the SQL Server CXPACKET wait type and will help in its interpretation.

下图将用于更好地理解SQL Server CXPACKET等待类型,并将有助于其解释。

sql cxpacket_对SQL Server中的CXPACKET等待类型进行故障排除

From this diagram we can see that whenever a parallel query execution can provide benefit to SQL Server, it will create multiple threads for that statement allowing for each parallel process to produce its own subset of data. Each thread can be processed by the separate physical or logical CPU. Communication between the producer and consumer thread is performed via the producer-consumer queue, which is actually a buffer. The query operator in charge of implementing this queue is called the Exchange operator.

从该图中我们可以看到,只要并行查询执行可以为SQL Server提供好处,它将为该语句创建多个线程,从而允许每个并行进程产生自己的数据子集。 每个线程都可以由单独的物理或逻辑CPU处理。 生产者线程和使用者线程之间的通信是通过生产者-消费者队列(实际上是一个缓冲区)执行的。 负责实现此队列的查询运算符称为Exchange运算符。

One or more producer threads will produce packets and send them to a buffer. That data will then be read from the buffer by the consumer threads. During this process, three different scenarios that can cause the excessive CXPACKET waits can be encountered:

一个或多个生产者线程将产生数据包并将其发送到缓冲区。 然后,使用者线程将从缓冲区读取该数据。 在此过程中,会遇到三种可能导致过多的CXPACKET等待的方案:

  • The Consumer cannot read the packets because the buffer (queue) is empty – meaning that the Producer threads do not supply or supply slowly data into buffer. This mean that some Producer threads are working slowly due to waiting for a resource such as CPU, memory grants, I/O etc., or some Producer threads are simply blocked

    消费者无法读取数据包,因为缓冲区(队列)为空–这意味着生产者线程不会向缓冲区中缓慢提供数据或向缓冲区缓慢提供数据。 这意味着某些生产者线程由于等待诸如CPU,内存授予,I / O等资源而工作缓慢,或者某些生产者线程被简单地阻止了

  • The producer threads cannot store the packets into a buffer as buffer is full. This mean that Consumer threads cannot process the data fast enough, causing a situation where the Producer threads must wait to store the data in buffer, once the buffer gets full

    生产者线程无法将数据包存储到缓冲区中,因为缓冲区已满。 这意味着消费者线程无法足够快地处理数据,从而导致一旦缓冲区已满,生产者线程必须等待将数据存储在缓冲区中的情况

  • Excessive parallelism for small queries, where creating the parallel plan and parallel execution could be costlier and slower than serialized plan

    小型查询的并行性过高,在这种情况下,创建并行计划和并行执行可能比序列化计划更昂贵且更慢

  • Uneven balance of packets across the parallel threads, could cause that some threads complete work faster than the others, and then they are waiting for other packets to complete their works

    并行线程之间的数据包平衡不均衡,可能导致某些线程比其他线程更快地完成工作,然后它们在等待其他数据包完成工作

So let’s get inside of the SQL Server CXPACKET wait type to understand this process in more details. Let’s consider the ideal scenario for executing a query when a parallel plan has been used.

因此,让我们深入了解SQL Server CXPACKET等待类型,以更详细地了解此过程。 让我们考虑使用并行计划时执行查询的理想方案。

sql cxpacket_对SQL Server中的CXPACKET等待类型进行故障排除

What we have in the above image is the example of a properly distributed load balance on each parallel thread, which is an ideal situation, as they will be executed in parallel without waiting on each other. But even in the ideal scenario, the parallel plan always has a “control thread” and it is in charge of registering CXPACKET waits. In case of the control thread the CXPACKET wait will represent the time needed for a parallel plan to be executed. It is now clear that the SQL Server CXPACKET wait type is always present in parallel execution even under an ideal scenario and that this is rather an indication of parallelism in query execution than an indication that something went wrong. As long as the CXPACKET is less than 50% of total waits, it shouldn’t be considered as a problem but rather as an indicator.

上图中的示例是在每个并行线程上适当分配负载平衡的示例,这是理想的情况,因为它们将并行执行而不会彼此等待。 但是即使在理想情况下,并行计划也始终具有“控制线程”,并且负责注册CXPACKET等待。 对于控制线程,CXPACKET等待将表示执行并行计划所需的时间。 现在很清楚,即使在理想的情况下,SQL Server CXPACKET等待类型也始终在并行执行中出现,这只是查询执行中并行性的指示,而不是出问题的指示。 只要CXPACKET少于总等待次数的50%,就不应将其视为问题,而应视为一个指标。

When high CXPACKET values are encountered, a possible issue, even in case when parallelism is evenly distributed, is when the cost of creating the parallel plan is higher than the cost of the serialized thread. This is often something that is overlooked and by the rule of thumb of reaching for altering of the Max Degree of Parallelism (MAXDOP), by setting it to 1 (each and every query will be processed by the single CPU core). Configuring MAXDOP settings to 1 should be the last resource used in troubleshooting excessive CXPACKET wait times.

当遇到高CXPACKET值时,即使在并行度均匀分布的情况下,一个可能的问题就是创建并行计划的成本高于序列化线程的成本。 通过将最大并行度(MAXDOP)设置为1(更改每个并行度,每个查询将由单个CPU内核处理),这往往是人们常忽略的事情。 将MAXD​​OP设置配置为1应该是解决CXPACKET过多等待时间的最后资源。

It is important to know that SQL Server’s query optimizer is using the Cost Threshold for Parallelism (CTFP) to determine when the query should be parallelized, or in other words, when the serialized query plan cost exceeds the cost threshold for parallelism it will create a parallel query plan. The CTFP is set by default to 5, which mean that even not so expensive query plan could initiate the parallel plan to be created.

重要的是要知道,SQL Server的查询优化器正在使用并行成本阈值(CTFP)来确定何时应该并行查询,换句话说,当序列化的查询计划成本超过并行成本阈值时,它将创建一个并行查询成本。并行查询计划。 默认情况下,CTFP设置为5,这意味着即使不是那么昂贵的查询计划也可以启动要创建的并行计划。

sql cxpacket_对SQL Server中的CXPACKET等待类型进行故障排除

The Cost Threshold for Parallelism value is in seconds and it means that for every query for which SQL Server estimates that running time will be longer than 5 seconds, a parallel plan will be created. This default value has been set back in nineties when single core computers, slow hard drives and memory were used and for modern computers it is definitely not optimal. What was in that era executed in 5 seconds, on modern machines will be executed for a fragment of second. In general, estimating query execution in seconds is not a good approach as the query cost actually depends on CPU, memory, I/O etc., and SQL Server don’t know the speed of the CPU and how many cores/CPUs are available or speed of HDD/SSD used.

并行成本阈值以秒为单位,这意味着对于SQL Server估计运行时间将超过5秒的每个查询,将创建一个并行计划。 当使用单核计算机,慢速硬盘驱动器和内存时,此默认值可追溯到90年代,对于现代计算机,它绝对不是最佳选择。 在那个时代,在现代机器上执行的时间只有5秒钟,而执行此操作的时间只有一秒钟。 通常,以秒为单位来估计查询执行不是一个好方法,因为查询成本实际上取决于CPU,内存,I / O等,并且SQL Server不知道CPU的速度以及可用的内核/ CPU数量。或使用的HDD / SSD的速度。

To prevent unwanted parallelism, the CTFP number could be increased and by the aforementioned rule of thumb, a minimum value of 25. Recent analysis indicates that 50 should be the optimal minimal number for modern computers. Although, finding the proper CTFP number and fine tuning it for maximum performance is something that has to be done by analyzing the query plans and available resources is the way to determine what CTFP configuration would work best for specific system. A great place to find how to properly determine the CTFP value is Tuning ‘cost threshold for parallelism’ from the Plan Cache article.

为了防止不必要的并行性,可以增加CTFP数量,并且通过上述经验法则,最小值应为25。最近的分析表明,50应该是现代计算机的最佳最小数量。 尽管找到合适的CTFP编号并对其进行微调以实现最佳性能,但是必须通过分析查询计划和可用资源来完成,这是确定哪种CTFP配置最适合特定系统的方法。 查找如何正确确定CTFP值的一个好地方是从“计划缓存”文章中调整“并行性的成本阈值”

So only in situations when above mentioned resources are exhausted and CXPACKET wait time is still large, should playing with the Maximum Degree of Parallelism be considered. The MAXDOP number represents the number of CPU cores SQL Server will use for parallel query execution. The default setting for MAXDOP is 0 and it means that all CPU cores should be used for processing. With modern machines featuring 8, 12, 32, 64 or even more cores, it is not advisable to allow that single query take over all the cores.

因此,仅在上述资源耗尽且CXPACKET等待时间仍然很大的情况下,才应考虑使用最大并行度。 MAXDOP编号表示SQL Server将用于并行查询执行的CPU内核数。 MAXDOP的默认设置为0,这意味着应使用所有CPU内核进行处理。 对于具有8、12、32、64甚至更多核的现代计算机,建议不要让单个查询接管所有核。

When a high CXPACKET value is accompanied with a LATCH_XX and with PAGEIOLATCH_XX or SOS_SCHEDULER_YIELD, it is an indicator that slow/inefficient parallelism itself is the actual root cause of the performance issues. And in such a scenario if the LATCH_XX waits is ACCESS_METHODS_DATASET_PARENT or ACCESS_METHODS_SCAN_RANGE_GENERATOR class, then it is highly possible that the parallelism level is the bottleneck and the actual root cause of the query performance issue. This is a typical example when MAXDOP should be reduced.

当高CXPACKET值与LATCH_XX和PAGEIOLATCH_XX或SOS_SCHEDULER_YIELD一起出现时,表明缓慢/低效率的并行性本身是导致性能问题的根本原因。 在这种情况下,如果LATCH_XX等待的是ACCESS_METHODS_DATASET_PARENT或ACCESS_METHODS_SCAN_RANGE_GENERATOR类,那么并行度级别很可能是瓶颈,并且是查询性能问题的实际根本原因。 这是应减少MAXDOP的典型示例。

For those who are interested in more details on how to set up MAXDOP properly for Intel, AMD and/or virtual machines there is a good article here Recommendations and guidelines for the “max degree of parallelism” configuration option in SQL Server.

对于那些对如何为Intel,AMD和/或虚拟机正确设置MAXDOP的更多详细信息感兴趣的人,这里有一篇不错的文章,关于SQL Server中“最大并行度”配置选项的建议和指南

sql cxpacket_对SQL Server中的CXPACKET等待类型进行故障排除

All the above described have one aim, and this is to allow large queries to be executed in parallel, as they can benefit from that significantly, and ensuring small queries are run serialized as it is the more efficient approach for small queries.

上面描述的所有内容都有一个目的,这是允许并行执行大型查询,因为它们可以从中受益匪浅,并确保小型查询被序列化运行,因为这是小型查询的更有效方法。

Another scenario where high values of SQL Server CXPACKET wait type can occur is due to uneven distribution of data across the threads. This is a typical scenario where CXPACKET is not the problem, but that the CXPACKET value is an indicator that a problem exists. In such cases, troubleshooting should be focused on other potential problems to understand better how this scenario can create the high CXPACKET value. The following graphics will be used to illustrate the scenario.

SQL Server CXPACKET等待类型可能会出现高值的另一种情况是由于线程之间的数据分布不均。 这是CXPACKET不是问题的典型情况,但是CXPACKET值表明存在问题。 在这种情况下,故障排除应该集中在其他潜在问题上,以更好地了解这种情况如何产生较高的CXPACKET值。 以下图形将用于说明这种情况。

sql cxpacket_对SQL Server中的CXPACKET等待类型进行故障排除

In this particular scenario, it can be seen that Thread 1 and Thread 2 were executed and completed their processing, so now they are waiting for other threads to complete their execution. As it is shown in this particular case, thread 3 and 5 are still running. This type of thread wait is called CXPACKET wait. Due to uneven distribution of data that each thread has to process, the CXPACKET wait type can have significantly higher values sometimes. Most of the burden could be on one or two threads instead on all five, like we have in our example, so the time needed for completing will be higher. In such cases, the CXPACKET wait is again an indicator that there is something wrong, although not with parallelism itself, but rather with external resources that are causing the uneven distribution of data per thread. The source of the issue should be investigated focusing on improper indexing for example or obsolete statistics among other reasons.

在这种特定情况下,可以看出线程1和线程2已执行并完成了它们的处理,因此现在它们正在等待其他线程完成其执行。 如在这种特定情况下所示,线程3和5仍在运行。 这种类型的线程等待称为CXPACKET等待。 由于每个线程必须处理的数据分布不均,因此CXPACKET等待类型有时可能具有更高的值。 大部分负担可能集中在一个或两个线程上,而不是全部五个线程上,如我们在示例中那样,因此完成所需的时间会更长。 在这种情况下,CXPACKET等待再次表明存在问题,尽管不是并行性本身,而是外部资源导致了每个线程的数据分配不均。 应该调查问题的根源,例如以索引编制不正确或统计数据过时等为重点。

It is also possible that thread has to wait for some external resources, the most common of which are:

线程也可能需要等待一些外部资源,其中最常见的是:

  • when the thread has to share I/O resource with another database or application, which causes slower processing and requires more time to complete work

    当线程必须与另一个数据库或应用程序共享I / O资源时,这将导致处理速度变慢并且需要更多时间来完成工作

  • a large parallelized query that is executing for a long time where different threads have to access different databases that are stored on a different physical or logical storage of a different speed

    长时间执行的大型并行查询,其中不同的线程必须访问存储在不同速度的不同物理或逻辑存储中的不同数据库

  • the resources needed by some parallelized threads are blocked by ad-hoc queries executed at the same time

    同时执行的即席查询会阻止某些并行化线程所需的资源

This is also an example where the CXPACKET wait type is just indicator that something is wrong. In such situations it is recommended to look at the associated wait types LCK_M_XX or PAGEIOLATCH_XX as well as IO_COMPLETION and ASYNC_IO_COMPLETION waits that are often accompanying previously mentioned two. Diagnosing and troubleshooting those wait types, rather than CXPACKET is something that will solve the root cause of the parallelism issues that were red-flagged via the high CXPACKET wait type value.

这也是一个示例,其中CXPACKET等待类型仅指示出错误。 在这种情况下,建议查看关联的等待类型LCK_M_XX或PAGEIOLATCH_XX以及通常伴随前面提到的两种的IO_COMPLETION和ASYNC_IO_COMPLETION等待。 对那些等待类型(而不是CXPACKET)进行诊断和故障排除,将解决由于高CXPACKET等待类型值而被红色标记的并行性问题的根本原因。

So to sum the things up, these are the steps that are recommended in diagnosing the cause of high CXPACKET wait stats values (before making any knee-jerk reaction and changing something on SQL Server):

因此,总而言之,以下是在诊断CXPACKET等待统计值较高的原因时建议的步骤(在进行任何举动和更改SQL Server之前):

  • Do not set MAXDOP to 1, as this is never the solution

    不要将MAXD​​OP设置为1,因为这永远不是解决方案

  • Investigate the query and CXPACKET history to understand and determine whether it is something that occurred just once or twice, as it could be just the exception in the system that is normally working correctly

    研究查询和CXPACKET历史记录,以了解并确定它是一次还是两次发生,这可能只是正常运行的系统中的异常

  • Check the indexes and statistics on tables used by the query and make sure they are up to date

    检查查询使用的表的索引和统计信息,并确保它们是最新的

  • Check the Cost Threshold for Parallelism (CTFP) and make sure that the value used is appropriate for your system

    检查并行成本阈值(CTFP) ,并确保所使用的值适合您的系统

  • Check whether the CXPACKET is accompanied with a LATCH_XX (possibly with PAGEIOLATCH_XX or SOS_SCHEDULER_YIELD as well). If this is the case than the MAXDOP value should be lowered to fit your hardware

    检查CXPACKET是否随附有LATCH_XX(可能还附带PAGEIOLATCH_XX或SOS_SCHEDULER_YIELD)。 如果是这种情况,则应降低MAXDOP值以适合您的硬件

  • Check whether the CXPACKET is accompanied with a LCK_M_XX (usually accompanied with IO_COMPLETION and ASYNC_IO_COMPLETION). If this is the case, then parallelism is not the bottleneck. Troubleshoot those wait stats to find the root cause of the problem and solution

    检查CXPACKET是否随附LCK_M_XX(通常随附IO_COMPLETION和ASYNC_IO_COMPLETION 。 如果是这种情况,那么并行性不是瓶颈。 对这些等待统计信息进行故障排除,以找到问题和解决方案的根本原因

翻译自: https://www.sqlshack.com/troubleshooting-the-cxpacket-wait-type-in-sql-server/

sql cxpacket