Friday, November 25, 2011

SQL Waits Stats

Waits are represented by SQL Server wait statistics. SQL Server tracks wait information any time that a user connection or session_id is waiting. This wait information is summarized and categorized across all connections so that a performance profile can be obtained for a given work load. Therefore, SQL Server wait types identify and categorize user (or thread) waits from an application workload or user perspective. 

Queues measure system resources and utilization.  The queues part of performance is represented by Performance Monitor objects and counters and other sources of information. Performance Monitor counters measure various aspects of performance such as transfer rates for disks or the processor time consumed.  SQL Server object counters are exposed to Performance Monitor using the dynamic management view (DMV) sys.dm_os_performance_counters. Thus, Performance Monitor counters show performance from a resource point of view.  

Associations or correlations of wait types to performance counters, and interesting performance counter ratios and comparisons round out the picture.  The association of waits and queues allows one to eliminate irrelevant counters insofar as the performance bottleneck is concerned and focus effectively on the problem area.  Comparisons of one counter to another provide perspective in order to draw the right conclusion.  For example, say you encounter 1000 lock waits during a production workload.  Does this cause a serious performance problem? To determine what role these play in overall performance you need to also consider the total number of lock requests (how many locks are requested overall vs. how many result in lock waits), the wait time or duration of the lock waits, and the time span of the test.  If the total number of lock requests was 10 million, perhaps 1000 waits is insignificant.  Further, if the associated lock wait time is 50 seconds overall, and the workload is over 8 hours, this too is insignificant.  On the other hand, if you average a 50 second wait for EACH of the 1000 lock waits, this IS significant.  In sum, associations or correlations allow us to determine relevancy to overall performance.

Application performance can be easily explained by looking at SQL Server waits and System or Resource queues. In SQL Server the dynamic management view (DMV) sys.dm_os_wait_stats provides a valuable source of wait information from an application perspective. The system or resource perspective uses Performance Monitor counters and other data sources to provide a breakdown of system resource usage according to resource queues. Taken together, the value of the application and resource perspectives used by the waits and queues methodology enables us to narrow in on the bottleneck and eliminate the irrelevant performance data.


-- Isolate top waits for server instance since last restart or statistics clear

-- Clear Wait Stats
-- DBCC SQLPERF('sys.dm_os_wait_stats', CLEAR);
select @@servername as servername,*,CASE wait_time_ms WHEN 0 THEN 0 ELSE wait_time_ms/waiting_tasks_count END as wait_time_avg_ms
      sys.dm_os_wait_stats (NOLOCK)
ORDER BY waiting_tasks_count desc;
select  text,query_plan,requested_memory_kb,granted_memory_kb,used_memory_kb
from sys.dm_exec_query_memory_grants MG
cross apply sys.dm_exec_sql_text(sql_handle) t
cross apply sys.dm_exec_query_plan(MG.plan_handle)
(SELECT wait_type, wait_time_ms / 1000. AS wait_time_s,100. * wait_time_ms / SUM(wait_time_ms) OVER() AS pct,
FROM sys.dm_os_wait_stats
SELECT W1.wait_type,
CAST(W1.wait_time_s AS DECIMAL(12, 2)) AS wait_time_s,
CAST(W1.pct AS DECIMAL(12, 2)) AS pct,
CAST(SUM(W2.pct) AS DECIMAL(12, 2)) AS running_pct
FROM Waits AS W1
ON W2.rn <= W1.rn
GROUP BY W1.rn, W1.wait_type, W1.wait_time_s, W1.pct
HAVING SUM(W2.pct) - W1.pct < 95; -- percentage threshold

-- Common Significant Wait types with BOL explanations

-- *** Network Related Waits ***
-- ASYNC_NETWORK_IO        Occurs on network writes when the task is blocked behind the network

-- *** Locking Waits ***
-- LCK_M_IX                Occurs when a task is waiting to acquire an Intent Exclusive (IX) lock
-- LCK_M_IU                Occurs when a task is waiting to acquire an Intent Update (IU) lock
-- LCK_M_S                Occurs when a task is waiting to acquire a Shared lock

-- *** I/O Related Waits ***
-- ASYNC_IO_COMPLETION  Occurs when a task is waiting for I/Os to finish
-- IO_COMPLETION        Occurs while waiting for I/O operations to complete.
--                      This wait type generally represents non-data page I/Os. Data page I/O completion waits appear
--                      as PAGEIOLATCH_* waits
-- PAGEIOLATCH_SH        Occurs when a task is waiting on a latch for a buffer that is in an I/O request.
--                      The latch request is in Shared mode. Long waits may indicate problems with the disk subsystem.
-- PAGEIOLATCH_EX        Occurs when a task is waiting on a latch for a buffer that is in an I/O request.
--                      The latch request is in Exclusive mode. Long waits may indicate problems with the disk subsystem.
-- WRITELOG             Occurs while waiting for a log flush to complete.
--                      Common operations that cause log flushes are checkpoints and transaction commits.
-- PAGELATCH_EX            Occurs when a task is waiting on a latch for a buffer that is not in an I/O request.
--                      The latch request is in Exclusive mode.
-- BACKUPIO                Occurs when a backup task is waiting for data, or is waiting for a buffer in which to store data

-- *** CPU Related Waits ***
-- SOS_SCHEDULER_YIELD  Occurs when a task voluntarily yields the scheduler for other tasks to execute.
--                      During this wait the task is waiting for its quantum to be renewed.

-- THREADPOOL            Occurs when a task is waiting for a worker to run on.
--                      This can indicate that the maximum worker setting is too low, or that batch executions are taking
--                      unusually long, thus reducing the number of workers available to satisfy other batches.
-- CX_PACKET            Occurs when trying to synchronize the query processor exchange iterator
--                        You may consider lowering the degree of parallelism if contention on this wait type becomes a problem

No comments:

Post a Comment