Scheduler, Tasks, Workers, Threads, Requests, Sessions, Connections

From https://blogs.msdn.microsoft.com/sqlmeditation/2012/12/13/tasks-workers-threads-scheduler-sessions-connections-requests-what-does-it-all-mean/

Scheduler (SOS Scheduler)– the object that manages thread scheduling in SQL Server and allows threads to be exposed to the CPU (described in sys.dm_os_schedulers). This is the all-powerful but benign and graceful master whom everyone abides.  He does not control things but lets the workers work with each other and relies on their cooperation (co-operative scheduling mode). Each scheduler /master (one per logical CPU) accepts new tasks and hands them off to workers. SOS Scheduler allows one worker at a time to be exposed to the CPU.

Task –a task represents the work that needs to be performed (sys.dm_os_tasks). A task contains one of the following requests: query request (RPC event or Language event), a prelogin request (prelogin event),  a login request (connect event), a logout request (disconnect event), a query cancellation request (an Attention event), a bulk load request (bulk load event), a distributed transaction request (transaction manager event). A task is what the Master is about – it is what defines its existence. Note these are tracked at the SOS scheduler layer (thus dm_OS_tasks)

Worker (worker thread) – This is the logical SQL Server representation of a thread (think of it as a wrapper on top of the OS thread). It is a structure within the Scheduler which maintains SQL Server-specific information about what a worker thread does. sys.dm_os_workers. Workers are the humble servants who carry out the task assigned to them by the Master (scheduler).

Thread – this is the OS thread sys.dm_os_threads that is created via calls likeCreateThread()/_beginthreadex(). A Worker is mapped 1-to-1 to a Thread.

Request is the logical representation of a query request made from the client application to SQL Server (sys.dm_exec_requests). This query request has been assigned to a task that the scheduler hands off to a worker to process. This represents query requests as well as system thread operations (like checkpoint, log writer, etc); you will not find login, logouts, attentions and the like here. Also, note that this is a representation at the SQL execution engine level (thus dm_EXEC_requests) not at the SOS Scheduler layer.

Sessions – when the client application connects to SQL Server the two sides establish a “session” on which to exchange information. Strictly speaking a session is not the same as the underlying physical connection, it is a SQL Server logical representation of a connection. But for practical purposes, you can think of this as being a connection (session =~ connection). See sys.dm_exec_sessions. This is the old SPID that existed in SQL Server 2000 and earlier. You may sometimes notice a single session repeating multiple times in a DMV output. This happens because of parallel queries. A parallel query uses the same session to communicate with the client, but on the SQL Server side multiple worker (threads) are assigned to service this request. So if you see multiple rows with the same session ID, know that the query request is being serviced by multiple threads.

Connections – this is the actual physical connection established at the lower protocol level with all of its characteristics sys.dm_exec_connections . There is a 1:1 mapping between a Session and a Connection.

How Statistics are used for Query Plan generation

Statistics (data distribution statistics on columns/indexes) will be used by Query Optimizer in determining the appropriate choices for operators within query plan.

Inaccurate, outdated statistics may lead to poor quality of Query Execution Plan and there by could lead to performance issues.

Statistics comprise of two kinds of details.

  • Density vector
  • Histogram

https://msdn.microsoft.com/en-IN/library/ms174384.aspx

http://www.sqlpassion.at/archive/2014/01/28/inside-the-statistics-histogram-density-vector/

Query plans that are generated using Histogram information are generally considered to be efficient compared to plans that are generated based on density vector only.

Below aspects will drive how statistics are used by SQL Server optimizer in generating a query plan.

Literals

Ex: SELECT SalesOrderDetailID, OrderQty FROM Sales.SalesOrderDetail WHERE ProductID = 897;

Constants or actual values used directly in the query as a filter provides for better cardinality estimates.

Usually, this will result in efficient query plan as optimizer has knowledge of specific column filters before run time as histogram part of statistics will be used by Query Optimizer to estimate the number of qualified rows.

Parameters

Ex: EXEC Get_OrderID_OrderQty @ProductID=897;

Using stored procedures and passing filter values as parameters to stored procedure has some advantages.

  • Query plan generated for the first invocation of stored procedure will be re-used for subsequent executions.
  • This will reduce the CPU consumption as no need to compile the code and generate execution plan for each execution of stored procedure.
  • Histogram part of the statistics will be used for deriving carnality estimates and hence provides for best possible cardinality estimates.

It has few disadvantages as well.

Query plan that was generated for the first invocation of stored procedure may not be efficient for all subsequent executions particularly if there is data distribution skew among column values.

Also, histogram part of the statistics will be not be re-checked for subsequent executions of stored procedure as query plan is re-used.

A work-around for this would be to use either statement level or stored procedure level RECOMPILE hint which will result in generation of a fresh plan considering the parameter values passed for the specific execution.

From the point of view of query optimization, parameter sniffing is a good thing but it can negatively impact performance in few scenarios.

More details on Parameter Sniffing concepts at http://www.brentozar.com/archive/2013/06/the-elephant-and-the-mouse-or-parameter-sniffing-in-sql-server/

http://www.benjaminnevarez.com/2010/06/the-parameter-sniffing-problem/

Local variables

Ex: DECLARE @ProductID INT=897
SELECT SalesOrderDetailID, OrderQty FROM Sales.SalesOrderDetail WHERE ProductID = @ProductID;

 

A traditional way to avoid the parameter sniffing problem, especially in previous versions of SQL Server, was by using local variables. When you use local variables SQL Server is not able to use the histogram anymore. Instead it uses the information on the density vector of the statistics object. OPTIMIZE FOR UNKNOWN query hint works pretty much in the same way.

Values for local variables are not known at optimization time so the Query Optimizer is not able to use the actual value passed for the optimization, as it did earlier in case of literals or parameters.

One benefit of using local variables or OPTIMIZE FOR UNKNOWN hint is that you always get the same execution plan.

More details at http://www.brentozar.com/archive/2014/06/tuning-stored-procedures-local-variables-problems/

http://www.benjaminnevarez.com/2010/06/how-optimize-for-unknown-works/

Identify the actual query involved in deadlock when it is not visible in deadlock graph

Sometimes, actual query involved in deadlock is not visible in deadlock graph. We can use below SQL to identify the query in such cases based on stmtstart, stmtend, sqlhandle information that is available in deadlock graph.

————————–

DECLARE @sqlhandle varbinary(64)
DECLARE @stmtstart INT
DECLARE @stmtend INT

SET @sqlhandle = 0x03000500eab1e835d941410165a400000100000000000000
SET @stmtstart = 20244
SET @stmtend = 20368

SELECT
       substring(qt.[text],qs.statement_start_offset/2,
         (case when qs.statement_end_offset = -1
          then len(convert(nvarchar(max), qt.[text])) * 2
          else qs.statement_end_offset end – qs.statement_start_offset)/2
         ) SQLText,
  qt.[text] batch,
  db.[name] datbase,
  qt.objectid,
  qs.execution_count,
  qs.total_worker_time,
  qs.total_elapsed_time,
  qs.total_logical_reads,
  qs.total_physical_reads,
  qs.total_logical_writes,
  qs.[sql_handle],
  qs.statement_start_offset,
  qs.statement_end_offset  
FROM    sys.dm_exec_query_stats qs
        CROSS APPLY sys.dm_exec_sql_text(qs.[sql_handle]) qt
        LEFT JOIN sys.databases db ON qt.[dbid] = db.database_id
WHERE   qs.[sql_handle] = @sqlhandle
AND     qs.statement_start_offset = @stmtstart
AND     qs.statement_end_offset = @stmtend

From

https://social.msdn.microsoft.com/Forums/sqlserver/en-US/58833801-02cc-4f33-a4bc-4a8501e7d227/statement-start-and-end-in-deadlock-graph
https://gallery.technet.microsoft.com/scriptcenter/4df31ad5-227c-495d-9c6c-d91d047e89ee

Using DMVs to identify long running queries within a stored procedure

SELECT TOP 10 Min(query_stats.statement_text)                                                      AS statement_text,
( Sum(query_stats.total_elapsed_time) / Sum(query_stats.execution_count) ) / 1000000 AS avg_elapsed_time_seconds,
Sum(query_stats.total_elapsed_time) / 1000000                                        AS total_elapsed_time seconds,
Max(query_stats.total_elapsed_time) / 1000000                                        AS max_elapsed_time_seconds,
Sum(query_stats.execution_count)                                                     run_count,
query_stats.query_hash                                                               AS query_hash
FROM   (SELECT QS.*,
Substring(ST.text, ( QS.statement_start_offset / 2 ) + 1, ( ( CASE statement_end_offset
WHEN -1 THEN Datalength(ST.text)
ELSE QS.statement_end_offset
END – QS.statement_start_offset ) / 2 ) + 1) AS statement_text
FROM   sys.dm_exec_query_stats AS QS
CROSS APPLY sys.Dm_exec_sql_text(QS.sql_handle) AS ST
WHERE  st.objectid = Object_id(‘stored_procedure’)) AS query_stats
GROUP  BY query_stats.query_hash
ORDER  BY 2 DESC;

GO

 

Source: https://msdn.microsoft.com/en-us/library/ms189741%28v=sql.105%29.aspx