13

I am running SQL Server 2012 and am trying to put some queries together for monitoring using the DMVs. However, when looking at the total_elapsed_time field in the sys.dm_exec_requests DMV, the numbers look way off. Here's an example:

SELECT
  session_id, RunTime = CURRENT_TIMESTAMP,
  start_time, total_elapsed_time
FROM sys.dm_exec_requests
WHERE session_id = 284;

session_id  RunTime                 start_time              total_elapsed_time
284         2016-04-07 16:14:03.690 2016-04-07 16:08:14.587 1419976

By my calculations*, the elapsed time should be around around 349,103 -- not 1,419,976. That's off by over a factor of 4.

* From the difference, in milliseconds, between the current time and the start_time i.e.
SELECT DATEDIFF(MILLISECOND, '2016-04-07T16:08:14.587', '2016-04-07T16:14:03.690');

Here's the server info:

SELECT @@VERSION;

Microsoft SQL Server 2012 - 11.0.5592.0 (X64) 
    Apr 17 2015 15:18:46 
    Copyright (c) Microsoft Corporation
    Enterprise Edition: Core-based Licensing (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1)

Any ideas what could be causing this discrepancy?

Paul White
  • 94,921
  • 30
  • 437
  • 687
JoeNahmias
  • 464
  • 6
  • 18

3 Answers3

11

There is a bug that aggregates the time in a parallel operation. This is fixed in 2014.

The total_elapsed_time will be correct for a particular parallel query in a batch until it moves on to the next statement in the batch, then the total_elapsed_time will explode by the DOP.

Example

Run this in one query window:

USE AdventureWorks2012
GO
SELECT *
FROM Sales.SalesOrderDetail sod
JOIN Production.Product p ON sod.ProductID = p.ProductID
ORDER BY Style

waitfor delay '00:00:15'

Run this in a second:

select 
    DATEDIFF(ms, start_time, getdate()) ActualElapsedTime,
    total_elapsed_time from sys.dm_exec_requests
where session_id = <your session_id for the above batch>

The two values will be close to identical until SQL Server moves to the WAITFORDELAY statement, then you should see the total_elapsed_time explode (assuming the first query has a parallel plan as it does on my server).

I remember working on this for a customer a few years ago. Found the bug in the internal database (I am a Microsoft Premier Developer Consultant), but no public reference.

Chad Mattox
  • 2,156
  • 11
  • 13
7

It looks like it could also be an bug/issue with the DMV. There is a Connect bug report here of this same kind of inaccuracy. The suggested workaround is to use

GETDATE() - sys.dm_exec_requests.start_time

instead of total_elapsed_time. The issue is resolved in SQL Server 2014. To quote the comment on the Connect item by "Nathan (MSFT)":

The issue with sys.dm_exec_requests.total_elapsed_time isn't specific to RESTORE operations; it has also been observed with UPDATE STATISTICS as well. This issue will not be resolved in SQL Server 2008 R2. [...] The issue is resolved in SQL Server 2014.

Paul White
  • 94,921
  • 30
  • 437
  • 687
James Rhoat
  • 1,577
  • 4
  • 12
  • 32
2

I've checked a couple servers and on the background requests observed a drift of about 14s over 2 months.

However that aside, the only significant difference I can see on other requests is where the spid has gone into a SLEEPING state. I suspect that this value does not increment while in that state; but I haven't been able to force a query into SLEEPING in order to test it. (WAITFOR goes to suspended rather than sleeping).

There may be other causes but I haven't found any yet. You could rule this one out by monitoring your query to make sure it doesn't go to SLEEPING state.

Cody Konior
  • 3,538
  • 14
  • 23