SQL Server USING SET STATISTICS IO and SET STATISTICS TIME – how to read results

Essentially, every DBA wants his queries to run faster. Whether it is trimming a query that takes 20 long minutes down to 1 minute, or trimming a 2 second query down to 1 second, every DBA’s ultimate goal is to reduce how long it takes our queries to run. This article will help in understanding the results of SET STATISTICS IO and SET STATISTICS TIME to tune a query to make it perform better.

USING SET STATISTICS IO and SET STATISTICS TIME – how to read results

Tuning a query seems simple enough. We want our searches to run faster but in real world many people find it difficult to accomplish this task. There are many reasons why query tuning is difficult. One of the reasons is that Query tuning takes place in an environment that is often changing from second to second, making it hard to really know what exactly is going on.

When you run a query, different server resources are used:

a) CPU time - The CPU time is a somewhat consistent measurement of the amount of CPU resources it takes for your query to run and this is relatively independent of how busy your CPU is. If we run the same query over and over again, the CPU time needed to run the query will remain very close from one execution to other. It’s not about the total time it takes to run query from the start to the end, but the amount of CPU time needed to run the query.

b) IO - Whenever you run a query, the server must retrieve data from memory (logical reads) or from the disk (physical reads). Whenever you run a query, SQL Server must retrieve data from the data cache (logical reads), and if the data it needs is not in the data cache, then it must read it from disk (physical reads).  Using high CPU and IO resources will result in slow query performance. So our goal should be to rewrite your queries in such a way that they use the least CPU and IO resources.

By default, the settings are not turned on.  Both settings can be ‘turned on’ using Transact-SQL commands or by setting the appropriate connection properties in Options dialog from the Tools menu in Management Studio. The Transact –SQL commands will be shown below later but below is a screenshot for the graphical interface:



So let’s get through an example.. And as always let’s make use of the AdventureWorks database.

Run the below query:

USE AdventureWorks
GO
SELECT * FROM Person.Address
GO

This query returns 19614 rows in my machine. Let’s run the same query with SET STATISTICS IO and the SET STATISTICS TIME command.  Run DBCC DROPCLEANBUFFERS and DBCC FREEPROCCACHE before setting these two commands on. These two commands will clear out SQL Server’s data and procedure cache. Below is the order:

First Run the code below:
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE

Then run the below:
SET STATISTICS IO ON
SET STATISTICS TIME ON

And then the main query goes:
SELECT * FROM Person.Address
GO

This time you will see something different in the Messages Tab after you execute the above queries because of the effect of SET STATISTICS TIME and SET STATISTICS IO being turned on. Below is the screenshot from my machine:



The results you get may vary, as all servers are not the same and they differ in configurations.

Here goes the explanation:

SET STATISTICS TIME Results:

‘SQL Server parse and compile time’ tells us how much CPU time and total time it took SQL Server to parse and compile our T-SQL statement.  In my case, it took 0 ms. of CPU time, and a total time of 4 ms. of elapsed time. The CPU time will probably vary on each server because each server is different.

If you re-execute the query, again without clearing the cache most likely both instances of CPU and compile time will be 0, as the cached query plan will be reused, so no new compile time is required. So this is not of much importance. The important part of the results is as below:

SQL Server Execution Times:
   CPU
time = 16 ms,  elapsed time = 348 ms.

What this tells us is how much CPU time was used to execute the query, and how long the query took to run.  As said earlier, to refresh again, the CPU time is a somewhat consistent measurement of the amount of CPU resources it takes for your query to run, and this is relatively independent of how busy your CPU is and the elapsed time number is a measurement of how long the query took to execute (not counting the time for locks or reads). This number will vary a lot because it depends on the current load on your server.

Because the CPU time is relatively consistent, this is what you should use to determine whether the changes you make in your queries, are helping or making things worse.

SET STATISTICS IO Results:


The below statement in the Messages tab shows the IO statistics of the query we had run:

Table 'Address'. Scan count 1, logical reads 280, physical reads 1, read-ahead reads 278,
lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.



Table - Name of the table.
Scan count - Number of index or table scans performed.
logical reads - Number of pages read from the data cache.
physical reads - Number of pages read from disk.
read-ahead reads - Number of pages placed into the cache for the query.
lob logical reads - Number of text, ntext, image, or large value type (varchar(max), nvarchar(max), varbinary(max)) pages read from the data cache.
lob physical reads - Number of text, ntext, image or large value type pages read from disk.
lob read-ahead reads - Number of text, ntext, image or large value type pages placed into the cache for the query.

Above is the precise explanation for each item in the statement and below is the detailed explanation of the most important of them:

Table: Name of the table (Address in our query)

Scan Count:
Refers to the number of times that tables referenced in the query have been accessed. In our example, the table in our query was only accessed 1 time. If a query has one or more joins, then this information may be of use. As we tune a query we can see that this count increases or decreases in cases of joins.

Logical Reads:
Number of pages read from the data cache. As we know, before SQL Server can do any operation on/access data, the data must be available in the data cache. A logical read is the number of pages (SQL Server reads data from the cache in 8K pages) SQL Server has to read from the data cache in order to produce the results specified by our query. The exact same number of logical reads will always occur when the same query is run against the exact same data on the Server.
As this is constant for a query this will help us determine whether we are succeeding or failing in our efforts to tune the query. The fewer the logical reads the more efficient the query is.

Physical Reads:
Number of pages read from disk. The number of physical reads SQL Server uses to execute a query cannot be reduced by query tuning. Reducing physical reads is also an important task for the DBA, but it is a focus on whole server tuning, not query tuning. When we tune queries, we have no control over the size of the data cache or how busy the server is, or whether or not the data our query needs is in the data cache or on disk. The only thing we can directly control through query tuning is the number of logical reads that are required to produce the results we need.

Read-Ahead Reads:
Number of pages placed into the cache for the query. This value tells us the number of physical page reads that SQL Server performed as part of its read-ahead mechanism. To help optimize its performance, SQL Server reads physical data pages ahead of when it thinks your queries might need the data


Tune a query:

As a fact without a WHERE clause, there’s not much we can do about tuning a query as it has to perform a scan on the table to retrieve all the rows. Let’s go ahead and tune a basic query with WHERE clause:



After adding a WHERE clause, even if the elapsed time is down (data is cached), the logical reads are at 224 as compared to 280 before without a WHERE clause. It retrieves 4 rows which we can explain like 4 rows cost us almost the same as 19614 rows. Not a good bargain, isn’t it?

Let’s go ahead and create an INDEX on the column CITY on which the query is being filtered out. Execute the following code to create an index on the CITY column:  

USE [AdventureWorks]
GO
CREATE NONCLUSTERED INDEX [IDX_City] ON [Person].[Address]
(
       [City] ASC
)
GO

Let’s re-execute the query which had the WHERE condition as shown below in the screenshot:



There you see the difference. Logical Reads down to 10 from 224. That’s a significant achievement, isn’t it…?



By Ashutosh Barick   Popularity  (1886 Views)