Benjamin Nevarez Rotating Header Image

PASS Summit and Other Speaking Engagements

It is almost that time of the year again when we are all headed to attend the largest SQL Server conference in the world, the PASS Summit, hosted again in Seattle. In my particular case I will also be attending some other SQL Server events where I will be given probably more than a dozen presentations. But like a rock band getting ready for a world tour :-) I’ll start warming up with a local gig at our SQL Server user group. I’ll be presenting Query Processing in In-Memory OLTP (Hekaton) this Thursday, September 18th at the Los Angeles SQL Server Professionals Group. Their meetings are hosted at the UCLA campus and you can visit their page for more details.

Two days later I’ll be attending our SQLSaturday in San Diego where I am schedule to present three sessions: the Hekaton session previously mentioned, Understanding Parameter Sniffing and Dive into the Query Optimizer – Undocumented Insight. As usual this event includes great speakers from all over the country and this time we will have Itzik Ben-Gan, Grant Fritchey, Randy Knight, TJay Belt, Mickey Stuewe, Brandon Leach, and Andrew Karcher among others. You can see the schedule and register at the SQLSaturday website.

So far I only have one event scheduled for October which is a local user group meeting at the San Dimas Software & Database Professionals group where again I will be presenting my Hekaton session. The session is scheduled for October 8th and you can find the meetup page here.

Then the first week of November I’ll be attending both the MVP Summit and the PASS Summit in Seattle, WA. But my flight will make a short stop first in Portland, OR to attend the Oregon SQLSaturday which is scheduled for November 1st. My only session for the event, Query Processing in In-Memory OLTP (Hekaton), is scheduled at 4:30pm. After the Oregon SQLSaturday I will be attending my first MVP Summit from November 2nd to November 6th. I am excited to be attending this event for the first time. The PASS Summit starts on November 4th and I’ll be presenting Query Processing in In-Memory OLTP on Friday, November 7th at 8:00am, rooms 611-612. This would be my 12th PASS Summit and my 7th year as presenter.

Before closing 2014, I’ve been invited to Istanbul, Turkey to participate on their SQLSaturday which is scheduled for the first weekend of December. I am planning to deliver a pre-con on Friday 5th covering “SQL Server Query Tuning & Optimization” topics and will be presenting my Hekaton session at the SQLSaturday the following day. I’ll provide or tweet more details as they become available.

I look forward to seeing lots of SQL Server professionals and meeting new friends at these SQL Server events.

Finally, after writing a new book this year it looks like I took a good break from writing. I just started working on some articles again and will be posting them soon both on this blog and on SQLPerformance.com.

Speaking at SQLBits and the PASS Summit

image

I am really excited that this weekend I will be flying to the UK to participate in my first European SQL Server conference, SQLBits XII, where I am scheduled to present 2 sessions. SQLBits XII, the largest SQL Server conference in Europe, will be held at The International Centre in Telford, UK from July 17th to July 19th. My first session, Dive into the Query Optimizer – Undocumented Insight, is scheduled for Friday 18th at 12:15 PM, room Ironbridge 3 and the second one, Understanding Parameter Sniffing for Saturday 19th at 8:15 AM, room Wenlock 1. These are sessions I have presented previously at the PASS Summit.

In addition, I got selected to present a session at the PASS Summit 2014, which will be held again in Seattle, WA this November. The PASS Summit is the largest SQL Server and BI conference in the world and this would be my 12th year attending, and my 7th year as a presenter. My session, Query Processing in In-Memory OLTP, will cover query processing and optimization with the new Hekaton database engine, a topic which I also cover in a chapter of my new book.

Just a few days after the sessions for the PASS Summit 2014 were announced, I also got the nice surprise that I was awarded as a SQL Server MVP. It is a great honor for me to be part of this awesome team.

Finally, my new book SQL Server 2014 Query Tuning & Optimization has been completed and will be soon published by McGraw-Hill Education.

I look forward to meeting lots of SQL Server professionals on these conferences. See you all in Telford and Seattle.

New Book: SQL Server 2014 Query Tuning & Optimization

clip_image002

 

It looks like my blog has been very quiet for a while once again. Yes, I’ve been busy working on some other SQL Server projects since last year. One of those projects was working on Kalen Delaney’s new book SQL Server 2012 Internals where I co-authored two chapters, “Special Databases” and “Query Execution”, which were originally written for previous versions of the book by Kalen and Craig Freedman, respectively. The book was finally released last November.

As soon as I finished my part on Kalen’s book, I immediately started working on my new book, SQL Server 2014 Query Tuning & Optimization, which has been available for pre-order at Amazon for a while now. This book covers everything you need to know about query tuning and optimization for SQL Server to help you design high-performing database applications. As a SQL Server 2014 book it obviously covers new features like Hekaton, the new cardinality estimator, and incremental statistics, among others, but most of the content can be used in previous versions of SQL Server as well. I am currently working on the last items of the book with the great team at McGraw-Hill Education and the help of the amazing Dave Ballantyne (@davebally) as technical reviewer and the book should be available soon.

Also, at the last PASS Summit in Charlotte I was invited by SQL Sentry to blog at their SQLPerformance.com site, which has great content by some of the most amazing bloggers in the SQL Server community. Although I haven’t had much time to blog there as I would wanted, I have started with a few SQL Server 2014 articles that you can read here.

Finally, I have submitted sessions for SQLBits (first time ever submitted) and to the PASS Summit (which I have attended for 11 years so far), and just at the moment I am writing this I got the great news that I got two sessions selected for SQLBits so I am really looking forward to finally attend this conference in the UK this July. I am also planning to attend some other SQL Server events in the US, including our always amazing local SQLSaturdays in Orange County this April 26th, Sacramento on July 12th and San Diego, which is usually scheduled for September. Hope to see you at any of these SQL Server events.

Speaking at the PASS Summit 2013

clip_image001

In just a few days I will be headed to Charlotte, NC to attend my eleventh PASS Summit. Yes, I’ve been attending this, the largest SQL Server and BI conference in the world, every year since 2003. I am also honored to be speaking at the conference again making this my sixth year as a presenter. My session, Defeating the Limitations of the Query Optimizer, which I just presented for the first time at the San Diego SQLSaturday #249, is scheduled at the end of the conference, Friday 4:15 PM at the Ballroom B.

Of course I am also planning to attend many sessions. I haven’t finished my schedule yet but of what I have seen so far I definitely want to attend Mark Russinovich’s Windows Azure Deep Dive, Bob Ward’s Inside SQL Server 2012 Memory: The Sequel, Conor Cunningham’s OLTP Sharding Techniques for Massive Scale, Kevin Liu’s SQL Server 2014 In-Memory OLTP: DBA Deep Dive and Kevin Farlee’s SQL Server 2014 In-Memory OLTP: DB Developer Deep Dive.

In addition to the PASS Summit I will also be attending SQL in the City on Monday and SQLSaturday Charlotte BI Edition on Saturday. I will also be speaking at this last event where I will present my session Dive into the Query Optimizer – Undocumented Insight. You can still register for these two free events.

I look forward to meeting lots of SQL Server professionals, including those whom I only know via twitter. See you all in Charlotte!

A Tour of the Hekaton AMR Tool

SQL Server 2014 CTP1 has a tool to help you decide which tables and stored procedures you can move to Hekaton or In-Memory OLTP. In this post I will give you a quick tour of this tool, the AMR (Analysis, Migration and Reporting). The AMR tool is integrated with the Data Collector and to enable it you have to enable the new Transaction Performance Collection Sets on the Configure Data Collection Wizard as shown next.

clip_image002

This will create two new collection sets, Stored Procedure Usage Analysis and Table Usage Analysis, in addition to the three system data collections sets previously available with the Data Collector.

Once you configure the Transaction Performance Collection Sets on the Data Collector you are ready to test the AMR tool. First, you need to create some database activity. In my case I am testing with the following stored procedures on a copy of AdventureWorks2012.

CREATE PROCEDURE test1
AS
SELECT * FROM Sales.SalesOrderHeader soh
    JOIN Sales.SalesOrderDetail sod ON soh.SalesOrderID = sod.SalesOrderID
WHERE ProductID = 870
GO
CREATE PROCEDURE test2
AS
SELECT ProductID, SalesOrderID, COUNT(*)
FROM Sales.SalesOrderDetail
GROUP BY ProductID, SalesOrderID

Execute the procedures multiple times, for example, 20 or 30.

EXEC test1
GO
EXEC test2
GO

After you create some database activity you may have to wait for the next Data Collector upload job to execute. For example, the Stored Procedure Usage Analysis upload job runs every 30 minutes and the Table Usage Analysis runs every 15 minutes. You could also run these jobs (collection_set_5_upload and collection_set_6_upload respectively) manually.

To access the AMR reports right-click your MDW database, select Reports, Management Data Warehouse and Transaction Performance Analysis Overview. This is the main AMR tool report.

clip_image004

Clicking on Usage Analysis on the Tables Analysis section will show the Recommended Tables Based on Usage report showing the top candidate tables for memory optimization based on the access patterns of the workload. The report graph shows the performance benefits on memory optimization along with the migration effort required to move tables to In-Memory OLTP based on how many unsupported features the table uses.

clip_image006

Selecting Tables Analysis, Contention Analysis on the main report will show the Recommended Tables Based on Contention report as shown next.

clip_image008

In both cases, the reports recommend to prioritize the tables in the top right corner of the graph. Clicking on a table name on any of previous reports will show the table performance statistics details, including both table usage and contention statistics. An example is shown next.

clip_image010

On the other hand, selecting on Stored Procedure Analysis, Usage Analysis on the main report will take us to the Recommended Stored Procedures Based on Usage report, showed next. This report will show the top stored procedures based on total worker time in milliseconds.

clip_image012

Similarly, clicking on a specific stored procedure will show the stored procedure execution statistics detail and the tables it references. An example for the test2 stored procedure is next.

clip_image014

You can also click on the listed tables to get the table’s performance statistics report shown previously.

In summary the AMR tool will be very helpful in providing recommendations as to which tables and stored procedures you might want to consider migrating into Hekaton. You could even upgrade your database to SQL Server 2014 and run this tool for analysis of your real performance data and iteratively move tables and stored procedures to In-Memory OLTP as required.

Hekaton: Transforming Query Plans into C Code

In-Memory OLTP, code-named Hekaton, is without a doubt the most exciting new feature of SQL Server 2014. We all first learned about this technology when it was announced back at the PASS Summit 2012 conference where it was also mentioned that it will be included in the next major release of SQL Server. Recently at TechEd North America Microsoft finally announced that this new version will be named SQL Server 2014 and that its first CTP will be available to download by the end of June. So after playing with this CTP for a few days I decided to write a few posts mostly focused on the query processing area of the technology.

But perhaps one of the most important concepts we are learning with Hekaton is the fact that even with the new hardware available today, obtaining ten times or more better performance will also require to dramatically change the way data management systems are designed. For example, database management systems were originally designed under the assumption that memory is limited and data should have to reside on disk pages. This assumption is no longer true and currently it is possible to fit entire databases in memory, or at least the most critical tables in an OLTP environment. So taking benefit of this available memory is not just a matter of reading more of the existing disk pages to memory but re-designing data management systems using a different approach to take the most possible benefit of this new hardware.

The best way to start learning about Hekaton is reading Kalen Delaney’s white paper SQL Server In-Memory OLTP Internals Overview for CTP1 and of course, downloading and playing with the CTP1 itself. So assuming you are familiar with the basic concepts in this post I will focus on two of the main objects that Hekaton provides:

1) Memory optimized tables. Hekaton tables are stored in memory, all the time. Obviously they are also kept in disk, but only for durability purposes. These tables provide two new kinds of indexes, hash and range indexes, although only the former kind is available on the first CTP. And you may be surprised that memory optimized tables are compiled into native DLLs when they are created.

2) Natively compiled stored procedures. This new kind of stored procedures are first optimized by the query optimizer, like any regular stored procedure, but are also later translated into C code and compiled into machine code. This process again will create DLLs.

Limitations

Perhaps the main limitation of Hekaton is that memory optimized tables could not be altered, they would need to be dropped and created again with the required changes. This is the case, for example, if you want to add a new column or a new index. Obviously, dropping the table would mean first copying its data to another location, creating a new empty table with the required changes and copying the data back, which would require some downtime for live applications.

You cannot alter natively compiled stored procedures either. Or even recompile them (except in a few limited cases). In addition, as the data on Hekaton tables change, statistics are not automatically updated, you will need to manually update them using the UPDATE STATISTICS statement. Also, after getting updated statistics your existing natively compiled stored procedures cannot take benefit of it. You will have to manually drop and re-create the store procedure.

Example

Let me show you an example of how the technology works, using a copy of AdventureWorks2012 database. First let us create a new database.

CREATE DATABASE Test
ON PRIMARY (NAME = [Test_data],
FILENAME = 'C:\DATA\Test_data.mdf', SIZE=500MB),
FILEGROUP [Test_fg] CONTAINS MEMORY_OPTIMIZED_DATA 
(NAME = [Test_fg], FILENAME = 'C:\DATA\Test_fg')
LOG ON (NAME = [Test_log], Filename='C:\DATA\Test_log.ldf', SIZE=500MB)
COLLATE Latin1_General_100_BIN2;

Now we can create a memory optimized table

CREATE TABLE [TransactionHistoryArchive] (
    [TransactionID] [int] NOT NULL PRIMARY KEY 
        NONCLUSTERED HASH WITH (BUCKET_COUNT = 100000),
    [ProductID] [int] NOT NULL,
    [ReferenceOrderID] [int] NOT NULL,
    [ReferenceOrderLineID] [int] NOT NULL,
    [TransactionDate] [datetime] NOT NULL,
    [TransactionType] [nchar](1) NOT NULL,
    [Quantity] [int] NOT NULL,
    [ActualCost] [money] NOT NULL,
    [ModifiedDate] [datetime] NOT NULL
) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA)

You cannot copy data directly from a second database to a memory optimized table. So we need to copy the data to a regular table first.

SELECT * INTO temp
FROM AdventureWorks2012.Production.TransactionHistoryArchive

Now we can copy the data to the memory optimized table

INSERT INTO TransactionHistoryArchive
SELECT * FROM temp

Let us look at our first execution plans using T-SQL against our memory optimized table (which by the way is defined as T-SQL using the query interop capabilities).

SELECT * FROM TransactionHistoryArchive
WHERE TransactionID = 8209

Since we defined a hash index on the TransactionId column we get the following plan

image

Now a query using a different column

SELECT * FROM TransactionHistoryArchive
WHERE ProductID = 780

image

The execution plans shows the familiar Index Scan and Index Seek operations but obviously the algorithms for this execution are not the same, this time the table and indexes are in memory and use totally different structures. You can identify differences on these operators by looking at NonClusteredHash after the name of the operator and Storage shown as MemoryOptimized in the properties of the operator as shown next.

image

Let us now create a stored procedure and yes, all the new clauses (like NATIVE_COMPILATION or SCHEMABINDING) are required, leaving any of these out will produce an error

CREATE PROCEDURE test
WITH NATIVE_COMPILATION, SCHEMABINDING, EXECUTE AS OWNER
AS
BEGIN ATOMIC WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = 'us_english')
SELECT TransactionID, ProductID, ReferenceOrderID FROM dbo.TransactionHistoryArchive
WHERE ProductID = 780
END

Then you can run the procedure by using

EXEC test

Something that you may notice is that executing the native stored procedure does not show any execution plan. At this moment I don’t know if the plan will be visible on a later CTP or the RTM but assuming it won’t, this may be a problem as we would not have an execution plan to troubleshoot. You may say that we can still see the plan using the T-SQL code but keep in mind that data and statistics may have changed since the procedure was created and so it does not mean that the plan you currently get was the same that was compiled to native code. Anyway, we would just have to wait for the next CTPs and see how is this going to work.

UPDATE: I got an update from Jos de Bruijn from Microsoft saying “You are correct that the plan for the native procedure may not be the same as in interpreted T-SQL. Not only can there be differences in statistics, but there are also certain operators we do not support in native procedures (e.g. hash join, hash aggregate). Starting with CTP2, we will be supporting SHOWPLAN_XML (also called the “estimated execution plan” in Management Studio), so you will be able to see the plan that was used to compile the procedure.”

Finally, it is interesting to note that the generated DLLs are not kept on the database but in the file system and you can find them, along with some other intermediate files by looking at the location returned by the following query

SELECT name, description FROM sys.dm_os_loaded_modules
where description = 'XTP Native DLL'

Generated C code is available at that location. A small fragment of the C code created on the previous examples is shown next

#include "hkenggen.h"
#include "hkrtgen.h"
#include "hkgenlib.h"

#define ENABLE_INTSAFE_SIGNED_FUNCTIONS
#include "intsafe.h"

int _fltused = 0;

int memcmp(const void*, const void*, size_t);
void *memcpy(void*, const void*, size_t);
void *memset(void*, int, size_t);

#define offsetof(s,f)   ((size_t)&(((s*)0)->f))

struct hkt_277576027
{
    __int64 hkc_5;
    __int64 hkc_8;
    __int64 hkc_9;
    long hkc_1;
    long hkc_2;
    long hkc_3;
    long hkc_4;
    long hkc_7;
    unsigned short hkvdo[2];
};

Query Optimization Research Papers

Two of the questions I’ve been asked sometimes are which sources I researched to write my Query Optimizer book and which research papers can I recommend to learn more about query optimization. Since I got asked about it again at the Tampa SQLSaturday last week, I wrote this short article on my flight back to Los Angeles to discuss this topic.

But first a warning: reading these academic papers may require a strong computer science background and they are usually more complicated than the SQL Server documentation, books or blogs we read every day. In addition, there are dozens or even hundreds of these articles, covering more than 40 years of query optimization research. Although I cannot list all the ones I have read I can definitely give you a way to get started so you can continue with the topics that may interest you.

Research papers reference other papers in the text and you can find the referenced paper details at the end of each article, so if you are interested in one particular area you can go and read that paper directly. By following other listed sources, which will also have additional references, you could find an almost unlimited source of information.

Although research papers usually focus on a specific area or problem you can get started by reading a few articles which are a more general overview before trying to read more specific topics. Some of these papers to get started are:

An Overview of Query Optimization in Relational Systems by Surajit Chaudhuri

Query Optimization by Yannis E. Ioannidis

An Overview of Data Warehousing and OLAP Technology by Surajit Chaudhuri, Umeshwar Dayal

By following references on those and other similar papers you can find dozens of articles which would be impossible to list here, but just to give you three examples:

Optimizing Join Orders by Michael Steinbrunn, Guido Moerkotte, Alfons Kemper

An Overview of Cost-based Optimization of Queries with Aggregates by Surajit Chaudhuri

Counting, Enumerating, and Sampling of Execution Plans in a Cost-Based Query Optimizer by Florian Waas, Cesar Galindo-Legaria

Some of these papers may be SQL Server related:

Query Processing for SQL Updates by Cesar Galindo-Legaria, Stefano Stefani, Florian Waas

Self-Tuning Database Systems: A Decade of Progress by Surajit Chaudhuri

An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server by Surajit Chaudhuri, Vivek Narasayya

SQL Server implemented its own cost-based query optimizer based on the Cascades Framework, when its database engine was re-architected for the release of SQL Server 7.0. Cascades is also based on other previous research work: Volcano and Exodus. You can read about these research projects here:

The Cascades Framework for Query Optimization by Goetz Graefe

The Volcano optimizer generator: Extensibility and efficient search by Goetz Graefe

The EXODUS Optimizer Generator by Goetz Graefe, David J. DeWitt

Finally, in this post I covered query optimization papers but obviously you can also find information on other areas of database research as well.

Statistics on Ascending Keys

One query processor problem I’ve been trying to research since some time ago is that of statistics on ascending keys. The traditional recommendation from Microsoft to fix this problem is to manually update statistics after loading data as explained here. The document describes the problem in the following way:

“Statistics on ascending or descending key columns, such as IDENTITY or real-time timestamp columns, might require more frequent statistics updates than the query optimizer performs. Insert operations append new values to ascending or descending columns. The number of rows added might be too small to trigger a statistics update. If statistics are not up-to-date and queries select from the most recently added rows, the current statistics will not have cardinality estimates for these new values. This can result in inaccurate cardinality estimates and slow query performance. For example, a query that selects from the most recent sales order dates will have inaccurate cardinality estimates if the statistics are not updated to include cardinality estimates for the most recent sales order dates.”

Trace flags 2389 and 2390, which were first published by Ian Jose in his article Ascending Keys and Auto Quick Corrected Statistics, can help to address this problem. Ian explains that when data typically ascends, most new insertions are out of the previously found range. My testing shows that these new values fall outside the range of values of the existing statistics histogram. This can lead to poorly performing plans as filters selecting recent data seem to exclude the entire relation when in fact a significant number of rows may be included.

To show you what the problem is and how this mystery trace flags 2389 works, let us start by creating a table in AdventureWorks2012.

CREATE TABLE dbo.SalesOrderHeader (
    SalesOrderID int NOT NULL,
    RevisionNumber tinyint NOT NULL,
    OrderDate datetime NOT NULL,
    DueDate datetime NOT NULL,
    ShipDate datetime NULL,
    Status tinyint NOT NULL,
    OnlineOrderFlag dbo.Flag NOT NULL,
    SalesOrderNumber nvarchar(25) NOT NULL,
    PurchaseOrderNumber dbo.OrderNumber NULL,
    AccountNumber dbo.AccountNumber NULL,
    CustomerID int NOT NULL,
    SalesPersonID int NULL,
    TerritoryID int NULL,
    BillToAddressID int NOT NULL,
    ShipToAddressID int NOT NULL,
    ShipMethodID int NOT NULL,
    CreditCardID int NULL,
    CreditCardApprovalCode varchar(15) NULL,
    CurrencyRateID int NULL,
    SubTotal money NOT NULL,
    TaxAmt money NOT NULL,
    Freight money NOT NULL,
    TotalDue money NOT NULL,
    Comment nvarchar(128) NULL,
    rowguid uniqueidentifier NOT NULL,
    ModifiedDate datetime NOT NULL
)

Populate the table with some initial data and create an index on it (notice that both tables have the same name but in the dbo and Sales schemas)

INSERT INTO dbo.SalesOrderHeader SELECT * FROM Sales.SalesOrderHeader 
WHERE OrderDate < '2008-07-20 00:00:00.000'
CREATE INDEX IX_OrderDate ON SalesOrderHeader(OrderDate)

The problem

After creating the index SQL Server will also create a statistics object for it, so a query like this will have a good cardinality estimate as shown next (as there is data for July 19 and it is captured on the last step of the statistics histogram object, which you can verify by using the DBCC SHOW_STASTISTICS statement).

SELECT * FROM dbo.SalesOrderHeader WHERE OrderDate = '2008-07-19 00:00:00.000'

clip_image001

Now, let us suppose we add new data for July 20th.

INSERT INTO dbo.SalesOrderHeader SELECT * FROM Sales.SalesOrderHeader 
WHERE OrderDate = '2008-07-20 00:00:00.000'

Changing the query to look for records for July 20

SELECT * FROM dbo.SalesOrderHeader WHERE OrderDate = '2008-07-20 00:00:00.000'

Since the number of rows added is too small, it does not trigger an automatic update of statistics. And since the value July 20 is not represented on the histogram SQL Server will use an estimate of 1 as shown in the following plan

clip_image002

Although both plans showed in this example are very similar a bad cardinality estimate may produce bad plans in some more realistic scenarios and queries.

Using trace flag 2389

Now let us see how trace flag 2389 helps on this problem. Run the next statements (notice that trace flag 2388 has not been mentioned before and will be explained shortly)

DBCC TRACEON (2388)
DBCC TRACEON (2389)

Trace flag 2389, which was introduced with SQL Server 2005 Service Pack 1, begins to track the nature of columns via subsequent operations of updating statistics. When the statistics are seen to increase three times in a row the column is branded ascending.

Trace flag 2388 is not required to enable the behavior described in this article but we can use it to show how trace flags 2390 works and determine if a column has been branded ascending. The trace flag changes the output of the DBCC SHOW_STATISTICS statement to show you a historical look at the most recent statistics update operations.

Trace flag 2390 enables a similar behavior than 2389 even if the ascending nature of the column is not known but I will not cover it here.

Run DBCC SHOW_STASTISTICS

DBCC SHOW_STATISTICS ('dbo.SalesOrderHeader', 'IX_OrderDate')

The statement shows the following output

Updated              Table Cardinality      Snapshot Ctr         Steps  Density                Rows Above             Rows Below             Squared Variance Error Inserts Since Last Update Deletes Since Last Update Leading column Type
-------------------- ---------------------- -------------------- ------ ---------------------- ---------------------- ---------------------- ---------------------- ------------------------- ------------------------- -------------------
Feb 26 2013  2:31AM  31095                  31095                199    0.000899280596058816   NULL                   NULL                   NULL                   NULL                      NULL                      Unknown

Not much data for now. But I’ll show you this output after three consecutive batches inserting data and updating statistics. Run the following statement to update statistics including the data you just added for February 20th.

UPDATE STATISTICS dbo.SalesOrderHeader WITH FULLSCAN

DBCC SHOW_STATISTICS now shows

Updated              Table Cardinality      Snapshot Ctr         Steps  Density                Rows Above             Rows Below             Squared Variance Error Inserts Since Last Update Deletes Since Last Update Leading column Type
-------------------- ---------------------- -------------------- ------ ---------------------- ---------------------- ---------------------- ---------------------- ------------------------- ------------------------- -------------------
Feb 26 2013  2:40AM  31125                  31125                199    0.000898472615517676   30                     0                      0.0135968539563045     30                        0                         Unknown
Feb 26 2013  2:31AM  31095                  31095                199    0.000899280596058816   NULL                   NULL                   NULL                   NULL                      NULL                      NULL

where ‘Rows Above’ and ‘Insert Since Last Update’ accounts for the 30 rows added previously (you may need to scroll to the right). Now run the second batch

INSERT INTO dbo.SalesOrderHeader SELECT * FROM Sales.SalesOrderHeader 
WHERE OrderDate = '2008-07-21 00:00:00.000'

Again running this query will verify the one row estimate in the plan

SELECT * FROM dbo.SalesOrderHeader WHERE OrderDate = '2008-07-21 00:00:00.000'

Update statistics again

UPDATE STATISTICS dbo.SalesOrderHeader WITH FULLSCAN

DBCC SHOW_STATISTICS now shows this. Notice a new record with ‘Insert Since Last Update’ and ‘Rows Above’ with a value of 27. ‘Leading column Type’ still shows ‘Unknown’.

Updated              Table Cardinality      Snapshot Ctr         Steps  Density                Rows Above             Rows Below             Squared Variance Error Inserts Since Last Update Deletes Since Last Update Leading column Type
-------------------- ---------------------- -------------------- ------ ---------------------- ---------------------- ---------------------- ---------------------- ------------------------- ------------------------- -------------------
Feb 26 2013  2:44AM  31152                  31152                199    0.000897666090168059   27                     0                      0.0122265623860741     27                        0                         Unknown
Feb 26 2013  2:40AM  31125                  31125                199    0.000898472615517676   30                     0                      0.0135968539563045     30                        0                         NULL
Feb 26 2013  2:31AM  31095                  31095                199    0.000899280596058816   NULL                   NULL                   NULL                   NULL                      NULL                      NULL

A third batch

INSERT INTO dbo.SalesOrderHeader SELECT * FROM Sales.SalesOrderHeader 
WHERE OrderDate = '2008-07-22 00:00:00.000'

Update statistics one last time

UPDATE STATISTICS dbo.SalesOrderHeader WITH FULLSCAN

DBCC SHOW_STATISTICS now shows

Updated              Table Cardinality      Snapshot Ctr         Steps  Density                Rows Above             Rows Below             Squared Variance Error Inserts Since Last Update Deletes Since Last Update Leading column Type
-------------------- ---------------------- -------------------- ------ ---------------------- ---------------------- ---------------------- ---------------------- ------------------------- ------------------------- -------------------
Feb 26 2013  2:47AM  31184                  31184                199    0.000896860961802304   32                     0                      0.0144758706820584     32                        0                         Ascending
Feb 26 2013  2:44AM  31152                  31152                199    0.000897666090168059   27                     0                      0.0122265623860741     27                        0                         NULL
Feb 26 2013  2:40AM  31125                  31125                199    0.000898472615517676   30                     0                      0.0135968539563045     30                        0                         NULL
Feb 26 2013  2:31AM  31095                  31095                199    0.000899280596058816   NULL                   NULL                   NULL                   NULL                      NULL                      NULL

In addition to the new record accounting for the 32 rows added, now you can notice that the branding was changed to ‘Ascending’. Once the column is branded ‘Ascending’ SQL Server will be able to give you a better cardinality estimate, without the need to manually update statistics.

Now try this batch

INSERT INTO dbo.SalesOrderHeader SELECT * FROM Sales.SalesOrderHeader 
WHERE OrderDate = '2008-07-23 00:00:00.000'

And run the following query

SELECT * FROM dbo.SalesOrderHeader WHERE OrderDate = '2008-07-23 00:00:00.000'

This time we get a better cardinality estimate. Notice that no UPDATE STATISTICS was required this time.

clip_image011

Instead the estimated of one row now we get 27.9677. But where is this value coming from? The query optimizer is now using the density information of the statistics object. The definition of density is 1 / number of distinct values and the estimated number of rows is obtained using the density multiplied by the number of records in the table which in this case is 0.000896861 * 31184, or 27.967713424 as shown in the plan. Also notice that density information is only used for values not covered in the histogram (You can see the density information using the same DBCC SHOW_STATISTICS statement but in another session where trace flag 2388 is not enabled).

In addition, if we look for data that does not exist we still get the one row estimate which is always adequate since it will return 0 records.

SELECT * FROM dbo.SalesOrderHeader WHERE OrderDate = '2008-07-23 00:00:00.000'

Notice that branding a column ascending requires statistics to increase three times in a row. If later we insert older data, breaking the ascending sequence, the column ‘Leading column Type’ will show ‘Stationary’ and the query processor will be back to the original cardinality estimate behavior. Three new additional updates in a row with increasing values can brand it as Ascending again.

Finally, at this moment I am not able to verify if these trace flags are “officially” documented and supported by Microsoft. Interestingly they are documented in the article FIX: You may notice a large increase in compile time when you enable trace flags 2389 and 2390 in SQL Server 2005 Service Pack 1 (look at the ‘More Information’ section). Another fix is documented here. Anyway, in any case talk to Microsoft and obviously test your application carefully if you think that these trace flags can improve the performance of your application.

DBCC SHOW_STATISTICS Works With SELECT Permission

I’ve been wanting to blog about several topics in the last few months and now that I’ve finished a couple of projects, including working on the SQL Server 2012 Internals book, I think I will have some time to do just that. The first of these topics takes me back to the first day at the PASS Summit 2012 keynote where Ted Kummert, Microsoft Corporate Vice President, Data Platform group, made several announcements, including the release of SQL Server 2012 Service Pack 1. Then after looking at this service pack documentation I noticed that one of its enhancements is that the DBCC SHOW_STATISTICS statement now works with SELECT permission. This has been a fix requested while ago, related to a problem with statistics and linked servers, something I blogged about previously in my post Optimizer Statistics on Linked Servers. In summary, the problem was that the query processor was not able to get the required optimizer statistics from a remote SQL Server instance due to permissions of the user used by the linked server. Previous to SQL Server 2012 Service Pack 1, in order to obtain all available statistics, the user must own the table or be a member of the sysadmin fixed server role, the db_owner fixed database role, or the db_ddladmin fixed database role on the linked server.

So let us see how it works, using an example similar to what I did on my previous post.

SELECT l.SalesOrderID, l.CustomerID 
FROM AdventureWorks2012.Sales.SalesOrderHeader l 
JOIN [remote].AdventureWorks2012.Sales.SalesOrderHeader r 
ON l.SalesOrderID = r.SalesOrderID 
WHERE r.CustomerID = 11000

My test configuration uses a linked server relying on a login with read only permissions on the data (for example, db_datareader). Running the previous query against a remote SQL Server instance without Service Pack 1 installed returns a bad cardinality estimate and the following plan using a hash join. Since the local query processor does not have access to the statistics on the remote server it has to rely on a guess, in this case estimating 2,362.49 records. You can notice a big difference between the actual and estimated number of rows.

clip_image002

After applying Service Pack 1 on the remote instance I can run the same query and this time we get a better cardinality estimate and the query optimizer is able to make a better decision. In this case we get a new plan using a nested loops join, which is more appropriate for a small number of records.

clip_image004

Although the new behavior is enabled by default after service pack 1 is installed, you also have the choice of disabling it by using trace flag 9485, which you can use in case of regressions in scenarios where the performance of some queries may be getting worst. Enabling trace flag 9485 reverts the new permission check to the original behavior.

Run the following statement to enable trace flag 9485.

DBCC TRACEON (9485, -1)

Running the query again will produce the original plan with the hash join and the 2,362.49 cardinality estimate. You may also need to use a statement to clear the current plan or force a new optimization during your testing, for example, DBCC FREEPROCCACHE. Apart from the change in permissions mentioned in this post, the behavior of using statistics with linked servers is the same as the one described in my post, so you may refer to it to understand how it works. Finally, Microsoft has not mentioned yet if this fix will be implemented on SQL Server 2008 R2 or other previous supported releases.

More Undocumented Query Optimizer Trace Flags

 

clip_image001

This is my second post discussing some query optimizer undocumented trace flags (you can read the first one, “Inside the Query Optimizer Memo Structure”, here). Although is not the purpose of this post to discuss the entire optimization process or go into detail on any of the optimization phases, you can read more about it in my book Inside the SQL Server Query Optimizer, which you can download for free from the simple-talk website. And same as before, please bear in mind that all these trace flags are undocumented and unsupported, and should not be used on a production environment. You can use them as a way to explore and understand how the query optimizer works.

As mentioned in my previous post, you will first have to enable the trace flag 3604 to redirect the trace output to the client executing the command, in this case to the Messages tab in SQL Server Management Studio.

DBCC TRACEON(3604)

I’ll start with three trace flags which display logical and physical trees used during the optimization process. First, trace flag 8605 will display the query initial tree representation created by SQL Server. Test if by running

SELECT e.EmployeeID FROM HumanResources.Employee AS e
INNER JOIN Sales.SalesPerson AS s ON e.EmployeeID = s.SalesPersonID
OPTION (RECOMPILE, QUERYTRACEON 8605)

It will show the following output

*** Converted Tree: ***
    LogOp_Project QCOL: [e].EmployeeID
        LogOp_Join
            LogOp_Get TBL: HumanResources.Employee(alias TBL: e) HumanResources.Employee TableID=901578250 TableReferenceID=0 IsRow: COL: IsBaseRow1001 
            LogOp_Get TBL: Sales.SalesPerson(alias TBL: s) Sales.SalesPerson TableID=1042102753 TableReferenceID=0 IsRow: COL: IsBaseRow1003 
            ScaOp_Comp x_cmpEq
                ScaOp_Identifier QCOL: [e].EmployeeID
                ScaOp_Identifier QCOL: [s].SalesPersonID
        AncOp_PrjList 

Trace flag 8606 will display additional logical trees used during the optimization process. Run

SELECT e.EmployeeID FROM HumanResources.Employee AS e
INNER JOIN Sales.SalesPerson AS s ON e.EmployeeID = s.SalesPersonID
OPTION (RECOMPILE, QUERYTRACEON 8606)

The output shows several different logical trees: input tree, simplified tree, join-collapsed tree, tree before project normalization, and tree after project normalization. These trees will include logical operators only. Part of the output is shown next.

*** Input Tree: *** LogOp_Project QCOL: [e].EmployeeID LogOp_Select LogOp_Join LogOp_Get TBL: HumanResources.Employee(alias TBL: e) HumanResources.Employee TableID=901578250 TableReferenceID=0 IsRow: COL: IsBaseRow1001 LogOp_Get TBL: Sales.SalesPerson(alias TBL: s) Sales.SalesPerson TableID=1042102753 TableReferenceID=0 IsRow: COL: IsBaseRow1003 ScaOp_Const TI(bit,ML=1) XVAR(bit,Not Owned,Value=1) ScaOp_Comp x_cmpEq ScaOp_Identifier QCOL: [e].EmployeeID ScaOp_Identifier QCOL: [s].SalesPersonID AncOp_PrjList *******************

*** Simplified Tree: *** LogOp_Join LogOp_Get TBL: HumanResources.Employee(alias TBL: e) HumanResources.Employee TableID=901578250 TableReferenceID=0 IsRow: COL: IsBaseRow1001 LogOp_Get TBL: Sales.SalesPerson(alias TBL: s) Sales.SalesPerson TableID=1042102753 TableReferenceID=0 IsRow: COL: IsBaseRow1003 ScaOp_Comp x_cmpEq ScaOp_Identifier QCOL: [s].SalesPersonID ScaOp_Identifier QCOL: [e].EmployeeID ******************* *** Join-collapsed Tree: *** LogOp_Join LogOp_Get TBL: HumanResources.Employee(alias TBL: e) HumanResources.Employee TableID=901578250 TableReferenceID=0 IsRow: COL: IsBaseRow1001 LogOp_Get TBL: Sales.SalesPerson(alias TBL: s) Sales.SalesPerson TableID=1042102753 TableReferenceID=0 IsRow: COL: IsBaseRow1003 ScaOp_Comp x_cmpEq ScaOp_Identifier QCOL: [s].SalesPersonID ScaOp_Identifier QCOL: [e].EmployeeID ******************* *** Tree Before Project Normalization *** LogOp_Join LogOp_Get TBL: HumanResources.Employee(alias TBL: e) HumanResources.Employee TableID=901578250 TableReferenceID=0 IsRow: COL: IsBaseRow1001 LogOp_Get TBL: Sales.SalesPerson(alias TBL: s) Sales.SalesPerson TableID=1042102753 TableReferenceID=0 IsRow: COL: IsBaseRow1003 ScaOp_Comp x_cmpEq ScaOp_Identifier QCOL: [s].SalesPersonID ScaOp_Identifier QCOL: [e].EmployeeID ***************************************** *** Tree After Project Normalization *** LogOp_Join LogOp_Get TBL: HumanResources.Employee(alias TBL: e) HumanResources.Employee TableID=901578250 TableReferenceID=0 IsRow: COL: IsBaseRow1001 LogOp_Get TBL: Sales.SalesPerson(alias TBL: s) Sales.SalesPerson TableID=1042102753 TableReferenceID=0 IsRow: COL: IsBaseRow1003 ScaOp_Comp x_cmpEq ScaOp_Identifier QCOL: [s].SalesPersonID ScaOp_Identifier QCOL: [e].EmployeeID **************************************** *** Stop search, level 1 ***

One interesting example is seeing how a tree is simplified when the query optimizer can detect a contradiction during the simplification phase. The purpose of the simplification stage is to reduce the query tree into a simpler form in order to make the optimization process easier. Contradiction detection is one of several possible simplifications. Following on an example on my book and this blog post, run the following query

SELECT * FROM HumanResources.Employee
WHERE VacationHours > 300
OPTION (RECOMPILE, QUERYTRACEON 8606)

Part of the output is next

*** Input Tree: ***
        LogOp_Project QCOL: [AdventureWorks].[HumanResources].[Employee].EmployeeID QCOL: [AdventureWorks].[HumanResources].[Employee].NationalIDNumber QCOL: [AdventureWorks].[HumanResources].[Employee].ContactID QCOL: [AdventureWorks].[HumanResources].[Employee].LoginID QCOL: [AdventureWorks].[HumanResources].[Employee].ManagerID QCOL: [AdventureWorks].[HumanResources].[Employee].Title QCOL: [AdventureWorks].[HumanResources].[Employee].BirthDate QCOL: [AdventureWorks].[HumanResources].[Employee].MaritalStatus QCOL: [AdventureWorks].[HumanResources].[Employee].Gender QCOL: [AdventureWorks].[HumanResources].[Employee].HireDate QCOL: [AdventureWorks].[HumanResources].[Employee].SalariedFlag QCOL: [AdventureWorks].[HumanResources].[Employee].VacationHours QCOL: [AdventureWorks].[HumanResources].[Employee].SickLeaveHours QCOL: [AdventureWorks].[HumanResources].[Employee].CurrentFlag QCOL: [AdventureWorks].[HumanResources].[Employee].rowguid QCOL: [AdventureWorks].[HumanResources].[Employee].ModifiedDate
            LogOp_Select
                LogOp_Get TBL: HumanResources.Employee HumanResources.Employee TableID=901578250 TableReferenceID=0 IsRow: COL: IsBaseRow1001 
                ScaOp_Comp x_cmpGt
                    ScaOp_Identifier QCOL: [AdventureWorks].[HumanResources].[Employee].VacationHours
                    ScaOp_Const TI(smallint,ML=2) XVAR(smallint,Not Owned,Value=300)
            AncOp_PrjList 
*******************
*** Simplified Tree: ***
        LogOp_ConstTableGet (0) COL: Chk1000  COL: IsBaseRow1001  QCOL: [AdventureWorks].[HumanResources].[Employee].EmployeeID QCOL: [AdventureWorks].[HumanResources].[Employee].NationalIDNumber QCOL: [AdventureWorks].[HumanResources].[Employee].ContactID QCOL: [AdventureWorks].[HumanResources].[Employee].LoginID QCOL: [AdventureWorks].[HumanResources].[Employee].ManagerID QCOL: [AdventureWorks].[HumanResources].[Employee].Title QCOL: [AdventureWorks].[HumanResources].[Employee].BirthDate QCOL: [AdventureWorks].[HumanResources].[Employee].MaritalStatus QCOL: [AdventureWorks].[HumanResources].[Employee].Gender QCOL: [AdventureWorks].[HumanResources].[Employee].HireDate QCOL: [AdventureWorks].[HumanResources].[Employee].SalariedFlag QCOL: [AdventureWorks].[HumanResources].[Employee].VacationHours QCOL: [AdventureWorks].[HumanResources].[Employee].SickLeaveHours QCOL: [AdventureWorks].[HumanResources].[Employee].CurrentFlag QCOL: [AdventureWorks].[HumanResources].[Employee].rowguid QCOL: [AdventureWorks].[HumanResources].[Employee].ModifiedDate
*******************
*** Join-collapsed Tree: ***
        LogOp_ConstTableGet (0) COL: Chk1000  COL: IsBaseRow1001  QCOL: [AdventureWorks].[HumanResources].[Employee].EmployeeID QCOL: [AdventureWorks].[HumanResources].[Employee].NationalIDNumber QCOL: [AdventureWorks].[HumanResources].[Employee].ContactID QCOL: [AdventureWorks].[HumanResources].[Employee].LoginID QCOL: [AdventureWorks].[HumanResources].[Employee].ManagerID QCOL: [AdventureWorks].[HumanResources].[Employee].Title QCOL: [AdventureWorks].[HumanResources].[Employee].BirthDate QCOL: [AdventureWorks].[HumanResources].[Employee].MaritalStatus QCOL: [AdventureWorks].[HumanResources].[Employee].Gender QCOL: [AdventureWorks].[HumanResources].[Employee].HireDate QCOL: [AdventureWorks].[HumanResources].[Employee].SalariedFlag QCOL: [AdventureWorks].[HumanResources].[Employee].VacationHours QCOL: [AdventureWorks].[HumanResources].[Employee].SickLeaveHours QCOL: [AdventureWorks].[HumanResources].[Employee].CurrentFlag QCOL: [AdventureWorks].[HumanResources].[Employee].rowguid QCOL: [AdventureWorks].[HumanResources].[Employee].ModifiedDate
*******************
*** Tree Before Project Normalization ***
        LogOp_ConstTableGet (0) COL: Chk1000  COL: IsBaseRow1001  QCOL: [AdventureWorks].[HumanResources].[Employee].EmployeeID QCOL: [AdventureWorks].[HumanResources].[Employee].NationalIDNumber QCOL: [AdventureWorks].[HumanResources].[Employee].ContactID QCOL: [AdventureWorks].[HumanResources].[Employee].LoginID QCOL: [AdventureWorks].[HumanResources].[Employee].ManagerID QCOL: [AdventureWorks].[HumanResources].[Employee].Title QCOL: [AdventureWorks].[HumanResources].[Employee].BirthDate QCOL: [AdventureWorks].[HumanResources].[Employee].MaritalStatus QCOL: [AdventureWorks].[HumanResources].[Employee].Gender QCOL: [AdventureWorks].[HumanResources].[Employee].HireDate QCOL: [AdventureWorks].[HumanResources].[Employee].SalariedFlag QCOL: [AdventureWorks].[HumanResources].[Employee].VacationHours QCOL: [AdventureWorks].[HumanResources].[Employee].SickLeaveHours QCOL: [AdventureWorks].[HumanResources].[Employee].CurrentFlag QCOL: [AdventureWorks].[HumanResources].[Employee].rowguid QCOL: [AdventureWorks].[HumanResources].[Employee].ModifiedDate
*****************************************
*** Tree After Project Normalization ***
        LogOp_ConstTableGet (0) COL: Chk1000  COL: IsBaseRow1001  QCOL: [AdventureWorks].[HumanResources].[Employee].EmployeeID QCOL: [AdventureWorks].[HumanResources].[Employee].NationalIDNumber QCOL: [AdventureWorks].[HumanResources].[Employee].ContactID QCOL: [AdventureWorks].[HumanResources].[Employee].LoginID QCOL: [AdventureWorks].[HumanResources].[Employee].ManagerID QCOL: [AdventureWorks].[HumanResources].[Employee].Title QCOL: [AdventureWorks].[HumanResources].[Employee].BirthDate QCOL: [AdventureWorks].[HumanResources].[Employee].MaritalStatus QCOL: [AdventureWorks].[HumanResources].[Employee].Gender QCOL: [AdventureWorks].[HumanResources].[Employee].HireDate QCOL: [AdventureWorks].[HumanResources].[Employee].SalariedFlag QCOL: [AdventureWorks].[HumanResources].[Employee].VacationHours QCOL: [AdventureWorks].[HumanResources].[Employee].SickLeaveHours QCOL: [AdventureWorks].[HumanResources].[Employee].CurrentFlag QCOL: [AdventureWorks].[HumanResources].[Employee].rowguid QCOL: [AdventureWorks].[HumanResources].[Employee].ModifiedDate

In this case the query optimizer makes use of an existing check constraint to conclude that no records qualify for the predicate VacationHours > 300, replacing the entire tree with a LogOp_ConstTableGet logical operator. If you try displaying an output tree, as discussed next, you will get a PhyOp_ConstTableScan physical operator, and you will get a constant scan operator on the final execution plan. You can see a different behavior if you try the same query with a predicate like VacationHours > 20.

Trace flag 8607 shows the optimization output tree. Try the following sentence

SELECT e.EmployeeID FROM HumanResources.Employee AS e
INNER JOIN Sales.SalesPerson AS s ON e.EmployeeID = s.SalesPersonID
OPTION (RECOMPILE, QUERYTRACEON 8607)

Notice that this time we have physical operators and the output tree is closer to the final execution plan.

****************************************
*** Output Tree: ***
        PhyOp_Apply lookup TBL: HumanResources.Employee (0) (x_jtInner)
            PhyOp_Range TBL: Sales.SalesPerson(alias TBL: s)(2) ASC  Bmk ( QCOL: [s].SalesPersonID) IsRow: COL: IsBaseRow1003 
            PhyOp_Range TBL: HumanResources.Employee(alias TBL: e)(1) ASC  Bmk ( QCOL: [e].EmployeeID) IsRow: COL: IsBaseRow1001 
                ScaOp_Comp x_cmpEq
                    ScaOp_Identifier QCOL: [s].SalesPersonID
                    ScaOp_Identifier QCOL: [e].EmployeeID
********************
** Query marked as Cachable
********************

Trace flag 8675 shows the query optimization phases for a specific optimization along with some other information like cost estimation, tasks, etc. You may want to test it with complex queries to see different optimization phases like in the following example

SELECT I.CustomerID, C.FirstName, C.LastName, A.AddressLine1, A.City,
SP.Name AS State, CR.Name AS CountryRegion
FROM Person.Contact AS C
JOIN Sales.Individual AS I ON C.ContactID = I.ContactID
JOIN Sales.CustomerAddress AS CA ON CA.CustomerID = I.CustomerID
JOIN Person.Address AS A ON A.AddressID = CA.AddressID
JOIN Person.StateProvince SP ON
SP.StateProvinceID = A.StateProvinceID
JOIN Person.CountryRegion CR ON
CR.CountryRegionCode = SP.CountryRegionCode
ORDER BY I.CustomerID
OPTION (RECOMPILE, QUERYTRACEON 8675)

It shows the following output

End of simplification, time: 0.003 net: 0.003 total: 0.003 net: 0.003
end exploration, tasks: 275 no total cost time: 0.005 net: 0.005 total: 0.009 net: 0.009
end exploration, tasks: 642 no total cost time: 0.003 net: 0.003 total: 0.012 net: 0.012
end search(0),  cost: 9.04 tasks: 681 time: 0 net: 0 total: 0.013 net: 0.013
end exploration, tasks: 1465 Cost = 9.04 time: 0.004 net: 0.004 total: 0.017 net: 0.017
end exploration, tasks: 2518 Cost = 9.04 time: 0.007 net: 0.007 total: 0.024 net: 0.024
end search(1),  cost: 5.86879 tasks: 2609 time: 0 net: 0 total: 0.025 net: 0.025
end exploration, tasks: 2610 Cost = 5.86879 time: 0 net: 0 total: 0.025 net: 0.025
end exploration, tasks: 5170 Cost = 5.86879 time: 0.021 net: 0.021 total: 0.047 net: 0.047
end search(1),  cost: 5.86248 tasks: 5469 time: 0.001 net: 0.001 total: 0.048 net: 0.048
end exploration, tasks: 5756 Cost = 5.86248 time: 0.001 net: 0.001 total: 0.05 net: 0.05
end exploration, tasks: 6434 Cost = 5.86248 time: 0.005 net: 0.005 total: 0.055 net: 0.055
end search(2),  cost: 5.84575 tasks: 7092 time: 0.004 net: 0.004 total: 0.059 net: 0.059
End of post optimization rewrite, time: 0 net: 0 total: 0.059 net: 0.059
End of query plan compilation, time: 0 net: 0 total: 0.06 net: 0.06

Note the optimization phases search(0), search(1) and search(2). Also, as mentioned in my book, you need at least three tables to qualify for search 0, so the following query will go directly to search 1 (you can also test this with trace flag 2372 as described later):

SELECT * FROM HumanResources.Employee
WHERE ManagerID = 12
OPTION (RECOMPILE, QUERYTRACEON 8675)

Trace flags 2372 and 2373 show memory utilization during the optimization process but we can also use them to gain information about the activities the query optimizer is performing for a particular query. Trace flag 2372 shows memory utilization during the different optimization stages. Same as before, more complicated queries will show more optimization stages.

SELECT I.CustomerID, C.FirstName, C.LastName, A.AddressLine1, A.City,
SP.Name AS State, CR.Name AS CountryRegion
FROM Person.Contact AS C
JOIN Sales.Individual AS I ON C.ContactID = I.ContactID
JOIN Sales.CustomerAddress AS CA ON CA.CustomerID = I.CustomerID
JOIN Person.Address AS A ON A.AddressID = CA.AddressID
JOIN Person.StateProvince SP ON
SP.StateProvinceID = A.StateProvinceID
JOIN Person.CountryRegion CR ON
CR.CountryRegionCode = SP.CountryRegionCode
ORDER BY I.CustomerID
OPTION (RECOMPILE, QUERYTRACEON 2372)

The output is next

Memory before NNFConvert: 13
Memory after NNFConvert: 14
Memory before project removal: 15
Memory after project removal: 15
Memory before simplification: 15
Memory after simplification: 35
Memory before heuristic join reordering: 35
Memory after heuristic join reordering: 46
Memory before project normalization: 46
Memory after project normalization: 46
Memory before stage TP: 46
Memory after stage TP: 81
Memory before stage QuickPlan: 81
Memory after stage QuickPlan: 144
Memory before stage Full: 144
Memory after stage Full: 156
Memory before copy out: 156
Memory after copy out: 157

Notice that “stage TP” or transaction processing phase is the same as search(0) shown before with trace flag 8675. In the same way, “stage QuickPlan” is the search(1) and “stage Full” is search(2).

Trace flag 2373 shows memory utilization while applying optimization rules and deriving properties.

SELECT e.EmployeeID FROM HumanResources.Employee AS e
INNER JOIN Sales.SalesPerson AS s ON e.EmployeeID = s.SalesPersonID
OPTION (RECOMPILE, QUERYTRACEON 2373)

Part of the output is next

Memory before rule IJtoIJSEL: 14
Memory after rule IJtoIJSEL: 14
Memory before rule MatchGet: 14
Memory after rule MatchGet: 14
Memory before rule MatchGet: 14
Memory after rule MatchGet: 14
Memory before rule JoinToIndexOnTheFly: 14
Memory after rule JoinToIndexOnTheFly: 14
Memory before rule JoinCommute: 14
Memory after rule JoinCommute: 14
Memory before rule JoinToIndexOnTheFly: 14
Memory after rule JoinToIndexOnTheFly: 14
Memory before rule JNtoIdxLookup: 14

Finally, trace 8757 can be used to skip the trivial plan optimization and basically force a full optimization. As a reminder, the trivial plan optimization is used for very simple queries that don’t require any cost estimation decision. For example, the following query will produce a trivial plan, which you can verify by looking at the optimization level or StatementOptmLevel property in your execution plan

SELECT * FROM dbo.DatabaseLog

By applying trace flag 8757, the trivial plan optimization will be skipped as you can verify by running the following query and once again verifying the optimization level property.

SELECT * FROM dbo.DatabaseLog
OPTION (QUERYTRACEON 8757)