Benjamin Nevarez Rotating Header Image

Query Optimizer

Code from my book Inside the SQL Server Query Optimizer

Recently I’ve been requested the code of my book Inside the Server Query Optimizer so I am including it in this post. The book contains a large number of example SQL queries, all of which are based on the AdventureWorks database and Chapter 6 additionally uses the AdventureWorksDW database. All code has been tested on both SQL Server 2008 and SQL Server 2008 R2. Note that these sample databases are not included by default in your SQL Server installation, but can be downloaded from the CodePlex website.

Inside the SQL Server Query Optimizer code – InsideQueryOptimizerCode.txt

Query Optimization with Denali Columnstore Indexes

In a previous post I talked about the new columnstore indexes and their related processing algorithms which are available in SQL Server code-named Denali. In this post I will cover the query processing part of the technology in more detail and will show you some examples that you can test on the recently released CTP3 (Community Technology Preview) of the product.

Same as with previous versions of SQL Server, in Denali the query optimizer can choose between the available access methods, which now also include columnstore indexes, and as always, this will be a cost-based decision. A new choice the query optimizer will have to make is the selection of an execution mode. The new query processing algorithms mentioned in my previous post will run in what is called a batch execution mode, which is different from the traditional processing mode, now called row mode.

In the row execution mode operators process data one row at a time. The new batch execution mode process data in batches which is more efficient for large amounts of data, like the workloads present on data warehouse queries. Each operator in an execution plan can use the row execution mode and, when columnstore indexes are available, some operators can also use the batch mode. There is both an estimated and an actual execution mode and this information is displayed on the query execution plan as I will show later. It is also worth mentioning that, although columnstore indexes can speed up the performance of data warehouse queries, they are not a good choice for very selective queries returning only a few records. In this case the query optimizer may have to rely on row stores, like clustered or regular nonclustered indexes, to find those records quickly. There are no seeks on columnstore indexes.

Same as with previous versions of SQL Server, you still have the choice to use a hint to force any index in the cases where the query optimizer is not giving you a good execution plan. This can happen for example when the query optimizer is choosing a columnstore index when it shouldn’t or when you want to force a columnstore index when it is not being selected. You can also use the new IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX hint to ask the query optimizer to avoid using any columnstore index.

Let me show you an example which you can test on SQL Server Denali CTP3, currently available for download here. To follow this example you will also need the AdventureWorksDWDenali database, available at CodePlex and I will use the same example on BOL to skip the basics and go directly to analyze the batch processing mode (By the way the BOL example didn’t work directly with the AdventureWorksDWDenali database so I had to add a few more columns at the end of the CREATE TABLE statement.)

First, use the following BOL code to create a partition function, a partition scheme and a new partitioned table with a columnstore index

USE AdventureWorksDWDenali;
GO

CREATE PARTITION FUNCTION [ByOrderDateMonthPF](int) AS RANGE RIGHT 
FOR VALUES (
    20050701, 20050801, 20050901, 20051001, 20051101, 20051201, 
    20060101, 20060201, 20060301, 20060401, 20060501, 20060601, 
    20060701, 20060801, 20060901, 20061001, 20061101, 20061201, 
    20070101, 20070201, 20070301, 20070401, 20070501, 20070601, 
    20070701, 20070801, 20070901, 20071001, 20071101, 20071201, 
    20080101, 20080201, 20080301, 20080401, 20080501, 20080601, 
    20080701, 20080801, 20080901, 20081001, 20081101, 20081201
) 
GO

CREATE PARTITION SCHEME [ByOrderDateMonthRange] 
AS PARTITION [ByOrderDateMonthPF] 
ALL TO ([PRIMARY]) 
GO

-- Create a partitioned version of the FactResellerSales table
CREATE TABLE [dbo].[FactResellerSalesPtnd]( 
    [ProductKey] [int] NOT NULL, 
    [OrderDateKey] [int] NOT NULL, 
    [DueDateKey] [int] NOT NULL, 
    [ShipDateKey] [int] NOT NULL, 
    [ResellerKey] [int] NOT NULL, 
    [EmployeeKey] [int] NOT NULL, 
    [PromotionKey] [int] NOT NULL, 
    [CurrencyKey] [int] NOT NULL, 
    [SalesTerritoryKey] [int] NOT NULL, 
    [SalesOrderNumber] [nvarchar](20) NOT NULL, 
    [SalesOrderLineNumber] [tinyint] NOT NULL, 
    [RevisionNumber] [tinyint] NULL, 
    [OrderQuantity] [smallint] NULL, 
    [UnitPrice] [money] NULL, 
    [ExtendedAmount] [money] NULL, 
    [UnitPriceDiscountPct] [float] NULL, 
    [DiscountAmount] [float] NULL, 
    [ProductStandardCost] [money] NULL, 
    [TotalProductCost] [money] NULL, 
    [SalesAmount] [money] NULL, 
    [TaxAmt] [money] NULL, 
    [Freight] [money] NULL, 
    [CarrierTrackingNumber] [nvarchar](25) NULL, 
    [CustomerPONumber] [nvarchar](25) NULL,
    [OrderDate] datetime NULL,
    [DueDate] datetime NULL,
    [ShipDate] datetime NULL
) ON ByOrderDateMonthRange(OrderDateKey);
GO

-- Copy the data from the FactResellerSales into the new table
INSERT INTO dbo.FactResellerSalesPtnd WITH(TABLOCK)
SELECT * FROM dbo.FactResellerSales;
GO

-- Create the columnstore index
CREATE NONCLUSTERED COLUMNSTORE INDEX [csindx_FactResellerSalesPtnd]
ON [FactResellerSalesPtnd]
( 
    [ProductKey], 
    [OrderDateKey], 
    [DueDateKey], 
    [ShipDateKey], 
    [ResellerKey], 
    [EmployeeKey], 
    [PromotionKey], 
    [CurrencyKey], 
    [SalesTerritoryKey], 
    [SalesOrderNumber], 
    [SalesOrderLineNumber], 
    [RevisionNumber], 
    [OrderQuantity], 
    [UnitPrice], 
    [ExtendedAmount], 
    [UnitPriceDiscountPct], 
    [DiscountAmount], 
    [ProductStandardCost], 
    [TotalProductCost], 
    [SalesAmount], 
    [TaxAmt], 
    [Freight], 
    [CarrierTrackingNumber], 
    [CustomerPONumber] 
);

Now run the following query

SELECT SalesTerritoryKey, SUM(ExtendedAmount) AS SalesByTerritory
FROM FactResellerSalesPtnd
GROUP BY SalesTerritoryKey;

This will create the following plan where you can see the new Columnstore Index Scan operator

clip_image002

The properties of the Columnstore Index Scan operator are shown next

clip_image003

You may notice that the actual and estimated execution mode is Row (lines 3 and 4 on the list of properties). Row execution mode was selected because the table is not large enough to require the batch execution mode. We can use the undocumented ROWCOUNT and PAGECOUNT options of the UPDATE STATISTICS statement to simulate a larger table as shown next (for more information about how this works see my post about the DTA here)

UPDATE STATISTICS FactResellerSalesPtnd WITH ROWCOUNT = 10000000, PAGECOUNT = 1000000

Removing the existing plan (using for example DBCC FREEPROCCACHE) and running the same query again will now show the following plan (only part is shown), which this time is using parallelism.

clip_image005

In addition, by looking at the properties of the Columnstore Index Scan you can notice that this time it is using the batch execution mode

clip_image006

You can also use the new IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX hint to disallow the use of a columnstore index. Run the following code

SELECT SalesTerritoryKey, SUM(ExtendedAmount) AS SalesByTerritory
FROM FactResellerSalesPtnd
GROUP BY SalesTerritoryKey
OPTION (IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX);

This will show you the following plan which as you can see it is now directly using the FactResellerSalesPtnd table, without using the columnstore index.

clip_image008

Finally, since the number of records and pages of the FactResellerSalesPtnd table was altered for this test, perhaps you want drop it and create a new copy if you need to do some additional testing

DROP TABLE FactResellerSalesPtnd

Speaking at the PASS Summit 2011

I am honored to be speaking at the PASS Summit again this year. I’ve been attending this SQL Server conference every year since 2003 and this will be my fourth year speaking. Same as last year I will again be presenting two sessions.

On my first session, Inside the SQL Server Query Optimizer, I will go into the internals of the Query Optimizer and will show you the steps that it performs in the background covering everything from the time a query is submitted to SQL Server until an execution plan is generated. On my second session, Parameter Sniffing: the Query Optimizer vs. the Plan Cache, I will show you how the Query Optimizer uses parameter sniffing to produce a plan tailored to the current parameters of a query and why in some cases it could be a performance problem, including troubleshooting and solutions to these cases.

The PASS Summit is less than four months away and you can register here. I look forward to meeting lots of SQL Server professionals, including those whom I only know via twitter. See you in Seattle in October.

Statistics on Computed Columns

Another interesting topic that I usually talk about on my presentations is statistics on computed columns so I will use this post to show you how they work and how they can help you to improve the performance of your queries.

A problem faced by some queries using scalar expressions is that they usually cannot benefit from statistics and, without them, the Query Optimizer will use the 30% selectivity guess on inequality comparisons. A solution to this problem can be the use of computed columns, as SQL Server can automatically create and update statistics on these columns which can help the Query Optimizer to create better execution plans. An additional benefit of this solution is that you don’t need to specify the name of the computed column in your queries for SQL Server to use its statistics. The Query Optimizer automatically matches the computed column definition to an existing scalar expression in a query, so your applications do not need to be changed. Although computed columns have been available in previous versions of SQL Server, the automatic matching feature was only introduced with SQL Server 2005.

To see an example, run this query, which creates the plan shown next:

SELECT * FROM Sales.SalesOrderDetail
WHERE OrderQty * UnitPrice > 25000

clip_image002The estimated number of rows is 36,395.1, which is 30% of the total number of rows, 121,317, although the query returns only 5 records. SQL Server is obviously using a selectivity guess, as it cannot estimate the selectivity of the expression OrderQty * UnitPrice > 25000.

Now create a computed column:

ALTER TABLE Sales.SalesOrderDetail
ADD cc AS OrderQty * UnitPrice

Run the previous SELECT statement again and note that, this time, the estimated number of rows has changed to 84.3101 which is very close to the actual number of rows returned by the query, as shown in the following plan:

clip_image004You can optionally test replacing the 25,000 in the query with some other values, like 1,000, 10,000, or 20,000 and verify that the estimated again will be close to the actual number of rows returned.

Note that creating the computed column does not create statistics; these statistics are created the first time that the query is optimized, and you can run the next query to display the information about the statistics objects for the Sales.SalesOrderDetail table:

SELECT * FROM sys.stats
WHERE object_id = object_id('Sales.SalesOrderDetail')

The newly created statistics object will most likely be at the end of the list. Copy its name and use the following command to display the details about the statistics object (I’ve used the name of my local object, but you should replace that as appropriate). You can also use "cc" as the name of the object to get the same results. In both cases, the "cc" column should be shown on the Columns field in the density section.

DBCC SHOW_STATISTICS ('Sales.SalesOrderDetail', _WA_Sys_0000000C_2645B050)

Unfortunately, for the automatic matching feature to work, the expression must be exactly the same as the computed column definition. So, if I change the query to UnitPrice * OrderQty, instead of OrderQty * UnitPrice, the execution plan will show an estimated number of rows of 30% again, as this query will demonstrate:

SELECT * FROM Sales.SalesOrderDetail
WHERE UnitPrice * OrderQty > 25000

As mentioned, the computed column provides statistics so the Query Optimizer can try to get you a better execution plan. In addition, you can create an index on the existing computed column to provide a better navigational alternative. Create the following index

CREATE INDEX IX_cc on Sales.SalesOrderDetail(cc)

By running the original SELECT statement again the Query Optimizer will now choose the newly created index and will produce a more efficient plan using an Index Seek/Key Lookup instead of a Clustered Index Scan, as shown next.

clip_image006Finally, drop the index and computed column you’ve just created:

DROP INDEX Sales.SalesOrderDetail.IX_cc
GO
ALTER TABLE Sales.SalesOrderDetail
DROP COLUMN cc

The Query Optimizer and Contradiction Detection

As covered in my book Inside the SQL Server Query Optimizer, contradiction detection is a query rewrite (or tree rewrite?) performed at the simplification phase of the optimization process in which query contradictions are detected and removed. Since these parts of the query are not executed at all, SQL Server saves resources like I/O, locks, memory and CPU, making the query to be executed faster. For example, the Query Optimizer may know that no records can satisfy a predicate even before touching any page of data. A contradiction may be related to a check constraint, or may be related to the way the query is written. I will show you examples of both cases next.

First, I need to find a table with a check constraint in AdventureWorks and, handily, the Employee table has the following check constraint definition:

([VacationHours]>=(-40) AND [VacationHours]<=(240))

This check constraint makes sure that the number of vacation hours is a number between –40 and 240, so when I request

SELECT * FROM HumanResources.Employee
WHERE VacationHours > 80

… SQL Server uses a Clustered Index Scan operator, as shown next

clip_image002However, if I request all of the employees with more than 300 vacation hours then, because of this check constraint, the Query Optimizer must immediately know that no records qualify for predicate. Run the following code

SELECT * FROM HumanResources.Employee
WHERE VacationHours > 300

As expected, the query will return no records, but this time it will show the following execution planclip_image004Note that, this time, instead of a Clustered Index Scan, SQL Server is using a Constant Scan operator. Since there is no need to access the table at all, SQL Server saves resources like I/O, locks, memory and CPU, making the query to be executed faster. Now, let’s see what happens if I disable the check constraint

ALTER TABLE HumanResources.Employee NOCHECK CONSTRAINT CK_Employee_VacationHours

This time, running the last query once again uses a Clustered Index Scan operator, as the Query Optimizer can no longer use the check constraint to guide its decisions. Don’t forget to enable the constraint again by running the following statement:

ALTER TABLE HumanResources.Employee WITH CHECK CHECK CONSTRAINT
CK_Employee_VacationHours

The second type of contradiction case is when the query itself explicitly contains a contradiction. Take a look at the next query

SELECT * FROM HumanResources.Employee
WHERE ManagerID > 10 AND ManagerID < 5

In this case there is no check constraint involved; both predicates are valid and each will individually return records, but they contradict each other when they are run together. As a result, the query returns no records and the plan shows again a Constant Scan operator similar to the plan shown previously. This may just look like a badly written query, but remember that some predicates may already be included in, for example, view definitions, and the developer of the query may be unaware of those. For example, in our last query, a view may include the predicate ManagerID > 10 and a developer may call this view using the predicate ManagerID < 5. Since both predicates contradict each other a Constant Scan operator will be used again instead.

Database Engine Tuning Advisor and the Query Optimizer – Part 2

One of the most interesting and perhaps not well known features of the Database Engine Tuning Advisor (DTA) is that you can use it with a test server to tune the workload of a production server. As I mentioned on the first part of this post, the DTA relies on the Query Optimizer to make its tuning recommendations and you can use it to make these optimizer calls to a test server instance without impacting the performance of the production server.

Information Required by the Query Optimizer

To better understand how this works let us first review what kind of information the Query Optimizer needs to tune a workload. Basically the most important information it needs to perform an optimization is:

1) The database metadata (i.e. table and column definitions, indexes, constraints, etc.)

2) Optimizer statistics (index and column statistics)

3) Table size (number of rows and pages)

4) Available memory and number of processors

The DTA can gather the database metadata and statistics from the production server and use it to create a similar database, with no data, on a different server. This is called a shell database. The DTA can also obtain the available memory and number of processors on the production server, by using the extended stored procedure xp_msver, and use this information for the optimization process. It is important to remind that no data is needed for the optimization process. This process is summarized in the following figure taken from Books Online: clip_image001[8]

This process provides the following benefits:

1) There is no need to do an expensive optimization on the production server which can impact its resources usage. Production server is only used to gather initial metadata and the required statistics.

2) No need to copy the entire database to a test server either, which is especially important for big databases, saving disk space and time to copy the database

3) No problems where test servers are not as powerful as production server as the DTA tuning session will consider the available memory and number of processors of the production server.

Running a Tuning Session

Now I am going to show an example of how to run a tuning session. First of all, the use of a test server is not supported by the DTA graphical user interface so the use of the dta utility, the command prompt version of DTA, is required. Configuring a test server also requires an XML input file containing the dta input information. I am using the following input file for this example

<?xml version="1.0" encoding="utf-16" ?>
<DTAXML xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
        xmlns="http://schemas.microsoft.com/sqlserver/2004/07/dta">
  <DTAInput>
    <Server>
      <Name>production_instance</Name>
      <Database>
        <Name>AdventureWorks</Name>
      </Database>
    </Server>
    <Workload>
      <File>workload.sql</File>
    </Workload>
    <TuningOptions>
      <TestServer>test_instance</TestServer>
      <FeatureSet>IDX</FeatureSet>
      <Partitioning>NONE</Partitioning>
      <KeepExisting>NONE</KeepExisting>
    </TuningOptions>
  </DTAInput>
</DTAXML>

The Server and Database elements of the XML file include the production SQL Server instance and database. The Workload element includes the definition of a script containing the workload to tune. TuningOptions includes the TestServer subelement which is used to include the name of the test SQL Server instance.

Create the workload.sql file containing a simple query like this

SELECT * FROM AdventureWorks.Sales.SalesOrderDetail
WHERE ProductID = 898

Run the following command

dta -ix input.xml -S production_instance -s session1

A successful execution will show an output similar to this

Microsoft (R) SQL Server Microsoft SQL Server Database Engine Tuning Advisor com
mand line utility
Version 9.00.5000.00
Copyright (c) Microsoft Corporation. All rights reserved.

Tuning session successfully created. Session ID is 26.

Total time used: 00:00:03
Workload consumed: 100%, Estimated improvement: 96%

Tuning process finished.

This example creates an entire copy of AdventureWorks (with no data) and performs the requested optimization. The shell database is automatically deleted after the tuning session is completed. Optionally you can keep the shell database, for example if you want to use it again on another tuning exercise, by using the RetainShellDB in the TuningOptions element like in the following XML fragment.

<TuningOptions>
  <TestServer>test_instance</TestServer>
  <FeatureSet>IDX</FeatureSet>
  <Partitioning>NONE</Partitioning>
  <KeepExisting>NONE</KeepExisting>
  <RetainShellDB>1</RetainShellDB>
</TuningOptions>

If the shell database already exists when you request a tuning session, the database creation process will be skipped. However, you will have to manually delete this database when it is no longer needed.

Once the tuning session is completed you can use the DTA graphical user interface as usual to see the recommendations. To do this open the DTA, open the session you used by double-clicking its session name (session1 in our example) and chose the Recommendations tab if it is not already selected.

Scripting Statistics

Although the DTA automatically gathers the metadata and statistics to build the shell database, I am going to show you how to script the required objects and statistics to tune a simple query. This can be helpful in cases where you don’t want to script the entire database. Scripting database objects is a fairly simple process well known by SQL Server professionals. Something that may be new for many though, is how to script the statistics. Created scripts make use of the undocumented STATS_STREAM, ROWCOUNT and PAGECOUNT options of the CREATE/UPDATE STATISTICS statement.

As an example to optimize the simple query shown previously try the following on Management Studio: Select Databases, right-click the AdventureWorks database, select Tasks, Generate Scripts …, click Next, select “Select specific database objects”, expand Tables, select Sales.SalesOrderDetail, click Next, click Advanced, look for the “Script Statistics” choice and select “Script statistics and histograms”. Finally chose True on “Script Indexes”. Your Advanced Scripting Options window should look similar to this:

clip_image002

Click Ok and finish the wizard to generate the scripts. You will get a script with a few UPDATE STATISTICS statements similar to this (with the STAT_STREAM value shortened to fit in this page).

UPDATE STATISTICS [Sales].[SalesOrderDetail]([IX_SalesOrderDetail_ProductID]) 
WITH STATS_STREAM = 0x010000000300000000000000000000004036000 , 
ROWCOUNT = 121317, PAGECOUNT = 227

These UPDATE STATISTICS statements are used to update the statistics of existing indexes (obviously the related CREATE INDEX statements were scripted as well). If the table also has column statistics it will include CREATE STATISTICS statements instead.

Testing Scripted Statistics

Finally, I will show you an example of how to use the scripted statistics to obtain plans and cost estimates on an empty table. Running the following query on the regular AdventureWorks database creates the following plan with an estimated number of rows of 9 and a cost of 0.0296835.

SELECT * FROM Sales.SalesOrderDetail
WHERE ProductID = 898

clip_image002[7]

Let us produce the same plan on an empty database. Following the procedure described before you can script the Sales.SalesOrderDetail table. You will end with multiple statements including the following (again shortened to fit on this post).

CREATE TABLE [Sales].[SalesOrderDetail](
    [SalesOrderID] [int] NOT NULL, 
) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [IX_SalesOrderDetail_ProductID] ON 
[Sales].[SalesOrderDetail] 
(
    [ProductID] ASC
)
GO
UPDATE STATISTICS [Sales].[SalesOrderDetail]([IX_SalesOrderDetail_ProductID]) 
WITH STATS_STREAM = 0x010000000300000000000, ROWCOUNT = 121317, PAGECOUNT = 227
GO
UPDATE STATISTICS [Sales].[SalesOrderDetail]
([PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID]) 
WITH STATS_STREAM = 0x010000000200000000000000000000003C2F68F6, ROWCOUNT = 121317, 
PAGECOUNT = 1237

Create a new database and run at least the previous four statements using the scripts you got on the previous step (or you can use the attached script on this post containing the statements needed to reproduce the example). After implementing the script on an empty database and running the sample query, you will get again the plan with cost 0.0296835 and estimated number of rows of 9.

Database Engine Tuning Advisor and the Query Optimizer

Since I am planning to publish a couple of DTA-related posts later this week I thought it would be a good idea to start with this article which explains how the DTA relies on the Query Optimizer to make its tuning recommendations. This article was previously published in my book Inside the SQL Server Query Optimizer.
 
Currently, all major commercial database vendors include a physical database design tool to help with the creation of indexes. However, when these tools were first developed, there were just two main architectural approaches considered for how these tools should recommend indexes. The first approach was to build a stand-alone tool with its own cost model and design rules. The second approach was to build a tool that could use the Query Optimizer cost model.

A problem with building a stand-alone tool is the requirement for duplicating the cost module. On top of that, having a tool with its own cost model, even if it’s better than the optimizer’s cost model, may not be a good idea because the optimizer still chooses its plan based on its own model.

The second approach, using the Query Optimizer to help on physical database design, was proposed in the database research community as far as back as 1988. Since it’s the optimizer which chooses the indexes for an execution plan, it makes sense to use the optimizer itself to help find which missing indexes would benefit existing queries. In this scenario, the physical design tool would use the optimizer to evaluate the cost of queries given a set of candidate indexes. An additional benefit of this approach is that, as the optimizer cost model evolves, any tool using its cost model can automatically benefit from it.

SQL Server was the first commercial database product to include a physical design tool, in the shape of the Index Tuning Wizard which shipped with SQL Server 7.0, and which was later replaced by the Database Engine Tuning Advisor (DTA) in SQL Server 2005. Both tools use the Query Optimizer cost model approach and were created as part of the AutoAdmin project at Microsoft, the goal of which was to reduce the total cost of ownership (TCO) of databases by making them self-tuning and self-managing. In addition to indexes, the DTA can help with the creation of indexed views and table partitioning.

However, creating real indexes in a DTA tuning session is not feasible; its overhead could impact operational queries and degrade the performance of your database. So how does the DTA estimate the cost of using an index that does not yet exist? Actually, even during a regular query optimization, the Query Optimizer does not use indexes to estimate the cost of a query. The decision of whether to use an index or not relies only on some metadata and the statistical information regarding the columns of the index. Index data itself is not needed during query optimization but will, of course, be required during query execution if the index is chosen for the execution plan.

So, to avoid creating indexes during a DTA session, SQL Server uses a special kind of indexes called hypothetical indexes, which were also used by the Index Tuning Wizard. As the name implies, hypothetical indexes are not real indexes; they only contain statistics and can be created with the undocumented WITH STATISTICS_ONLY option of the CREATE INDEX statement. You may not be able to see these indexes during a DTA session because they are dropped automatically when they are no longer needed. However, you could see the CREATE INDEX WITH STATISTICS_ONLY and DROP INDEX statements if you run a SQL Server Profiler session to see what the DTA is doing.

Let’s take a quick tour to some of these concepts. To get started, create a new table on the AdventureWorks database:

SELECT *
INTO dbo.SalesOrderDetail
FROM Sales.SalesOrderDetail

Copy the following query and save it to a file:

SELECT * FROM dbo.SalesOrderDetail
WHERE ProductID = 897

Open a new DTA session, and you can optionally run a SQL Server Profiler session if you want to inspect what the DTA is doing. On the Workload File option, select the file containing the SQL statement that you just created and specify AdventureWorks as both the database to tune and the database for workload analysis. Click the Start Analysis button and, when the DTA analysis finishes, run this query to inspect the contents of the msdb..DTA_reports_query table:

SELECT * FROM msdb..DTA_reports_query

Running that query shows the following output, (edited for space):

StatementString                             CurrentCost RecommendedCost
SELECT * FROM dbo.SalesOrderDetail WHERE Pr 1.2434      0.00328799

Notice that the query returns information like the query that was tuned, as well as the current and recommended cost. The current cost, 1.2434, is easy to obtain by directly requesting an estimated execution plan for the query as shown next.

clip_image002

Since the DTA analysis was completed, the created hypothetical indexes were already dropped. To now obtain the indexes recommended by the DTA, click on the Recommendations tab and look at the Index Recommendations section, where you can find the code to create any recommended index by then clicking on the Definition column. In our example, it will show the following code:

CREATE CLUSTERED INDEX [_dta_index_SalesOrderDetail_c_5_1915153868__K5]
ON [dbo].[SalesOrderDetail]
(
    [ProductID] ASC
)WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF,  DROP_EXISTING = OFF,
ONLINE = OFF) ON [PRIMARY]

In the next statement, and for demonstration purposes only, I will go ahead and create the index recommended by the DTA but, instead of a regular index, I will create it as a hypothetical index by adding the WITH STATISTICS_ONLY clause:

CREATE CLUSTERED INDEX cix_ProductID ON dbo.SalesOrderDetail(ProductID)
WITH STATISTICS_ONLY

You can validate that a hypothetical index was created by running the next query:

SELECT * FROM sys.indexes
WHERE object_id = object_id('dbo.SalesOrderDetail')
AND name = 'cix_ProductID'

The output is shown next below; note that the is_hypothetical field shows that this is, in fact, just a hypothetical index:

object_id  name          index_id type type_desc is_hypothetical
1915153868 cix_ProductID 3        1    CLUSTERED 1

Remove the hypothetical index by running this statement:

DROP INDEX dbo.SalesOrderDetail.cix_ProductID

Now implement the DTA recommendation, this time as a regular clustered index:

CREATE CLUSTERED INDEX cix_ProductID ON dbo.SalesOrderDetail(ProductID)

After implementing the recommendation and running the query again, the clustered index is in fact now being chosen by the Query Optimizer. This time, the plan shows a Clustered Index Seek operator and an estimated cost of 0.0033652, which is very close to the recommended cost listed previously when querying the msdb..DTA_reports_query table.

Finally, drop the table you just created by running the following statement:

DROP TABLE dbo.SalesOrderDetail

Optimizer Statistics on Linked Servers

Recently I was asked to troubleshoot a performance problem with a query using linked servers. The problem was related to a well known issue where the query processor is not able to get the required optimizer statistics from the remote server due to permissions of the user used on the linked server. This behavior is documented on the Books Online entry Guidelines for Using Distributed Queries as shown next:

 “To create the best query plans when you are using a table on a linked server, the query processor must have data distribution statistics from the linked server. Users that have limited permissions on any columns of the table might not have sufficient permissions to obtain all the useful statistics, and might receive a less efficient query plan and experience poor performance. If the linked server is an instance of SQL Server, to obtain all available statistics, the user must own the table or be a member of the sysadmin fixed server role, the db_owner fixed database role, or the db_ddladmin fixed database role on the linked server.”

Basically the problem is that if the user used by the linked server does not have the permissions described in the previous Books Online entry, SQL Server will not be able to execute the DBCC SHOW_STATISTICS statement on the remote server to obtain the required statistics. It is however unfortunate that having access to the data does not also give you access to its statistics as having to provide higher level permissions could be a security concern. Let me show you the problem with an example.

Plan with read-only permissions

I’ve created a linked server between two SQL Server instances each one hosting a copy of AdventureWorks database. First, I grant read-only permissions to the user used by the linked server and run the following artificial query just to demonstrate this behavior.

SELECT l.* FROM AdventureWorks.Sales.SalesOrderHeader l

JOIN remote.AdventureWorks.Sales.SalesOrderHeader r

ON l.SalesOrderID = r.SalesOrderID

WHERE r.CustomerID = 666

Running the previous query gets me the following plan:

clip_image002

In this case the Query Optimizer could benefit of knowing the cardinality estimate of the query executed on the remote server, that is, to know how many orders were placed by customer 666, but this information is not available for this plan. With an estimated guess of 2,362.49 rows the Query Optimizer is deciding to use a Merge Join plus a Clustered Index Scan when in fact the query is returning only 8 records. You can run Profiler against the remote server to learn what kind of information the local query processor is requesting from it. I’ve noticed that it executes the following five system stored procedures which obtain information about the tables, columns, indexes, check constraints and statistics involved on the remote query.

exec [AdventureWorks].[sys].sp_tables_info_90_rowset N’SalesOrderHeader’,N’Sales’,NULL

exec [AdventureWorks].[sys].sp_columns_100_rowset N’SalesOrderHeader’,N’Sales’,NULL

exec [AdventureWorks].[sys].sp_indexes_100_rowset N’SalesOrderHeader’,NULL,N’Sales’

exec [AdventureWorks].[sys].sp_check_constbytable_rowset N’SalesOrderHeader’,N’Sales’,NULL,NULL

exec [AdventureWorks].[sys].sp_table_statistics2_rowset N’SalesOrderHeader’,N’Sales’,N’AdventureWorks’,NULL,NULL,NULL

The last stored procedure, sp_table_statistics2_rowset, tries to obtain the header and density information of the statistics object using the following statement but it fails because of the lack of permissions mentioned earlier.

dbcc show_statistics(@qtbl, @statname) with stat_header join density_vector

Plan with db_owner permissions

Now grant db_owner permissions to the user used on the liked server and run the query again while forcing a new optimization (for example using DBCC FREEPROCCACHE to clear the plan cache on the local server). Profiler will show that the previous five system stored procedures were again executed but this time the sp_table_statistics2_rowset procedure is able to successfully get the requested header and density information. In addition, the following statement is also issued to obtain the histogram information from the statistics object

DBCC SHOW_STATISTICS(N’"AdventureWorks"."Sales"."SalesOrderHeader"’,

"IX_SalesOrderHeader_CustomerID") WITH HISTOGRAM_STEPS

This time the following execution plan is returned

clip_image004

By using the histogram from the remote server, the local query processor is able to get an estimated number of rows of 6.65385 and decides to use a Nested Loops Join and a Clustered Index Seek instead which is a better plan and more appropriate as the actual number of records returned is only 8.

If you manually run the previous DBCC SHOW_STATISTICS statement on the remote server you can see the histogram, an extract of which is shown next, showing the step corresponding for the value for CustomerID 666 which in this case shows an estimated of 6.653846 records on the AVG_RANGE_ROWS column, which is also shown on the previous execution plan.

clip_image006

Remote plan

It is also worth mentioning that the query executed on the remote servers is practically the same in both cases, except that ORDER BY is needed on the first example as the Merge Join on the local plan requires sorted data, but the plan is still the same as shown next (both the query and the plan can be captured using Profiler).

SELECT "Tbl1003"."SalesOrderID" "Col1011"

FROM "AdventureWorks"."Sales"."SalesOrderHeader" "Tbl1003"

WHERE "Tbl1003"."CustomerID"=(666)

ORDER BY "Col1011" ASC

Again notice that the estimated number of rows is 6.65385

clip_image008

Conclusion

So we’ve seen how using a user with limited permissions to run a query through a linked server can in some specific cases be a performance problem because of the lack of access to optimizer statistics. However, it seems inappropriate that having access to the data does not also give you access to its statistics as having to provide higher level permissions could be a security concern. In fact, there is a related connect entry by Erland Sommarskog discussing the problem and suggesting that permissions to access data should also give access to its statistics. So let us hope a better solution to the use of linked servers is provided in the near future.

My book “Inside the SQL Server Query Optimizer” Now Available on Amazon

I just found today that my book, “Inside the SQL Server Query Optimizer”, is finally available on Amazon. You can find the description of the book on the Amazon page or a more detailed description chapter by chapter on one of my previous posts here. So I would like to use this post to thank the people who helped me or in some way influenced the content of the book.

Writing this book was a lot of fun, but also a lot of work; actually a lot more work than I originally expected. Fortunately I got help from several people. First of all, I would like to thank Chris Massey. Chris helped me as the technical editor of the book, and guided me through most of the writing process, doing absolutely outstanding work. Very special thanks also go out to Grant Fritchey who helped us as the technical reviewer, and went very much beyond just reviewing, as his comments provided invaluable feedback to improve the quality of this book. Thanks also go to Tony Davis for offering me this opportunity in the first place, and helping to get the project started.

Outside the Red Gate team, my deepest gratitude goes to Cesar Galindo-Legaria, Manager of the SQL Server Query Optimization team at Microsoft, for answering my endless list of questions about the Query Optimizer, and educating me through some of the information he has published, including numerous research papers and an excellent chapter of a SQL Server book. I had the opportunity to meet Tony, Cesar, and Grant at the PASS Summit back in 2009, when I first started this journey, and I hope to finally meet Chris at the same conference in October 2011.

Although Chris, Cesar, Grant and Tony have directly helped me to shape the content of this book, there’s also a host of other people who have indirectly influenced the book through their work, which helped me to learn about and better understand the SQL Server query processor. With that in mind, I would like to thank the authors who have discussed the query processor in some of the available SQL Server books, Microsoft white papers and blogs, including Kalen Delaney, Ken Henderson, Lubor Kollar, Craig Freedman, Conor Cunningham and Eric Hanson.

Research papers have provided me with an unlimited source of information, and helped me to understand the Query Optimizer at a significantly deeper level than ever before. So, thanks to all that amazing work by the many people in the research community including, in addition to Cesar, Goetz Graefe, Surajit Chaudhuri, Yannis Ioannidis, Vivek Narasayya, Pat Selinger, Florian Waas, and many, many more.

Finally, on the personal side, I would like to thank my parents, Guadalupe and Humberto, and my family: my wife Rocio, and my three sons, Diego, Benjamin and David. Thanks all for your unconditional support and patience.

Slide Deck and Demo Files for Recent Presentations

Attached are WinZip files containing the PowerPoint presentation, as well as the scripts used in the demos for my sessions “Top 10 Query Optimizer Topics for Better Performance” and “Inside the SQL Server 2008 Data Collector” which I presented recently at the SoCal Code Camp, the Los Angeles SQL Server Professionals Group and the PASS Summit 2010.

Inside the SQL Server 2008 Data Collector – TheDataCollector.zip
10 Query Optimizer Topics for Better Performance – QueryOptimizerTopics.zip