A problem with building a stand-alone tool is the requirement for duplicating the cost module. On top of that, having a tool with its own cost model, even if it’s better than the optimizer’s cost model, may not be a good idea because the optimizer still chooses its plan based on its own model.
The second approach, using the Query Optimizer to help on physical database design, was proposed in the database research community as far as back as 1988. Since it’s the optimizer which chooses the indexes for an execution plan, it makes sense to use the optimizer itself to help find which missing indexes would benefit existing queries. In this scenario, the physical design tool would use the optimizer to evaluate the cost of queries given a set of candidate indexes. An additional benefit of this approach is that, as the optimizer cost model evolves, any tool using its cost model can automatically benefit from it.
SQL Server was the first commercial database product to include a physical design tool, in the shape of the Index Tuning Wizard which shipped with SQL Server 7.0, and which was later replaced by the Database Engine Tuning Advisor (DTA) in SQL Server 2005. Both tools use the Query Optimizer cost model approach and were created as part of the AutoAdmin project at Microsoft, the goal of which was to reduce the total cost of ownership (TCO) of databases by making them self-tuning and self-managing. In addition to indexes, the DTA can help with the creation of indexed views and table partitioning.
However, creating real indexes in a DTA tuning session is not feasible; its overhead could impact operational queries and degrade the performance of your database. So how does the DTA estimate the cost of using an index that does not yet exist? Actually, even during a regular query optimization, the Query Optimizer does not use indexes to estimate the cost of a query. The decision of whether to use an index or not relies only on some metadata and the statistical information regarding the columns of the index. Index data itself is not needed during query optimization but will, of course, be required during query execution if the index is chosen for the execution plan.
So, to avoid creating indexes during a DTA session, SQL Server uses a special kind of indexes called hypothetical indexes, which were also used by the Index Tuning Wizard. As the name implies, hypothetical indexes are not real indexes; they only contain statistics and can be created with the undocumented WITH STATISTICS_ONLY option of the CREATE INDEX statement. You may not be able to see these indexes during a DTA session because they are dropped automatically when they are no longer needed. However, you could see the CREATE INDEX WITH STATISTICS_ONLY and DROP INDEX statements if you run a SQL Server Profiler session to see what the DTA is doing.
Let’s take a quick tour to some of these concepts. To get started, create a new table on the AdventureWorks database:
SELECT * INTO dbo.SalesOrderDetail FROM Sales.SalesOrderDetail
Copy the following query and save it to a file:
SELECT * FROM dbo.SalesOrderDetail WHERE ProductID = 897
Open a new DTA session, and you can optionally run a SQL Server Profiler session if you want to inspect what the DTA is doing. On the Workload File option, select the file containing the SQL statement that you just created and specify AdventureWorks as both the database to tune and the database for workload analysis. Click the Start Analysis button and, when the DTA analysis finishes, run this query to inspect the contents of the msdb..DTA_reports_query table:
SELECT * FROM msdb..DTA_reports_query
Running that query shows the following output, (edited for space):
StatementString CurrentCost RecommendedCost
SELECT * FROM dbo.SalesOrderDetail WHERE Pr 1.2434 0.00328799
Notice that the query returns information like the query that was tuned, as well as the current and recommended cost. The current cost, 1.2434, is easy to obtain by directly requesting an estimated execution plan for the query as shown next.
Since the DTA analysis was completed, the created hypothetical indexes were already dropped. To now obtain the indexes recommended by the DTA, click on the Recommendations tab and look at the Index Recommendations section, where you can find the code to create any recommended index by then clicking on the Definition column. In our example, it will show the following code:
CREATE CLUSTERED INDEX [_dta_index_SalesOrderDetail_c_5_1915153868__K5] ON [dbo].[SalesOrderDetail] ( [ProductID] ASC )WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]
In the next statement, and for demonstration purposes only, I will go ahead and create the index recommended by the DTA but, instead of a regular index, I will create it as a hypothetical index by adding the WITH STATISTICS_ONLY clause:
CREATE CLUSTERED INDEX cix_ProductID ON dbo.SalesOrderDetail(ProductID) WITH STATISTICS_ONLY
You can validate that a hypothetical index was created by running the next query:
SELECT * FROM sys.indexes WHERE object_id = object_id('dbo.SalesOrderDetail') AND name = 'cix_ProductID'
The output is shown next below; note that the is_hypothetical field shows that this is, in fact, just a hypothetical index:
object_id name index_id type type_desc is_hypothetical
1915153868 cix_ProductID 3 1 CLUSTERED 1
Remove the hypothetical index by running this statement:
DROP INDEX dbo.SalesOrderDetail.cix_ProductID
Now implement the DTA recommendation, this time as a regular clustered index:
CREATE CLUSTERED INDEX cix_ProductID ON dbo.SalesOrderDetail(ProductID)
After implementing the recommendation and running the query again, the clustered index is in fact now being chosen by the Query Optimizer. This time, the plan shows a Clustered Index Seek operator and an estimated cost of 0.0033652, which is very close to the recommended cost listed previously when querying the msdb..DTA_reports_query table.
Finally, drop the table you just created by running the following statement:
DROP TABLE dbo.SalesOrderDetail
How would the query plan look if the query was estimated/executed with the hypothetical index being still alive? Would it be included in the queryplan and fail to execute?
Very interesting. Thanks for the info Benjamin.
Also, enjoyed your presentations the past two years at SQL Saturday-HB.
Hi Tobi,
Hypothetical indexes can only be used inside a DTA session so no execution plans using those indexes are available outside of it. Even if you manually create the hypothetical index in my example, it can not be used in an execution plan.
Regards,
Ben
Hi TKnoob,
Thanks again. See you on the next SQLSaturday or SoCal Code Camp!
Regards,
Ben
[…] a test server to tune the workload of a production server. As I mentioned on the first part of this post, the DTA relies on the Query Optimizer to make its tuning recommendations and you can use it to […]
Nice post Benjamin.
I would have a question in regards to above topic if you don’t mind.
If we use SQL Profiler to trace typical workload into file or table for DETA input, does DETA actually execute SQLs against the database (in order to analyze the workload and generate recommendations)? MS documentation does not say much about this directly … I assume it does its analysis by estimating costs, looking into execution plans, creating hypothetical indexes, etc … but it does not execute or ‘replay’ original queries … running DMLs would especially cause SQLs to fail (constraints) or even worse – corruption (duplicates, etc …) … I know for option to run it against test server but my main concern was about running DETA against the database where workload was originally created… what’s your opinion?
Thanks,
Milan N.
Hi Milan,
Thanks for your comment. Yes, the DTA does not execute the queries, it only evaluates their cost.
Regards,
Ben