Last year when I presented my session regarding the Query Optimizer at the PASS Summit, I was asked how the estimated CPU and I/O costs in an execution plan are calculated, that is, where a specific value like 1.13256 is coming from. All I was able to say at the moment was that Microsoft does not publish how these costs are calculated.

Since this time I am working on a related project, I thought that perhaps I could look into this question again and show one example. But since there are dozens of operators, I decided to take a look at a simple one: the Clustered Index Scan operator. So I captured dozens of XML plans, used XQuery to extract their cost information and after some analysis I was able to obtain a basic formula for this specific operator.

But first a quick introduction to cost estimation: the cost of each operator depends on its algorithm, each operator is associated with a CPU cost, and some of them will also have an I/O cost. The total cost of the operator is the sum of these two costs. An operator like a Clustered Index Scan has both CPU and I/O costs. Some other operators, like a Stream Aggregate, will have only CPU cost. It is interesting to note that this cost used to mean the estimated time in seconds that a query or operator would take to execute on a particular reference machine. In recent versions of SQL Server this cost should no longer be interpreted as seconds, milliseconds, or any other unit.

To show the example, let us look at the largest table in AdventureWorks, Sales.SalesOrderDetail. Run the following query and look at the estimated CPU and I/O costs for the Clustered Index Scan operator as shown on the next figure.

SELECT * FROM Sales.SalesOrderDetail

WHERE LineTotal = 35

For a Clustered Index Scan operator, I observed that the CPU cost is 0.0001581 for the first record, plus 0.0000011 for any additional record after that. In this specific case we have an estimated number of records of 121,317, as shown on the picture above, so we can use 0.0001581 + 0.0000011 * (121317 – 1) or 0.133606 which is the value shown as Estimated CPU Cost. In a similar way, I noticed that the minimum I/O cost is 0.003125 for the first database page and then it grows in increments of 0.00074074 for every additional page. Since the Clustered Index Scan operator scans the entire table, I can use the following query to find the number of database pages, which returns 1,234.

SELECT in_row_data_page_count, row_count

FROM sys.dm_db_partition_stats

WHERE object_id = object_id(‘Sales.SalesOrderDetail’)

AND index_id = 1

In this case we have 0.003125 + 0.00074074 * (1234 – 1) or 0.916458 which is the value shown as estimated I/O Cost.

Finally, we add both costs, 0.133606 + 0.916458 to get 1.05006 which is the total estimated cost of the operator. In the same way, adding the cost of all the operators will give the total cost of the plan. In this case, the cost of the Clustered Index Scan, 1.05006, plus the cost of the first Compute Scalar operator, 0.01214, the second Compute Scalar operator, 0.01213, and the cost of the Filter operator, 0.0582322, will give the total cost of the plan, 1.13256, as shown next.

Hey Ben,

You might be interested to read the very-in-depth coverage Joe Chang gave to this topic at http://www.qdpma.com/SQLServerCostBasedOptimizer.html

Regards,

Paul

Thanks Paul. I remember seeing Joe Chang’s post but definitely I should read it again.

Regards,

Ben

Hi Mr. Nevarez

Iam a master student in college of science -computer departement ,and now iam working on my thesis which is titled (query optimization by using Genetic algorithms). Iam really in need for ur help, can i get ur email or mobile number to be able to contact u and ask u some questions please.

Hi ahmedhassany,

Yes, you can e-mail me at admin at benjaminnevarez dot com.

Regards,

Ben

Thanks Mr. Benjamin,I got ur email ,and I have sent you an email

Thank you Sir. This article was of much helpful in understanding the calculations of various Estimated costs. I will try analysing the same for various queries during its implementation to reduce the future issues on data record retrieval.

how to get estimated sub tree cost with out GUI?

[…] The execution plan is a tree. Each iterator in the tree is given an estimated CPU cost and an estimated IO cost and these are added together to get the overall cost (the relative weightings can be adjusted with a couple of undocumented DBCC commands). The estimated subtree cost includes the cost for the iterator itself and all its descendants. To see an example of the costing formulas used you can look at this article. […]