The SQL Server Query Optimizer needs to take two important decisions regarding joins: the selection of a join order and the choice of a join algorithm. The selection of a join algorithm is the choice between a nested loops join, merge join or hash join operator, but I will leave that topic for some other post. In this post I will talk about join orders.
A join combines records from two tables based on some common information. But since a join works with only two tables at a time, a query requesting data from n tables must be executed as a sequence of n – 1 joins. Although the results of a query are the same regardless of the join order, the order in which the tables are joined greatly influences the cost and performance of a query. The Query Optimizer must find the optimal sequence of joins between the tables used in the query. Finding the optimal join order is one of the most difficult problems in query optimization and one that has been subject of extensive research since the seventies.
This basically means that the Query Optimizer defines the join order; it is not defined by your query. To see an example run the following query against the AdventureWorks database and choose to display the execution plan by clicking Include Actual Execution Plan.
SELECT FirstName, LastName
FROM
JOIN
ON C.ContactID = I.ContactID
JOIN
ON I.CustomerID = Cu.CustomerID
WHERE Cu.CustomerType = ‘I’
It will show the following execution plan
Note that the Query Optimizer is not using the same join order specified in the query; it found a more efficient one. The query asks to join the Contact table with the Individual table, and then join the result with the Customer table, but if you inspect the three clustered index scan operators, the plan is first joining the Customer and Individual tables and then joining the result with the Contact table.
The problem of finding an efficient join order for a query is difficult because of the number of possible permutations that the Query Optimizer needs to analyze. Because of the commutative and associative properties of joins there could be many different possible join orders in a query and this number increases exponentially with the number of tables. In fact, with just a handful of tables the number of possible join orders could be in the thousands or even millions. Obviously, it is impossible for any query optimizer to look at all those combinations: it would take too long.
Queries are represented as trees in the query processor and some names like leftdeep, rightdeep and bushy trees are commonly used to identify the shapes of the order of joins in these trees. Leftdeep and bushy trees for a join of four tables are shown in the next figure.
For example, the leftdeep tree shown could be JOIN( JOIN( JOIN(A, B), C), D) and the bushy tree could be JOIN(JOIN(A, B), JOIN(C, D)). Leftdeep trees are also called linear trees or linear processing trees. The set of bushy trees includes the sets of both leftdeep and rightdeep trees. The following table lists the number of possible join orders considering leftdeep and bushy trees.
Tables

LeftDeep Trees

Bushy Trees

1

1

1

2

2

2

3

6

12

4

24

120

5

120

1,680

6

720

30,240

7

5,040

665,280

8

40,320

17,297,280

9

362,880

518,918,400

10

3,628,800

17,643,225,600

11

39,916,800

670,442,572,800

12

479,001,600

28,158,588,057,600

The number of leftdeep trees is calculated as n! or n factorial, where n is the number of tables in the relation. The factorial is the product of all the positive integers less than or equal to n. For example, 5! or 5 factorial is 5 x 4 x 3 x 2 x 1 or 120. Calculating the number of bushy tress is more complicated and can be calculated as (2n – 2)!/(n – 1)!.
So how does the Query Optimizer analyze all these possible join orders? The answer is: it does not. Analyzing all the possible join orders for a complex query could take a long time so the Query Optimizer must find a balance between the optimization time and the quality of the resulting plan. The Query Optimizer may not perform an exhaustive search and instead uses some heuristics to guide the search process. For example, a common heuristic is to avoid considering bushy tress during optimization, but I will leave some of these details for some other post.