In part 1 we looked at the overall flow of the optimizer.
In this post we will examine the different types of logical trees, and the types of operators you see inside them.
Logical Trees
Tree Types
Operator Types
Remember This
Logical trees are composed of operators, each of which may have child operators, producing a cascading plan. Conceptually, the structure is similar to a query plan you will see in Management Studio. There are however some key characteristics to note.
Unlike physical operators in a query plan, logical operators never execute. Their sole purpose is to describe an operation that logically exists at a given step. For example, a join will be shown in it's logical form (inner, outer, full, semi, anti-semi) with no consideration as to whether it should be a hash join, merge join, nested loop or apply. Similarly, getting table data is described with no determination of how the table should be accessed, e.g. with a seek or scan, or whether a heap or specific index will be used as the source of the data.
Terminology is different than SQL - Relational rather SQL terminology is used. Get used to the idea that selection means determining which rows to return - aka filtering. Projection means defining which columns to return - akin the behaviour of SQL SELECT.
Logical operators describe the intrinsic functionality expressed in T-SQL code. This is not necessarily a direct translation. When we look at logical trees, we will see cases where the same logical operators are used for seemingly different SQL functions or constructs. For example, there is no logical left string operator - it can be adequately described using the more multi-purpose substring. Conversely, logical (and physical) operators can describe a null equality test - which is not explicitly available in T-SQL.
Logical (and physical) trees are solely diagnostics - they omit lots of 'under the hood' information that is used by the optimizer. Also, in places, they have a few syntax anomalies that would make them unsuitable for machine reading. But despite these limitations - or perhaps because of some of them - they are human readable and useful in understanding steps in the optimization process.
At different steps of optimization we can examine the logical tree, and also some physical trees at the end of the process. Below are some examples * of the changes that can occur between each tree. We will look at practical examples in later posts.
Converted Tree
Input Tree
Simplified Tree
Some outer joins may be converted to inner joins, e.g. if a predicate is applied to the outer table.
Join Collapsed Tree
After Project Normalization Tree
Output Tree
Post Optimization Rewrites
Operator Types
In logical trees there are three overall types of operator.
Logical Operators - LogOp_{name}
Logical operators describe the processing of sets of data. This includes originating data from tables, and operations which in most cases take other operators as inputs; such as joins, filters and aggregations. Although there are 75 of these operators in SQL Server 2019, there is a surprisingly small subset that satisfy most queries.
In the output tree we will see logical operations expressed instead as specific physical operators - PhyOp_{name} - e.g. a Hash Join rather than a generic logical join. In this step by step guide, we'll look at examples of this evolution, and how you can design tables, indexes and queries to get the operators that will deliver the best performance for a query.
Ancillary Operators - AncOp_{name}
Ancillary operators describe additional functionality for a logical operator. Of the 16 ancillary operators in SQL Server 2019, one of the most commonly seen uses is encapsulating an expression to derive a new column that will be evaluated at run time.
Scalar Operators - ScaOp_{name}
Scalar operators are the building blocks of expressions, producing a scalar rather than relational output. They range from basic mathematical operations such as add, multiply etc, through to comparisons and set comparisons such as exists.
Note that unlike logical operators (LogOp_{name}), the scalar operators found in logical trees can also exist in physical plans - i.e. they can be executed.
Not only are there over fifty of these operators, several of those have many variants, leading to over a hundred explicit uses. Despite this apparent proliferation, most are readable without the need for in-depth knowledge.
- Logical trees show the early stages of optimization.
- A number of simplifications are applied, to eliminate unnecessary work.
- Logical operators do not specify physical processing methods such as physical join types (e.g. hash) or data access methods - this is determined later.
These behaviours are however far from comprehensive. The optimizer contains around four hundred rules that can rewrite portions of a tree in certain circumstances.
The following sql file gives queries reproducing
Please do comment below or get in touch if you have other examples that might be worth a mention.
No comments:
Post a Comment