Estimate Actual Operator - 1225 PROJECT ps_partkey, sum AS v
- 1225 SORT sum
- 1225 FILTER sum > expr_23
32000 28515 CROSSJOINLOOP 32000 28515 │└AGGREGATESUM(sum_24) AS sum GROUPBYHASH ps_partkey
32000 28515 │DISTRIBUTEHASHON ps_partkey
32000 28515 │AGGREGATESUM(expr) AS sum_24 GROUPBY PARTIAL ps_partkey
32000 30160 │PROJECT ps_supplycost *CAST(ps_availqty AS double) AS expr, ps_partkey
32000 30160 │INNERJOINHASHON ps_suppkey = s_suppkey
800000 800000 ││└TABLESCAN partsupp
400 377 │DISTRIBUTEHASHON s_suppkey
400 377 │INNERJOINHASHON s_nationkey = n_nationkey
10000 10000 ││└TABLESCAN supplier
1 1 │DISTRIBUTE GATHER
25 1 │FILTER n_name = 'JAPAN'
25 25 │TABLESCAN nation
1 1 DISTRIBUTE GATHER
1 1 PROJECT sum_22 * 1.0E-4 AS expr_23
1 1 DISTRIBUTE ROUND ROBIN
1 1 AGGREGATESUM(sum_25) AS sum_22
32000 1 DISTRIBUTE GATHER
32000 1 AGGREGATESUM(expr_21) AS sum_25
32000 30160 PROJECT ps_supplycost_6 *CAST(ps_availqty_5 AS double) AS expr_21
32000 30160 INNERJOINHASHON ps_suppkey_4 = s_suppkey_9
800000 800000 │└PROJECT ps_suppkey AS ps_suppkey_4, ps_availqty AS ps_availqty_5, ps_supplycost AS ps_supplycost_6
800000 800000 │TABLESCAN partsupp
400 377 DISTRIBUTEHASHON s_suppkey_9
400 377 INNERJOINHASHON s_nationkey_12 = n_nationkey_17
10000 10000 │└PROJECT s_suppkey AS s_suppkey_9, s_nationkey AS s_nationkey_12
10000 10000 │TABLESCAN supplier
1 1 DISTRIBUTE GATHER
25 1 PROJECT n_nationkey AS n_nationkey_17
25 1 FILTER n_name = 'JAPAN'
25 25 TABLESCAN nation
SQL Server
Estimate Actual Operator 9031 1225 SORT Expr1009
9031 1225 INNERJOINLOOPON n_name = 'JAPAN' AND n_name = 'JAPAN'
30106 28515 │└AGGREGATESUM(Expr1021) AS Expr1009 GROUPBYHASH ps_partkey
32002 30160 │INNERJOINHASHON ps_suppkey = s_suppkey
400 377 ││└INNERJOINHASHON s_nationkey = n_nationkey
1 1 │││└TABLESCAN nation WHERE n_name = 'JAPAN'
1000 377 ││TABLESCAN supplier WHERE BLOOM(s_nationkey)
80000 30160 │PROJECT ps_supplycost * CONVERT_IMPLICIT(decimal(10,0),ps_availqty,0) AS Expr1021
80000 30160 │TABLESCAN partsupp WHERE BLOOM(ps_suppkey)
1 1 PROJECTCASEWHEN Expr1042 = 0 THEN NULL ELSE Expr1043 END AS Expr1019
1 1 AGGREGATECOUNT(Expr1022) AS Expr1042, SUM(Expr1022) AS Expr1043
32002 30160 INNERJOINHASHON ps_suppkey = s_suppkey
400 377 │└INNERJOINHASHON s_nationkey = n_nationkey
1 1 ││└TABLESCAN nation WHERE n_name = 'JAPAN'
1000 377 │TABLESCAN supplier WHERE BLOOM(s_nationkey)
80000 30160 PROJECT ps_supplycost * CONVERT_IMPLICIT(decimal(10,0),ps_availqty,0) AS Expr1022
80000 30160 TABLESCAN partsupp WHERE BLOOM(ps_suppkey)
Commentary
Statistics (Histograms and MCV)
To pick the optimal bushy join order in Query 11, the optimizer needs accurate statistics like histograms and Most Common Values (MCV). These allow it to know that a filter on n_name = 'JAPAN' will result in a very small number of matching rows in the supplier table.
Bushy Join
Query 11 benefits from a bushy join tree: partsupp ⨝ (supplier ⨝ nation). Since the filtered result of supplier ⨝ nation is tiny, the hash table built for this join is very small, making this more memory-efficient and faster than a traditional left-deep tree.
Canonical Forms vs. Init Plans
How a database handles non-correlated subqueries (like the one in the HAVING clause) affects the simplicity of its execution engine. Some databases use "init plans" to execute the subquery first, while others canonicalize the subquery into a simple cross-join to a single row.