QUERY EXECUTION
- Recompile would be the 1st step to influence to optimize to recompile the code again.
- Compilation is the most important stage
- Cost based optimizer == need to evaluate a query reasonably quickly
- When writing a query, there are many ways to write it but do not focus to that first. Focus the result.
- WHERE msal * 12 > literal -> Bad
- WHERE msal > literal / 12 -> Good
- All about statistics / estimation -> Query performance
- Most of the cases the issue is not because of out of date stats or indexes, it could be parameter sniffing
- EXEC <proc name> <parameters> -> uses the existing plan
- EXEC <proc name> <parameters> WITH RECOMPILE -> generates a new plan
- Using above two, you can identify the parameter sensitiveness (sniffing)
- Update stats invalidate the plan
- Estimates vs Actual -> If it is different, it may not be the stat problem always. It may be parameter sniffing.
STATISTICS
- Selectivity
- Optimizer loves highly selective predicates
- Low no.of rows -> high selectivity
- High no.of rows -> low selectivity
- Optimizer loves highly selective predicates
- Statistics -> Summarized info about the data distribution of table columns
- DBCC AUTOPILOT -> Undocumented
- Hypothetical Indexes -> just the index structure.
- What if analysis using AUTOPILOT with >= SQL Server 2008
- What if analysis using AUTOPILOT with >= SQL Server 2008
- sp_helpstats ‘<table name>’, ‘all’
- Histogram -> 200 steps + 1 row for the null if the column allows null
- SQL Server 7.0 had 300 rows
- When the histogram is large, it increases the compilation time because histogram is not like an index
- EQ_ROWS – Equal rows for the index key
- Rows * All density = Avg no.of rows returns for that column or the combination of cols
- If the table is huge, the data distribution present in histogram is not quite accurate
- Step compression -> When building the histogram, if it finds the value approximately similar to each adjacent steps, then the algorithm will compress those steps and create 1 step
- RANGE_ROWS -> values between two HI_KEYs and excluding the HI_KEYs at both ends
- AVG_RANGE_ROWS = RANGE_ROWS / DISTINCT_RANGE_ROWS
- SQL Server 7.0 had 300 rows
- Exec sp_autostats ‘<table name>’
- Sys.stats -> shows all the stats including index stats
- Sys.dm_db_stats_properties -> gives lot of details that you can use to update stats more effectively programmatically.
- Histogram -> direct hit
- DBCC SHOWSTATISTICS WITH HISTOGRAM
- Stats will update on Index Rebuild but not on Index re-org.
- Entire stats structure stored in db as a BLOB (Header, Density Vector and Histogram)
- Partitioned table uses table level stats which has 200 steps by each by each partition
- Online partition level index rebuild in SQL Server 2014, but when MS introduced partitioning? It was in SQL Server 2005. So MS took 9 years to get online partition level index rebuild
- Tuple cardinality -> used to estimate distinct cardinality
- Optimization | Compilation | Execution
- Local variable values are not known at optimization time
- Local variable values are not known at optimization time
- Parameters and literals can be sniffed -> uses histogram
- Variables cannot be sniffed -> density vector
- UDTATE STATS with any % will not be parallelized but FULL SCAN can be parallelized.
- OPTION (QUERYTRACEON 3602, 9204, RECOMPILE)
- Dynamic auto updating threshold
- For large tables to have 20% change , you need to wait long time to update stats.
- For large tables to have 20% change , you need to wait long time to update stats.
- SQL Server does not understand the correlation between columns
- Calculation direction in query plan is from right to left.
CONCERNS AROUND THE HISTOGRAM
- Problem is always with monster tables
- Is 200 steps enough for such tables?
- Even the table is partitioned, still the histogram is @ table level
- Sp_recompile for a table is an expensive operation. It needs Sch-M lock on the table
- Filtered stats – consider creating filtered stats, even daily basis to tackle and solve the estimates problems due to skewed data. So that you will get better execution plans
- QUERYTRACEON (2353) -> Additional info.