SQL Server Torque: October 2016

QUERY EXECUTION

Recompile would be the 1^st step to influence to optimize to recompile the code again.
Compilation is the most important stage
Cost based optimizer == need to evaluate a query reasonably quickly
When writing a query, there are many ways to write it but do not focus to that first. Focus the result.
WHERE msal * 12 > literal -> Bad
WHERE msal > literal / 12 -> Good
All about statistics / estimation -> Query performance
Most of the cases the issue is not because of out of date stats or indexes, it could be parameter sniffing
EXEC <proc name> <parameters> -> uses the existing plan
EXEC <proc name> <parameters> WITH RECOMPILE -> generates a new plan
Using above two, you can identify the parameter sensitiveness (sniffing)
Update stats invalidate the plan
Estimates vs Actual -> If it is different, it may not be the stat problem always. It may be parameter sniffing.

STATISTICS

Selectivity
- Optimizer loves highly selective predicates
- Low no.of rows -> high selectivity
- High no.of rows -> low selectivity
Statistics -> Summarized info about the data distribution of table columns
DBCC AUTOPILOT -> Undocumented
Hypothetical Indexes -> just the index structure.
- What if analysis using AUTOPILOT with >= SQL Server 2008
sp_helpstats ‘<table name>’, ‘all’
Histogram -> 200 steps + 1 row for the null if the column allows null
- SQL Server 7.0 had 300 rows
- When the histogram is large, it increases the compilation time because histogram is not like an index
- EQ_ROWS – Equal rows for the index key
- Rows * All density = Avg no.of rows returns for that column or the combination of cols
- If the table is huge, the data distribution present in histogram is not quite accurate
- Step compression -> When building the histogram, if it finds the value approximately similar to each adjacent steps, then the algorithm will compress those steps and create 1 step
- RANGE_ROWS -> values between two HI_KEYs and excluding the HI_KEYs at both ends
- AVG_RANGE_ROWS = RANGE_ROWS / DISTINCT_RANGE_ROWS
Exec sp_autostats ‘<table name>’
Sys.stats -> shows all the stats including index stats
Sys.dm_db_stats_properties -> gives lot of details that you can use to update stats more effectively programmatically.
Histogram -> direct hit
DBCC SHOWSTATISTICS WITH HISTOGRAM
Stats will update on Index Rebuild but not on Index re-org.
Entire stats structure stored in db as a BLOB (Header, Density Vector and Histogram)
Partitioned table uses table level stats which has 200 steps by each by each partition
Online partition level index rebuild in SQL Server 2014, but when MS introduced partitioning? It was in SQL Server 2005. So MS took 9 years to get online partition level index rebuild
Tuple cardinality -> used to estimate distinct cardinality
Optimization | Compilation | Execution
- Local variable values are not known at optimization time
Parameters and literals can be sniffed -> uses histogram
Variables cannot be sniffed -> density vector
UDTATE STATS with any % will not be parallelized but FULL SCAN can be parallelized.
OPTION (QUERYTRACEON 3602, 9204, RECOMPILE)
Dynamic auto updating threshold
- For large tables to have 20% change , you need to wait long time to update stats.
SQL Server does not understand the correlation between columns
Calculation direction in query plan is from right to left.

CONCERNS AROUND THE HISTOGRAM

Problem is always with monster tables
Is 200 steps enough for such tables?
Even the table is partitioned, still the histogram is @ table level
Sp_recompile for a table is an expensive operation. It needs Sch-M lock on the table
Filtered stats – consider creating filtered stats, even daily basis to tackle and solve the estimates problems due to skewed data. So that you will get better execution plans
QUERYTRACEON (2353) -> Additional info.

SQL Server Torque

Friday, October 21, 2016

Short notes on Statistics

How to interpret Disk Latency

Search This Blog