This can happen due to many factors like lack of indexes, out of dates statistics, if it returns a large number of records, etc.
I observed a similar behavior recently in a production environment. The stored procedure had the code similar to the following;
SELECT Col3
FROM dbo.Tab1
WHERE Col1 = @Col1
AND Col2 = @Col2
AND RTRIM(LTRIM(REPLACE(REPLACE(REPLACE(REPLACE(CAST(Comments AS VARCHAR(2000)),'<br />',''),CHAR(10),''),CHAR(13),''),' ',''))) <> ''
Let me give you some background information about this table.
- Col2 and Col3 are UNIQUEIDENTIFIER
- Comments column is ntext and does not have indexes
- All three columns Col1, Col2 and Col3 has indexes
- Col2 and the PK is the clustered index of the table
- Table has 825,452,256 records (825 million)
When executing the stored procedure very first time, application has received time outs which prompt us to check in database side what is happening. When I checked I noticed it is executing stats creation statement for the Comments column. (This column has never used in WHERE clause before)
SELECT StatMan([SC0], [LC0], [SB0000]) FROM (SELECT TOP 100 PERCENT [SC0], [LC0], step_direction([SC0]) over (order by NULL) AS [SB0000]
FROM (SELECT SUBSTRING ([Comments], 1, 100)++substring([Comments], case when datalength([Comments])<=400 then 101 else datalength([Comments])/2-99 end,
datalength([Comments])) AS [SC0], datalength([Comments]) AS [LC0] FROM [dbo].[Tab1] TABLESAMPLE SYSTEM (1.517414e-001 PERCENT)
WITH (READUNCOMMITTED) ) AS _MS_UPDSTATS_TBL_HELPER ORDER BY [SC0], [SB0000] ) AS _MS_UPDSTATS_TBL OPTION (MAXDOP 1)
And also no plan was created. Which means before creates the plan SQL Server needs the stats created for Comments column which slows down the execution. That is the reason it is executing the above statement.
However I see this behavior in SQL Server is unnecessary because the Comments column is rapped by many string functions and it will anyway use CLUSTERED INDEX SEEK or SCAN depending on the no. of rows it output.
It was able to by pass the above stats creation step, after creating the STATS manually with 0 records. Here I choose 0 records otherwise it takes lot of time since this table is a huge one.
CREATE STATISTICS Comments_Stats ON Tab1 (Comments) WITH SAMPLE 0 ROWS,NORECOMPUTE
After creating this stats, now SQL Server can create a plan for the stored procedure and it complied quickly and runs as expected.
Alon Zamir
ReplyDeleteDBA at QualiTest Group
When you see stored procedures taking long time to execute examine them with the Database Engine Tuning Advisor and then you`ll get advises you could use in order to improve the performance. save the recommendations in a file and apply only the most effective ones, take into account that too many indices and/or indices that contain too many fields would slow down the upload, so it is not a good idea to base your decisions on the recommendations for one stored procedure only, it would be better to create a load file that executes the most frequent and heavy stored procedure and then analyze that load file with the Database Engine Tuning Advisor, then the recommendations you'd get would be the most relevant. Statistics do not slow down uploads but they might consume a lot of disk space, so the amount of them should also be considered. When you create indices and statistics make sure to use meaningful names so that you won't create many indices that are alike or several indices that contain the same columns exactly. Partitioning tables and indices could also help improve the performance but that is not trivial, you should choose your partitioning criteria carefully, for further details regarding partitioning please read http://msdn.microsoft.com/en-us/library/ms190787.aspx .
Abdul Ghaffar
ReplyDeleteOffshore Database Administrator at Urban Lending Solutions (USA)
Execution plan of objects is based upon the stats. This is a good practice if a large b=object is added, imported, or amended, we create/update stats of this object. To fulfill this a sql job is the best option to update stats in off peak hours. The large growing databses and large databases require this frequently.
It seems that partitioning would be the solution for this case, consider partitioning the relevant table(s) according to the most frequently used where conditions. For an example of how to partition an existing table please read http://sqlandme.com/2011/08/09/sql-server-how-to-partition-an-existing-table/ .
ReplyDeleteThe partitioning (if done effectively using the optimal criteria, which you should test in your test environment) would cause your query to run only on part of the data (the relevant partitions only) instead of all of it and the running time of the query should decrease significantly. apply only the partitions that reduce the running time significantly and test with as big as possible amounts of data.
By Alon Zamir
Why don't you change the column in the where clause and use a more suitable column with a better data type.
ReplyDeleteThat way you can have indexing applicable to your sql command.
By ezeizu john
http://www.linkedin.com/e/-hztfjp-hpk9puth-3x/vaq/5819783244165431300/125418/5821005976228089856/view_disc/?hs=false&tok=3B9HNaz6bDYm01
Ward Pond
ReplyDeleteMentor, SQL Relatiomal Engine at SolidQ (disabled)
If you'd rather examine all of the data, consider the CONTAINS clause with an appropriate full-text index on the Comments field.
Omkaar Sawant
ReplyDeleteSQL Server DBA at Bluestem Brands, Inc.
Susantha Bathige, please check all the nolock & index hints for all the tables used in the joins. If the hints are not specified , please specify them appropriately.
After that you can check for missing indexes using your DMV's. Based on the improvement measure & user impact you can add the missing indexes. Also see that you add indexes for tables with more that 9000 rows, as indexing tables doesn't always serve the purpose of improving performance.
If you want I can forward you my DMV code that can give list of missing, duplicate & redundant indexes.
Ward Pond
ReplyDeleteMentor, SQL Relatiomal Engine at SolidQ (disabled)
Omkaar Sawant, I agree with everything you say, but that WHERE clause in Susantha Bathige's original post is likely also the primary concern. As a general rule in a data retrieval query like Susantha's, NOLOCK is going to be more of a concurrency issue than a single query performance issue. The WHERE clause is the only semantic outlier here IMHO.
I think we're both right. :).
Performance tuning often involves making multiple adjustments. Remember to make changes one at a time, measuring the results each time. Good performance tuning often results in code and even schema that is unrecognizable from its starting state.
Good luck!
Ward Pond
ReplyDeleteMentor, SQL Relatiomal Engine at SolidQ (disabled)
Omkaar Sawant, I agree with everything you say, but that WHERE clause in Susantha Bathige's original post is likely also the primary concern. As a general rule in a data retrieval query like Susantha's, NOLOCK is going to be more of a concurrency issue than a single query performance issue. The WHERE clause is the only semantic outlier here IMHO.
I think we're both right. :).
Performance tuning often involves making multiple adjustments. Remember to make changes one at a time, measuring the results each time. Good performance tuning often results in code and even schema that is unrecognizable from its starting state.
Good luck!