Monday, January 31, 2011

Query Tuning - At grassroot level


Query Tuning Steps

Most of the DBA’s new to query tuning would wonder where to start in order to tune a query. I hope this article would probably guide them in understanding the steps to begin with.


Step 1:
Run the query in Management Studio and view the actual execution plan. To view the execution plan, press Ctrl+M and then execute the query in SSMS.

Step 2:
In Execition plan next to results ,Check if there are any table scans or Clustered index scan or Index scan involved in the execution plan. If yes, then you should analyze that table’s info thoroughly in the execution.

Step 3:
Identify the actual rows in the table where there is scan involved. If the table is slightly larger i.e. greater than 2000 rows I would suggest you to check if there are proper indexes in the table. If the table has less than 2000 records table scan wouldn’t be a problem and I would rather prefer a table scan on those tables.

Step 4:
If there is already an index you have to analyze why the optimizer preferred a Clustered index scan or an Index scan rather than Seeks. The reason may be due to fragmentation or outdated statistics or due to the least selectivity or the query cost.

Step 5:
The following query will give the exact % of fragmentation in the indexes for a particular table. The below query will display the fragmentation status in the table “Person.Address” in Adventureworks database.

SELECT CAST(DB_NAME(database_id) AS varchar(20)) AS [Database Name],
CAST(OBJECT_NAME(object_id) AS varchar(20)) AS [TABLE NAME], Index_id, Index_type_desc, Avg_fragmentation_in_percent, Avg_page_space_used_in_percent
FROM sys.dm_db_index_physical_stats(DB_ID('AdventureWorks'),OBJECT_ID('person.address'),NULL,NULL,'Detailed')

If the avg_fragmentation_in_percent is > 40% rebuild the index (using Alter index rebuild command) to eliminate fragmentation. It’s recommended to have a rebuild index job for all the tables scheduled to run on a weekly basis. Please NOTE that rebuilding an index is an expensive operation and ensure that it’s done only during OFF-Production hours.


Step 6:
If the indexes are fine, then check the statistics. Sometimes the index will be fine but the query would still continue to be slow since the optimizer wouldn’t be able to use the correct indexes due to outdated statistics. The following query gives the last time when the statistics for an index was last updated.

SELECT Name AS Stats_Name, STATS_DATE(object_id, stats_id) AS Statistics_update_date
FROM sys.stats
WHERE object_id=OBJECT_ID('person.address')

The statistics should be updated either weekly or daily or on alternate days depending on the frequency of modifications in the table. The more frequent the table is modified the more frequent the statistics should be updated. Sometimes for high transactional tables you can schedule a job to update the statistics on a regular basis.
Please NOTE that rebuilding the index will automatically update the statistics as well. Hence avoid updating the statistics if you are rebuilding the index.



Step 7:
If you see any key lookups happening in the execution plan, make use of Included columns to create a covering Nonclustered index to avoid expensive lookup operation. This will help in improving the query performance as the logical reads would be reduced to a great extent.

Step8:
Ensure that each table has a clustered index preferably on primary key columns (by default there is one unless you explicitly mention Nonclustered) or on Identity columns. The clustered index should always be defined on unique valued columns like primary keys or identity.

Step9:
If you have a composite index, ensure to have the most selective field (the ones which have unique values) as the leading column in the index.

Step10:
If you couldn’t tune the query further or if you are clueless, try to use Database Tuning Advisor (DTA). Provide the SQL query as input file and run the DTA. It will provide a list of recommendations to reduce the query cost.

Please do NOT blindly implement the suggestions doing so would certainly improve the query performance but you would end up creating numerous indexes which will be difficult to maintain during maintenance operations. You have to take the call of creating indexes as suggested by DTA, check whether the index will be used in most cases or if you can rewrite the query to make use of the existing indexes.

Step11:
While tuning stored procedures you need to ensure that the query plan for stored procedures is cached. The following query will help in providing the caching info for the stored procedures.

SELECT usecounts, cacheobjtype, objtype, [text]
FROM sys.dm_exec_cached_plans P
CROSS APPLY sys.dm_exec_sql_text(plan_handle) S
WHERE cacheobjtype = 'Compiled Plan' AND objtype='Proc'
AND [text] NOT LIKE '%dm_exec_cached_plans%'
AND S.DBID=11
--MENTION THE DATABASE ID FOR THE RESPECTIVE DATABASE (USE SP_HELPDB TO GET THE DBID)


The value of usecounts will increase every time you run the same stored procedure.If there is a problem in caching check if there is any SET options as most of them will cause a recompile in query plan. Also the plan will be flushed out every time you run DBCC Freeproccache or DBCC FlushprocinDB. Never use both of them in production environment as it will remove the cache for all the procedures and they (SP) will have to be recompiled the next time they are run.

If you suspect there might be some problem in the query plan, you can try to use WITH RECOMPILE option which will recompile the particular stored procedure every time it runs and see how the performance is.


CREATE PROC Test
WITH RECOMPILE
AS
Statement 1
Statement 2

Step12:
Finally if all the above options are fine and the query couldn’t be tuned, try to rewrite the query. In few cases as soon as you view the query such as the ones below we need to rewrite the query:

  • Creating a view with TOP 100% in order to include the ORDERBY clause in view definition where the view will not be sorted unless we explicitly sort the view by issuing
<!--[endif]-->
Select * from view order by column1 –Result will be sorted
Select * from view Result will NOT be sorted
Thus there is a extra cost involved in sorting by using the ORDER BY clause in view definition even though the result is NOT sorted. Hence we should avoid ORDER BY in view definition and instead use it as Select * from view order by column1
  • <!--[endif]-->Using correlated sub queries will cause RBAR – Row by agonizing Row and will affect the performance.
  • Avoid using Scalar functions in select statements and instead use Inline or Table valued function. Since Scalar function behaves like a cursor we need to avoid it being referenced in the Select statement

No comments:

Post a Comment