THE DBA DELIGHT: January 2011

Monday, January 31, 2011

Query Tuning - At grassroot level

Query Tuning Steps

Most of the DBA’s new to query tuning would wonder where to start in order to tune a query. I hope this article would probably guide them in understanding the steps to begin with.

Step 1:

Run the query in Management Studio and view the actual execution plan. To view the execution plan, press Ctrl+M and then execute the query in SSMS.

Step 2:

In Execition plan next to results ,Check if there are any table scans or Clustered index scan or Index scan involved in the execution plan. If yes, then you should analyze that table’s info thoroughly in the execution.

Step 3:

Identify the actual rows in the table where there is scan involved. If the table is slightly larger i.e. greater than 2000 rows I would suggest you to check if there are proper indexes in the table. If the table has less than 2000 records table scan wouldn’t be a problem and I would rather prefer a table scan on those tables.

Step 4:

If there is already an index you have to analyze why the optimizer preferred a Clustered index scan or an Index scan rather than Seeks. The reason may be due to fragmentation or outdated statistics or due to the least selectivity or the query cost.

Step 5:

The following query will give the exact % of fragmentation in the indexes for a particular table. The below query will display the fragmentation status in the table “Person.Address” in Adventureworks database.

SELECT CAST(DB_NAME(database_id) AS varchar(20)) AS [Database Name],

CAST(OBJECT_NAME(object_id) AS varchar(20)) AS [TABLE NAME], Index_id, Index_type_desc, Avg_fragmentation_in_percent, Avg_page_space_used_in_percent

FROM sys.dm_db_index_physical_stats(DB_ID('AdventureWorks'),OBJECT_ID('person.address'),NULL,NULL,'Detailed')

If the avg_fragmentation_in_percent is > 40% rebuild the index (using Alter index rebuild command) to eliminate fragmentation. It’s recommended to have a rebuild index job for all the tables scheduled to run on a weekly basis. Please NOTE that rebuilding an index is an expensive operation and ensure that it’s done only during OFF-Production hours.

Step 6:

If the indexes are fine, then check the statistics. Sometimes the index will be fine but the query would still continue to be slow since the optimizer wouldn’t be able to use the correct indexes due to outdated statistics. The following query gives the last time when the statistics for an index was last updated.

SELECT Name AS Stats_Name, STATS_DATE(object_id, stats_id) AS Statistics_update_date

FROM sys.stats

WHERE object_id=OBJECT_ID('person.address')

The statistics should be updated either weekly or daily or on alternate days depending on the frequency of modifications in the table. The more frequent the table is modified the more frequent the statistics should be updated. Sometimes for high transactional tables you can schedule a job to update the statistics on a regular basis.

Please NOTE that rebuilding the index will automatically update the statistics as well. Hence avoid updating the statistics if you are rebuilding the index.

Step 7:

If you see any key lookups happening in the execution plan, make use of Included columns to create a covering Nonclustered index to avoid expensive lookup operation. This will help in improving the query performance as the logical reads would be reduced to a great extent.

Step8:

Ensure that each table has a clustered index preferably on primary key columns (by default there is one unless you explicitly mention Nonclustered) or on Identity columns. The clustered index should always be defined on unique valued columns like primary keys or identity.

Step9:

If you have a composite index, ensure to have the most selective field (the ones which have unique values) as the leading column in the index.

Step10:

If you couldn’t tune the query further or if you are clueless, try to use Database Tuning Advisor (DTA). Provide the SQL query as input file and run the DTA. It will provide a list of recommendations to reduce the query cost.

Please do NOT blindly implement the suggestions doing so would certainly improve the query performance but you would end up creating numerous indexes which will be difficult to maintain during maintenance operations. You have to take the call of creating indexes as suggested by DTA, check whether the index will be used in most cases or if you can rewrite the query to make use of the existing indexes.

Step11:

While tuning stored procedures you need to ensure that the query plan for stored procedures is cached. The following query will help in providing the caching info for the stored procedures.

SELECT usecounts, cacheobjtype, objtype, [text]

FROM sys.dm_exec_cached_plans P

CROSS APPLY sys.dm_exec_sql_text(plan_handle) S

WHERE cacheobjtype = 'Compiled Plan' AND objtype='Proc'

AND [text] NOT LIKE '%dm_exec_cached_plans%'

AND S.DBID=11

--MENTION THE DATABASE ID FOR THE RESPECTIVE DATABASE (USE SP_HELPDB TO GET THE DBID)

The value of usecounts will increase every time you run the same stored procedure.If there is a problem in caching check if there is any SET options as most of them will cause a recompile in query plan. Also the plan will be flushed out every time you run DBCC Freeproccache or DBCC FlushprocinDB. Never use both of them in production environment as it will remove the cache for all the procedures and they (SP) will have to be recompiled the next time they are run.

If you suspect there might be some problem in the query plan, you can try to use WITH RECOMPILE option which will recompile the particular stored procedure every time it runs and see how the performance is.

CREATE PROC Test

WITH RECOMPILE

Statement 1

Statement 2

Step12:

Finally if all the above options are fine and the query couldn’t be tuned, try to rewrite the query. In few cases as soon as you view the query such as the ones below we need to rewrite the query:

Creating a view with TOP 100% in order to include the ORDERBY clause in view definition where the view will not be sorted unless we explicitly sort the view by issuing

Select * from view order by column1 –Result will be sorted

Select * from view – Result will NOT be sorted

Thus there is a extra cost involved in sorting by using the ORDER BY clause in view definition even though the result is NOT sorted. Hence we should avoid ORDER BY in view definition and instead use it as Select * from view order by column1

Using correlated sub queries will cause RBAR – Row by agonizing Row and will affect the performance.
Avoid using Scalar functions in select statements and instead use Inline or Table valued function. Since Scalar function behaves like a cursor we need to avoid it being referenced in the Select statement

Changing server collation

How to change server collation

The most clear and easy way is to completely reinstall the sql server instance. In sql server 2000 and earlier versions there is rebuild master utility - rebuildm.exe which has very clear interface but there are many cases when this utility hangs and you lose your master database forever.

Starting from sql server 2005 rebuild master utility is integrated into installation package. So you need to load your installation drive and run from command line the next command

setup.exe /q /ACTION=RebuildDatabase /INSTANCENAME=MSSQLSERVER /SAPWD="password"

/SQLSYSADMINACCOUNTS="BUILTIN\ADMINISTRATORS" /SqlCollation=Latin1_General_CI_AS

(type this command in a single line in your cmd console)

Adjust this params for your needs - INSTANCENAME - put the instance name you want to change collation of. INSTANCENAME=MSSQLSERVER - default instance. SAPWD - new sa password, SqlCollation - new sql server instance collation. This query may be used to rebuild master of sql server 2005 or to rebuild master of sql server 2008

Consider that after rebuild you will have a fresh sql server instance. You will need to attach or restore your databases, recreate logins and so on.

How to change database collation

To change the database collation you can use alter database operator. For instance to change collation of previously created database collationtest you can run the script

alter database collationtest collate Cyrillic_General_CI_AS

Script that returns all your databases' collations

select name, collation_name

from sys.databases

Query that returns your sql server instance's collation

select SERVERPROPERTY('collation')

Saturday, January 29, 2011

Script to find the job details and timings

/* script to find the server jobs and their details
This script is for SQL server 2005
Author: Arun S Prasad
Mail    : arprasaad@gmail.com
Developed for thedbadelight.blogspot.com
*/

set nocount on
create table #joblisting(jobname varchar(100), enabled varchar(10), description varchar(100), frequency varchar(200),lastrundate varchar(100),Lastrun_status varchar(100))

declare @name varchar(100)
declare @enabled varchar(100)
declare @description varchar(100)
declare @freq_type varchar(100)
declare @freq_interval varchar(100)
declare @freq_subday_type varchar(100)
declare @freq_subday_interval varchar(100)
declare @freq_relative_interval varchar(100)
declare @freq_recurrence_factor varchar(100)
declare @active_start_date varchar(100)
declare @active_end_date varchar(100)
declare @active_start_time varchar(100)
declare @active_end_time varchar(100)
declare @schedule_description varchar(8000)
declare @rundate varchar(100)
declare @status varchar(10)
declare @jobid varchar(100)

DECLARE job_cursor CURSOR
FOR
select a.[name],
a.enabled,
a.description,
c.freq_type,
c.freq_interval,
c.freq_subday_type,
c.freq_subday_interval,
c.freq_relative_interval,
c.freq_recurrence_factor,
c.active_start_date,
c.active_end_date,
c.active_start_time,
c.active_end_time
from msdb..sysjobs a,
msdb..sysjobschedules b,
msdb..sysschedules c
where a.job_id=b.job_id and b.schedule_id = c.schedule_id

OPEN job_cursor

FETCH NEXT FROM job_cursor
INTO   @name,
@enabled,
@description,
@freq_type,
@freq_interval,
@freq_subday_type,
@freq_subday_interval,
@freq_relative_interval,
@freq_recurrence_factor,
@active_start_date,
@active_end_date,
@active_start_time,
@active_end_time

while @@fetch_status = 0
begin

EXEC msdb.dbo.sp_get_schedule_description
@freq_type = @freq_type,
@freq_interval = @freq_interval,
@freq_subday_type = @freq_subday_type,
@freq_subday_interval = @freq_subday_interval,
@freq_relative_interval = @freq_relative_interval,
@freq_recurrence_factor = @freq_recurrence_factor,
@active_start_date = @active_start_date,
@active_end_date = @active_end_date,
@active_start_time = @active_start_time,
@active_end_time = @active_end_time,
@schedule_description = @schedule_description out

select @jobid = job_id from sysjobs where name = @name
select @status=run_status, @rundate=convert(varchar(30),max(run_date))+' '+convert(varchar(30),max(run_time)) from sysjobhistory where job_id = @jobid group by run_status

insert into #joblisting values
(@name,@enabled,@description,@schedule_description,@rundate,case @status when 1 then 'Success' else 'Failure' end)

FETCH NEXT FROM job_cursor
INTO   @name,
@enabled,
@description,
@freq_type,
@freq_interval,
@freq_subday_type,
@freq_subday_interval,
@freq_relative_interval,
@freq_recurrence_factor,
@active_start_date,
@active_end_date,
@active_start_time,
@active_end_time

end

close job_cursor
deallocate job_cursor
--change line number 108 as select * from #joblisting to get the complete detail of the job. Only our requirement is met by this script.
select jobname as JOBNAME,frequency as SCHEDULE from #joblisting

drop table #joblisting

Monday, January 10, 2011

Dynamic Management Views and Functions

Dynamic management views and functions return server state information that can be used to monitor the health of a server instance, diagnose problems, and tune performance.

Dynamic management views and functions return internal, implementation-specific state data. Their schemas and the data they return may change in future releases of SQL Server. Therefore, dynamic management views and functions in future releases may not be compatible with the dynamic management views and functions in SQL Server 2008.

There are two types of dynamic management views and functions:

Server-scoped dynamic management views and functions. These require VIEW SERVER STATE permission on the server.
Database-scoped dynamic management views and functions. These require VIEW DATABASE STATE permission on the database.

Most Important DMVs

1)  sys.dm_exec_sessions and sys.dm_exec_requests. These two combine effectively as the replacement of sysprocesses

2)  sys.dm_exec_query_stats. Current performance statistics for cached objects and queries

3)  sys.dm_os_wait_stats. Find out where the waits are occuring in the engine. Replaces DBCC SQLPERF('WAITSTATS')

4)  sys.dm_db_missing_index_details. See what indexes could have helped performance of previous queries. A must to look at for any

   server that is having performance problems

5)  sys.dm_exec_sql_text. A dynamic management function to get the text of a SQL query based on its SQL handle

6)  sys.dm_os_ring_buffers  - for advanced use 

TIP:

To “join” the use of a function like sys.dm_exec_sql_text you need to use the CROSS APPLY operator.

For example, here is a query to get the “input buffer” for sessions found in sys.dm_exec_requests:

Select session_id, s2.text from sys.dm_exec_requests

Cross apply sys.dm_exec_sql_text(sql_handle) s2

Don’t run this query in the context of a database in a compatibility mode < 90. It will fail because the APPLY operator is new to SQL 2005

The following query allows you to identify what SQL statements are currently executing. This information can be useful in debugging the cause of both long running queries and blocking, and also showing the execution progress of a stored procedure or batch of SQL.

SELECT [Spid] = session_Id

            , ecid

            , [Database] = DB_NAME(sp.dbid)

            , [User] = nt_username

            , [Status] = er.status

            , [Wait] = wait_type

            , [Individual Query] = SUBSTRING (qt.text, er.statement_start_offset/2, 

                   (CASE WHEN er.statement_end_offset = -1 

                        THEN LEN(CONVERT(NVARCHAR(MAX), qt.text)) * 2 

                    ELSE er.statement_end_offset END - er.statement_start_offset)/2)

            ,[Parent Query] = qt.text

            , Program = program_name

            , Hostname

            , nt_domain

            , start_time

      FROM sys.dm_exec_requests er 

      INNER JOIN sys.sysprocesses sp ON er.session_id = sp.spid

      CROSS APPLY sys.dm_exec_sql_text(er.sql_handle) as qt

      WHERE session_Id > 50               -- Ignore system spids. 

      AND session_Id NOT IN (@@SPID)      -- Ignore this current statement.

      ORDER BY 1, 2

Column name	Data type	Description
spid	smallint	SQL Server process ID.
ecid	smallint	Execution context ID used to uniquely identify the subthreads operating on behalf of a single process.
dbid	smallint	ID of the database currently being used by the process.
nt_username	nchar(128)	Windows user name for the process, if using Windows Authentication, or a trusted connection.
status	nchar(30)	Process ID status. For example, running and sleeping.
wait_type	bigint	Current wait time in milliseconds.
Individual Query	varchar	SQL Statement currently running.
Parent Query	varchar	Routine that contains the Individual Query.
program_name	nchar(128)	Name of the application program.
Hostname	nchar(128)	Name of the workstation.
nt_domain	nchar(128)	Microsoft Windows domain for the client, if using Windows Authentication, or a trusted connection.
Start_time	datetime	Time when the request is scheduled to run.

Yet we often hear confusion about their use.  People ask, " there are so many of them—which are the most important ones?  Where do I begin?"

You can begin right here.

I have drafted a team of five first-round picks which comprise DMV All-Stars Dream Team. They are simple to use & provide a rich source of useful information.

The chosen queries provide information on the following topics:

·         Expensive Queries (CPU, reads, frequency, etc.)

·         Wait Stats

·         Virtual File Stats (including calculations for virtual file latency)

·         Plan Cache 

·         Blocking (real-time)

I have provided backward-compatible versions for three of them. The first has no equivalent in SQL Server 2000.  The fifth has an equivalent, but would require a big post all its own.

These are among the most frequent DMVs I use—that's why I put them in one place.  I hope you find them helpful as well.  There’s not much unique in what I’ve provided (though I do have a surprise for a forthcoming post).  In fact, you're likely to recognize all of them.  Some of them I've liberated right out of BOL (2005 2008).  (And sh-h-h-h—don't tell anyone but I've copied-&-pasted one directly from the SQL Server Customer Advisory Team's (SQL CAT) Best Practices Toolbox).  Yet, the point is, ladies & gents, goyles & geeks, you have here in one place many of the DMVs you’ll need for much of your routine work.

There were many worthy queries which I have not included.  Perhaps someday I’ll draft a team of bench warmers which might include, e.g., DMVs for fragmentation.  If you don’t like my picks, nominate DMVs for your own  team.

--<<<<<<<<<<----------------------------------------------------------------->>>>>>>>>>--

--DMV_All-Stars.sql

--Table of Contents

--1. expensive queries

--2. wait stats

--3. virtual file stats (& virtual file latency)

--4. plan cache interrogation

--5. real-time blockers

--<<<<<<<<<<----------------------------------------------------------------->>>>>>>>>>--

--Weasel Clause: This script is provided "AS IS" with no warranties, and confers no rights. 

--  Use of included script samples are subject to the terms specified at 

--  http://www.microsoft.com/info/cpyright.htm

--<<<<<<<<<<----------------------------------------------------------------->>>>>>>>>>--

--1. expensive queries

    --text *and* statement

    --usage: modify WHERE & ORDER BY clauses to suit circumstances

SELECT TOP 25

      -- the following four columns are NULL for ad hoc and prepared batches

      DB_Name(qp.dbid) as dbname , qp.dbid , qp.objectid , qp.number 

    --, qp.query_plan --the query plan can be *very* useful; enable if desired

    , qt.text 

    , SUBSTRING(qt.text, (qs.statement_start_offset/2) + 1,

        ((CASE statement_end_offset 

            WHEN -1 THEN DATALENGTH(qt.text)

            ELSE qs.statement_end_offset END 

                - qs.statement_start_offset)/2) + 1) as statement_text

    , qs.creation_time , qs.last_execution_time , qs.execution_count 

    , qs.total_worker_time    / qs.execution_count as avg_worker_time

    , qs.total_physical_reads / qs.execution_count as avg_physical_reads 

    , qs.total_logical_reads  / qs.execution_count as avg_logical_reads 

    , qs.total_logical_writes / qs.execution_count as avg_logical_writes 

    , qs.total_elapsed_time   / qs.execution_count as avg_elapsed_time 

    , qs.total_clr_time       / qs.execution_count as avg_clr_time

    , qs.total_worker_time , qs.last_worker_time , qs.min_worker_time , qs.max_worker_time 

    , qs.total_physical_reads , qs.last_physical_reads , qs.min_physical_reads , qs.max_physical_reads 

    , qs.total_logical_reads , qs.last_logical_reads , qs.min_logical_reads , qs.max_logical_reads 

    , qs.total_logical_writes , qs.last_logical_writes , qs.min_logical_writes , qs.max_logical_writes 

    , qs.total_elapsed_time , qs.last_elapsed_time , qs.min_elapsed_time , qs.max_elapsed_time

    , qs.total_clr_time , qs.last_clr_time , qs.min_clr_time , qs.max_clr_time 

    --, qs.sql_handle , qs.statement_start_offset , qs.statement_end_offset 

    , qs.plan_generation_num  -- , qp.encrypted 

    FROM sys.dm_exec_query_stats as qs 

    CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) as qp

    CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) as qt

    --WHERE...

    --ORDER BY qs.execution_count      DESC  --Frequency

      ORDER BY qs.total_worker_time    DESC  --CPU

    --ORDER BY qs.total_elapsed_time   DESC  --Durn

    --ORDER BY qs.total_logical_reads  DESC  --Reads 

    --ORDER BY qs.total_logical_writes DESC  --Writes

    --ORDER BY qs.total_physical_reads DESC  --PhysicalReads    

    --ORDER BY avg_worker_time         DESC  --AvgCPU

    --ORDER BY avg_elapsed_time        DESC  --AvgDurn     

    --ORDER BY avg_logical_reads       DESC  --AvgReads

    --ORDER BY avg_logical_writes      DESC  --AvgWrites

    --ORDER BY avg_physical_reads      DESC  --AvgPhysicalReads

    --sample WHERE clauses

    --WHERE last_execution_time > '20070507 15:00'

    --WHERE execution_count = 1

    --  WHERE SUBSTRING(qt.text, (qs.statement_start_offset/2) + 1,

    --    ((CASE statement_end_offset 

    --        WHEN -1 THEN DATALENGTH(qt.text)

    --        ELSE qs.statement_end_offset END 

    --            - qs.statement_start_offset)/2) + 1)

    --      LIKE '%MyText%'

--<<<<<<<<<<----------------------------------------------------------------->>>>>>>>>>--

--2. wait stats

--2000

--DBCC sqlperf(waitstats,CLEAR)               --re-initialize waitstats

DBCC sqlperf(waitstats)

--2005

--DBCC sqlperf('sys.dm_os_wait_stats',CLEAR)  --re-initialize waitstats

SELECT * , (wait_time_ms - signal_wait_time_ms) as resource_wait_time_ms

    FROM sys.dm_os_wait_stats

    ORDER BY resource_wait_time_ms DESC 

    --ORDER BY wait_time_ms DESC

    --ORDER BY signal_wait_time_ms DESC

    --ORDER BY waiting_tasks_count DESC

    --ORDER BY max_wait_time_ms DESC

--<<<<<<<<<<----------------------------------------------------------------->>>>>>>>>>--

--3. virtual file stats

--2000

SELECT DB_NAME(DBID) as DbName 

    , (IoStallMS/(NumberReads + NumberWrites)) as IsStall

    , * FROM ::fn_virtualfilestats(-1,-1)

--2005

SELECT 

    -- virtual file latency calcs

    IsReadStall

        = CASE WHEN num_of_reads = 0 

               THEN 0 ELSE (io_stall_read_ms/num_of_reads) END

  , IsWriteStall

       = CASE WHEN io_stall_write_ms = 0 

              THEN 0 ELSE (io_stall_write_ms/num_of_writes) END

  , IsStall

      = CASE WHEN (num_of_reads = 0 AND num_of_writes = 0)

             THEN 0 ELSE (io_stall/(num_of_reads + num_of_writes)) END

  , LEFT(mf.physical_name,2) as Drive

  , DB_NAME(vfs.database_id) as DB

  --, mf.name AS FileName

  , vfs.*

  , mf.physical_name

  FROM sys.dm_io_virtual_file_stats(NULL,NULL) as vfs

    JOIN sys.master_files as mf ON vfs.database_id = mf.database_id AND vfs.file_id = mf.file_id

  --WHERE vfs.file_id = 2 -- log files

  ORDER BY IsStall DESC

  -- ORDER BY IsReadStall DESC

  -- ORDER BY IsWriteStall DESC

--<<<<<<<<<<----------------------------------------------------------------->>>>>>>>>>--

--4. plan cache interrogation

-- note: sys.dm_exec_cached_plans is diminutive version of syscacheobjects

    -- no dbid, setopts

-- we want reusable code, absence of ad hoc SQL

-- we want relatively few rows with low usecounts

--2000   

SELECT cacheobjtype , objtype , usecounts , pagesused , dbid , sql

    FROM master.dbo.syscacheobjects

    WHERE cacheobjtype = 'Compiled Plan'

    ORDER BY usecounts DESC

    --ORDER BY sql 

--2005

SELECT c.cacheobjtype , c.objtype , c.usecounts , c.size_in_bytes , t.dbid , t.text 

    FROM sys.dm_exec_cached_plans as c

        CROSS APPLY sys.dm_exec_sql_text(plan_handle) as t

    WHERE c.cacheobjtype = 'Compiled Plan'

    ORDER BY c.usecounts DESC

    --ORDER BY t.text 

--<<<<<<<<<<----------------------------------------------------------------->>>>>>>>>>--

--5. real-time blockers

    --Report Blocker and Waiter SQL Statements

    --http://www.microsoft.com/technet/scriptcenter/scripts/sql/sql2005/trans/sql05vb044.mspx?mfr=true

    -- SQLCAT BPT

SELECT 

    t1.resource_type as lock_type

  , db_name(resource_database_id) as DB

  , t1.resource_associated_entity_id as blkd_obj

  , t1.request_mode as lock_req          -- lock requested

  , t1.request_session_id as waiter_sid-- spid of waiter

  , t2.wait_duration_ms as waittime

  , (SELECT text FROM sys.dm_exec_requests as r  --- get sql for waiter

        CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) 

            WHERE r.session_id = t1.request_session_id) as waiter_batch

  , (SELECT SUBSTRING(qt.text , r.statement_start_offset/2

          , (CASE WHEN r.statement_end_offset = -1 

                THEN LEN(CONVERT(nvarchar(MAX), qt.text)) * 2 

                ELSE r.statement_end_offset END - r.statement_start_offset)/2) 

        FROM sys.dm_exec_requests as r

            CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) as qt

            WHERE r.session_id = t1.request_session_id) as waiter_stmt    --- this is the statement executing right now

   , t2.blocking_session_id as blocker_sid -- spid of blocker

   , (SELECT text FROM sys.sysprocesses as p       --- get sql for blocker

            CROSS APPLY sys.dm_exec_sql_text(p.sql_handle) 

            WHERE p.spid = t2.blocking_session_id) as blocker_stmt

FROM sys.dm_tran_locks as t1 

    JOIN sys.dm_os_waiting_tasks as t2 

        ON t1.lock_owner_address = t2.resource_address

Script Repository: SQL Server 2005

http://www.microsoft.com/technet/scriptcenter/scripts/sql/sql2005/default.mspx?mfr=true 

THE DBA DELIGHT