dutyDBA.com

Practical solutions from a real DBA

, , ,

sp_WhoIsActive – What is it? How to deploy & configure it? How to maintain the collected data?

If you’ve been an SQL Server DBA for a while, chances are you are already familiar with sp_WhoIsActive, a comprehensive, free, lightweight stored procedure developed by Adam Machanic for monitoring and logging SQL Server activity. If you aren’t already familiar with it, then you are missing out on a very useful tool.

Many DBAs over the years have made this a part of their standard DBA tool kit. They deploy it as part of their SQL Server build. Some run it on an ad hoc basis, and some schedule it as an SQL Agent job that runs at regular intervals and logs the output to a table.

Whilst this stored procedure has been widely documented on SQL Server websites, I couldn’t find much information on how to manage the captured data, how to purge or archive the data, and the job history that it generates (when scheduled as a job). In this post, I am going to go over all aspects of successfully deploying, managing, archiving/purging the data as well as job history.

Feel free to jump to the specific section of this post:

What is sp_WhoIsActive?

sp_WhoIsActive is a free stored procedure developed by Adam Mechanic. It queries the SQL Server Dynamic Management Views (DMVs) to gather information about the currently active sessions in your SQL Server. It gathers all the currently running SQL statements, and additional details about those statements like CPU utlisation, disk IO, tempdb usage, blocking information, query plan and a lot more.

DBAs normally configure it to run as a scheduled SQL Agent job that runs at regular intervals like every 10 seconds or every half minute or a minute. It stores the information into a specified table, which can be queried later.

What can sp_WhoIsActive do for me?

sp_WhoIsActive is is very useful, especially when its scheduled to run at frequent intervals. On my servers, I prefer to run it every 10 seconds. Every 10 seconds might sound excessive, but it has not caused any issues on my servers. Running it with higher frequency gives me more continuous insight into what’s going on in my SQL Server at any given point in time.

It can typically help you answer questions like below (this is not the full list by any means):

  • An application support engineer comaplains that an important batch process had timed out last night at 11 PM. He wants to know what was running in the production database at that time.

  • Your system administrator informs you that every day around 2 AM, your SQL Server is maxing out the CPU and runing at 100% for 5 minutes. Now you want to now what was running at that time.

  • You are the on call DBA, and you get an alert saying the production tempdb database has filled up. By the time you login and check, tempdb space utilisation has cleared. You want to know what has caused the tempdb to fill up.

  • Application users are experiencing slow performance. The first line DBA informs you that there was a massive blocking chain, but it cleared before they can make sense of it. You want to see who was the head blocker and what was the blocking SQL command.

  • The development team has been tasked with optimising the database code in oder to improve performance for users during the peak operational period of 2 to 5 PM. You are asked to produce a list of all SQL statements that took longer than 5 seconds between 2 and 5 PM.

How to download and create sp_WhoIsActive?

Follow the below steps to create sp_WhoIsActive stored procedure on your SQL Server:

  • Download the latest source code zip file or tar/gzip from the above link. As of this writing, the zip file contains two folders name 2008 and 2019. The 2008 folder contains old code for SQL Server versions 2005 and 2008. The 2019 folder contains new code for SQL Server versions 2012 onwars. Use the code from 2019 folder as you are very likely on a newever version of SQL Server.

  • Open the sp_WhoIsActive.SQL file in SQL Server Management Studio. Connect to your SQL Server and change context to your desired database (typically a DBA tools database where you keep your DBA scripts).

  • Run the script to create sp_WhoIsActive.

    How to set up sp_WhoIsActive and execute it?

    Now that you have successfully created the procedure sp_WhoIsActive on your SQL Server database, next step is to run it. You can just execute it with no input parameters. This will run the procedure in default mode and display output. You might not get any rows, if nothing is running on your SQL Server.

    The main use case is to run sp_WhoIsActive at regular intervals and capture its output into a table for analysis, troubleshooting and monitoring trends.

    sp_WhoIsActive is highly customisable – you can add or remove various pieces of functionality by using the provided input parameters. Depending on which options you choose and which you omit, the output of the stored procedure varies. You have to match the output table structure accordingly.

    To make this easier, you can specify all the input parameters as per your requirements, and additionally pass the parameter @return_schema as 1. When you run this, the procedure will return the suitable table structure as an output parameter named @schema. You can then use the CREATE TABLE statement stored in the output parameter @schema to create the output table. Example below:

    T-SQL code to create the target table
    DECLARE @schema nvarchar(max) = ''
    
    --Executing with @return_schema = 1 
    --to output the table structure
    EXEC dbo.sp_WhoIsActive
        @get_plans  = 1,
        @get_outer_command  = 1,
        @find_block_leaders = 1,
        @format_output = 0,
        @destination_table  = 'dbo.WhoIsActive',
        @return_schema = 1,
        @schema = @schema OUTPUT
    
    --Replace the <table_name> place holder 
    --with your preferred table name
    SET @schema = REPLACE(@schema, '<table_name>', 'dbo.WhoIsActive')
    
    --Print the table structure
    PRINT @schema
    
    --Create the table
    EXEC sp_executesql @schema
    GO

    Now that the output table dbo.WhoIsActive is created, you can start running the stored procedure dbo.sp_WhoIsActive to start logging its output to the output table as shown below:

    T-SQL code to log the output to the target table
    EXEC dbo.sp_WhoIsActive
        @get_plans  = 1,
        @get_outer_command  = 1,
        @find_block_leaders = 1,
        @format_output = 0,
        @destination_table  = 'dbo.WhoIsActive'
    GO

    How to schedule sp_WhoIsActive as an SQL Agent job?

    This is the easy part. Simply create an SQL Agent job, add a T-SQL job step with the below code:

    T-SQL code to add to SQL Agent job step
    EXEC dbo.sp_WhoIsActive
        @get_plans  = 1,
        @get_outer_command  = 1,
        @find_block_leaders = 1,
        @format_output = 0,
        @destination_table  = 'dbo.WhoIsActive'
    GO

    Add a schedule to this SQL Agent job as per your requirement. I quite like to schedule my job to run every 10 seconds. Some DBAs run it every 30 seconds or every one minute.

    How to keep the data and space in check, and how to archive/purge the data?

    As a DBA, you know how important table indexes are for keeping the queries running at their best. The destination table dbo.WhoIsActive could get quite big over time, with all the query text and execution plan xml. The first and most important thing is to add a clustered index to the destination table. As the table can get quite big on busy servers, I strongly recommend compressing the table. I achieve this by creating a non-unique clustered index on the collection_time column, with page compression. The datatype of the collection_time column is datetime, and the data is added to the table in increasing order of collection_time – hence it makes an excellent choice for a clustered index key and helps speed up your queries, especially when looking for specific date ranges.

    Adding clustered index with page compression
    CREATE CLUSTERED INDEX NU_C_WhoIsActive_collection_time 
      ON dbo.WhoIsActive(collection_time) 
      WITH (DATA_COMPRESSION = PAGE)
    GO

    Page compression helps to reduce the table size dramatically. This helps keep the destination table small.

    The next thing to tackle the amount of data stored in the destination table. There is no point keeping months and months of data in this table. I typically keep the data for 7 days or a couple of weeks. I also delete some unwanted data from the destination table, that is of no interest to me. This helps reduce the noise.

    If you configure the SQL Agent job to run too frequently, like I do, the job history table msdb..sysjobhistory also gets pretty big. You need to keep purging this table as well to reduce msdb space issues.

    In summary, these are the things I purge:

    • Data older than 1 week.

    • Delete unwanted data. Eg. an application constantly runs SELECT GETDATE(). This is just noise and I delete it.

    • Delete SQL Agent job history for this job only, from that msdb..sysjobhistory table, that is older than 72 hours

    Here is an updated version of the SQL Agent job step code, that incorporates the above data clean up and archiving activities.

    SQL Agent job step code incorporating purging
    SET NOCOUNT ON
    GO
    
    --Delete data older than 7 days
    DECLARE @retain_days int = 7
    
    --Get the time stamp for last execution of this job
    DECLARE @collection_time datetime = 
    (
        SELECT MAX(collection_time) FROM dbo.WhoIsActive
    )
    
    --Execute sp_WhoIsActive
    EXEC dbo.sp_WhoIsActive
        @get_plans  = 1,
        @get_outer_command  = 1,
        @find_block_leaders = 1,
        @format_output = 0,
        @destination_table  = 'dbo.WhoIsActive'
    
    --Purge data older than @retain_days
    DELETE dbo.WhoIsActive 
    WHERE collection_time < DATEADD(dd, -@retain_days, CURRENT_TIMESTAMP)
    
    RAISERROR('**** %d rows have been purged from WhoIsActive, that are older than %d days ****', 0, 1, @@ROWCOUNT, @retain_days)
    
    --Delete unwanted data
    --Customise the WHERE claus to delete
    --unwanted noise from your output
    DELETE dbo.WhoIsActive 
    WHERE
    (
        (COALESCE(sql_text,'') IN ('sp_server_diagnostics'))
        OR (COALESCE(sql_command,'') IN ('sys.sp_reset_connection;1'))
        OR (COALESCE(sql_command,'') IN ('select getDate()'))
        OR (COALESCE(sql_text,'') IN ('select getDate()'))
    )
    AND
    (
        collection_time >= @collection_time
    )
    
    RAISERROR('***** %d unwanted rows have been deleted from WhoIsActive *****', 0, 1, @@ROWCOUNT)
    GO
    
    --Purge sysjobhistory in msdb
    --Delete job history older than @RetainHours which is set to 72 hours
    
    --Notice the use of SQL Agent token $(ESCAPE_NONE(JOBID)) that can 
    --only be used inside SQL Agent job steps.
    --This one dynamically returns the job_id of the current job
    DECLARE @JobID uniqueidentifier = $(ESCAPE_NONE(JOBID))
    DECLARE @BatchSize int = 1000, @RetainHours int = 72, @RCTR int
    
    --Deletes the job hisotry in smaller chunks of 1000 rows
    --Deleting in smaller batches helps avoid filling up msdb transaction log
    WHILE 1=1
    BEGIN
        DELETE TOP (@BatchSize) FROM msdb..sysjobhistory
        WHERE
            job_id = @JobID
            AND CAST(
                CAST(run_date AS varchar(8)) + ' ' + 
                STUFF(STUFF(RIGHT('000000' + CAST(run_time AS varchar(6)), 6), 3, 0, ':'), 6, 0, ':')
                AS datetime
            ) < DATEADD(hh, -@RetainHours, CURRENT_TIMESTAMP)
    
        SET @RCTR = @@ROWCOUNT
    
        RAISERROR('Deleted %d rows from msdb..sysjobhistory that are older than %d hours', 0, 1, @RCTR, @RetainHours) WITH NOWAIT
    
        IF @RCTR < @BatchSize BREAK
    END

    Hope I’ve given you a good overview on the setup and configuration of sp_WhoIsActive!

    In a future post, I will showcase some of the queries that I run against the dbo.WhoIsActive table to get the information I need when troubleshooting different scenarios. Thanks for reading, and feel free to leave a comment below.

    Leave a Reply

    Your email address will not be published. Required fields are marked *