Dismiss Notice

Register now to be one of the first members of this SharePoint Community! Click here it just takes seconds!

Dismiss Notice
Welcome Guest from Country Flag

Considerations when tuning your queries with columnstore indexes on clone databases

Discussion in 'Official Microsoft News' started by Parikshit Savjani, Apr 3, 2017.

Thread Status:
Not open for further replies.
  1. Parikshit Savjani

    Parikshit Savjani Guest

    Blog Posts:
    0
    As discussed in my previous blog post, one of the primary scenario for DBCC CLONEDATABASE is to assist dbas, developers and support teams in troubleshooting sub-optimal query plans by creating fast, minimally invasive and transaction ally consistent database clones of their production databases. The database clone created using DBCC CLONEDATABASE contains the copy of schema and statistics which allows the optimizer to generate same query plan as observed on the production database without the actual data. While this is true for queries involving traditional rowstore indexes, there are some special considerations for queries involving Columnstore indexes due to difference in the way statistics are generated for these indexes. In this blog post, I will explain you this behavior and in the end, provide you with the scripts required to handle this scenario to generate same query plan in database clone as observed on the production databases.

    Unlike traditional Btree indexes, when a columnstore index is created, there is no index statistics created on the columns of the columnstore indexes. However, there is an empty stats object created with the same name as columnstore index and an entry is added to sys.stats at the time of index creation. The stats object is populated on the fly when a query is executed against the columnstore index or when executing DBCC SHOW_STATISTICS against the columnstore index, but the columnstore index statistics aren’t persisted in the storage. The index statistics is different from the auto created statistics on the individual columns of columnstore indexes which is generated on the fly and persisted in the statistics object. Since the index statistics is
    not persisted in storage, the clonedatabase will not contain those statistics leading to inaccurate stats and different query plans when same query has run against database clone as opposed to production database.

    Let me illustrate this behavior with a sample script below

    set nocount on
    go
    create database db1

    go
    use db1
    go
    create table t1(a varchar(8000));
    go
    insert t1 values(replicate(newid(), 200));
    go 1000
    create clustered Columnstore index cci on t1
    go


    — Initial stats object with name cci created at the time of index creation
    select * from sys.stats where object_id=OBJECT_ID(‘t1’)
    go


    — Adding more 3000 rows to the table
    insert t1 values(replicate(newid(), 200));
    go 3000


    — Query with predicate to generate auto created statistics on the column
    select a from t1 where a = ‘aaaaaaaaaaaaaa’
    GO


    –Verify if auto created statistics got added to the table
    — Here you will see 2 statistics objects. 1 with the index name and other auto created statistics on column a due to the earlier query


    select * from sys.stats where object_id=OBJECT_ID(‘t1’)
    go


    — Turn on Actual Exec Plan and it accurately displays 4000 rows
    — before cloning
    select a from t1
    GO


    [​IMG]

    — Run DBCC SHOW_STATISTICS against the index statistics of columnstore index and it accurately reflects 4000 rows & 6000 Data pages (generated on the fly)

    dbcc show_statistics(‘t1’,‘cci’) with stats_stream
    go


    [​IMG]

    dbcc clonedatabase(‘db1’,‘db2’)
    go


    use db2
    go


    — Both the stats are copied in clone
    select * from sys.stats where object_id=OBJECT_ID(‘t1’)
    go


    /* If you turn on Actual Execution Plan only 1000 rows are displayed in statistics clone which is the same as the number of
    rows at the time of creation of cci but never updated.*/
    select a from db2.dbo.t1
    GO


    [​IMG]

    — Run DBCC SHOW_STATISTICS against the index statistics and it reflects 1000 rows & 0 Data pages which was at the time of index generation (not updated)
    dbcc show_statistics(‘t1′,’cci’) with stats_stream
    go


    [​IMG]

    — DROP DATABASE after tests

    use master
    go
    DROP DATABASE db1
    DROP DATABASE db2


    This is by design behavior of Columnstore indexes in SQL Server. To handle this behavior and to be able to accurately capture the columnstore index statistics in the clone database, we have created and shared a script in our Tiger Github repository, which can be used to update the columnstore index statistics on the source database before running DBCC CLONEDATABASE.

    The script usp_update_CI_stats_before_cloning.sql should be run on source production database which needs to be cloned before running DBCC CLONEDATABASE. The script basically performs the following

    1. Runs DBCC SHOW_STATISTICS WITH STATS_STREAM against all the columnstore indexes on the source database to capture the up to date stats blob generated on the fly.
    2. Updates the persisted stats object with the most recent stats blob captured in step 1.

    Note: The above is script is required to be run only if you would like to clone columnstore index statistics provided the query plan on the database clone is different from the query plan on source database. This script is not required to be run otherwise since database engine is designed to generate and handle the non-persisted columnstore index statistics for efficient query plans.



    Parikshit Savjani
    Senior PM, SQL Server Tiger Team
    Twitter | LinkedIn
    Follow us on Twitter: @mssqltiger | Team Blog: Aka.ms/sqlserverteam


    Continue reading...
     
Thread Status:
Not open for further replies.

Share This Page

LiveZilla Live Chat Software