Wednesday, 9 November 2011

Fragmentation and its Resolution Steps

Measuring and Dealing with Object-Level Fragmentation
Within ASE
 Table and Index fragmentations occur as a result of heavy activities, especially deletes and updates. As tables grow in size, this problem becomes more acute.

There is no doubt of the benefits of adding indexes to tables, and for the most part we have to do little work to keep the indexes maintained. However, some maintenance is required as indexes can become fragmented during data modifications, especially deletes and updates. This fragmentation can become a source of performance issues with queries.

What is Object-Level Fragmentation?

Object (table/index)-level fragmentation is essentially made up of extents whose pages are not sequential, extents that contain empty pages, pages that have empty rows, or rows that have only been logically deleted thus far. All of these conditions will result in more I/Os being required to return the required data than would be necessary if there was no fragmentation.
So what exactly is index fragmentation? Index fragmentation actually comes in two different forms: external fragmentation and internal fragmentation. These two forms of fragmentation result in the inefficient use of pages within an index. This inefficient use may be because the logical order of the pages are wrong (external fragmentation) or because the amount of data stored within each page is less than the data page can contain (internal fragmentation). Whichever type of fragmentation occurs in the index, we could face performance issues with our queries because of the fragmentation.
Simply put fragmentation is defined as any condition which causes more than the optimal amount of disk I/O to be performed in accessing a table, or causes the disk I/Os that are performed to take longer than they optimally would. Optimal performance of select queries occurs when data pages are as contiguous as possible within the database, and the data pages are packed as fully as possible."

Type of ASE tables

When a table is created, by default the locking scheme is the server-wide setting for the configuration parameter "lock scheme". The default value for this parameter is "allpages" or APL. In addition, ASE allows you to set the table locking scheme to "datapages" DPL or "datarows" DOL. These are explained below:
1. In APL, the data page and the index pages are locked during the operation.
2. In DPL, only the data page is locked, no index page locking.
3. In DOL, only the rows are locked, no index page locking.
Parameters to Check for the Degree of Index Fragmentation

ASE provides a number of parameters to indicate the level of fragmentation in the indexes:
1. data-page-cluster-ratio, dpcr
2. index-page-cluster-ratio, ipcr
3. data-row-cluster-ratio, drcr
4. space utilization, sput
5. large I/O efficiency, lgio
- The "data-page-cluster-ratio" of the table indicates how many times a sequential read of the page chain will have to jump to one extent and then back to continue the read. The more jumps, the less clustered pages are within extents and thus the more fragmented. These jumps require extra I/O and over time can become expensive to perform.
- The "index-page-cluster-ratio" is the same as the "data-page-cluster-ratio" except that it measures how well clustered index leaf pages are in relation to their extents. It is a very good indicator of fragmentation of the index pages.
- The "data row cluster ratio", is another cluster ratio that is used to cost the number of data pages that need to be accessed during scans using a particular index.
- The Space Utilization, "sput" gives you an idea of how efficiently space is being used by the object—in other words, whether it’s fragmented or not. The value uses the average row size and the number of rows to estimate the minimum number of pages for the object and then compares this to the current actual number of pages.
- Large I/O efficiency is used to estimate the number of useful pages that can be read in a single large I/O. Put another way, how many pages in a single large I/O will contain needed rows?.
For DPL or DOL tables with clustered indexes, dpcr and drcr indicate the level of data fragmentation. For DPL or DOL tables with non clustered indexes, both ipcr and sput figures are useful parameters. For APL tables with clustered indexes, both sput and lgio efficiency will be good indicators of data fragmentation. For APL and non-clustered indexes, look at the parameter "sput". The rule of thumb is that the lower these values, the higher the degree of fragmentation.

Working Out Index Fragmentation

In versions prior to ASE 12.5.1, we could get information about the fragmented indexes through the ASE utility
 
> OptDiag/12.0.0.8/P/EBF 11689 ESD1/Sun_svr4/OS 5.6/2047/32bit/OPT/Thu
> Feb 12 21:33:41 2004
> Adaptive Server Enterprise/12.0.0.8/P/EBF 11689 ESD1/Sun_svr4/OS
> 5.6/2047/32bit/FBO/Fri Feb 13 03:52:51 2004
>
> Server name: "PROD_SERVER"
>
> Specified database: "PROD"
> Specified table owner: not specified
> Specified table: "op_services_log_detail_history"
> Specified column: not specified
>
> Table owner: "dbo"
> Table name: "op_services_log_detail_history"
>
> Statistics for index: "op_services_log_detail_hst_idx"
> (clustered)
> Index column list: "op_services_run_id"
> Data page count: 877600
> Empty data page count: 0
> Data row count: 31346435.0000000000000000
> Forwarded row count: 0.0000000000000000
> Deleted row count: 0.0000000000000000
> Data page CR count: 111527.0000000000000000
> OAM + allocation page count: 10561
> First extent leaf pages: 0
> Data row size: 46.0000000000000000
> Index height: 3
>
> Derived statistics:
> Data page cluster ratio: 0.9976207839562443
> Space utilization: 0.8150023207684015
> Large I/O efficiency: 0.9836183166408136
>
For this example, we chose a fairly large APL table in excess of 31 million records. This table had a non-unique clustered index with a large number of overflow pages. The section under "Derived statistics" provides the information required to determine the level of fragmentation in this table. The two parameters that are important in this case are sput and lgio.
The low "space utilization" figure is normally associated with an APL table with a non-unique clustered index. This can lead to lots of overflow pages as each key can have many rows. The non-unique clustered index is built on the column "op_services_run_id".

Using DERIVED_STAT Function to Estimate Fragmentation

ASE 12.5.0.3 introduced the function:

DERIVED_STAT("table_name","index_name","option")

By using this function, one saves a good bit of time and effort in picking up (a) specific fragmentation indicator(s) without going into optdiag. The function derived_stat can provide a number of indicators. However, it is left to the DBA on how to go about identifying the tables and indexes that need to be defragged.
In general, the closer the value of these parameters to 1 (range 0-1), the better, and the less fragmentation you have. As explained earlier, for different table and index types, these parameters have different significance.
As a DBA, I wrote a shell script that carried out table and index "fragmentation analysis" for a given database. Note that if you have a pre ASE 12.5.0.3 database, as long as you have a test environment with ASE 12.5.0.3 or higher installed you can load your pre ASE 12.5.0.3 database there and run your analysis against the 12.5.0.3 loaded database.
We used the algorithm explained under section "Parameters to Check for the Degree of Index Fragmentation" to recommend which tables and indexes we need to defrag.
The parameters we look at are:
'table_type = DOL or DPL and index type = clustered, examine ====> dpcr & drcr’ 'table_type = DOL or DPL and index_type = non-clustered, examine ====> ipcr & sput' 'table_type = APL and index_type = clustered, examine ====> sput & lgio' 'table_type = APL and index_type = non-clustered examine ====> sput'

The actions are twofold. If you have a DPL or DOL table, then carrying out "reorg rebuild" will do the trick. If you have an APL table with a clustered index, drop and recreate the clustered index. That action will rebuild all other indexes. If you have an APL table without a clustered index, then drop and recreate the index affected on that table.
So what criteria do you use to select indexes for defragging? Obviously given the nature of defragging most often a maintenance window is required. Taking into account the time factors, the objectives are to select large tables (read tables with the high number of records) which show the appropriate derived stats parameters having below optimal values.
 
Name TableType Records IndexName IndexType
------------------------------ --------- ----------- ------------------------------ ---------------
QMCASEEVENTAUDIT DPL 41924732 idx_nc_caseid non-clustered
QMCASEEVENTAUDIT DPL 41924732 idx_u_QMCASEEVENTAUDIT non-clustered
QMAUTHORISEREQUESTAUDIT APL 33911989 idx_nc_caseid non-clustered
QMAUTHORISEREQUESTAUDIT APL 33911989 idx_nc_created non-clustered
QMAUTHORISEREQUESTAUDIT APL 33911989 idx_u_QMAUTHORISEREQUESTAUDIT non-clustered
QMEXCEPTION DOL 10882587 idx_nc_state non-clustered
QMEXCEPTION DOL 10882587 idx_u_exceptionid clustered
QMEXCEPTION DOL 10882587 idx_nc_tradeid non-clustered
QMEXCEPTION DOL 10882587 idx_u_QMEXCEPTION non-clustered
QMEXCEPTION DOL 10882587 idx_nc_exceptiontypeid non-clustered
TMTMSTREAM APL 1405635 idx_uc_TMTMSTREAM clustered
TMTMSTREAM APL 1405635 idx_nc_TMTMSTREAM non-clustered
QMWORKQUEUETOVIEW APL 1122526 idx_u_QMWORKQUEUETOVIEW non-clustered
QMWORKQUEUETOVIEW APL 1122526 idx_unc_QMWORKQUEUETOVIEW non-clustered
TMTMTRADE APL 783232 idx_uc_TMTMTRADE clustered
TMTMTRADE APL 783232 idx_nc_TMTMTRADE non-clustered
SCHEDULED_EVENTS DOL 448277 idx_c_SCHEDULED_EVENTS_2 clustered
SCHEDULED_EVENTS DOL 448277 idx_u_SCHEDULED_EVENTS_1 non-clustered
SCHEDULED_EVENTS DOL 448277 idx_nc_SCHEDULED_EVENTS_3 non-clustered
SCHEDULED_EVENTS DOL 448277 idx_nc_SCHEDULED_EVENTS_4 non-clustered
QMCASEATTRIBUTE APL 225735 idx_u_QMCASEATTRIBUTE clustered
QMTRANSITION APL 189327 stepid non-clustered
QMTRANSITION APL 189327 idx_u_QMTRANSITION non-clustered
QMROLETOSTEP APL 152061 idx_u_QMROLETOSTEP non-clustered
QMROLETOSTEP APL 152061 idx_unc_roleid_stepid non-clustered
EVT_MESSAGE DOL 100547 idx_uc_EVT_MESSAGE_1 clustered
QMCASE DPL 77554 idx_u_QMCASE non-clustered
QMCASE DPL 77554 idx_u_nc_exceptionid clustered
QMCASE DPL 77554 idx_nc_workqueueid_state non-clustered
QMUSERGROUPMAPAUDIT APL 57790 idx_nc_created non-clustered
QMUSERGROUPMAPAUDIT APL 57790 idx_nc_useraccountid non-clustered
QMUSERGROUPMAPAUDIT APL 57790 idx_u_QMUSERGROUPMAPAUDIT non-clustered
.As an example we can take index "

This table is a data-page-locked table and has in excess of 41 million records. The index itself is non-clustered. Earlier we identified derived_stats paramerters "ipcr" and "sput" as the fragmentation indicators for this type of table and index. The actual parameters are shown below:
 
name index_name records size ipcr sput
------------------------------ ------------------------------ ----------- ----------- -------------------- --------------------
QMCASEEVENTAUDIT idx_nc_caseid 41924732 3515414
 
Note that both "ipcr" and "sput" are below their optimal values indicating that this index is fragmented.
The same principle applies to other tables as well. By creating an algorithm we can easily identify tables and indexes that need defragging.

INSTALLATION OF DEFRAGGING ROUTINES
See the cron example below.

#----------------------------------------------------------------------------------------------------------#
##
## Fragmentation report on all user databases daily
## Mich Talebzadeh 23/12/2004
##
00 6 * * 1-6 /local/usts02_sybase/sybase/dba/frag_analysis.ksh USTSAC2 > /local/usts02_sybase/sybase/logs/USTSAC2/frag_analysis.err 2>&1
00 8 * * 1-6 /local/usts02_sybase/sybase/dba/frag_analysis.ksh USTSAC2 STS_CS > /local/usts02_sybase/sybase/logs/USTSAC2/frag_analysis_STS_CS.err 2
>&1
#----------------------------------------------------------------------------------------------------------#

frag_analysis.ksh routine will update DBA_CONTROL..frag_databases. If record exists for this database then the column WhenFragLastRun will be timestamped. Otherwise a new record for this database will be created in DBA_CONTROL..frag_databases with default value of 10 for TopNTables to be defragged. If you want to exclude a database from defragging, set Exclude = 1 in this database in frag_databases.

1> use DBA_CONTROL
2> go
1> select * from frag_databases
2> go
DBName TopNTables WhenFragLastRun WhenRebuildLastDone Exclude
------------------------------ ----------- -------------------------- -------------------------- -------
test 10 Jan 14 2005 4:49PM Jan 14 2005 2:06PM 0
CORD_CORPORATE 10 Jan 25 2005 6:00AM NULL 0
DBA_CONTROL 10 Jan 25 2005 6:00AM NULL 0
PINGTEST 10 Jan 25 2005 6:00AM NULL 0
PUBLISH_SUBSCRIBE 10 Jan 25 2005 6:01AM NULL 0
STS_DEUSD 10 Jan 25 2005 6:02AM Jan 19 2005 8:33PM 0
sqlcqr_master 10 Jan 25 2005 6:02AM NULL 0
(7 rows affected)

The other table of interest is DBA_CONTROL..frag_info table. This is the main table which holds information about which tables and indexes need to be defragged. frag_analysis.ksh will populate this table. For a given database, it will simply delete the old records and create new records. After running frag_analyis.ksh YOU NEED TO EXCLUDE TABLES THAT SHOULD NOT BE PICKED UP BY rebuilt_indexes.ksh
because there may not be enough free space in the database or you may not have a large enough time window. For example for STS_QM we had to exclude the following three tables and their indexes from reorg rebuild etc

1> update frag_info set Exclude = 1 where DBName = 'STS_QM'
2> and Name in ('QMEXCEPTION','SCHEDULED_EVENTS','EVT_MESSAGE')
3> go
(10 rows affected)


Upon completion of this routine a summary email of work done will be sent to the list of emails in the function "forward_email" in routine ~sybase/dba/ common_functions.ksh

Output files
The main output file is the report file. The other is the log file both in ~/logs/server_name directory

sybaseAC@ubnsts-s02:/local/usts02_sybase/sybase/logs/USTSAC2% ltr frag*
-rw------- 1 sybaseAC sybase 0 Jan 12 06:00 frag_analysis.err
-rw------- 1 sybaseAC sybase 12213 Jan 12 11:59 frag_analysis_USTSAC2.report
-rw------- 1 sybaseAC sybase 1502 Jan 12 11:59 frag_analysis_USTSAC2.log

WHAT TO WATCH!

Anytime that you run frag_analysis.ksh the routine deletes the old records for a database in DBA_CONTROL..frag_info and populates them with new records. If you intend to exclude any table from rebuild index or reorg rebuild, set the Exclude column = 1 for that table in database before running the rebuild_indexs.ksh routine
frag_analysis.ksh is non destructive and can be run anytime. It will only update DBA_CONTROL tables.

Doing Reorg rebuild and index rebuild

To do the actual reorg and rebuild of indexes run ~/dba/rebuild_indexes.ksh. This routine expects as a minimum a server_name and a database_name to be supplied. You can give multiple databases after server name. They will be done in the order entered. You can also run reorg_rebuild for different servers and databases or the same server and different databases simultaneously. Examples below should help you

00 16 * * 6 /local/usts02_sybase/sybase/dba/rebuild_indexes.ksh USTSAC2 test > /local/usts02_sybase/sybase/logs/USTSAC2/rebuild_indexes_USTSAC2_tes
t.err 2>&1
00 16 * * 6 /local/usts02_sybase/sybase/dba/rebuild_indexes.ksh USTSAC2 test STS_CS > /local/usts02_sybase/sybase/logs/USTSAC2/rebuild_indexes_USTS
AC2.err 2>&1

This table relies on the information in DBA_CONTROL..frag_info table as discussed before. There is a log file for the server itself and for a given database there is a sql file and a log file. Examples are given below.

sybaseAC@ubnsts-s02:/local/usts02_sybase/sybase/logs/USTSAC2% ltr rebuild_indexes_USTSAC2_*12Jan2005_*
-rw------- 1 sybaseAC sybase 1828 Jan 12 12:11 rebuild_indexes_USTSAC2_SERVER_WIDE_12Jan2005_12:11_67.log
-rw------- 1 sybaseAC sybase 28791 Jan 12 12:11 rebuild_indexes_USTSAC2_DATABASE_test_index_rebuild_script_12Jan2005_12:11_67.sql
-rw------- 1 sybaseAC sybase 27091 Jan 12 12:36 rebuild_indexes_USTSAC2_DATABASE_test_12Jan2005_12:11_67_progress.log

Notes
Before running this routine ensure that both DBA_CONTROL..frag_databases and frag_info have correct information. The top n tables will be picked up from frag_databases table. The index names to be done will be picked up from frag_info table.
rebuild_indexes.ksh will set the Server configuration option 'number of sot buffers' to 1500 for the duration of index rebuild and set it back to its original value upon completion
In order to perform reorg rebuild the database option 'select into' will be turned on for that database. So you need to:
1. Stop the transaction log dumps
2. Back up the database
Once the operation for that database finished the option "select into" will be turned off. If the database already has 'select into' turned on no action will be done. Ensure that you backup the database and restart the transaction log dump if applicable.
1. Extract files from frags.tar. This will create a directory called "frags".
2. Put frag_analysis.ksh, common_function.ksh and rebuild_indexes.ksh under ~sybase/dba directory
3. Put environment.ksh under directory ~/<SERVER_NAME>. This is the environment file for the SQL Server <SERVER_NAME>
4. Modify this file to set the environment variables SYBASE and DSQUERY correctly. For example set
export SYBASE=/local/usts02_sybase/sybase/ase12_52-11942-64
export DSQUERY=USTSAC2
5. Run DBA_CONTROL.sql. This code creates two tables namely; DBA_CONTROL..frag_info and DBA_CONTROL..frag_databases
6. To get information for tables and indexes to be defragged, run $SYBASE/dba/frag_analysis.ksh. An example would be a cronjob installed similar to below. This routine takes two parameters, server_name and database_name respectively. database_name is optional. With server_name given like in the example below, it will do fragmenattion analysis for all user databases. You can run this routine with <server_name> <database_name>. In that case it will update frag info for that database only.
0.729916 0.763700
idx_nc_caseid" on table "QMCASEEVENTAUDIT".
As a good indicator I go for tables with more than 50,000 records having the applicable derived stats parameters below 0.9. An example of indexes chosen for defragging is shown in table below:
optdiag. The output from optdiag contains all sorts of information including the "Derived statistics":

No comments:

Post a Comment