New DBQL logging options USECOUNT and STATSUSAGE, introduced in Teradata Database 14.10, enable the logging of used and missing statistics. The output of this logging can be utilized to find used, unused, and missing statistics globally (for all queries) or for just a subset of queries.
The USECOUNT logging option, which starting in Teradata Database 14.10 supports the object use count (OUC) feature, is associated with the database that owns the table. USECOUNT logs usage from all requests against that table irrespective of the user who submitted the query. This option logs the object usage from such objects as databases, tables, indexes, join indexes, and statistics. Additionally, it logs insert, update, and delete counts against the database’s tables. The STATSUSAGE logging option, similar to other DBQL logging options, is associated directly to a user and logs the used and missing statistics at the query level within an XML document.
In this article, we focus on the logging of the used statistics, how to retrieve them and join them to other dictionary tables to find the global list of used and unused statistics and also the missing statistics at query level.
Enabling the DBQL USECOUNT Option
Each statistic you define becomes an object that USECOUNT logging can track. Each time the optimizer actually makes use of a specific statistic, an access counter in the new DBC.ObjectUsage table gets incremented. So at any point in time you can look in the data dictionary and evaluate how frequently (or even if ever) a particular statistic is being used.
The descriptions of the statements to enable object use count (OUC) logging on a given database or a user are given below. It is especially useful to enable the USECOUNT option on the databases that own permanent tables. It can be enabled on users or user accounts also, but you would do that only when these accounts have tables on which you want the logging enabled.
Description
|
Command
|
To enable on a database (which is not a user). Note that other DBQL options are not allowed to be enabled on a database.
|
BEGIN QUERY LOGGING WITH USECOUNT ON <database Name>;
|
To enable on a user with no DBQL options enabled.
|
BEGIN QUERY LOGGING WITH USECOUNT ON <User Name>;
|
To enable on a user having some existing DBQL options enabled (use SHOW QUERY LOGGING ON <User> to find the current DBQL options).
|
REPLACE QUERY LOGGING WITH <current options>, USECOUNT ON <User Name>;
|
The following are exceptions and special scenarios in OUC logging:
- OUC logging is not applicable for DBC tables. In other words, the OUC feature doesn’t log the use counts for statistics you may be collecting on dictionary objects.
- OUC logging is not applicable for temporary tables.
- Dropping statistics also drops the corresponding entries in DBC.ObjectUsage table. So, if you are dropping and recollecting statistics, the use counts reflect usage from the last collection only.
- Unlike other DBQL options, disabling USECOUNT invalidates (resets the counts and timestamp) the corresponding rows in DBC.ObjectUsage. It is not recommended to disable this option unless there is a specific reason to do so.
The following sections describe how to find these statistics for each category along with examples.
Identifying Used Statistics
After enabling DBQL USECOUNT and letting some time pass, the following query can be used to query the dictionary table DBC.ObjectUsage to get the use counts of the existing statistics. The number of days a statistic has been continuously under the control of USECOUNT logging is given by the column DaysStatLogged; a value of -1 for this column indicates that USECOUNT option has been disabled after being enabled for some time in the past. You need to consider the number of days a statistic was a candidate for logging along with use counts to normalize the comparison across different statistics.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
|
SELECT DBC.DBase.DatabaseName AS DatabaseName
,DBC.TVM.TVMName AS TableName
, COALESCE (DBC.StatsTbl.StatsName
,DBC.StatsTbl.ExpressionList
, 'SUMMARY' ) AS StatName
,OU.UserAccessCnt AS UseCount
, CAST (OU.LastAccessTimeStamp AS DATE ) AS DateLastUsed
, CASE
WHEN DBC.DBQLRuleTbl.TimeCreated IS NULL
THEN -1
WHEN DBC.DBQLRuleTbl.TimeCreated > DBC.StatsTbl.CreateTimeStamp
THEN CURRENT_DATE - CAST (DBC.DBQLRuleTbl.TimeCreated AS DATE )
ELSE CURRENT_DATE - CAST (DBC.StatsTbl.CreateTimeStamp AS DATE )
END AS DaysStatLogged
FROM DBC.ObjectUsage OU
,DBC.Dbase
,DBC.TVM
,DBC.StatsTbl LEFT JOIN DBC.DBQLRuleTbl
ON DBC.StatsTbl.DatabaseId = DBC.DBQLRuleTbl.UserID
AND DBQLRuleTbl.ExtraField5 = 'T' /* Comment this line if TD 15.0 or above */
/* AND DBQLRuleTbl.ObjectUsage = 'T' */ /* Uncomment this line if TD 15.0 or above */
WHERE DBC.Dbase.DatabaseId = OU.DatabaseId
AND DBC.TVM.TVMId = OU.ObjectId
AND DBC.StatsTbl.DatabaseId = OU.DatabaseId
AND DBC.StatsTbl.ObjectId = OU.ObjectId
AND DBC.StatsTbl.StatsId = OU.FieldId
AND OU.UsageType = 'STA'
ORDER BY 1, 2, 3;
|
Customize the above query for your databases and tables of interest, number of days a statistic is logged, etc. A sample output of the above query is given below.
Database
Name
|
TableName
|
StatName
|
Use
Count
|
DateLast
Used
|
DaysStat
Logged
|
DBA_TEST
|
PARTY_DATA
|
CURR_FLAG
|
209
|
11/15/2014
|
-1
|
PNR_DB
|
CODE_SHARE
|
FK_ITIODPAD_ITINERARY_INFO_ODP
|
253
|
11/1/2014
|
34
|
PNR_DB
|
ITINERARY_TBL
|
AD_ITINERARY_INFO_ODP
|
203
|
11/1/2014
|
34
|
PNR_DB
|
ITINERARY_TBL
|
CD_ID_PRODUCT
|
203
|
11/1/2014
|
34
|
PNR_DB
|
ITINERARY_TBL
|
CD_ORIGINAL_STATUS_CODE,CD_STATUS,
FK_PNRODP_AD_PNR_HEADER_ODP
|
203
|
11/2/2014
|
34
|
PNR_DB
|
PNR_HEADER_ODP
|
AD_PNR_HEADER_ODP
|
600
|
11/28/2014
|
34
|
PNR_DB
|
PNR_HEADER_ODP
|
AD_PNR_HEADER_ODP,CD_CONTROL_NUMBER_LOC
|
700
|
11/28/2014
|
34
|
Identifying Unused Statistics
Using USECOUNT to find unused statistics requires consideration of multiple time dimensions. The basic question being answered here is: Am I collecting statistics that have not been used for some duration? To answer this question, you need to consider not only the usage counts, but also how long the USECOUNT logging was active, the most recent usage time stamp and also the age of the statistics. For example, if USECOUNT logging has been enabled only for the past few days, some statistics which get used on a monthly/quarterly workload may not yet have been logged as used. Similarly, if you collect new statistics (not re-collections), you need to let them get exposed to different workloads for certain period of time to properly identify whether they are being used or not.
The following query is designed to consider these aspects and to list out the statistics that meet the following criteria.
- The age of the statistic is more than N days (first collected more than N days ago).
- DBQL USECOUNT logging is active for more than N days on the owning database.
- The statistic has not been used in the last N days.
The value of N can be customized based on your requirements. In the example query given below, the value of N is set to 30 (note that N is used in two predicates; both need to be updated if you customize this value).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
|
SELECT DBC.DBase.DatabaseName AS DatabaseName
,DBC.TVM.TVMName AS TableName
, COALESCE (DBC.StatsTbl.StatsName
,DBC.StatsTbl.ExpressionList
, 'SUMMARY' ) AS StatName
, CURRENT_DATE - CAST (DBC.StatsTbl.CreateTimeStamp AS DATE ) AS StatAge
, CASE
WHEN DatabaseName = 'DBC'
THEN -2
WHEN DBC.StatsTbl.StatsType IN ( 'B' , 'M' )
THEN -2
WHEN DBC.DBQLRuleTbl.TimeCreated IS NULL
THEN -1
WHEN DBC.DBQLRuleTbl.TimeCreated > DBC.StatsTbl.CreateTimeStamp
THEN CURRENT_DATE - CAST (DBC.DBQLRuleTbl.TimeCreated AS DATE )
ELSE CURRENT_DATE - CAST (DBC.StatsTbl.CreateTimeStamp AS DATE )
END AS DaysStatLogged
FROM DBC.StatsTbl LEFT JOIN DBC.DBQLRuleTbl
ON DBC.StatsTbl.DatabaseId = DBC.DBQLRuleTbl.UserID
AND DBQLRuleTbl.ExtraField5 = 'T' /* Comment this line if TD 15.0 or above */
/* AND DBQLRuleTbl.ObjectUsage = 'T' */ /* Uncomment this line if TD 15.0 or above */
,DBC.Dbase
,DBC.TVM
WHERE DBC.StatsTbl.DatabaseId = DBC.DBASE.DatabaseId
AND DBC.StatsTbl.ObjectId = DBC.TVM.TVMId
AND NOT EXISTS ( SELECT 100 FROM DBC.ObjectUsage OU
WHERE OU.UsageType = 'STA'
AND OU.DatabaseId = DBC.StatsTbl.DatabaseId
AND OU.ObjectId = DBC.StatsTbl.ObjectId
AND OU.FieldId = DBC.StatsTbl.StatsId
AND CURRENT_DATE - CAST (OU.LastAccessTimeStamp AS DATE ) < 30
)
AND DaysStatLogged > 30
AND DBC.StatsTbl.StatsId <> 0
ORDER BY 1, 2, 3;
|
You can customize the above query to adjust the number of days the statistic has not been used, for the databases and tables of your interest, the age of the statistic, etc.
DatabaseName
|
TableName
|
StatName
|
Stat
Age
|
DaysStat
Logged
|
PNR_DB
|
ITINERARY_TBL
|
PARTITION
|
34
|
34
|
PNR_DB
|
ITINERARY_TBL
|
ST_281020293276_0_ITINERARY_INFO_ODP
|
34
|
34
|
PNR_DB
|
ITINERARY_TBL
|
TS_ULT_ALZ
|
34
|
34
|
PNR_DB
|
PNR_HEADER_ODP
|
CD_CREATION_OFFICE_ID
|
34
|
34
|
PNR_DB
|
PNR_HEADER_ODP
|
CD_CREATOR_IATA_CODE
|
34
|
34
|
PNR_DB
|
PNR_HEADER_ODP
|
PARTITION
|
34
|
34
|
Identifying Missing Statistics
Two additional DBQL logging options, STATSUSAGE and XMLPLAN, create XML documents that identify which statistics were used, and ones that the optimizer looked for but did not find. Contrary to USECOUNT, these two logging options should be turned on temporarily and only as needed for analysis. These two options are enabled the same way as other DBQL logging is, by User or Account. Their output can be found in DBC.DBQLXMLTbl.
STATSUSAGE logs the usage of existing statistics within a query, as well as recommendations for new statistics that were found to be missing when the query was optimized. It does this without tying the recommendation to a particular query step in the plan. The relationship to query steps can only be seen if XMLPLAN logging is also enabled. XMLPLAN logs detailed step data in XML format.
Enable STATSUSAGE when looking for missing statistics always, with or without XMLPLAN. If you enable XMLPLAN by itself without STATSUSAGE, no statistics-related information of any kind is logged into DBQLXMLTbl. STATSUSAGE provides all the statistics recommendations and, if both are being logged, those statistic recommendations can be attached to specific steps in the plan.
Because XMLPLAN comes with increased overhead, for the purposes of identifying missing statistics, it usually is sufficient to start with STATSUSAGE logging without XMPLPLAN. The step information available in XMLPLAN is more useful when in analysis mode.
(Courtesy: Teradata Forum )