Friday, April 12, 2019

How to collect necessary info for session stuck/hang issue?

Objective
How to collect necessary info for session stuck/hang issue(including issue wherein session remains IDLE state)?  
 
Session stuck/hang can occur in various state (e.g. Aborting, PARSING, RESPONSE etc)

This article describes general step to collect necessary info in any session stuck/hung issue.
 
Point:
 It may not be DBS issue when session state remains in RESPONSE state.
 The RESPONSE state indicates a response to a session request is in process wherein DBS internal activity already completed.
  (e.g. session state remains RESPONSE until JDBC application closes query execution object)

Procedure
1) Collect perflook (/opt/teradata/gsctools/bin/perflook)
   Ideally, please collect perflook twice with 2~3 min interval.
   We can see transition of session/system state(really hang or slowly moving) if we have multiple perflooks.
 
 
 
2) Collect session status with qrysessn
   qrysessn output is collected by perflook.
   Please make sure hang session is captured there. 
 
  (qrysessn output sample)
 
  Host   Session  PE     DBC User ID
  -----  -------  -----  -------------------------------
      1     1005  30718  User1
  State details : PARSING
 
 
3) Collect session status with gtwglobal (/usr/tgtw/bin/gtwglobal)
    Please collect this info before trying to abort session if you did not try to abort yet. (session state in gtwglobal may change due to abort attempt)
 
 (run following command on TPA node)
  # /usr/tgtw/bin/xgtwglobal
 se ho <host_no>                 <<<<<< specify host number found in qrysessn (usually 1)
 1> di se <session_no> long      <<<<<< specify sessionno found in qrysessn
 
  Command and output sample:
    Enter gateway command or enter h for Help:
   se ho 1
   Host 1 has been selected.
    Enter gateway command or enter h for Help:
   1>di se 29989  long
   Session 29989 connected to GTW 22528 is assigned to PE 30718
   of host 1
   User Name            Account              IP Addr                                       Port
   -------------------- -------------------- --------------------------------------------- ----------
   DBC                                       10.210.190.203                                35119
   State                                   Event                    Action       
   --------------------------------------- ------------------------ ------------------------
   CS_CLIENTWAITNOTRAN                     CE_STARTMSGRSPNOTRAN     CA_SENDDBSRSP  
   Partition        Authentication
   ---------------- --------------
   DBC/SQL          DATABASE
      :

  Review "State" of the state in the output. "Utilities" guide has description for each "State".
  In sample above, CS_CLIENTWAITNOTRAN
 means the session is waiting for a request from client.
  (GSO/Client needs to look into what the client App is doing)

  If we have multiple hung/stuck sessions, please repeat the command above for all hung sessions.
 
 
4) Run lokdisp and collect blocking info
    Blocking info may be helpful even when session is NOT in BLOCKED status.
    Session status remains in PARSING when express request is blocked.
    (express request is for PE to retrieve data from AMP during optimizing process)
 
     Type following command in lokdisp:     
    > bl                            <<<< display blocking info
    > a                              <<<< request to all AMP
    > tr                            <<<< display transaction lock
    > (AMP No)          <<<< specify AMP # where blocking is found
 
     Please repeat  "tr" command on all AMPs where blocking is found.
 
 
5) (If session is in aborting state) See if rollback is running
    Aborted session will remain in aborting state until rollback is completed.
    Run this command in rcvmanager utility:
    LIST ROLLBACK TABLES; 
 
 
6) See if ABORT SESSION can abort hang session
   Please get customer approval before running ABORT SESSION.
   If ABORT SESSION works, you don't need to move on to next step.
 
 
7) Run tpareset -d if it's unable to abort session.
    Please make sure to get customer approval before tpareset.
    Engage GSO/DBS and let them review system before tpareset so we can identify the node to be uploaded. 
 
 
8) Upload dump to TD GSC team
    Don't upload full dump if system has more than 4 nodes. Please upload node(s) with hang session tasks only.
   
   

Tuesday, February 2, 2016

When your mailx command is not sending mails ?

Then check for the file /etc/postfix/main.cf for relayhost is set to right server. then do  restart service.
command to restart mailx services : service postfix restart

Monday, November 16, 2015

How to recover left over Spool space in Teradata

The Teradata system automatically creates and drops spool files which it uses as temporary work tables to execute queries. Sometimes, one of two potential problems can occur with the spool files:

Leftover spool: Leftover spool occurs when the system fails to properly drop the spool file after the execution of a query completes. The spool that the query was using appears frozen without a session. This may result in error 2667 which aborts the transaction. Note that with error 2667, the system terminates the query not because the query was faulty, but as a result of the leftover spool problem.

Phantom spool: Phantom spool occurs when the DBC.DatabaseSpace table is not properly updated to reflect current and actual space usage.

To check for either types of spool problems, run the following query when the system is most quiescent and when there are the least amount of users logged on:

To Identify the left over spool, use the following SQL.

/* LEFT OVER SPOOL */
sel databasename
,sum(currentspool)
from DBC.DiskSpace
where databasename NOT IN
(sel username from DBC.SessionInfo)
group by 1 having sum(currentspool) > 0
order by 2 desc;

/* Left over spool - by vprocs */
SELECT DATABASENAME, VPROC, CURRENTSPOOL
FROM DBC.DISKSPACE
WHERE DATABASENAME NOT IN (SEL USERNAME FROM DBC.SESSIONINFO)
AND CURRENTSPOOL > 0
ORDER BY 1,2 with sum(currentspool);

If the above query returns rows, run the Update Space (updatespace) utility to update the spool space accounting in DBC.DatabaseSpace for each of the databases.
If the problem is fixed using Update Space, then the problem was phantom spool. If running Update Space does not fix the problem, you have leftover spool and the only way to fix the problem is to restart the system.


To clear the same follow the below procedure:
1. Login as root in the unix box (NODE)
2. cnsterm 6 (open a suprvsr session)
4. start updatespace (start the utility)
5. Ctrl+c to exit this window
6. cnsterm x (where x is whatever window it's started in - output from the cnsterm 6 screen)

You will see screen with

The format of the input command is:

UPDATE [SPOOL | TEMPORARY | ALL] SPACE FOR {ALL DATABASES | dbname} ;

7. If u decide to do it db by db, then type continue after each update.
8. type quit to exit
9. ctrl+c to exit the window
10. cnsterm 6
11. stop x (where x is the update space utility run window)


Log from the activity ;

Enter Command
> update spool space for myDb;
update spool space for myDb;

Updating space for myDb
Space updated for myDb
Enter QUIT or CONTINUE.
> continue
continue


Enter Command
> update spool space for myDb2;
update spool space for myDb2;

Updating space for myDb2
Space updated for myDb2
Enter QUIT or CONTINUE.

Monday, February 16, 2015

Handling informatica failure with an MLoad job

Some of you may have experienced a failure in an MLoad job.  Due to the way the Informatica session was set up, when the job was re-run, a new error, similar to the following was given:

[ERROR] Type:(Teradata DBS Error), Error: (Worktable is missing for xxx during MLoad restart.)

This is happening because the "Drop Log/Error/Work Tables" attribute on the target in the session is checked.  This attribute tells Informatica to drop those tables before it starts the MLoad regardless if an MLoad was in progress in the prior run.  Unfortunately, by dropping the work and log tables, an MLoad that needed to be restarted cannot.  This is when the DBA gets called to drop and recreate the target table.  Well, we shouldn't need to do that anymore!!!

Here are the steps that should be followed when a failure occurs:

1)      Do a "select *" from the target table with a "sample 1" to see if a lock exist on the table.  Note: a "select count(*)" will not work for this purpose.
a.       If the query returns no error, then the job can be restarted.  Proceed to #2.
b.      If the query returns an error that a lock exists on the table, then a DBA needs to be engaged to drop that lock.  While waiting for that, you can do #2, but do not do #3 until the lock has been released.
Ex: Release mload on <tablename> ; --> if failed in aquisition phase.
release mload <tabname> in apply; -- if failed in appln phase.

2)      Ready the job - Copy the job to the support folder to make sure the "Drop Log/Error/Work Tables" attribute is unchecked.  The standard will change for development to have this box unchecked.  If the session already has this box unchecked, then there will not be a need to copy it to the support folder first.

3)      Re-run the job - Remember, Informatica has no idea that the MLoad was in progress and that an MLoad restart will commence.  Informatica is simply submitting the MLoad to Teradata.  Teradata, on the other hand, due to the data in the log table and work table, knows to do an MLoad restart.  Rerun the job.


4)      If there is another failure after this, start at #1 again.

Friday, February 13, 2015

Identifying Used, Unused and Missing Statistics in Teradata14.10

New DBQL logging options USECOUNT and STATSUSAGE, introduced in Teradata Database 14.10, enable the logging of used and missing statistics.  The output of this logging can be utilized to find used, unused, and missing statistics globally (for all queries) or for just a subset of queries.
The USECOUNT logging option, which starting in Teradata Database 14.10 supports the object use count (OUC) feature, is associated with the database that owns the table.  USECOUNT logs usage from all requests against that table irrespective of the user who submitted the query.  This option logs the object usage from such objects as databases, tables, indexes, join indexes, and statistics. Additionally, it logs insert, update, and delete counts against the database’s tables.  The STATSUSAGE logging option, similar to other DBQL logging options, is associated directly to a user and logs the used and missing statistics at the query level within an XML document.
In this article, we focus on the logging of the used statistics, how to retrieve them and join them to other dictionary tables to find the global list of used and unused statistics and also the missing statistics at query level.
Enabling the DBQL USECOUNT Option
Each statistic you define becomes an object that USECOUNT logging can track.  Each time the optimizer actually makes use of a specific statistic, an access counter in the new DBC.ObjectUsage table gets incremented.  So at any point in time you can look in the data dictionary and evaluate how frequently (or even if ever) a particular statistic is being used.
The descriptions of the statements to enable object use count (OUC) logging on a given database or a user are given below.  It is especially useful to enable the USECOUNT option on the databases that own permanent tables.  It can be enabled on users or user accounts also, but you would do that only when these accounts have tables on which you want the logging enabled. 
Description
Command
To enable on a database (which is not a user). Note that other DBQL options are not allowed to be enabled on a database.
BEGIN QUERY LOGGING WITH USECOUNT ON <database Name>;
To enable on a user with no DBQL options enabled.
BEGIN QUERY LOGGING WITH USECOUNT ON <User Name>;
To enable on a user having some existing DBQL options enabled (use SHOW QUERY LOGGING ON <User> to find the current DBQL options).
REPLACE QUERY LOGGING WITH <current options>, USECOUNT ON <User Name>;
 
The following are exceptions and special scenarios in OUC logging:
  1. OUC logging is not applicable for DBC tables.  In other words, the OUC feature doesn’t log the use counts for statistics you may be collecting on dictionary objects.
  2. OUC logging is not applicable for temporary tables.
  3. Dropping statistics also drops the corresponding entries in DBC.ObjectUsage table.  So, if you are dropping and recollecting statistics, the use counts reflect usage from the last collection only.
  4. Unlike other DBQL options, disabling USECOUNT invalidates (resets the counts and timestamp) the corresponding rows in DBC.ObjectUsage. It is not recommended to disable this option unless there is a specific reason to do so.
The following sections describe how to find these statistics for each category along with examples.
Identifying Used Statistics
After enabling DBQL USECOUNT and letting some time pass, the following query can be used to query the dictionary table DBC.ObjectUsage to get the use counts of the existing statistics.  The number of days a statistic has been continuously under the control of USECOUNT logging is given by the column DaysStatLogged; a value of -1 for this column indicates that USECOUNT option has been disabled after being enabled for some time in the past.  You need to consider the number of days a statistic was a candidate for logging along with use counts to normalize the comparison across different statistics.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
SELECT DBC.DBase.DatabaseName AS DatabaseName
      ,DBC.TVM.TVMName        AS TableName
      ,COALESCE(DBC.StatsTbl.StatsName
         ,DBC.StatsTbl.ExpressionList
               ,'SUMMARY')    AS StatName
      ,OU.UserAccessCnt       AS UseCount
      ,CAST(OU.LastAccessTimeStamp AS DATE) AS DateLastUsed
      ,CASE
       WHEN DBC.DBQLRuleTbl.TimeCreated IS NULL
       THEN -1  -- Logging disabled
       WHEN DBC.DBQLRuleTbl.TimeCreated > DBC.StatsTbl.CreateTimeStamp
       THEN CURRENT_DATE - CAST(DBC.DBQLRuleTbl.TimeCreated AS DATE)
       ELSE CURRENT_DATE - CAST(DBC.StatsTbl.CreateTimeStamp AS DATE)
       END AS DaysStatLogged
FROM DBC.ObjectUsage OU
    ,DBC.Dbase
    ,DBC.TVM
    ,DBC.StatsTbl LEFT JOIN DBC.DBQLRuleTbl
           ON DBC.StatsTbl.DatabaseId = DBC.DBQLRuleTbl.UserID
          AND DBQLRuleTbl.ExtraField5 = 'T'   /* Comment this line if TD 15.0 or above   */
        /*AND DBQLRuleTbl.ObjectUsage = 'T'*/ /* Uncomment this line if TD 15.0 or above */
WHERE DBC.Dbase.DatabaseId    = OU.DatabaseId
  AND DBC.TVM.TVMId           = OU.ObjectId
  AND DBC.StatsTbl.DatabaseId = OU.DatabaseId
  AND DBC.StatsTbl.ObjectId   = OU.ObjectId
  AND DBC.StatsTbl.StatsId    = OU.FieldId
  AND OU.UsageType            = 'STA'
ORDER BY 1, 2, 3;
Customize the above query for your databases and tables of interest, number of days a statistic is logged, etc.  A sample output of the above query is given below. 
Database
Name
TableName
StatName
Use
Count
DateLast
Used
DaysStat
Logged
DBA_TEST
PARTY_DATA
CURR_FLAG
209
11/15/2014
-1
PNR_DB
CODE_SHARE
FK_ITIODPAD_ITINERARY_INFO_ODP
253
11/1/2014
34
PNR_DB
ITINERARY_TBL
AD_ITINERARY_INFO_ODP
203
11/1/2014
34
PNR_DB
ITINERARY_TBL
CD_ID_PRODUCT
203
11/1/2014
34
PNR_DB
ITINERARY_TBL
CD_ORIGINAL_STATUS_CODE,CD_STATUS,
FK_PNRODP_AD_PNR_HEADER_ODP
203
11/2/2014
34
PNR_DB
PNR_HEADER_ODP
AD_PNR_HEADER_ODP
600
11/28/2014
34
PNR_DB
PNR_HEADER_ODP
AD_PNR_HEADER_ODP,CD_CONTROL_NUMBER_LOC
700
11/28/2014
34

 Identifying Unused Statistics
Using USECOUNT to find unused statistics requires consideration of multiple time dimensions.  The basic question being answered here is:  Am I collecting statistics that have not been used for some duration?  To answer this question, you need to consider not only the usage counts, but also how long the USECOUNT logging was active, the most recent usage time stamp and also the age of the statistics.  For example, if USECOUNT logging has been enabled only for the past few days, some statistics which get used on a monthly/quarterly workload may not yet have been logged as used.  Similarly, if you collect new statistics (not re-collections), you need to let them get exposed to different workloads for certain period of time to properly identify whether they are being used or not.
The following query is designed to consider these aspects and to list out the statistics that meet the following criteria.
  1. The age of the statistic is more than N days (first collected more than N days ago).
  2. DBQL USECOUNT logging is active for more than N days on the owning database.
  3. The statistic has not been used in the last N days.
The value of N can be customized based on your requirements. In the example query given below, the value of N is set to 30 (note that N is used in two predicates; both need to be updated if you customize this value).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
SELECT DBC.DBase.DatabaseName AS DatabaseName
      ,DBC.TVM.TVMName        AS TableName
      ,COALESCE(DBC.StatsTbl.StatsName
               ,DBC.StatsTbl.ExpressionList
               ,'SUMMARY')    AS StatName
      ,CURRENT_DATE - CAST(DBC.StatsTbl.CreateTimeStamp AS DATE) AS StatAge
      ,CASE
       WHEN DatabaseName = 'DBC'
       THEN -2  -- Logging Not Applicable
       WHEN DBC.StatsTbl.StatsType IN ('B', 'M')
       THEN -2  -- Logging Not Applicable on Temp tables (base and materialized)
       WHEN DBC.DBQLRuleTbl.TimeCreated IS NULL
       THEN -1  -- Logging Not Enabled
       WHEN DBC.DBQLRuleTbl.TimeCreated > DBC.StatsTbl.CreateTimeStamp
       THEN CURRENT_DATE - CAST(DBC.DBQLRuleTbl.TimeCreated AS DATE)
       ELSE CURRENT_DATE - CAST(DBC.StatsTbl.CreateTimeStamp AS DATE)
 END AS DaysStatLogged 
FROM   DBC.StatsTbl LEFT JOIN  DBC.DBQLRuleTbl
           ON DBC.StatsTbl.DatabaseId = DBC.DBQLRuleTbl.UserID
          AND DBQLRuleTbl.ExtraField5 = 'T'    /* Comment this line if TD 15.0 or above   */
        /*AND DBQLRuleTbl.ObjectUsage = 'T'*/  /* Uncomment this line if TD 15.0 or above */
      ,DBC.Dbase
      ,DBC.TVM     
WHERE DBC.StatsTbl.DatabaseId = DBC.DBASE.DatabaseId
  AND DBC.StatsTbl.ObjectId   = DBC.TVM.TVMId
  AND NOT EXISTS (SELECT 100 FROM DBC.ObjectUsage OU
                  WHERE OU.UsageType  = 'STA'
                    AND OU.DatabaseId = DBC.StatsTbl.DatabaseId
                    AND OU.ObjectId   = DBC.StatsTbl.ObjectId
                    AND OU.FieldId    = DBC.StatsTbl.StatsId
                    AND CURRENT_DATE - CAST(OU.LastAccessTimeStamp AS DATE) < 30
                 )
  AND DaysStatLogged > 30
  AND DBC.StatsTbl.StatsId <> 0  -- Do not qualify table-level SUMMARY statistics as unused
                                 -- May get implicitly used but not recorded as used
ORDER BY 1, 2, 3;

You can customize the above query to adjust the number of days the statistic has not been used, for the databases and tables of your interest, the age of the statistic, etc.
DatabaseName
TableName
StatName
Stat
Age
DaysStat
Logged
PNR_DB
ITINERARY_TBL
PARTITION
34
34
PNR_DB
ITINERARY_TBL
ST_281020293276_0_ITINERARY_INFO_ODP
34
34
PNR_DB
ITINERARY_TBL
TS_ULT_ALZ
34
34
PNR_DB
PNR_HEADER_ODP
CD_CREATION_OFFICE_ID
34
34
PNR_DB
PNR_HEADER_ODP
CD_CREATOR_IATA_CODE
34
34
PNR_DB
PNR_HEADER_ODP
PARTITION
34
34


 Identifying Missing Statistics
Two additional DBQL logging options, STATSUSAGE and XMLPLAN, create XML documents that identify which statistics were used, and ones that the optimizer looked for but did not find.  Contrary to USECOUNT, these two logging options should be turned on temporarily and only as needed for analysis.  These two options are enabled the same way as other DBQL logging is, by User or Account.  Their output can be found in DBC.DBQLXMLTbl.
STATSUSAGE logs the usage of existing statistics within a query, as well as recommendations for new statistics that were found to be missing when the query was optimized.  It does this without tying the recommendation to a particular query step in the plan.  The relationship to query steps can only be seen if XMLPLAN logging is also enabled.  XMLPLAN logs detailed step data in XML format.
Enable STATSUSAGE when looking for missing statistics always, with or without XMLPLAN.  If you enable XMLPLAN by itself without STATSUSAGE, no statistics-related information of any kind is logged into DBQLXMLTbl.  STATSUSAGE provides all the statistics recommendations and, if both are being logged, those statistic recommendations can be attached to specific steps in the plan.
Because XMLPLAN comes with increased overhead, for the purposes of identifying missing statistics, it usually is sufficient to start with STATSUSAGE logging without XMPLPLAN.  The step information available in XMLPLAN is more useful when in analysis mode.

(Courtesy: Teradata Forum )