Friday, April 12, 2019

How to collect necessary info for session stuck/hang issue?

Objective
How to collect necessary info for session stuck/hang issue(including issue wherein session remains IDLE state)?  
 
Session stuck/hang can occur in various state (e.g. Aborting, PARSING, RESPONSE etc)

This article describes general step to collect necessary info in any session stuck/hung issue.
 
Point:
 It may not be DBS issue when session state remains in RESPONSE state.
 The RESPONSE state indicates a response to a session request is in process wherein DBS internal activity already completed.
  (e.g. session state remains RESPONSE until JDBC application closes query execution object)

Procedure
1) Collect perflook (/opt/teradata/gsctools/bin/perflook)
   Ideally, please collect perflook twice with 2~3 min interval.
   We can see transition of session/system state(really hang or slowly moving) if we have multiple perflooks.
 
 
 
2) Collect session status with qrysessn
   qrysessn output is collected by perflook.
   Please make sure hang session is captured there. 
 
  (qrysessn output sample)
 
  Host   Session  PE     DBC User ID
  -----  -------  -----  -------------------------------
      1     1005  30718  User1
  State details : PARSING
 
 
3) Collect session status with gtwglobal (/usr/tgtw/bin/gtwglobal)
    Please collect this info before trying to abort session if you did not try to abort yet. (session state in gtwglobal may change due to abort attempt)
 
 (run following command on TPA node)
  # /usr/tgtw/bin/xgtwglobal
 se ho <host_no>                 <<<<<< specify host number found in qrysessn (usually 1)
 1> di se <session_no> long      <<<<<< specify sessionno found in qrysessn
 
  Command and output sample:
    Enter gateway command or enter h for Help:
   se ho 1
   Host 1 has been selected.
    Enter gateway command or enter h for Help:
   1>di se 29989  long
   Session 29989 connected to GTW 22528 is assigned to PE 30718
   of host 1
   User Name            Account              IP Addr                                       Port
   -------------------- -------------------- --------------------------------------------- ----------
   DBC                                       10.210.190.203                                35119
   State                                   Event                    Action       
   --------------------------------------- ------------------------ ------------------------
   CS_CLIENTWAITNOTRAN                     CE_STARTMSGRSPNOTRAN     CA_SENDDBSRSP  
   Partition        Authentication
   ---------------- --------------
   DBC/SQL          DATABASE
      :

  Review "State" of the state in the output. "Utilities" guide has description for each "State".
  In sample above, CS_CLIENTWAITNOTRAN
 means the session is waiting for a request from client.
  (GSO/Client needs to look into what the client App is doing)

  If we have multiple hung/stuck sessions, please repeat the command above for all hung sessions.
 
 
4) Run lokdisp and collect blocking info
    Blocking info may be helpful even when session is NOT in BLOCKED status.
    Session status remains in PARSING when express request is blocked.
    (express request is for PE to retrieve data from AMP during optimizing process)
 
     Type following command in lokdisp:     
    > bl                            <<<< display blocking info
    > a                              <<<< request to all AMP
    > tr                            <<<< display transaction lock
    > (AMP No)          <<<< specify AMP # where blocking is found
 
     Please repeat  "tr" command on all AMPs where blocking is found.
 
 
5) (If session is in aborting state) See if rollback is running
    Aborted session will remain in aborting state until rollback is completed.
    Run this command in rcvmanager utility:
    LIST ROLLBACK TABLES; 
 
 
6) See if ABORT SESSION can abort hang session
   Please get customer approval before running ABORT SESSION.
   If ABORT SESSION works, you don't need to move on to next step.
 
 
7) Run tpareset -d if it's unable to abort session.
    Please make sure to get customer approval before tpareset.
    Engage GSO/DBS and let them review system before tpareset so we can identify the node to be uploaded. 
 
 
8) Upload dump to TD GSC team
    Don't upload full dump if system has more than 4 nodes. Please upload node(s) with hang session tasks only.