1) Collect perflook (/opt/teradata/gsctools/bin/perflook)
Ideally, please collect perflook twice with 2~3 min interval.
We can see transition of session/system state(really hang or slowly moving) if we have multiple perflooks.
2) Collect session status with qrysessn
qrysessn output is collected by perflook.
Please make sure hang session is captured there.
(qrysessn output sample)
Host Session PE DBC User ID
----- ------- ----- -------------------------------
1 1005 30718 User1
State details : PARSING
3) Collect session status with gtwglobal (/usr/tgtw/bin/gtwglobal)
Please collect this info before trying to abort session if you did not try to abort yet. (session state in gtwglobal may change due to abort attempt)
(run following command on TPA node)
# /usr/tgtw/bin/xgtwglobal
se ho <host_no> <<<<<< specify host number found in qrysessn (usually 1)
1> di se <session_no> long <<<<<< specify sessionno found in qrysessn
Command and output sample:
Enter gateway command or enter h for Help:
se ho 1
Host 1 has been selected.
Enter gateway command or enter h for Help:
1>di se 29989 long
Session 29989 connected to GTW 22528 is assigned to PE 30718
of host 1
User Name Account IP Addr Port
-------------------- -------------------- --------------------------------------------- ----------
DBC 10.210.190.203 35119
State Event Action
--------------------------------------- ------------------------ ------------------------
CS_CLIENTWAITNOTRAN CE_STARTMSGRSPNOTRAN CA_SENDDBSRSP
Partition Authentication
---------------- --------------
DBC/SQL DATABASE
:
Review "State" of the state in the output. "Utilities" guide has description for each "State".
In sample above, CS_CLIENTWAITNOTRAN means the session is waiting for a request from client.
(GSO/Client needs to look into what the client App is doing)
If we have multiple hung/stuck sessions, please repeat the command above for all hung sessions.
4) Run lokdisp and collect blocking info
Blocking info may be helpful even when session is NOT in BLOCKED status.
Session status remains in PARSING when express request is blocked.
(express request is for PE to retrieve data from AMP during optimizing process)
Type following command in lokdisp:
> bl <<<< display blocking info
> a <<<< request to all AMP
> tr <<<< display transaction lock
> (AMP No) <<<< specify AMP # where blocking is found
Please repeat "tr" command on all AMPs where blocking is found.
5) (If session is in aborting state) See if rollback is running
Aborted session will remain in aborting state until rollback is completed.
Run this command in rcvmanager utility:
> LIST ROLLBACK TABLES;
6) See if ABORT SESSION can abort hang session
Please get customer approval before running ABORT SESSION.
If ABORT SESSION works, you don't need to move on to next step.
7) Run tpareset -d if it's unable to abort session.
Please make sure to get customer approval before tpareset.
Engage GSO/DBS and let them review system before tpareset so we can identify the node to be uploaded.
8) Upload dump to TD GSC team
Don't upload full dump if system has more than 4 nodes. Please upload node(s) with hang session tasks only.
No comments:
Post a Comment