Calculating psutils ps etimes for Java Threads
10 | 31 Jan 2017In older versions of psutils, the application ps does not have etimes. This is a quick documentation on how to gather etimes from the procfs.
What is etimes?
Man:
etimes ELAPSED elapsed time since the process was started, in seconds.
Example:
$ ps -Lo pid,etimes,comm --pid 4398 --sort +etimes
PID ELAPSED COMMAND
4398 359 java
4398 358 java
4398 358 java
4398 354 java
4398 351 java
4398 351 java
4398 349 gmain
4398 349 gdbus
4398 348 java
4398 329 java
4398 52 java
...
The etimes is the number of seconds that have passed since a process was created. Checking the source code of psutils we can see;
etimes = seconds_since_boot - (unsigned long)(pp->start_time / Hertz);
1. seconds_since_boot we can get from /proc/stat
2. start_time we can get from /proc//task//stat
3. Hertz we can get with a c program or the utility getconf
Seconds Since Boot
we can calculate the seconds since boot by subtracting the boot time from the current time. In the file /proc/stat we can find the following;
btime 769041601
boot time, in seconds since the Epoch, 1970-01-01 00:00:00 +0000 (UTC).
with:
$ cat /proc/stat | grep btime | awk '{print $2}'
1485890751
and we can get the current time since epoch using:
$ date +%s
1485893129
so we know this system has been up for a total of 2378 seconds.
Process Start Time
The process start time can be taken from /proc//task//stat;
(22) starttime %llu
The time the process started after system boot. In kernels before Linux 2.6, this value was expressed in jiffies. Since Linux 2.6, the value is expressed in
clock ticks (divide by sysconf(_SC_CLK_TCK)).
with this we can get the value:
$ cat /proc/$JAVA_PID/task/$LWP/stat | awk '{print $22}'
140205
Hertz
For the purpose of this bash script we can get the hertz of the system using:
$ getconf CLK_TCK
100
Calculating
With all the numbers gatherd we can now calculate:
$ bc <<< "scale=2; 2378 - (140205/100)"
975.95
Comparing and Validating
$ bc <<< "scale=2; $(($(date +"%s")-$(cat /proc/stat | grep btime | awk '{print $2}'))) - ($(cat /proc/4398/task/4616/stat | awk '{print $22}')/$(getconf CLK_TCK))"; ps -Lo lwp,etime,etimes --pid 4398 | grep 4616
4250.95
4616 01:10:51 4251
We can see the value we've pulled from procfs matches the value provided from psutil.
Why do all of this rather that formating etime?
Getting to know my system better... it also opens the door to a vast array of other stats that can be pulled for monitoring purposes including per thread netstat.
Full Bash Script For Java Thread Dumps:
kept up to date in github: https://github.com/mikejonesey/take-jstacks/blob/master/take-stacks.sh
#!/bin/bash
################################################
# take-stacks.sh
# take stack traces with the pcpu mapped to each thread
# this can be used to identify which threads are consuming cpu
# Additional Info added to JAVA Thread Dumps:
# - pcpu (percentage of cpu usage on the current core)
# - cpu core
# - page faults major + minor
# - etimes (how long the process has been running)
# Author Michael Jones
# Date 07/08/2015
################################################
#
TRACE_COUNT=1
TRACE_INTERVAL=1
JAVA_PID=$(ps -ef | grep java | grep -v grep | tail -1 | awk '{print $2}')
TMP_DIR="/tmp/$(date +"%Y-%m-%d")"
mkdir $TMP_DIR 2>/dev/null
ps -o etimes &>/dev/null
if [ "$?" == "0" ]; then
etimes_valid=true
else
etimes_valid=false
fi
for ((i=0; i<$TRACE_COUNT; i++)); do
TRACETIME=$(date +"%s")
if [ $etimes_valid == true ]; then
EXINF=$(ps -eLo pcpu,lwp,sgi_p,maj_flt,min_flt,etimes,cmd | grep java | grep -v grep | sort -k1,1 -n | while read al; do nid=$(printf "0x%xn" $(echo "$al" | awk '{print $2}')); echo "$nid $al"; done)
else
SYS_BOOTTIME=$(cat /proc/stat | grep btime | awk '{print $2}') # boot time, in seconds since the Epoch (January 1, 1970)
SYS_JIFFPS=$(getconf CLK_TCK) # jiffies ps
SYS_CURTIME=$(date +"%s") # time since epo
SYS_UPTIME=$(($SYS_CURTIME-$SYS_BOOTTIME)) # uptime in secs
EXINF=$(ps -eLo pcpu,lwp,sgi_p,maj_flt,min_flt,cmd | grep java | grep -v grep | sort -k1,1 -n | while read al; do
LWP=$(echo "$al" | awk '{print $2}')
if [ -f "/proc/$JAVA_PID/task/$LWP/stat" ]; then
nid=$(printf "0x%xn" $LWP)
THREAD_STIME=$(cat /proc/$JAVA_PID/task/$LWP/stat | awk '{print $22}')
THREAD_ETIME=$(bc <<< "scale=2; $SYS_UPTIME - ($THREAD_STIME/$SYS_JIFFPS)")
TIMESTAMP=$(date -d@$(($SYS_CURTIME-$(echo "$THREAD_ETIME" | sed 's/..*//'))) +"%Y-%m-%d_%H:%M:%S")
etimes="$THREAD_STIME $THREAD_ETIME $TIMESTAMP"
else
etimes="-1 -1 -1"
fi
echo "$nid $etimes $al"
done)
fi
jstack -l "$JAVA_PID" > $TMP_DIR/stack-$TRACETIME.out
echo "$EXINF" | while read al; do
NID=$(echo "$al" | awk '{print $1}')
if [ $etimes_valid == true ]; then
ESTR="$(echo "$al" | awk '{print "pcpu=" $2 " core=" $4 " page=" $5 " mpage=" $6 " secs=" $7}')"
else
ESTR="$(echo "$al" | awk '{print "pcpu=" $5 " core=" $7 " page=" $8 " mpage=" $9 " jifftime=" $2 " secs=" $3 " time=" $4}')"
fi
sed -i "s/nid=$NID/nid=$NID $ESTR/" $TMP_DIR/stack-$TRACETIME.out
done
sleep $TRACE_INTERVAL
done