This post is more than 5 years old
300 Posts
0
2187
January 13th, 2011 10:00
pkill with new server_stats
I've been using the (super! cool! long-awaited!!) new server_stats facility in DART 6.0 to collect all kinds of cool stuff. I'm trying to build a script to collect statistics via cron, then run a cron script to kill the server_stats collection.
My problem is, when I kill or pkill the parent pid of my script, it doesn't kill all of the child processes, because a new child seems to be forked by the undelying Java. Let's what my pax-script in action:
[nasadmin@NS80-CS0 scripts]$ ./start-pax-stats.sh (I capture the PID in this script, and write it as /tmp/paxstat.pid)
[nasadmin@NS80-CS0 scripts]$ cat /tmp/paxstat.pid
15996
[nasadmin@NS80-CS0 scripts]$ ps -ef|grep stats
nasadmin 15996 18335 0 13:04 pts/0 00:00:00 /bin/bash ./start-pax-stats.sh
nasadmin 16000 15996 0 13:04 pts/0 00:00:00 /bin/sh /nas/bin/server_stats server_2 -monitor ndmp.pax.session.ALL-ELEMENTS.dataOperator.ALL-ELEMENTS.backupRootDir,ndmp.pax.session.ALL-ELEMENTS.dataOperator.ALL-ELEMENTS.bytes -interval 5 -te no -format csv -file /nas/quota/slot_2/ndmp_logs/pax/pax.1101131304
nasadmin 16002 16000 19 13:04 pts/0 00:00:03 /usr/java/bin/java -DNAS_DB=/nas -server -Xmx256M -cp /nas/stats/lib/server_stats.jar:/nas/http/webui/tools/tomcat/webapps/ROOT/WEB-INF/lib/ccmd-support_en_US.jar:/nas/j_lib/db.jar server_stats server_2 -monitor ndmp.pax.session.ALL-ELEMENTS.dataOperator.ALL-ELEMENTS.backupRootDir,ndmp.pax.session.ALL-ELEMENTS.dataOperator.ALL-ELEMENTS.bytes -interval 5 -te no -format csv -file /nas/quota/slot_2/ndmp_logs/pax/pax.1101131304
Now when I pkill the PPID with pkill -P 15996, I get:
[nasadmin@NS80-CS0 scripts]$ pkill -P 15996
[nasadmin@NS80-CS0 scripts]$ ps -ef|grep stats
nasadmin 16002 1 2 13:04 pts/0 00:00:03 /usr/java/bin/java -DNAS_DB=/nas -server -Xmx256M -cp /nas/stats/lib/server_stats.jar:/nas/http/webui/tools/tomcat/webapps/ROOT/WEB-INF/lib/ccmd-support_en_US.jar:/nas/j_lib/db.jar server_stats server_2 -monitor ndmp.pax.session.ALL-ELEMENTS.dataOperator.ALL-ELEMENTS.backupRootDir,ndmp.pax.session.ALL-ELEMENTS.dataOperator.ALL-ELEMENTS.bytes -interval 5 -te no -format csv -file /nas/quota/slot_2/ndmp_logs/pax/pax.110113130
The processes forked by my script are killed, but the process presumably forked by Java. Any ideas how to capture and kill that forked Java process?
Thanks!
Karl
dynamox
9 Legend
•
20.4K Posts
0
January 13th, 2011 11:00
your script and java script have "ndmp" in the string ..so is that going to be unique to which object you are monitoring ?
dynamox
9 Legend
•
20.4K Posts
1
January 13th, 2011 10:00
grep for server_stat command and kill that pid ?
umichklewis_ac7b91
300 Posts
0
January 13th, 2011 11:00
The problem is, I get multiple hits for that command:
[nasadmin@NS80-CS0 scripts]$ ps -ef|grep server_stats
nasadmin 421 733 0 14:04 pts/1 00:00:00 grep server_stats
nasadmin 32556 32550 0 14:04 pts/0 00:00:00 /bin/sh /nas/bin/server_stats server_2 -monitor ndmp.pax.session.ALL-ELEMENTS.dataOperator.ALL-ELEMENTS.backupRootDir,ndmp.pax.session.ALL-ELEMENTS.dataOperator.ALL-ELEMENTS.bytes -interval 5 -te no -format csv -file /nas/quota/slot_2/ndmp_logs/pax/pax.1101131404
nasadmin 32557 32556 30 14:04 pts/0 00:00:01 /usr/java/bin/java -DNAS_DB=/nas -server -Xmx256M -cp /nas/stats/lib/server_stats.jar:/nas/http/webui/tools/tomcat/webapps/ROOT/WEB-INF/lib/ccmd-support_en_US.jar:/nas/j_lib/db.jar server_stats server_2 -monitor ndmp.pax.session.ALL-ELEMENTS.dataOperator.ALL-ELEMENTS.backupRootDir,ndmp.pax.session.ALL-ELEMENTS.dataOperator.ALL-ELEMENTS.bytes -interval 5 -te no -format csv -file /nas/quota/slot_2/ndmp_logs/pax/pax.1101131404
Both my script and the forked Java process both have the same name. Worse yet, if I had multiple server_stats instances running (which is highly possible), pkill would kill multiple instances in one go.
dynamox
9 Legend
•
20.4K Posts
1
January 13th, 2011 14:00
no problem, by the way if you want to filter out your grep command you can do this:
ps -ef | grep erver_stats
umichklewis_ac7b91
300 Posts
0
January 13th, 2011 14:00
Looking for the unique string got me pointed in the right direction! I ended up using 'pgrep' to find the PID of the spawned Java instance and 'pkill' to kill said PID. Worked like a charm.
Thanks for the help!
Karl