Index: [Article Count Order] [Thread]

Date: Mon, 02 Feb 2009 13:46:39 -0700
From: Dwight Tovey <dwight.tovey@hp.com>
Subject: [users:01089] New TN too fast
To: "users@tahi.org" <users@tahi.org>
Message-Id: <1233607599.4295.45.camel@dwight-laptop.dtovey.local>
X-Mail-Count: 01089

I've run into a rather odd problem.

Since we run most of our other tests from Linux, sometime back I
installed the Linux patches for Tahi and I've been running the tests on
a Linux system ever since then.  For the most part everything was
working, but every so often I would have a test fail with a "no test
sequence" error.  On rerunning the failed test though everything was
fine, so chasing down that issue got put on my "TODO" list with a low
priority.

Recently though I installed a new, faster TN system, and suddenly
instead of failing just occasionally, almost every test was failing with
that error, so finding the cause became a higher priority.

It turns out that the tests are passing, but in the END procedure in
V6evalTool.pm, when it was attempting to clean up pktbuf and tcpdump,
the 'kill' system command was failing.  The module is sending
termination signals the processes with this code:
        foreach(@PktbufPids){
                prTrace("Exiting... sending SIGTERM to $_");
                kill('TERM',$_) || system("kill -n 15 $_") ||
 		 prOut("Error in killing pktbuf pid=$_");
        }
        sleep 1;
        foreach(@TcpdumpPids){
                prTrace("Exiting... sending SIGTERM to $_");
                kill('INT',$_) || system("kill -n 15 $_") ||
		 prOut("Error in killing tcpdump pid=$_");
        }


On my old slower system, the script was usually able to just use the
perl 'kill' function for both processes.  On the new system though, when
the first kill signal was sent to the pktbuf process, it was dead and
gone (and took tcpdump with it since the pipe was closed) before the
kill for tcpdump was called.  When that function failed, the code would
use the system("kill") call.  Since 'kill -n' is invalid syntax for
Linux, the script would abort which caused autorun to see a failure
which it interpreted as 'no test sequence'.

My question is, why the call to system("kill")?  The perl 'kill'
function will send the signal, so why try to call the external command?
I don't see a problem with sending the signal to tcpdump even though it
is already gone (the pipe close might not have killed it) but I don't
think the 'system()' call is necessary.  Or am I missing something?

	/dwight

-- 
Dwight Tovey
IO Test Engineer
email: dwight.tovey@hp.com
(208) 396-4645