Hi, Dwight.
I'm sorry for having kept you waiting.
I realized that we have a bad manner to handle child processes not only in v6eval but also other tools.
And I also understand that this issue is the critical,
but it takes high cost to fix the problem because the point is really core part of our tools.
Anyway I'll investigate this.
Thanks,
On Mon, 02 Feb 2009 13:46:39 -0700
Dwight Tovey <dwight.tovey@hp.com> wrote:
> I've run into a rather odd problem.
>
> Since we run most of our other tests from Linux, sometime back I
> installed the Linux patches for Tahi and I've been running the tests on
> a Linux system ever since then. For the most part everything was
> working, but every so often I would have a test fail with a "no test
> sequence" error. On rerunning the failed test though everything was
> fine, so chasing down that issue got put on my "TODO" list with a low
> priority.
>
> Recently though I installed a new, faster TN system, and suddenly
> instead of failing just occasionally, almost every test was failing with
> that error, so finding the cause became a higher priority.
>
> It turns out that the tests are passing, but in the END procedure in
> V6evalTool.pm, when it was attempting to clean up pktbuf and tcpdump,
> the 'kill' system command was failing. The module is sending
> termination signals the processes with this code:
> foreach(@PktbufPids){
> prTrace("Exiting... sending SIGTERM to $_");
> kill('TERM',$_) || system("kill -n 15 $_") ||
> prOut("Error in killing pktbuf pid=$_");
> }
> sleep 1;
> foreach(@TcpdumpPids){
> prTrace("Exiting... sending SIGTERM to $_");
> kill('INT',$_) || system("kill -n 15 $_") ||
> prOut("Error in killing tcpdump pid=$_");
> }
>
>
> On my old slower system, the script was usually able to just use the
> perl 'kill' function for both processes. On the new system though, when
> the first kill signal was sent to the pktbuf process, it was dead and
> gone (and took tcpdump with it since the pipe was closed) before the
> kill for tcpdump was called. When that function failed, the code would
> use the system("kill") call. Since 'kill -n' is invalid syntax for
> Linux, the script would abort which caused autorun to see a failure
> which it interpreted as 'no test sequence'.
>
> My question is, why the call to system("kill")? The perl 'kill'
> function will send the signal, so why try to call the external command?
> I don't see a problem with sending the signal to tcpdump even though it
> is already gone (the pipe close might not have killed it) but I don't
> think the 'system()' call is necessary. Or am I missing something?
>
> /dwight
>
> --
> Dwight Tovey
> IO Test Engineer
> email: dwight.tovey@hp.com
> (208) 396-4645
>
>
>
--
Yukiyo Akisada <akisada@tahi.org>