Hi again,
On Tue, Apr 15, 2008 at 10:29:16PM +0200, Karsten Keil wrote:
...
> > I run in some problems with the ct-2.1.1/default-addr-select tests.
> > Randomly some tests fail and the log says that the ICMP6 echo was
> > not received, but the packet log itself shows that the packet was
> > received and also in time.
> >
...
> >From the log:
> NUT <------------------- DefaultRouter : ICMPv6 Echo Request
> 02:47:37 vRecv(Link0,ns_NUT2DEFAULT_ROUTER_SLLA ns_NUT2DEFAULT_ROUTER_noOPT ns_NUT2DESTINATION_SLLA ns_NUT2DESTINATION_noOPT echo_reply_SOURCE12DESTINATION) timeout:7
> cntLimit:0 seektime:0 cmd:/usr/local/v6eval//bin/pktrecv -t /usr/local/v6eval//etc//tn.def -n /usr/local/v6eval//etc//nut.def -p /var/tmp/tmp.0.vq6mYQ -l1 -i Link0 -e 7
> ns_NUT2DEFAULT_ROUTER_SLLA ns_NUT2DEFAULT_ROUTER_noOPT ns_NUT2DESTINATION_SLLA ns_NUT2DESTINATION_noOPT echo_reply_SOURCE12DESTINATION
> NUT(known Source) -----> Destination(offLink) : NS with SLLA
> 02:47:37 vSend(Link0,na_DESTINATION_R_2NUT_TLLA)
> NUT <------------------- Destination(offLink) : NA with TLLA & no R flag
> 02:47:37 vRecv(Link0,ns_NUT2DEFAULT_ROUTER_SLLA ns_NUT2DEFAULT_ROUTER_noOPT ns_NUT2DESTINATION_SLLA ns_NUT2DESTINATION_noOPT echo_reply_SOURCE12DESTINATION) timeout:7
> cntLimit:0 seektime:1208134057.659195 cmd:/usr/local/v6eval//bin/pktrecv -t /usr/local/v6eval//etc//tn.def -n /usr/local/v6eval//etc//nut.def -p /var/tmp/tmp.0.vq6mYQ
> -l1 -i Link0 -e 7 -s 1208134057.659195 ns_NUT2DEFAULT_ROUTER_SLLA ns_NUT2DEFAULT_ROUTER_noOPT ns_NUT2DESTINATION_SLLA ns_NUT2DESTINATION_noOPT
> echo_reply_SOURCE12DESTINATION
> vRecv() return status=1
> --- Can not receive Echo Reply or NS from NUT ---
>
> the vRECv is restarted with a seektime of 1208134057.659195 (which computes to
> 02:47:37.659195)
>
> If you look at the tcpdump log:
> 02:47:37.280154 IP6 fec0::3 > fec0::1: ICMP6, echo request, seq 0, length 16
> 02:47:37.281859 IP6 fec0::1 > ff02::1:ff00:3: ICMP6, neighbor solicitation, who has fec0::3, length 32
> 02:47:37.658973 IP6 fec0::3 > fec0::1: ICMP6, neighbor advertisement, tgt is fec0::3, length 32
> 02:47:37.659137 IP6 fec0::1 > fec0::3: ICMP6, echo reply, seq 0, length 16
>
> The valid echo reply is received at 02:47:37.659137, which is before the seektime, so it is ignored.
> The question is, what did set the seektime to this wrong value ?
> Seems to me to be a real bug in v6eval code.
>
OK I found an explanation and a workaround:
The reason is, that the the ADDR_SELECT.pm use the send timestamp from vSend
as seektime for the next vRecv call. The problem is, that vSend (or better the
pkksend program) does not take the real send timestamp, it calls gettimeofday
at the end of all things it is doing, so the program maybe get rescheduled
(several times) between sending the packet and taking the actual time.
So the send timestamp is allways late to the real packet send (you can see
that if you compare send timestamps with the valid tcpdump timestamps).
In some cases the send timestamp is behind of the next received packet, then
the test fails. Since the scheduling times and so the time gap depends on
many things, the time gap is random, this explains why we see this only
sometimes and that on some hardware it never shows up.
Since I never saw such error on newer TAHI tests ( e.g. Selftest-1.4.13) I
looked at similar tests and found, that all newer tests doesn't use the
seektime parameter in any vRecv calls, it use 0 in all cases.
So I changed ADDR_SELECT.pm to use 0 as well and now I get the expected
results.
Can please somebody from TAHI confirm, that this analysis and the workaround
is OK ?
--
Karsten Keil
SuSE Labs
ISDN and VOIP development
SUSE LINUX Products GmbH, Maxfeldstr.5 90409 Nuernberg, GF: Markus Rex, HRB 16746 (AG Nuernberg)