public inbox for b.a.t.m.a.n@lists.open-mesh.org
 help / color / mirror / Atom feed
* [B.A.T.M.A.N.] Possible bad interaction between BLA2 and TT?
@ 2013-06-23 21:51 Gui Iribarren
  2013-06-23 22:28 ` Gui Iribarren
  0 siblings, 1 reply; 5+ messages in thread
From: Gui Iribarren @ 2013-06-23 21:51 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

[-- Attachment #1: Type: text/plain, Size: 2871 bytes --]

Hey everyone!
fiiinally got back home, a week ago, and got time to debug a strange 
issue here. The report i had from a few users was "intermittent 
connectivity", with "waves" of traffic, with random pauses lasting from 
a few seconds to a minute or so.

I initially dismissed as interference, or even OS problems, but turns 
out they were right! and sadly, batman seems to be in the way

 From what i've seen, watching "batctl tg -w" on every node along the 
way, i could determine the window of time where the traffic gets lost:
from the moment when there's a TT change on one side of the network,
to the moment that change is propagated to the other side.

By ordex's advice, i ran some "batctl ll tt ; batctl l" along the way 
and i'm sending the pastebin results at the end of this mail.

Some (hopefully) useful context follows, and a batctl vd graph is attached

The IPv6 of tdorado is pinged (to rule out DAT interactions) from 
labanda-este (works fine always) and from labanda-oeste (suffers the 
issue, as well as all nodes "behind" it, i.e. casapuente & alfredo).
both labandas are tl-wdr3500 connected by 2.4ghz, 5ghz, and an ethernet 
cable. The ethernet carries only batadv packets (eth0.1 is added to 
bat0); there's no "lan backbone" (the eth0.2 that appears under br-lan 
is not connected to anything)

root@tdorado:~# opkg list kmod-batman-adv # same in all nodes
kmod-batman-adv - 3.8.3+2013.2.0-2
root@tdorado:~# ip a s br-lan
6: br-lan: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue 
state UP
     link/ether 64:70:02:3d:a0:f7 brd ff:ff:ff:ff:ff:ff
     inet6 2a00:1508:1:f004:6670:2ff:fe3d:a0f7/64 scope global dynamic
        valid_lft 6985sec preferred_lft 6985sec
root@tdorado:~# batctl tl -n |grep f7
  * 64:70:02:3d:a0:f7 [.....]   1.140
root@tdorado:~# batctl o |head -n 1
[B.A.T.M.A.N. adv 2013.2.0, MainIF/MAC: wlan0-1/66:70:02:3d:a0:f9 (bat0)]

labanda-este
http://pastebin.com/R1kHQCQG

labanda-oeste
http://pastebin.com/b1Uc23VZ

Both ping6s were started at the same time, so the seq numbers are 
synchronized, and can be used as timestamps.

the "gap" in labanda-oeste is between seq=73 and seq=89
in labanda-oeste there were no messages or traffic for 25secs, and then 
the "TT inconsistency" came up, resolved, and seq=89 succeded, traffic 
restored.
at that time, seq=74, labanda-este got a TT update:
[  23161800] Deleting tdorado from global tt entry 44:d8:84:b0:d2:f5: tt 
removed by changes
and (AFAIU) dropped traffic coming from labanda-oeste until 
labanda-oeste finally got the TT update and increased the ttvn to 129

does any of this make sense? I imagine a tcpdump would help, so i'll try 
getting one, but maybe this debug was enough to get an insight?

As you can imagine, any further pointer will be greatly appreciated,

I hope you're having a great week, ...and that i'm not ruining it as 
always :D

Gui

[-- Attachment #2: dl.svg --]
[-- Type: image/svg+xml, Size: 15801 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-06-25  6:38 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-06-23 21:51 [B.A.T.M.A.N.] Possible bad interaction between BLA2 and TT? Gui Iribarren
2013-06-23 22:28 ` Gui Iribarren
2013-06-24  7:02   ` Antonio Quartulli
2013-06-25  4:14     ` Gui Iribarren
2013-06-25  6:38       ` Antonio Quartulli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox