public inbox for b.a.t.m.a.n@lists.open-mesh.org
 help / color / mirror / Atom feed
* [B.A.T.M.A.N.] can batctl ping but not ping in 2014.1.0
@ 2014-04-08 17:57 Gui Iribarren
  2014-04-08 18:34 ` Gui Iribarren
  0 siblings, 1 reply; 2+ messages in thread
From: Gui Iribarren @ 2014-04-08 17:57 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

Hello again friendly devs,
here we are, after a long "running stable" hiatus, back into the 
bleeding edge for a ride \o/

running a small cloud of recent openwrt trunk (r40361)
(OT: kmod-ath9k is running suprisingly smooth! yay!!)
with kmod-batman-adv - 3.10.34+2014.1.0-2

and, well... i have bat news :P
1) yesterday i saw something vaguely reminiscent to the old OGM starving 
issue: in a line of 4 guinea-pig nodes that flow through a river of 
DeltaLibre, the 4th node would get TQ=1 for the 1st node, and would not 
even ping it (i can't remember the result of batctl ping, maybe it did), 
even though the links were really solid (TQ>220 on every one-hop-link of 
the chain)
(the 3rd node was seeing the 1st with TQ>200, and could batctl ping / 
ping perfectly)
at that point i found out kmod-batman-adv was inadvertently compiled 
without log support :( so that's as much as i can report for now, i'll 
recompile with that enabled and follow up.

2) this morning, in 2-node cloud testbed at home, uptime=22hs, The 
Bizarre Behaviour showed up and is sharing breakfast with me.

    on one side, lying calmly on the floor...

root@rockm5:~# batctl o
[B.A.T.M.A.N. adv 2014.1.0, MainIF/MAC: wlan0_adhoc.11/dc:9f:db:9c:37:54 
(bat0 BATMAN_IV)]
   Originator      last-seen (#/255)           Nexthop [outgoingIF]: 
Potential nexthops ...
64:70:02:ed:f8:ea    0.770s   (255) 64:70:02:ed:f8:ea [wlan0_adhoc.11]: 
64:70:02:ed:f8:ea (255)
02:00:49:ed:f8:e8    0.320s   (255) 64:70:02:ed:f8:ea [wlan0_adhoc.11]: 
64:70:02:ed:f8:ea (255)
root@rockm5:~# batctl if
wlan0_adhoc.11: active

     2 meters away, a TL-WDR3600 lurks...

root@planit:~# batctl o
[B.A.T.M.A.N. adv 2014.1.0, MainIF/MAC: eth0.1.11/02:00:49:ed:f8:e8 
(bat0 BATMAN_IV)]
   Originator      last-seen (#/255)           Nexthop [outgoingIF]: 
Potential nexthops ...
dc:9f:db:9c:37:54    0.360s   (255) dc:9f:db:9c:37:54 [wlan1_adhoc.11]: 
dc:9f:db:9c:37:54 (255)
root@planit:~# batctl if
eth0.1.11: active
wlan1_adhoc.11: active
wlan0_adhoc.11: active

### rockm5 global ip over br-lan: gave its last breath
root@planit:~# ip -6 r get 2a00:1508:1:f804::9d:3754/64
2a00:1508:1:f804::9d:3754 from :: dev br-lan  src 
2a00:1508:1:f804::ed:f8e8  metric 0
root@planit:~# ping 2a00:1508:1:f804::9d:3754
PING 2a00:1508:1:f804::9d:3754 (2a00:1508:1:f804::9d:3754): 56 data bytes
--- 2a00:1508:1:f804::9d:3754 ping statistics ---
4 packets transmitted, 0 packets received, 100% packet loss

### rockm5 link-local over br-lan: feeding the daisies
root@planit:~# ping6 fe80::de9f:dbff:fe9d:3754%br-lan
PING fe80::de9f:dbff:fe9d:3754%br-lan(fe80::de9f:dbff:fe9d:3754) 56 data 
bytes
--- fe80::de9f:dbff:fe9d:3754%br-lan ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 2001ms

### lower level link-local works fine (avoiding batman-adv)
root@planit:~# ping6 fe80::de9f:dbff:fe9c:3754%wlan1_adhoc.11
PING fe80::de9f:dbff:fe9c:3754%wlan1_adhoc.11(fe80::de9f:dbff:fe9c:3754) 
56 data bytes
64 bytes from fe80::de9f:dbff:fe9c:3754: icmp_seq=1 ttl=64 time=2.70 ms
64 bytes from fe80::de9f:dbff:fe9c:3754: icmp_seq=2 ttl=64 time=1.39 ms
--- fe80::de9f:dbff:fe9c:3754%wlan1_adhoc.11 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 1.398/2.053/2.708/0.655 ms

### batctl ping to rockm5 enjoys excellent health
root@planit:~# batctl ping dc:9f:db:9c:37:54
PING dc:9f:db:9c:37:54 (dc:9f:db:9c:37:54) 20(48) bytes of data
20 bytes from dc:9f:db:9c:37:54 icmp_seq=1 ttl=50 time=1.16 ms
20 bytes from dc:9f:db:9c:37:54 icmp_seq=2 ttl=50 time=0.90 ms
20 bytes from dc:9f:db:9c:37:54 icmp_seq=3 ttl=50 time=0.90 ms
^C--- dc:9f:db:9c:37:54 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss
rtt min/avg/max/mdev = 0.902/0.989/1.162/0.122 ms


well, as said before, i have no "batctl l" output to show, but will 
collect and write chapter two.
With a bit of luck, what i described so far rings a bell on someone, and 
can give an early insight
(maybe it's due to the way we are using vlans?)
(maybe its because routing_algo = BATMAN_IV?)
(maybe the rewritten code is designed to work this way? yay!)
(maybe it's our ugly hacky ebtables droppings / anygw magic that are 
interacting badly in some way? can describe them in detail next time)

i must say tho, that this was running fine yesterday, and it broke 
spontaneously without any manual intervention or config change.

oh, BLA2 and DAT are disabled on all nodes.

thanks as always,
and hope a giggle cheers up your day :)

gui

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [B.A.T.M.A.N.] can batctl ping but not ping in 2014.1.0
  2014-04-08 17:57 [B.A.T.M.A.N.] can batctl ping but not ping in 2014.1.0 Gui Iribarren
@ 2014-04-08 18:34 ` Gui Iribarren
  0 siblings, 0 replies; 2+ messages in thread
From: Gui Iribarren @ 2014-04-08 18:34 UTC (permalink / raw)
  To: b.a.t.m.a.n

On 04/08/2014 02:57 PM, Gui Iribarren wrote:
> Hello again friendly devs,
> here we are, after a long "running stable" hiatus, back into the 
> bleeding edge for a ride \o/
> 
> running a small cloud of recent openwrt trunk (r40361)
> (OT: kmod-ath9k is running suprisingly smooth! yay!!)
> with kmod-batman-adv - 3.10.34+2014.1.0-2
> 
> and, well... i have bat news :P
> 1) yesterday i saw something vaguely reminiscent to the old OGM starving 
> issue: in a line of 4 guinea-pig nodes that flow through a river of 
> DeltaLibre, the 4th node would get TQ=1 for the 1st node, and would not 
> even ping it (i can't remember the result of batctl ping, maybe it did), 
> even though the links were really solid (TQ>220 on every one-hop-link of 
> the chain)
> (the 3rd node was seeing the 1st with TQ>200, and could batctl ping / 
> ping perfectly)
> at that point i found out kmod-batman-adv was inadvertently compiled 
> without log support :( so that's as much as i can report for now, i'll 
> recompile with that enabled and follow up.
> 
> 2) this morning, in 2-node cloud testbed at home, uptime=22hs, The 
> Bizarre Behaviour showed up and is sharing breakfast with me.
> 
>     on one side, lying calmly on the floor...
> 
> root@rockm5:~# batctl o
> [B.A.T.M.A.N. adv 2014.1.0, MainIF/MAC: wlan0_adhoc.11/dc:9f:db:9c:37:54 
> (bat0 BATMAN_IV)]
>    Originator      last-seen (#/255)           Nexthop [outgoingIF]: 
> Potential nexthops ...
> 64:70:02:ed:f8:ea    0.770s   (255) 64:70:02:ed:f8:ea [wlan0_adhoc.11]: 
> 64:70:02:ed:f8:ea (255)
> 02:00:49:ed:f8:e8    0.320s   (255) 64:70:02:ed:f8:ea [wlan0_adhoc.11]: 
> 64:70:02:ed:f8:ea (255)
> root@rockm5:~# batctl if
> wlan0_adhoc.11: active
> 
>      2 meters away, a TL-WDR3600 lurks...
> 
> root@planit:~# batctl o
> [B.A.T.M.A.N. adv 2014.1.0, MainIF/MAC: eth0.1.11/02:00:49:ed:f8:e8 
> (bat0 BATMAN_IV)]
>    Originator      last-seen (#/255)           Nexthop [outgoingIF]: 
> Potential nexthops ...
> dc:9f:db:9c:37:54    0.360s   (255) dc:9f:db:9c:37:54 [wlan1_adhoc.11]: 
> dc:9f:db:9c:37:54 (255)
> root@planit:~# batctl if
> eth0.1.11: active
> wlan1_adhoc.11: active
> wlan0_adhoc.11: active
> 
> ### rockm5 global ip over br-lan: gave its last breath
> root@planit:~# ip -6 r get 2a00:1508:1:f804::9d:3754/64
> 2a00:1508:1:f804::9d:3754 from :: dev br-lan  src 
> 2a00:1508:1:f804::ed:f8e8  metric 0
> root@planit:~# ping 2a00:1508:1:f804::9d:3754
> PING 2a00:1508:1:f804::9d:3754 (2a00:1508:1:f804::9d:3754): 56 data bytes
> --- 2a00:1508:1:f804::9d:3754 ping statistics ---
> 4 packets transmitted, 0 packets received, 100% packet loss
> 
> ### rockm5 link-local over br-lan: feeding the daisies
> root@planit:~# ping6 fe80::de9f:dbff:fe9d:3754%br-lan
> PING fe80::de9f:dbff:fe9d:3754%br-lan(fe80::de9f:dbff:fe9d:3754) 56 data 
> bytes
> --- fe80::de9f:dbff:fe9d:3754%br-lan ping statistics ---
> 3 packets transmitted, 0 received, 100% packet loss, time 2001ms
> 
> ### lower level link-local works fine (avoiding batman-adv)
> root@planit:~# ping6 fe80::de9f:dbff:fe9c:3754%wlan1_adhoc.11
> PING fe80::de9f:dbff:fe9c:3754%wlan1_adhoc.11(fe80::de9f:dbff:fe9c:3754) 
> 56 data bytes
> 64 bytes from fe80::de9f:dbff:fe9c:3754: icmp_seq=1 ttl=64 time=2.70 ms
> 64 bytes from fe80::de9f:dbff:fe9c:3754: icmp_seq=2 ttl=64 time=1.39 ms
> --- fe80::de9f:dbff:fe9c:3754%wlan1_adhoc.11 ping statistics ---
> 2 packets transmitted, 2 received, 0% packet loss, time 1001ms
> rtt min/avg/max/mdev = 1.398/2.053/2.708/0.655 ms
> 
> ### batctl ping to rockm5 enjoys excellent health
> root@planit:~# batctl ping dc:9f:db:9c:37:54
> PING dc:9f:db:9c:37:54 (dc:9f:db:9c:37:54) 20(48) bytes of data
> 20 bytes from dc:9f:db:9c:37:54 icmp_seq=1 ttl=50 time=1.16 ms
> 20 bytes from dc:9f:db:9c:37:54 icmp_seq=2 ttl=50 time=0.90 ms
> 20 bytes from dc:9f:db:9c:37:54 icmp_seq=3 ttl=50 time=0.90 ms
> ^C--- dc:9f:db:9c:37:54 ping statistics ---
> 3 packets transmitted, 3 received, 0% packet loss
> rtt min/avg/max/mdev = 0.902/0.989/1.162/0.122 ms
> 
> 
> well, as said before, i have no "batctl l" output to show, but will 
> collect and write chapter two.
> With a bit of luck, what i described so far rings a bell on someone, and 
> can give an early insight
> (maybe it's due to the way we are using vlans?)

Mh... speaking of which, maybe there's something TT-fishy about vlans?

root@rockm5:~# batctl tl
Locally retrieved addresses (from bat0) announced via TT (TTVN: 2):
       Client         VID Flags    Last seen (CRC       )
 *     rockm5_br-lan   -1 [......]   3.220   (0xbfe4b7db)
 *       rockm5_bat0   -1 [.P....]   0.000   (0xbfe4b7db)
 *       rockm5_bat0    0 [.P....]   0.000   (0x453da959)
root@rockm5:~# batctl tg
Globally announced TT entries received via the mesh bat0
       Client         VID  (TTVN)       Originator      (Curr TTVN) (CRC       ) Flags
 *       planit_bat0   -1   (  2) via  planit_eth0.1.11     (  2)   (0x8f4039e4) [....]
 *       planit_bat0    0   (  2) via  planit_eth0.1.11     (  2)   (0x29283c0f) [....]


root@planit:~# batctl tl
Locally retrieved addresses (from bat0) announced via TT (TTVN: 2):
       Client         VID Flags    Last seen (CRC       )
 *       planit_bat0   -1 [.P....]   0.000   (0x8f4039e4)
 *       planit_bat0    0 [.P....]   0.000   (0x29283c0f)
root@planit:~# batctl tg
Globally announced TT entries received via the mesh bat0
       Client         VID  (TTVN)       Originator      (Curr TTVN) (CRC       ) Flags
 *     rockm5_br-lan   -1   (  2) via rockm5_wlan0_adhoc     (  2)   (0xbfe4b7db) [....]
 *       rockm5_bat0   -1   (  1) via rockm5_wlan0_adhoc     (  2)   (0xbfe4b7db) [....]
 *       rockm5_bat0    0   (  2) via rockm5_wlan0_adhoc     (  2)   (0x453da959) [....]

i understand vid -1 means "no tag"... but then, what's vid=0 then?

relevant bat-hosts
dc:9f:db:9d:37:54 rockm5_br-lan
96:65:b0:4c:6b:44 rockm5_bat0
dc:9f:db:9c:37:54 rockm5_wlan0_adhoc
92:5c:d9:b1:8f:df planit_bat0
02:00:49:ed:f8:e8 planit_eth0.1.11

> (maybe its because routing_algo = BATMAN_IV?)
> (maybe the rewritten code is designed to work this way? yay!)
> (maybe it's our ugly hacky ebtables droppings / anygw magic that are 
> interacting badly in some way? can describe them in detail next time)
> 
> i must say tho, that this was running fine yesterday, and it broke 
> spontaneously without any manual intervention or config change.
> 
> oh, BLA2 and DAT are disabled on all nodes.
> 
> thanks as always,
> and hope a giggle cheers up your day :)
> 
> gui

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2014-04-08 18:34 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-08 17:57 [B.A.T.M.A.N.] can batctl ping but not ping in 2014.1.0 Gui Iribarren
2014-04-08 18:34 ` Gui Iribarren

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox