From: Philipp Psurek <philipp.psurek@gmail.com>
To: b.a.t.m.a.n@lists.open-mesh.org
Subject: Re: [B.A.T.M.A.N.] kernel BUG at net/core/skbuff.c:100
Date: Thu, 20 Nov 2014 13:22:29 +0100 [thread overview]
Message-ID: <1416486149.2747.9.camel@gmail.com> (raw)
In-Reply-To: <546DC214.6050908@hundeboll.net>
[-- Attachment #1: Type: text/plain, Size: 5736 bytes --]
Hi Martin
/usr/src/linux/net/batman-adv/fragmentation.c is patched. I'm sorry I
oversaw your attachment. the new module is running, the size differs
# lsmod
[ … ]
batman_adv 147774 0 # old
batman_adv 148030 0 # new
[ … ]
Batman-adv runs with
# batctl if
fastd0: active
# batctl it
5000
# batctl ap
disabled
# batctl bl
enabled
# batctl dat
enabled
# batctl ag
enabled
# batctl b
disabled
# batctl f
enabled
# batctl nc
enabled
# batctl mark
0x00000000/0x00000000
# batctl mm
enabled
batctl ll
Error - can't open file '/sys/class/net/bat0/mesh/log_level': No such
file or directory [ … ]
batctl gw
server (announced bw: 100.0/100.0 MBit)
this are also the options while kernel panic.
Am Donnerstag, den 20.11.2014, 11:27 +0100 schrieb Martin Hundebøll:
> On 2014-11-20 10:48, Philipp Psurek wrote:
[ … ]
> Yeah, most people compile out network coding. Has the bug disappeared
> after disabling NC ?
I can't tell for sure. nc is disabled for 20 hours. The Bug appeared
from 1 minute to 72 hours. It depends on our users. To reproduce the bug
nc is enabled again.
> > Am Donnerstag, den 20.11.2014, 09:32 +0100 schrieb Martin Hundebøll:
> >> Thanks for you report. The bug is probably triggered by some bogus data
> >> in an incoming packet. I have created a small debug patch that will
> >> detect if this is the case, and print some debug info if so.
> >
> > Thank you for your work. I didn't find your Patch on
> > http://git.open-mesh.org/batman-adv.git
>
> It was attached to my previous mail :)
I'm so sorry ;-) my fault
> > I can not analyse the packages because the gateway is part of an ISP
> > infrastructure and there is data privacy. But if you're capable to fish
> > only the bogus data package during kernel panic with your patch there
> > shouldn't be any problems, I think.
>
> My debug patch should only print the header of the packet causing the
> panic, so no problems with privacy here. (But you should probably check
> the output before mailing it to a public list...)
OK, thanks for that
[ … ]
> I am running with NC on my machines in the lab and haven't seen this
> frag-issue before. I have seen a similar issue (wrong size value in the
> header) in another context though, but this wasn't due to either network
> coding or fragmentation.
Well, the lab is peaceful but in the free wild there are evil data
packages.
> Would you mind sending me your fastd config (without the key), so that I
> can try to reproduce this in my VMs?
Not at all. Here is the censored /etc/fastd/fastd.conf
#---8<---8<---8<---8<---8<---8<----
bind <my_publicIP>:<my_fastdPORT>;
include "secret.conf";
include peers from "peers/wupper";
include peers from "testpeers/wupper";
include peers from "servers/wupper";
interface "fastd0";
log level warn;
method "salsa2012+gmac";
#### doesn't have anything to do with the bug, also seen with fastd v14
#### not used yet but with the new firmware:
method "salsa2012+umac";
mtu 1426;
on up "
ip link set address <MAC_ADDRESS> dev $INTERFACE
ip link set up dev $INTERFACE
modprobe batman-adv
batctl if add fastd0
batctl it 5000
batctl bl enable
batctl gw client
### gw will be changed later to server 100000/100000
ip link set up dev bat0
ip addr add 10.3.<IP>/16 broadcast 10.3.255.255 dev bat0
ip addr add 10.3.<anotherIP>/16 broadcast 10.3.255.255 dev bat0
ip addr add fda0:747e:ab29:e1ba:<IPv6_IP>/64 dev bat0
ip route add 10.3.0.0/16 dev bat0 proto kernel scope link src
10.3.<wrong_IP*)>
alfred -i bat0 -m > /dev/null 2>&1 &
batadv-vis -i bat0 -s > /dev/null 2>&1 &
";
#---8<---8<---8<---8<---8<---8<----EOF
*) now I see there is a different IP. This IP does not belong to this
machine, and during kernel panic and now to no machine in the Batman
cloud.
wolke linux # /etc/init.d/fastd start fastd ...
RTNETLINK answers: Invalid argument
#### now I know why ;-) but to reproduce the bug I don't change it
then this commands are executed:
#---8<---8<---8<---8<---8<---8<----
ip tunnel add tun-ffw-w07 mode ipip remote <remoteIP> local <myIP>
ip addr add <some_ISP_IP>/31 dev tun-ffw-w07
ip tunnel change tun-ffw-w07 ttl 64
ip link set mtu 1400 dev tun-ffw-w07
ip link set dev tun-ffw-w07 up
ip rule add from <some_ISP_IP>/31 table 16
ip rule add iif bat0 table 16
ip rule add from all to <some_ISP_IP_for_this_machine> lookup 16
ip route add default via <some_ISP_IP_on_the_other_side> \
dev tun-ffw-w07 table 16
ip route add <some_ISP_IP>/31 dev tun-ffw-w07 table 16
# bat doesn't need any address, but the error occurs also with scope
# link
ip addr flush dev fastd0
iptables -t nat \
-A POSTROUTING \
-o tun-ffw-w07 ! -s <some_ISP_IP>/31 \
-j SNAT --to <some_ISP_IP_for_this_machine>
iptables -A FORWARD -p tcp \
--tcp-flags SYN,RST SYN -j TCPMSS --clamp-mss-to-pmtu
# yes, I know … but some services in the net do not like IMCP
# http://lartc.org/howto/lartc.cookbook.mtu-mss.html
sysctl -w net.ipv4.ip_forward=1
sysctl -w net.ipv4.conf.default.rp_filter=0
sysctl -w net.ipv4.conf.all.rp_filter=0
/etc/local.d/kdump.start
/etc/init.d/dhcpd restart
/etc/init.d/vnstatd restart
/etc/init.d/named restart
/etc/init.d/apache2 restart
batctl gw server 100000/100000
#---8<---8<---8<---8<---8<---8<----EOF
Now we have to wait till “prime time” or weekend. I always hoped:
“please don't crush” but now it's different ;-) I hope after that you
can reproduce the bug and fix it.
Best regards
Philipp
________________________
Freifunk Rheinland e. V.
– Funkzelle Wuppertal –
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
next prev parent reply other threads:[~2014-11-20 12:22 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-18 21:58 [B.A.T.M.A.N.] kernel BUG at net/core/skbuff.c:100 Philipp Psurek
2014-11-20 8:32 ` Martin Hundebøll
2014-11-20 9:48 ` Philipp Psurek
2014-11-20 10:27 ` Martin Hundebøll
2014-11-20 12:22 ` Philipp Psurek [this message]
2014-11-20 12:36 ` Martin Hundebøll
2014-11-21 8:40 ` Philipp Psurek
2014-11-22 20:39 ` Philipp Psurek
2014-11-24 8:24 ` Martin Hundebøll
2014-11-24 10:44 ` Philipp Psurek
2014-11-24 12:14 ` Philipp Psurek
2014-11-24 21:15 ` Philipp Psurek
2014-11-24 22:26 ` Philipp Psurek
2014-11-25 0:22 ` Philipp Psurek
2014-11-25 10:17 ` Philipp Psurek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1416486149.2747.9.camel@gmail.com \
--to=philipp.psurek@gmail.com \
--cc=b.a.t.m.a.n@lists.open-mesh.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox