From: Philipp Psurek <philipp.psurek@gmail.com>
To: b.a.t.m.a.n@lists.open-mesh.org
Subject: Re: [B.A.T.M.A.N.] kernel BUG at net/core/skbuff.c:100
Date: Thu, 20 Nov 2014 13:22:29 +0100 [thread overview]
Message-ID: <1416486149.2747.9.camel@gmail.com> (raw)
In-Reply-To: <546DC214.6050908@hundeboll.net>
[-- Attachment #1: Type: text/plain, Size: 5736 bytes --]
Hi Martin
/usr/src/linux/net/batman-adv/fragmentation.c is patched. I'm sorry I
oversaw your attachment. the new module is running, the size differs
# lsmod
[ … ]
batman_adv 147774 0 # old
batman_adv 148030 0 # new
[ … ]
Batman-adv runs with
# batctl if
fastd0: active
# batctl it
5000
# batctl ap
disabled
# batctl bl
enabled
# batctl dat
enabled
# batctl ag
enabled
# batctl b
disabled
# batctl f
enabled
# batctl nc
enabled
# batctl mark
0x00000000/0x00000000
# batctl mm
enabled
batctl ll
Error - can't open file '/sys/class/net/bat0/mesh/log_level': No such
file or directory [ … ]
batctl gw
server (announced bw: 100.0/100.0 MBit)
this are also the options while kernel panic.
Am Donnerstag, den 20.11.2014, 11:27 +0100 schrieb Martin Hundebøll:
> On 2014-11-20 10:48, Philipp Psurek wrote:
[ … ]
> Yeah, most people compile out network coding. Has the bug disappeared
> after disabling NC ?
I can't tell for sure. nc is disabled for 20 hours. The Bug appeared
from 1 minute to 72 hours. It depends on our users. To reproduce the bug
nc is enabled again.
> > Am Donnerstag, den 20.11.2014, 09:32 +0100 schrieb Martin Hundebøll:
> >> Thanks for you report. The bug is probably triggered by some bogus data
> >> in an incoming packet. I have created a small debug patch that will
> >> detect if this is the case, and print some debug info if so.
> >
> > Thank you for your work. I didn't find your Patch on
> > http://git.open-mesh.org/batman-adv.git
>
> It was attached to my previous mail :)
I'm so sorry ;-) my fault
> > I can not analyse the packages because the gateway is part of an ISP
> > infrastructure and there is data privacy. But if you're capable to fish
> > only the bogus data package during kernel panic with your patch there
> > shouldn't be any problems, I think.
>
> My debug patch should only print the header of the packet causing the
> panic, so no problems with privacy here. (But you should probably check
> the output before mailing it to a public list...)
OK, thanks for that
[ … ]
> I am running with NC on my machines in the lab and haven't seen this
> frag-issue before. I have seen a similar issue (wrong size value in the
> header) in another context though, but this wasn't due to either network
> coding or fragmentation.
Well, the lab is peaceful but in the free wild there are evil data
packages.
> Would you mind sending me your fastd config (without the key), so that I
> can try to reproduce this in my VMs?
Not at all. Here is the censored /etc/fastd/fastd.conf
#---8<---8<---8<---8<---8<---8<----
bind <my_publicIP>:<my_fastdPORT>;
include "secret.conf";
include peers from "peers/wupper";
include peers from "testpeers/wupper";
include peers from "servers/wupper";
interface "fastd0";
log level warn;
method "salsa2012+gmac";
#### doesn't have anything to do with the bug, also seen with fastd v14
#### not used yet but with the new firmware:
method "salsa2012+umac";
mtu 1426;
on up "
ip link set address <MAC_ADDRESS> dev $INTERFACE
ip link set up dev $INTERFACE
modprobe batman-adv
batctl if add fastd0
batctl it 5000
batctl bl enable
batctl gw client
### gw will be changed later to server 100000/100000
ip link set up dev bat0
ip addr add 10.3.<IP>/16 broadcast 10.3.255.255 dev bat0
ip addr add 10.3.<anotherIP>/16 broadcast 10.3.255.255 dev bat0
ip addr add fda0:747e:ab29:e1ba:<IPv6_IP>/64 dev bat0
ip route add 10.3.0.0/16 dev bat0 proto kernel scope link src
10.3.<wrong_IP*)>
alfred -i bat0 -m > /dev/null 2>&1 &
batadv-vis -i bat0 -s > /dev/null 2>&1 &
";
#---8<---8<---8<---8<---8<---8<----EOF
*) now I see there is a different IP. This IP does not belong to this
machine, and during kernel panic and now to no machine in the Batman
cloud.
wolke linux # /etc/init.d/fastd start fastd ...
RTNETLINK answers: Invalid argument
#### now I know why ;-) but to reproduce the bug I don't change it
then this commands are executed:
#---8<---8<---8<---8<---8<---8<----
ip tunnel add tun-ffw-w07 mode ipip remote <remoteIP> local <myIP>
ip addr add <some_ISP_IP>/31 dev tun-ffw-w07
ip tunnel change tun-ffw-w07 ttl 64
ip link set mtu 1400 dev tun-ffw-w07
ip link set dev tun-ffw-w07 up
ip rule add from <some_ISP_IP>/31 table 16
ip rule add iif bat0 table 16
ip rule add from all to <some_ISP_IP_for_this_machine> lookup 16
ip route add default via <some_ISP_IP_on_the_other_side> \
dev tun-ffw-w07 table 16
ip route add <some_ISP_IP>/31 dev tun-ffw-w07 table 16
# bat doesn't need any address, but the error occurs also with scope
# link
ip addr flush dev fastd0
iptables -t nat \
-A POSTROUTING \
-o tun-ffw-w07 ! -s <some_ISP_IP>/31 \
-j SNAT --to <some_ISP_IP_for_this_machine>
iptables -A FORWARD -p tcp \
--tcp-flags SYN,RST SYN -j TCPMSS --clamp-mss-to-pmtu
# yes, I know … but some services in the net do not like IMCP
# http://lartc.org/howto/lartc.cookbook.mtu-mss.html
sysctl -w net.ipv4.ip_forward=1
sysctl -w net.ipv4.conf.default.rp_filter=0
sysctl -w net.ipv4.conf.all.rp_filter=0
/etc/local.d/kdump.start
/etc/init.d/dhcpd restart
/etc/init.d/vnstatd restart
/etc/init.d/named restart
/etc/init.d/apache2 restart
batctl gw server 100000/100000
#---8<---8<---8<---8<---8<---8<----EOF
Now we have to wait till “prime time” or weekend. I always hoped:
“please don't crush” but now it's different ;-) I hope after that you
can reproduce the bug and fix it.
Best regards
Philipp
________________________
Freifunk Rheinland e. V.
– Funkzelle Wuppertal –
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
next prev parent reply other threads:[~2014-11-20 12:22 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-18 21:58 [B.A.T.M.A.N.] kernel BUG at net/core/skbuff.c:100 Philipp Psurek
2014-11-20 8:32 ` Martin Hundebøll
2014-11-20 9:48 ` Philipp Psurek
2014-11-20 10:27 ` Martin Hundebøll
2014-11-20 12:22 ` Philipp Psurek [this message]
2014-11-20 12:36 ` Martin Hundebøll
2014-11-21 8:40 ` Philipp Psurek
2014-11-22 20:39 ` Philipp Psurek
2014-11-24 8:24 ` Martin Hundebøll
2014-11-24 10:44 ` Philipp Psurek
2014-11-24 12:14 ` Philipp Psurek
2014-11-24 21:15 ` Philipp Psurek
2014-11-24 22:26 ` Philipp Psurek
2014-11-25 0:22 ` Philipp Psurek
2014-11-25 10:17 ` Philipp Psurek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1416486149.2747.9.camel@gmail.com \
--to=philipp.psurek@gmail.com \
--cc=b.a.t.m.a.n@lists.open-mesh.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.