public inbox for b.a.t.m.a.n@lists.open-mesh.org
 help / color / mirror / Atom feed
From: Philipp Psurek <philipp.psurek@gmail.com>
To: The list for a Better Approach To Mobile Ad-hoc Networking
	<b.a.t.m.a.n@lists.open-mesh.org>
Subject: Re: [B.A.T.M.A.N.] kernel BUG at net/core/skbuff.c:100
Date: Mon, 24 Nov 2014 11:44:33 +0100	[thread overview]
Message-ID: <1416825873.2678.2.camel@gmail.com> (raw)
In-Reply-To: <5472EB56.8000700@hundeboll.net>

[-- Attachment #1: Type: text/plain, Size: 3027 bytes --]

Hi Martin

Am Montag, den 24.11.2014, 09:24 +0100 schrieb Martin Hundebøll:
> Can you help me do a quick sum-up?

At the beginning of the month some bug occurred on our two gateways with
Arch Linux. MM and NC enabled. First bat-adv 2014.2.0 later bat-adv
provided with the kernel. Kernel 3.14.23-ARCH later 3.17.1-ARCH. The bug
can not be analysed on those VMs because I don't have access to their
consoles.

I took a Gentoo Linux VM (bat-adv provided inside 3.16.6) with access to
the console to ensure this bug is not Arch related. The bug occurs after
three days. After kernel panic it can't be scrolled inside the console,
so a complete trace-back is impossible. The end of the trace-back was
the same I reported. I recompiled the Kernel (3.16.7) with debug symbols
on and changed to it after one more crash.

> 1) At first it crashed with regular intervals (0 - 72 hours) with the 
> backtrace you posted initially.

No, with irregular intervals (0 - 72 hours). I think it has nothing to
do with the time. With the Arch VMs I tried out this: one machine gw
server the other gw client. After first VM's crash I immediately
switched the other to gw server. After no time also this machine
crashed. I think it has to be a bogus user packet.

I don't know which user sends bogus packages and I also can not ask our
users what they are doing to crash our gateways.

I also don't know if the crash on Arch VM is the same on the Gentoo VM,
with the back-trace I reported, but I assume.

> 2) Then you disabled NC. Did it stop crashing at that point?

NC has been disabled for 20 h before I patched the kernel, so it can't
be told for sure that disabling stops the crashes. 

> 3) Then we enabled NC and added my patch, and it still does not crash?

After patching NC was enabled again to reproduce the bug. The VM crashed
after 27 h. I could not retrieve the trace-back because I set the
'crashkernel' option to low. The next crash happened after 32:38:59.
There has not been any batadv_frag_merge_packets in kernel ring buffer.

> I remeber you said it crashed with the distro-provided batman-adv 
> module. Did you ensure to use the same version when running with my patch?

Yes. I patched /usr/src/linux/net/batman-adv/fragmentation.c
I use batman-adv provided with the Kernel to reproduce all the steps.
make modules recompiled only the batman-adv module, which I reloaded.

> I haven't had time to dig into the reproduction of the crash, but I 
> think I will do regardless.

Please tell me, if you need some more information.

The VM's uptime is now 39 h. It survives Saturday evening and Sunday
without a crash. I think the bug is NC related, but lets wait some more
days until next Monday to tell for sure. In this time the users might do
what they did in the past and trigger the bug.

Thank you for your time and making B.A.T.M.A.N.-adv better.

Best regards

Philipp

________________________
Freifunk Rheinland e. V.
– Funkzelle Wuppertal –

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

  reply	other threads:[~2014-11-24 10:44 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-18 21:58 [B.A.T.M.A.N.] kernel BUG at net/core/skbuff.c:100 Philipp Psurek
2014-11-20  8:32 ` Martin Hundebøll
2014-11-20  9:48   ` Philipp Psurek
2014-11-20 10:27     ` Martin Hundebøll
2014-11-20 12:22       ` Philipp Psurek
2014-11-20 12:36         ` Martin Hundebøll
2014-11-21  8:40           ` Philipp Psurek
2014-11-22 20:39           ` Philipp Psurek
2014-11-24  8:24             ` Martin Hundebøll
2014-11-24 10:44               ` Philipp Psurek [this message]
2014-11-24 12:14                 ` Philipp Psurek
2014-11-24 21:15                   ` Philipp Psurek
2014-11-24 22:26                     ` Philipp Psurek
2014-11-25  0:22                       ` Philipp Psurek
2014-11-25 10:17                         ` Philipp Psurek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1416825873.2678.2.camel@gmail.com \
    --to=philipp.psurek@gmail.com \
    --cc=b.a.t.m.a.n@lists.open-mesh.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox