From: Sven Eckelmann <sven@narfation.org>
To: b.a.t.m.a.n@lists.open-mesh.org
Subject: Re: [B.A.T.M.A.N.] [PATCH v3 1/2] batman-adv: fix race conditions on interface removal
Date: Fri, 21 Oct 2016 14:30:10 +0200 [thread overview]
Message-ID: <2315506.rV8PSJo6DZ@bentobox> (raw)
In-Reply-To: <20161005234308.29871-2-linus.luessing@c0d3.blue>
[-- Attachment #1: Type: text/plain, Size: 6842 bytes --]
On Donnerstag, 6. Oktober 2016 01:43:07 CEST Linus Lüssing wrote:
> The most prominent general protection fault I was experiencing when
> quickly removing and adding interfaces to batman-adv is the following:
I am personally not sure whether go through net.git or through net-next.git.
If you think it should go through net-next then maybe it would be good to
state quite early in the commit message that mdelay(...) is required to cause
the problem?
> ~~~~~~
> [ 1137.316136] general protection fault: 0000 [#1] SMP
[...]
> [ 1137.320038] Call Trace:
> [ 1137.320038] [<ffffffffa0363294>] batadv_hardif_disable_interface+0x29a/0x3a6 [batman_adv]
> [ 1137.320038] [<ffffffffa0373db4>] batadv_softif_destroy_netlink+0x4b/0xa4 [batman_adv]
> [ 1137.320038] [<ffffffff813b52f3>] __rtnl_link_unregister+0x48/0x92
> [ 1137.320038] [<ffffffff813b9240>] rtnl_link_unregister+0xc1/0xdb
> [ 1137.320038] [<ffffffff8108547c>] ? bit_waitqueue+0x87/0x87
> [ 1137.320038] [<ffffffffa03850d2>] batadv_exit+0x1a/0xf48 [batman_adv]
> [ 1137.320038] [<ffffffff810c26f9>] SyS_delete_module+0x136/0x1b0
> [ 1137.320038] [<ffffffff8144dc65>] entry_SYSCALL_64_fastpath+0x18/0xa8
> [ 1137.320038] [<ffffffff8108aaca>] ? trace_hardirqs_off_caller+0x37/0xa6
> [ 1137.320038] Code: 89 f7 e8 21 bd 0d e1 4d 85 e4 75 0e 31 f6 48 c7 c7 50 d7 3b a0 e8 50 16 f2 e0 49 8b 9c 24 28 01 00 00 48 85 db 0f 84 b2 00 00 00 <48> 8b 03 4d 85 ed 48 89 45 c8 74 09 4c 39 ab f8 00 00 00 75 1c
> [ 1137.320038] RIP [<ffffffffa0371852>] batadv_purge_outstanding_packets+0x1c8/0x291 [batman_adv]
> [ 1137.320038] RSP <ffff88001da5fd78>
> [ 1137.451885] ---[ end trace 803b9bdc6a4a952b ]---
> [ 1137.453154] Kernel panic - not syncing: Fatal exception in interrupt
> [ 1137.457143] Kernel Offset: disabled
> [ 1137.457143] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
> ~~~~~~
Can we reduce the length of some lines here? Especially the modules line
(which is not really interesting - I hope) to something like "Modules linked
in: batman-adv(O-) <...>". Also please remove the "[ 1137.457143] " and just
use 2/4 spaces in front of the snippet.
>
> It can be easily reproduced with some carefully placed
> msleeps()/mdelay()s.
>
> The issue is, that on interface removal, any still running worker thread
> of a forwarding packet will race with the interface purging routine to
> free a forwarding packet. Temporarilly giving up a spin-lock to be able
s/Temporarilly/Temporarily/
[...]
>
> PS: checkpatch throws the following at me, but seems to be bogus?
>
> ~~~~
> -------------------------------------------------------------------
> /tmp/0001-batman-adv-fix-race-conditions-on-interface-removal.patch
> -------------------------------------------------------------------
> CHECK: spinlock_t definition without comment
> + spinlock_t *lock);
>
> total: 0 errors, 0 warnings, 1 checks, 411 lines checked
>
> NOTE: For some of the reported defects, checkpatch may be able to
> mechanically convert to the typical style using --fix or --fix-inplace.
>
> /tmp/0001-batman-adv-fix-race-conditions-on-interface-removal.patch has style problems, please review.
> ~~~~~
Yes, this is bogus and a deficit of checkpatch.pl. But since we run checkpatch
each day and I don't want to find a way to fix it in checkpatch.pl - maybe you
can shorten it in send.h?
bool batadv_forw_packet_steal(struct batadv_forw_packet *packet, spinlock_t *l);
[...]
> +bool batadv_forw_packet_steal(struct batadv_forw_packet *forw_packet,
> + spinlock_t *lock)
> +{
> + struct hlist_head head = HLIST_HEAD_INIT;
> +
> + /* did purging routine steal it earlier? */
> + spin_lock_bh(lock);
> + if (batadv_forw_packet_was_stolen(forw_packet)) {
> + spin_unlock_bh(lock);
> + return false;
> + }
> +
> + hlist_del(&forw_packet->list);
> +
> + /* Just to spot misuse of this function */
> + hlist_add_head(&forw_packet->bm_list, &head);
> + hlist_add_fake(&forw_packet->bm_list);
Sorry, I don't get how this should spot misuse via this extra hlist_add_head.
You first add the packet to the list (on the stack) and then setting pprev
pointer to itself. So you basically have a fake hashed node with next pointer
set to NULL. Wouldn't it be better here to use INIT_HLIST_NODE instead of
hlist_add_head? I would even say that INIT_HLIST_NODE isn't needed here
because you already did this during batadv_forw_packet_alloc.
But I would assume that you actually only wanted hlist_add_fake for the
WARN_ONCE in batadv_forw_packet_queue, right?
[...]
> +/**
> + * batadv_forw_packet_queue - try to queue a forwarding packet
> + * @forw_packet: the forwarding packet to queue
> + * @lock: a key to the store (e.g. forw_{bat,bcast}_list_lock)
> + * @head: the shelve to queue it on (e.g. forw_{bat,bcast}_list)
> + * @send_time: timestamp (jiffies) when the packet is to be sent
> + *
> + * This function tries to (re)queue a forwarding packet. If packet was stolen
> + * earlier then the shop owner will (usually) keep quiet about it.
Can "shop owner" please replaced with some relevant information for
batman-adv?
> + *
> + * Caller needs to ensure that forw_packet->delayed_work was initialized.
> + */
> +static void batadv_forw_packet_queue(struct batadv_forw_packet *forw_packet,
> + spinlock_t *lock, struct hlist_head *head,
> + unsigned long send_time)
> +{
> + spin_lock_bh(lock);
> +
> + /* did purging routine steal it from us? */
> + if (batadv_forw_packet_was_stolen(forw_packet)) {
> + /* If you got it for free() without trouble, then
> + * don't get back into the queue after stealing...
> + */
> + WARN_ONCE(hlist_fake(&forw_packet->bm_list),
> + "Oh oh... the kernel OOPs are on our tail now... Jim won't bail us out this time!\n");
Can this be replaced with a less funny but more helpful message?
[...]
>
> +/**
> + * batadv_purge_outstanding_packets - stop/purge scheduled bcast/OGMv1 packets
> + * @bat_priv: the bat priv with all the soft interface information
> + * @hard_iface: the hard interface to cancel and purge bcast/ogm packets on
Please replace the tab between " @hard_iface:" and "the hard in" with a space
[...]
> @@ -21,6 +21,7 @@
> #include "main.h"
>
> #include <linux/compiler.h>
> +#include <linux/spinlock_types.h>
> #include <linux/types.h>
This include is actually correct - but I am currently mapping
linux/spinlock_types.h to linux/spinlock.h in iwyu. So would be easier for me
when this include will be set to linux/spinlock.h.
I am not sure about all the crime related puns in this patch but the idea
makes sense and also cleans up some of the forwarding packet code.
Kind regards,
Sven
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 801 bytes --]
next prev parent reply other threads:[~2016-10-21 12:30 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-10-05 23:43 [B.A.T.M.A.N.] [PATCH v3 0/2] batman-adv: hard interface removal fixes Linus Lüssing
2016-10-05 23:43 ` [B.A.T.M.A.N.] [PATCH v3 1/2] batman-adv: fix race conditions on interface removal Linus Lüssing
2016-10-21 12:30 ` Sven Eckelmann [this message]
2016-10-29 2:46 ` Linus Lüssing
2016-10-29 6:55 ` Sven Eckelmann
2016-10-31 7:22 ` Linus Lüssing
2016-10-31 8:09 ` Sven Eckelmann
2016-10-31 9:57 ` Linus Lüssing
2016-10-05 23:43 ` [B.A.T.M.A.N.] [PATCH v3 2/2] batman-adv: fix splat on disabling an interface Linus Lüssing
2016-10-21 12:49 ` Sven Eckelmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2315506.rV8PSJo6DZ@bentobox \
--to=sven@narfation.org \
--cc=b.a.t.m.a.n@lists.open-mesh.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox