From: Aaron Conole <aconole@bytheb.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Florian Westphal <fw@strlen.de>,
Al Viro <viro@zeniv.linux.org.uk>,
Andrew Morton <akpm@linux-foundation.org>,
Jens Axboe <axboe@fb.com>, "Ted Ts'o" <tytso@mit.edu>,
Christoph Lameter <cl@linux.com>,
David Miller <davem@davemloft.net>,
Pablo Neira Ayuso <pablo@netfilter.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
Network Development <netdev@vger.kernel.org>,
NetFilter <netfilter-devel@vger.kernel.org>
Subject: Re: slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice))
Date: Mon, 10 Oct 2016 15:18:13 -0400 [thread overview]
Message-ID: <f7td1j8xbbe.fsf@redhat.com> (raw)
In-Reply-To: <CA+55aFy0szySf+SnysjXTyfiU=RMBo9U1sHAVaTKG=tUTF+XGw@mail.gmail.com> (Linus Torvalds's message of "Mon, 10 Oct 2016 12:05:17 -0700")
Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Mon, Oct 10, 2016 at 9:28 AM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> So as I already answered to Dave, I'm not actually sure that this was
>> the buggy code, or that my patch would make any difference at all.
>
> My patch does seem to fix things, and in fact the warning about "hook
> not found" now triggers.
>
> So I think the bug really was that the singly-linked list handling
> code did not correctly handle the case of not finding the entry, and
> then freed (incorrectly) the last one that wasn't actually unlinked.
>
> In fact, I get quite a few warnings (56 total) about 30 seconds after
> logging in:
>
> [ 54.213170] WARNING: CPU: 1 PID: 111 at net/netfilter/core.c:151
> nf_unregister_net_hook+0x8e/0x170
> ... repeat 54 times ...
> [ 54.445520] WARNING: CPU: 7 PID: 111 at net/netfilter/core.c:151
> nf_unregister_net_hook+0x8e/0x170
>
> and looking in the journal, the first one is (again) immediately
> preceded by that systemd-hostnamed service stopping:
>
> Oct 10 11:45:47 i7 audit[1546]: USER_LOGIN
> ...
> Oct 10 11:46:11 i7 audit[1]: SERVICE_STOP pid=1 uid=0
> auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0
> msg='unit=fprintd comm="systemd" exe="/usr/lib/systemd/systemd"
> hostname=? addr=? terminal=? res=success'
> Oct 10 11:46:13 i7 pulseaudio[1697]: [pulseaudio] bluez5-util.c:
> GetManagedObjects() failed: org.freedesktop.DBus.Error.NoReply: Did
> not receive a reply. Possible causes include: the remote application
> did not send a reply, the message bus security policy blocked the
> reply, the reply timeout expir
> Oct 10 11:46:13 i7 dbus-daemon[1003]: [system] Failed to activate
> service 'org.bluez': timed out
> Oct 10 11:46:20 i7 audit[1]: SERVICE_STOP pid=1 uid=0
> auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0
> msg='unit=systemd-hostnamed comm="systemd"
> exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=?
> res=success'
> Oct 10 11:46:20 i7 kernel: ------------[ cut here ]------------
> Oct 10 11:46:20 i7 kernel: WARNING: CPU: 1 PID: 111 at
> net/netfilter/core.c:151 nf_unregister_net_hook+0x8e/0x170
>
> so I do think it's something to do with some network startup service
> thing (perhaps dhcp, perhaps chrome, who knows) as I do my initial
> login.
>
> David - I think that also explains what was wrong with the old code.
> In the old code, this loop:
>
> while (hooks_entry && nf_entry_dereference(hooks_entry->next)) {
>
> would exit with "hooks_entry" pointing to the last list entry (because
> ->next was NULL). Nothing was ever unlinked in the loop itself,
> because it never actually found a matching entry, but then after the
> loop it would free that last entry because it *thought* that was the
> match.
>
> My list rewrite fixes that.
>
> Anyway, I'm assuming it will come to me from the networking tree after
> more testing by the maintainers. You can add my
>
> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
>
> to the patch, though.
>
> David, if you want me to just commit that thing directly, I can
> obviously do so, but I do think somebody should look at
>
> (a) that I actually got the priority list ordering right on the
> insertion side
It looks correct.
Reviewed-by: Aaron Conole <aconole@bytheb.org>
> (b) what it is that makes it try to unregister that hook that isn't
> on the list in the first place
This is a still problem, I think. I wasn't able to reproduce the issue
on a fedora-23 VM. My fedora 24 bare-metal system does trigger this,
though. Not sure what changed in userspace/kernel interaction side (not
an excuse, but just an observation).
> but on the whole I consider this issue explained and solved. I'll
> continue to run with my patch on my machine (just not committed).
Okay. Very sorry for this, again.
> Linus
next prev parent reply other threads:[~2016-10-10 19:18 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-10-09 21:31 slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice)) Linus Torvalds
2016-10-10 0:51 ` Florian Westphal
2016-10-10 1:35 ` Aaron Conole
2016-10-10 2:49 ` Linus Torvalds
2016-10-10 3:41 ` Linus Torvalds
2016-10-10 3:57 ` slab corruption with current -git David Miller
2016-10-10 8:24 ` David Miller
2016-10-10 16:15 ` Linus Torvalds
2016-10-11 13:17 ` Michal Kubecek
2016-10-11 13:55 ` Aaron Conole
2016-10-10 13:49 ` slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice)) Aaron Conole
2016-10-10 16:28 ` Linus Torvalds
2016-10-10 19:05 ` Linus Torvalds
2016-10-10 19:18 ` Aaron Conole [this message]
2016-10-11 0:30 ` slab corruption with current -git David Miller
2016-10-11 0:54 ` Linus Torvalds
2016-10-11 5:39 ` slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice)) Linus Torvalds
2016-10-11 5:47 ` Linus Torvalds
2016-10-11 8:57 ` slab corruption with current -git David Miller
2016-10-13 6:02 ` Markus Trippelsdorf
2016-10-13 6:06 ` Markus Trippelsdorf
[not found] ` <CA+55aFwsUR4-YmOYgJOOO4a2e48M4_tk7YhAo4s5KZQQxUjpZw@mail.gmail.com>
2016-10-13 6:27 ` Markus Trippelsdorf
2016-10-13 19:49 ` Linus Torvalds
2016-10-13 20:43 ` Florian Westphal
2016-10-13 21:32 ` Al Viro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f7td1j8xbbe.fsf@redhat.com \
--to=aconole@bytheb.org \
--cc=akpm@linux-foundation.org \
--cc=axboe@fb.com \
--cc=cl@linux.com \
--cc=davem@davemloft.net \
--cc=fw@strlen.de \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=netfilter-devel@vger.kernel.org \
--cc=pablo@netfilter.org \
--cc=torvalds@linux-foundation.org \
--cc=tytso@mit.edu \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).