From: Florian Westphal <fw@strlen.de>
To: eric gisse <jowr.pi@gmail.com>
Cc: Florian Westphal <fw@strlen.de>,
netfilter-devel@vger.kernel.org,
Mathias Krause <minipli@googlemail.com>
Subject: Re: bug report: use after free bug leading to kernel panic
Date: Fri, 31 Oct 2014 23:00:55 +0100 [thread overview]
Message-ID: <20141031220055.GO10069@breakpoint.cc> (raw)
In-Reply-To: <CAHZ_AjvkAgcgkwjFzSfzMhkAOHBNPRiA2ywTyv=ORBO8RRZAaw@mail.gmail.com>
eric gisse <jowr.pi@gmail.com> wrote:
> On Fri, Oct 31, 2014 at 4:50 PM, Florian Westphal <fw@strlen.de> wrote:
> >> + if (pax_sanitize_slab && !(s->flags & SLAB_NO_SANITIZE)) {
> >> + memset(x, PAX_MEMORY_SANITIZE_VALUE, s->object_size);
> >> + if (s->ctor)
> >> + s->ctor(x);
> >> + }
> >> +
> >
> > I am no SLUB expert, but this looks wrong.
> > slab_free() is called directly via kmem_cache_free().
>
> I can't help with that one. My competence does not extend to kernel
> memory managment / allocation issues :)
Seems Mathias Krause will work on improving Pax poisoning to treat
SLAB_DESTROY_BY_RCU specially.
> > conntrack objects are alloc'd/free'd from a SLAB_DESTROY_BY_RCU cache.
> >
> > It is therefore legal to access a conntrack object from another
> > CPU even after kmem_cache_free() was invoked on another cpu, provided all
> > readers that do so hold rcu_read_lock, and verify that object has not been
> > freed yet by issuing appropriate atomic_inc_not_zero calls.
> >
> > Therefore, object poisoning will only be safe from rcu callback, after
> > accesses are known to be illegal/invalid.
>
> Can you expand on that? The term "object poisoning" to me means an
> object (you are talking about the conntract tuple, right?) with
Yes.
> problematic values is put into memory, but the way you phrase it seems
> more like the hash table itself is being manipulated improperly.
No, afaics the conntrack object accesses are correct.
> I'm still trying to work out what the actual ISSUE is. My
> understanding is this, thus far:
>
> It seems like an object in the connection track hash table is being
> improperly marked as free, which then is sanitized, and is then later
> being accessed by the netfilter codepath that loops through the table.
No. Conntrack objects are free'd when the last reference counter goes
away. However, because lookup of the conntrack hash table is lockless,
another CPU might be accessing the conntrack object that is being free'd
right now.
Usually this means that the access is invalid. However, in the
conntrack case, the conntrack objects are allocated from a special
cache that delays freeing of underlying pages until we know that no
other cpu is currently accessing it.
So there are 2 possible cases:
1 - the conntrack object that is being looked at is alive (refcnt > 1).
2 - the conntrack object that is being looked is being free'd RIGHT NOW
on another cpu. RCU protects us from page fault, since the underlying
memory page cannot be free'd.
So, we're safe to look at the memory contents of the tuple and decide
wheter its the object (conntrack tuple) we're trying to find or not.
If it is, we try to obtain a reference, this will only succeed if the
reference count is not 0 already, so we can detect the "its free'd"
case.
If we obtained a reference, we still need to re-validate the tuple
address since its possible that the object was free'd on cpu x and
almost-instantly reallocated for use by a different tuple.
If you are interested in this you can have a look at the bug fixes made
in that area, there are some more explanations there.
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/net/netfilter/nf_conntrack_core.c?id=e53376bef2cd97d3e3f61fdc677fb8da7d03d0da
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/net/netfilter/nf_conntrack_core.c?id=c6825c0976fa7893692e0e43b09740b419b23c09
> > If you use different allocator, please tell us which one (check kernel
> > config, slub is default).
>
> SLAB allocator, though I do not remember making the choice.
>
> From the kernel config that's causing issues:
>
> # egrep 'SLAB|SLUB' .config
> CONFIG_SLAB=y
> # CONFIG_SLUB is not set
> CONFIG_SLABINFO=y
> # CONFIG_DEBUG_SLAB is not set
> CONFIG_PAX_USERCOPY_SLABS=y
Ok, from a quick glance PaX slab kfree is also zapping
objects before grace period elapsed.
> > If its reproduceable with poisoning done after the RCU grace periods
> > have elapsed (i.e., where its not legal anymore to access the memory),
> > please let us know and we can have another look at it.
> >
> > Thanks.
>
> Reproducability is an issue since I don't know what's triggering it in
> the first place. Just that it happens after a variable length of time
> along the same code path, subject to differences between the two
> kernel versions I've seen this issue with.
>
> The machine itself is pushing 20-25 megabytes (~50k packets) per
> second at any given time and has smacked the default conntrack hash
> table maximums. So the netfilter system is under nontrivial stresses.
It should be able to handle a lot more.
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/net/netfilter/nf_conntrack_core.c?id=93bb0ceb75be2fdfa9fc0dd1fb522d9ada515d9c
> I'll happily work with you guys to isolate this as this is an
> interesting problem and I'm bored, but I need a bit of help and
> prompting to get this done properly.
Sure, my understanding is that someone from pax team is working on
the object poisoning to handle SLAB_DESTROY_BY_RCU properly.
Please don't hesitate to report back with newer pax versions if you
still see invalid accesses.
next prev parent reply other threads:[~2014-10-31 22:00 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-31 15:28 bug report: use after free bug leading to kernel panic eric gisse
2014-10-31 16:50 ` Florian Westphal
2014-10-31 17:30 ` eric gisse
2014-10-31 22:00 ` Florian Westphal [this message]
2014-11-01 10:56 ` Mathias Krause
2014-10-31 17:46 ` Mathias Krause
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141031220055.GO10069@breakpoint.cc \
--to=fw@strlen.de \
--cc=jowr.pi@gmail.com \
--cc=minipli@googlemail.com \
--cc=netfilter-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).