From: Alexei Potashnik <alexei@purestorage.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Joern Engel <joern@purestorage.com>,
"David S. Miller" <davem@davemloft.net>,
netdev@vger.kernel.org
Subject: RE: neigh use-after-free
Date: Fri, 3 Apr 2015 17:57:24 -0700 [thread overview]
Message-ID: <520964f8c9b65ad9a49545a30ecf6065@mail.gmail.com> (raw)
In-Reply-To: <1428108249.25985.197.camel@edumazet-glaptop2.roam.corp.google.com>
> -----Original Message-----
> From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> Sent: Friday, April 03, 2015 5:44 PM
> To: Alexei Potashnik
> Cc: Joern Engel; David S. Miller; netdev@vger.kernel.org
> Subject: Re: neigh use-after-free
>
> On Fri, 2015-04-03 at 17:27 -0700, Alexei Potashnik wrote:
> > > -----Original Message-----
> > > From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> > >
> > > On Fri, 2015-04-03 at 13:52 -0700, Alexei Potashnik wrote:
> > > > Would this be an appropriate solution:
> > > >
> > > > diff --git a/net/core/neighbour.c b/net/core/neighbour.c index
> > > > b49e8ba..38265f2 100644
> > > > --- a/net/core/neighbour.c
> > > > +++ b/net/core/neighbour.c
> > > > @@ -168,7 +168,8 @@ static int neigh_forced_gc(struct neigh_table
> > > > *tbl)
> > > >
> > > > static void neigh_add_timer(struct neighbour *n, unsigned long
> > > > when) {
> > > > - neigh_hold(n);
> > > > + if (!atomic_inc_not_zero(&n->refcnt))
> > > > + return;
> > > > if (unlikely(mod_timer(&n->timer, when))) {
> > > > printk("NEIGH: BUG, double timer add, state is
> > > > %x\n",
> > > > n->nud_state);
> > >
> > > This is a very ugly hack. Please find root cause.
> > >
> > > The correct way to implement refcount on a timer is :
> > >
> > > if (!mod_timer(&n->timer, when))
> > > neigh_hold(&n->refcnt);
> > >
> > > And current code seems to do that (It dumps a printk() and stack
> > > trace
> > > otherwise)
> >
> > The printk() and stack trace don't get printed because this is a
> > different case -- it’s not double timer add, but rather add of the
> > timer after neighbor object is deleted.
> >
> > > If refcnt is 0 at this point, it means there is another bug, and we
> > > should fix it instead of trying to work around it.
> >
> > I agree. And it seems to be clear what code does it:
> >
> > ip_finish_output2
> > +-> __ipv4_neigh_lookup_noref
> >
> > Lookup name is clearly indicating that noref is intentional. What's
> > not clear is why.
> > The change seems to have originated in
> >
> > commit a263b3093641fb1ec377582c90986a7fd0625184
> > Author: David S. Miller <davem@davemloft.net>
> > Date: Mon Jul 2 02:02:15 2012
> >
> > ipv4: Make neigh lookups directly in output packet path.
> >
> > Do not use the dst cached neigh, we'll be getting rid of that.
> >
> > Signed-off-by: David S. Miller <davem@davemloft.net>
> >
> > Before that neigh object came from dst_get_neighbour_noref(), which
> > used to return dst->_neighbour. Now, if dst->_neighbour used own a
> > reference, than the above commit has introduced the bug. Otherwise,
> > the issue has always been there and, perhaps, David Miller can explain
> > why it was deemed safe to operate on neigh object here without extra
> > ref.
> >
> > Alexei
>
> Your debugging code might be wrong.
>
> At entry of neigh_add_timer(), n->refcnt can be 0 if caller holds the
> correct
> lock, preventing the timer handler to proceed and decrement refcnt.
>
> neigh_timer_handler() does :
>
> write_lock(&neigh->lock);
> ...
> write_unlock(&neigh->lock);
> neigh_release(neigh);
>
My understanding of the state when issue occurs is:
- neigh_destroy() has finished executing
- no timer for this neigh entry is pending or running
(which makes neigh_timer_handler behavior irrelevant)
- but kfree_rcu callback for the neigh entry has not executed yet
- at this point IP code finds the entry with __ipv4_neigh_lookup_noref
and starts operating on it.
Alexei
next prev parent reply other threads:[~2015-04-04 0:57 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-03 20:27 neigh use-after-free Jörn Engel
2015-04-03 20:52 ` Alexei Potashnik
2015-04-03 23:34 ` Eric Dumazet
2015-04-04 0:27 ` Alexei Potashnik
2015-04-04 0:44 ` Eric Dumazet
2015-04-04 0:57 ` Alexei Potashnik [this message]
2015-04-04 1:10 ` Eric Dumazet
2015-04-04 1:20 ` Eric Dumazet
2015-04-04 1:32 ` Alexei Potashnik
2015-04-04 16:06 ` Julian Anastasov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=520964f8c9b65ad9a49545a30ecf6065@mail.gmail.com \
--to=alexei@purestorage.com \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=joern@purestorage.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).