All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Florian Westphal <fw@strlen.de>
Cc: netfilter-devel@vger.kernel.org,
	Eric Dumazet <eric.dumazet@gmail.com>,
	Pablo Neira Ayuso <pablo@netfilter.org>,
	netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>,
	Patrick McHardy <kaber@trash.net>
Subject: Re: [nf-next PATCH 3/5] netfilter: avoid race with exp->master ct
Date: Fri, 28 Feb 2014 12:30:37 +0100	[thread overview]
Message-ID: <20140228123037.568fdcd9@redhat.com> (raw)
In-Reply-To: <20140227213452.GE9965@breakpoint.cc>

On Thu, 27 Feb 2014 22:34:52 +0100
Florian Westphal <fw@strlen.de> wrote:

> Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> > Preparation for disconnecting the nf_conntrack_lock from the
> > expectations code.  Once the nf_conntrack_lock is lifted, a race
> > condition is exposed.
> > 
> > The expectations master conntrack exp->master, can race with
> > delete operations, as the refcnt increment happens too late in
> > init_conntrack().  Race is against other CPUs invoking
> > ->destroy() (destroy_conntrack()), or nf_ct_delete() (via timeout
> > or early_drop()).
> > 
> > Avoid this race in nf_ct_find_expectation() by using atomic_inc_not_zero(),
> > and checking if nf_ct_is_dying() (path via nf_ct_delete()).
> > 
> > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> > Signed-off-by: Florian Westphal <fw@strlen.de>
> > ---
> > 
> >  net/netfilter/nf_conntrack_core.c   |    2 +-
> >  net/netfilter/nf_conntrack_expect.c |   16 +++++++++++++++-
> >  2 files changed, 16 insertions(+), 2 deletions(-)
> > 
> > diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
> > index ac85fd1..a822720 100644
> > --- a/net/netfilter/nf_conntrack_core.c
> > +++ b/net/netfilter/nf_conntrack_core.c
> > @@ -898,6 +898,7 @@ init_conntrack(struct net *net, struct nf_conn *tmpl,
> >  			 ct, exp);
> >  		/* Welcome, Mr. Bond.  We've been expecting you... */
> >  		__set_bit(IPS_EXPECTED_BIT, &ct->status);
> > +		/* exp->master safe, refcnt bumped in nf_ct_find_expectation */
> >  		ct->master = exp->master;
> >  		if (exp->helper) {
> >  			help = nf_ct_helper_ext_add(ct, exp->helper,
> > @@ -912,7 +913,6 @@ init_conntrack(struct net *net, struct nf_conn *tmpl,
> >  #ifdef CONFIG_NF_CONNTRACK_SECMARK
> >  		ct->secmark = exp->master->secmark;
> >  #endif
> > -		nf_conntrack_get(&ct->master->ct_general);
> >  		NF_CT_STAT_INC(net, expect_new);
> >  	} else {
> >  		__nf_ct_try_assign_helper(ct, tmpl, GFP_ATOMIC);
> > diff --git a/net/netfilter/nf_conntrack_expect.c b/net/netfilter/nf_conntrack_expect.c
> > index 4fd1ca9..2c4ffdb 100644
> > --- a/net/netfilter/nf_conntrack_expect.c
> > +++ b/net/netfilter/nf_conntrack_expect.c
> > @@ -147,13 +147,27 @@ nf_ct_find_expectation(struct net *net, u16 zone,
> >  	if (!exp)
> >  		return NULL;
> >  
> > +	/* Avoid race with other CPUs, that for exp->master ct, is
> > +	 * about to invoke ->destroy(), or nf_ct_delete() via timeout
> > +	 * or early_drop().
> > +	 *
> > +	 * The atomic_inc_not_zero() check tells:  If that fails, we
> > +	 * know that the ct is being destroyed.  If it succeeds, we
> > +	 * can be sure the ct cannot disappear underneath.
> > +	 */
> > +	if (unlikely(nf_ct_is_dying(exp->master) ||
> > +		     !atomic_inc_not_zero(&exp->master->ct_general.use)))
> > +		return NULL;
> > +
> >  	/* If master is not in hash table yet (ie. packet hasn't left
> >  	   this machine yet), how can other end know about expected?
> >  	   Hence these are not the droids you are looking for (if
> >  	   master ct never got confirmed, we'd hold a reference to it
> >  	   and weird things would happen to future packets). */
> > -	if (!nf_ct_is_confirmed(exp->master))
> > +	if (!nf_ct_is_confirmed(exp->master)) {
> > +		atomic_dec(&exp->master->ct_general.use);
> >  		return NULL;
> > +	}
> 
> Not sure if this is safe.
> 
> What about:
> CPU0: atomic_inc_not_zero()
> CPU1: calls nf_conntrack_put()
> CPU0: atomic_dec() -> zero refcnt without invocation of ->destroy

Okay, so, you are saying CPU0 should use nf_ct_put() or nf_conntrack_put().

> [ Cannot happen now because of nf_conntrack_lock ]
> 
> I'd suggest to test nf_ct_is_confirmed() first, it avoids the need to
> undo the atomic_inc_not_zero.

Okay, guess that should be okay.

> You also need to deal with the "timer-deletion-fails" a bit later in the same
> function:
> 
>         if (exp->flags & NF_CT_EXPECT_PERMANENT) {
>                 atomic_inc(&exp->use);
>                 return exp;
>         } else if (del_timer(&exp->timeout)) {
>                 nf_ct_unlink_expect(exp);
>                 return exp;
>         }
> 	// Problem: exp->master ref was bumped
> 	nf_ct_put(exp->master); // missing
>         return NULL;

True, and yes, use of use nf_ct_put() or nf_conntrack_put() would be
necessary here instead of manual refcnt dec.

Thanks for your review, I will fix it up...

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

  reply	other threads:[~2014-02-28 11:30 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-27 18:23 [nf-next PATCH 0/5] (repost) netfilter: conntrack: optimization, remove central spinlock Jesper Dangaard Brouer
2014-02-27 18:23 ` [nf-next PATCH 1/5] netfilter: trivial code cleanup and doc changes Jesper Dangaard Brouer
2014-02-27 18:23 ` [nf-next PATCH 2/5] netfilter: conntrack: spinlock per cpu to protect special lists Jesper Dangaard Brouer
2014-02-27 18:23 ` [nf-next PATCH 3/5] netfilter: avoid race with exp->master ct Jesper Dangaard Brouer
2014-02-27 21:34   ` Florian Westphal
2014-02-28 11:30     ` Jesper Dangaard Brouer [this message]
2014-02-27 18:23 ` [nf-next PATCH 4/5] netfilter: conntrack: seperate expect locking from nf_conntrack_lock Jesper Dangaard Brouer
2014-02-27 18:23 ` [nf-next PATCH 5/5] netfilter: conntrack: remove central spinlock nf_conntrack_lock Jesper Dangaard Brouer
2014-02-27 23:34 ` [nf-next PATCH 0/5] (repost) netfilter: conntrack: optimization, remove central spinlock David Miller
2014-02-28  9:47   ` Jesper Dangaard Brouer
2014-02-28 12:16 ` [nf-next PATCH V2 0/5] " Jesper Dangaard Brouer
2014-02-28 12:16   ` [nf-next PATCH V2 1/5] netfilter: trivial code cleanup and doc changes Jesper Dangaard Brouer
2014-02-28 12:17   ` [nf-next PATCH V2 2/5] netfilter: conntrack: spinlock per cpu to protect special lists Jesper Dangaard Brouer
2014-02-28 12:17   ` [nf-next PATCH V2 3/5] netfilter: avoid race with exp->master ct Jesper Dangaard Brouer
2014-02-28 12:17   ` [nf-next PATCH V2 4/5] netfilter: conntrack: seperate expect locking from nf_conntrack_lock Jesper Dangaard Brouer
2014-02-28 15:08     ` Florian Westphal
2014-03-03 11:33       ` Jesper Dangaard Brouer
2014-02-28 12:17   ` [nf-next PATCH V2 5/5] netfilter: conntrack: remove central spinlock nf_conntrack_lock Jesper Dangaard Brouer
2014-03-03  1:14   ` [nf-next PATCH V2 0/5] netfilter: conntrack: optimization, remove central spinlock David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140228123037.568fdcd9@redhat.com \
    --to=brouer@redhat.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=fw@strlen.de \
    --cc=kaber@trash.net \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pablo@netfilter.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.