public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: dormando <dormando@rydia.net>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: "Alexey Preobrazhensky" <preobr@google.com>,
	"Steffen Klassert" <steffen.klassert@secunet.com>,
	"David Miller" <davem@davemloft.net>,
	paulmck@linux.vnet.ibm.com, netdev@vger.kernel.org,
	"Kostya Serebryany" <kcc@google.com>,
	"Dmitry Vyukov" <dvyukov@google.com>,
	"Lars Bull" <larsbull@google.com>,
	"Eric Dumazet" <edumazet@google.com>,
	"Bruce Curtis" <brutus@google.com>,
	"Maciej Żenczykowski" <maze@google.com>,
	"Alexei Starovoitov" <alexei.starovoitov@gmail.com>
Subject: Re: [PATCH] ipv4: fix a race in ip4_datagram_release_cb()
Date: Mon, 23 Jun 2014 01:55:06 -0700 (PDT)	[thread overview]
Message-ID: <alpine.DEB.2.02.1406230151480.19472@dtop> (raw)
In-Reply-To: <1403512412.16682.23.camel@edumazet-glaptop2.roam.corp.google.com>

On Mon, 23 Jun 2014, Eric Dumazet wrote:

> On Sun, 2014-06-22 at 12:07 -0700, dormando wrote:
>
> > Update on testing:
> >
> > I only have two machines that crash on their own frequently (more like
> > one, even). Unfortunately something happened to the datacenter it's in and
> > it was offline for a week. The machine normally crashes after 1.5-4d,
> > averaging 2d.
> >
> > It's done about three days total time without a new crash. I also have the
> > kernel running in another datacenter for ~10 days.. but it takes 30-150
> > days to crash in that one.
> >
> > So, inconclusive, but still promising. If the machine survives the week it
> > probably means it's fixed, or at least greatly reduced.
> >
> > I saw that one of your patches got queued for stable, but all three were
> > necessary to fix udpkill. What's your plan for cleanup/upstreaming?
> >
> > Did you folks end up running udpkill under the tester thing?
>
> I did not test udpkill, as the known problem is the DST_NOCACHE flag.
>
> We end up calling sk_dst_set(sk, dst) with a dst having this flag set.
>
> So maybe DST_NOCACHE should be renamed, if we _can _ cache a dst like
> this. Its meaning is really that dst_release() has to track when
> refcount reaches 0 so that last owner fress dst, but we need to respect
> rcu grace period.
>
> Fixing sk_dst_set() as I did is not enough, as it is only reducing race
> window.

Hrm. I'll have to spend more time trying to understand how to test this
(beyond just putting a kernel into production and seeing if it crashes).
Outside of udpkill it's been slightly hard to reproduce, and udpkill ran
for over five hours with your previous patches.

Do you or other folks have any methods for testing this?

thanks!

> Something like following :
>
>  include/net/sock.h   |    4 ++--
>  net/core/dst.c       |   16 +++++++++++-----
>  net/ipv4/ip_tunnel.c |    7 +------
>  3 files changed, 14 insertions(+), 13 deletions(-)
>
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 07b7fcd60d80..173cae485de1 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -1730,8 +1730,8 @@ sk_dst_get(struct sock *sk)
>
>  	rcu_read_lock();
>  	dst = rcu_dereference(sk->sk_dst_cache);
> -	if (dst)
> -		dst_hold(dst);
> +	if (dst && !atomic_inc_not_zero(&dst->__refcnt))
> +		dst = NULL;
>  	rcu_read_unlock();
>  	return dst;
>  }
> diff --git a/net/core/dst.c b/net/core/dst.c
> index 80d6286c8b62..a028409ee438 100644
> --- a/net/core/dst.c
> +++ b/net/core/dst.c
> @@ -269,6 +269,15 @@ again:
>  }
>  EXPORT_SYMBOL(dst_destroy);
>
> +static void dst_destroy_rcu(struct rcu_head *head)
> +{
> +	struct dst_entry *dst = container_of(head, struct dst_entry, rcu_head);
> +
> +	dst = dst_destroy(dst);
> +	if (dst)
> +		__dst_free(dst);
> +}
> +
>  void dst_release(struct dst_entry *dst)
>  {
>  	if (dst) {
> @@ -276,11 +285,8 @@ void dst_release(struct dst_entry *dst)
>
>  		newrefcnt = atomic_dec_return(&dst->__refcnt);
>  		WARN_ON(newrefcnt < 0);
> -		if (unlikely(dst->flags & DST_NOCACHE) && !newrefcnt) {
> -			dst = dst_destroy(dst);
> -			if (dst)
> -				__dst_free(dst);
> -		}
> +		if (unlikely(dst->flags & DST_NOCACHE) && !newrefcnt)
> +			call_rcu(&dst->rcu_head, dst_destroy_rcu);
>  	}
>  }
>  EXPORT_SYMBOL(dst_release);
> diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
> index 097b3e7c1e8f..cc3b7fd34555 100644
> --- a/net/ipv4/ip_tunnel.c
> +++ b/net/ipv4/ip_tunnel.c
> @@ -73,12 +73,7 @@ static void __tunnel_dst_set(struct ip_tunnel_dst *idst,
>  {
>  	struct dst_entry *old_dst;
>
> -	if (dst) {
> -		if (dst->flags & DST_NOCACHE)
> -			dst = NULL;
> -		else
> -			dst_clone(dst);
> -	}
> +	dst_clone(dst);
>  	old_dst = xchg((__force struct dst_entry **)&idst->dst, dst);
>  	dst_release(old_dst);
>  }
>
>
>

  reply	other threads:[~2014-06-23  8:55 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-06 11:29 Potential race in ip4_datagram_release_cb Alexey Preobrazhensky
2014-06-06 12:56 ` Eric Dumazet
2014-06-06 15:59   ` Alexei Starovoitov
2014-06-06 16:16     ` Eric Dumazet
2014-06-06 17:44       ` Alexei Starovoitov
2014-06-06 17:56         ` Eric Dumazet
2014-06-06 18:13           ` Alexei Starovoitov
2014-06-10 13:43 ` [PATCH] ipv4: fix a race in ip4_datagram_release_cb() Eric Dumazet
2014-06-11  0:32   ` dormando
2014-06-11  0:55     ` Eric Dumazet
2014-06-11  1:12       ` Eric Dumazet
2014-06-11  1:26         ` Eric Dumazet
2014-06-11  4:16           ` dormando
2014-06-11  5:54             ` Eric Dumazet
2014-06-11  7:20               ` dormando
2014-06-11  7:26                 ` dormando
2014-06-11  7:38                   ` dormando
2014-06-11 12:41                     ` Eric Dumazet
2014-06-11 13:12                       ` Eric Dumazet
2014-06-12  1:55                         ` dormando
2014-06-12  3:43                           ` Eric Dumazet
2014-06-12  4:05                             ` dormando
2014-06-22 19:07                             ` dormando
2014-06-23  8:33                               ` Eric Dumazet
2014-06-23  8:55                                 ` dormando [this message]
2014-06-23 16:57                                   ` Dmitry Vyukov
2014-06-24 17:05                                 ` [PATCH net] ipv4: fix dst race in sk_dst_get() Eric Dumazet
2014-06-26  0:42                                   ` David Miller
2014-06-11 13:38             ` [PATCH] ipv4: fix a race in ip4_datagram_release_cb() Kostya Serebryany
2014-06-29  0:25           ` dormando
2014-06-30  6:38             ` Eric Dumazet
2014-06-30  8:15               ` dormando
2014-06-30  8:30                 ` Eric Dumazet
2014-07-08  1:41                   ` dormando
2014-07-08  6:47                     ` Eric Dumazet
2014-07-08  7:01                       ` dormando
2014-07-16 21:03                       ` dormando
2014-07-25  8:11                         ` dormando
2014-06-30  8:26           ` [PATCH] ipv4: irq safe sk_dst_[re]set() and ipv4_sk_update_pmtu() fix Eric Dumazet
2014-07-01  6:43             ` David Miller
2014-06-11 22:39   ` [PATCH] ipv4: fix a race in ip4_datagram_release_cb() David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.02.1406230151480.19472@dtop \
    --to=dormando@rydia.net \
    --cc=alexei.starovoitov@gmail.com \
    --cc=brutus@google.com \
    --cc=davem@davemloft.net \
    --cc=dvyukov@google.com \
    --cc=edumazet@google.com \
    --cc=eric.dumazet@gmail.com \
    --cc=kcc@google.com \
    --cc=larsbull@google.com \
    --cc=maze@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=preobr@google.com \
    --cc=steffen.klassert@secunet.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox