netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Wei Wang <weiwan@google.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: "Paweł Staszewski" <pstaszewski@itcare.pl>,
	"Cong Wang" <xiyou.wangcong@gmail.com>,
	"Linux Kernel Network Developers" <netdev@vger.kernel.org>,
	"Eric Dumazet" <edumazet@google.com>
Subject: Re: [PATCH net] net: prevent dst uses after free
Date: Thu, 21 Sep 2017 09:49:56 -0700	[thread overview]
Message-ID: <CAEA6p_Dxaei6f0khmgSZoz8F8ePStFQSvi7L7Namvoh7V7BtfA@mail.gmail.com> (raw)
In-Reply-To: <1506010546.29839.148.camel@edumazet-glaptop3.roam.corp.google.com>

On Thu, Sep 21, 2017 at 9:15 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> In linux-4.13, Wei worked hard to convert dst to a traditional
> refcounted model, removing GC.
>
> We now want to make sure a dst refcount can not transition from 0 back
> to 1.
>
> The problem here is that input path attached a not refcounted dst to an
> skb. Then later, because packet is forwarded and hits skb_dst_force()
> before exiting RCU section, we might try to take a refcount on one dst
> that is about to be freed, if another cpu saw 1 -> 0 transition in
> dst_release() and queued the dst for freeing after one RCU grace period.
>
> Lets unify skb_dst_force() and skb_dst_force_safe(), since we should
> always perform the complete check against dst refcount, and not assume
> it is not zero.
>
> Bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=197005
>
> [  989.919496]  skb_dst_force+0x32/0x34
> [  989.919498]  __dev_queue_xmit+0x1ad/0x482
> [  989.919501]  ? eth_header+0x28/0xc6
> [  989.919502]  dev_queue_xmit+0xb/0xd
> [  989.919504]  neigh_connected_output+0x9b/0xb4
> [  989.919507]  ip_finish_output2+0x234/0x294
> [  989.919509]  ? ipt_do_table+0x369/0x388
> [  989.919510]  ip_finish_output+0x12c/0x13f
> [  989.919512]  ip_output+0x53/0x87
> [  989.919513]  ip_forward_finish+0x53/0x5a
> [  989.919515]  ip_forward+0x2cb/0x3e6
> [  989.919516]  ? pskb_trim_rcsum.part.9+0x4b/0x4b
> [  989.919518]  ip_rcv_finish+0x2e2/0x321
> [  989.919519]  ip_rcv+0x26f/0x2eb
> [  989.919522]  ? vlan_do_receive+0x4f/0x289
> [  989.919523]  __netif_receive_skb_core+0x467/0x50b
> [  989.919526]  ? tcp_gro_receive+0x239/0x239
> [  989.919529]  ? inet_gro_receive+0x226/0x238
> [  989.919530]  __netif_receive_skb+0x4d/0x5f
> [  989.919532]  netif_receive_skb_internal+0x5c/0xaf
> [  989.919533]  napi_gro_receive+0x45/0x81
> [  989.919536]  ixgbe_poll+0xc8a/0xf09
> [  989.919539]  ? kmem_cache_free_bulk+0x1b6/0x1f7
> [  989.919540]  net_rx_action+0xf4/0x266
> [  989.919543]  __do_softirq+0xa8/0x19d
> [  989.919545]  irq_exit+0x5d/0x6b
> [  989.919546]  do_IRQ+0x9c/0xb5
> [  989.919548]  common_interrupt+0x93/0x93
> [  989.919548]  </IRQ>
>
>
> Similarly dst_clone() can use dst_hold() helper to have additional
> debugging, as a follow up to commit 44ebe79149ff ("net: add debug
> atomic_inc_not_zero() in dst_hold()")
>
> In net-next we will convert dst atomic_t to refcount_t for peace of
> mind.
>
> Fixes: a4c2fd7f7891 ("net: remove DST_NOCACHE flag")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Wei Wang <weiwan@google.com>
> Reported-by: Paweł Staszewski <pstaszewski@itcare.pl>
> Bisected-by: Paweł Staszewski <pstaszewski@itcare.pl>
> ---

Thanks a lot for the fix Eric. It makes sense to unify all the usage
of skb_dst_force() to always check on the refcnt not being 0.
And thank you Pawel for reporting and testing on this.

Acked-by: Wei Wang <weiwan@google.com>


>  include/net/dst.h   |   22 ++++------------------
>  include/net/route.h |    2 +-
>  include/net/sock.h  |    2 +-
>  3 files changed, 6 insertions(+), 20 deletions(-)
>
> diff --git a/include/net/dst.h b/include/net/dst.h
> index 93568bd0a3520bb7402f04d90cf04ac99c81cfbe..06a6765da074449e6f1fe42ee05e711e898ad372 100644
> --- a/include/net/dst.h
> +++ b/include/net/dst.h
> @@ -271,7 +271,7 @@ static inline void dst_use_noref(struct dst_entry *dst, unsigned long time)
>  static inline struct dst_entry *dst_clone(struct dst_entry *dst)
>  {
>         if (dst)
> -               atomic_inc(&dst->__refcnt);
> +               dst_hold(dst);
>         return dst;
>  }
>
> @@ -311,21 +311,6 @@ static inline void skb_dst_copy(struct sk_buff *nskb, const struct sk_buff *oskb
>         __skb_dst_copy(nskb, oskb->_skb_refdst);
>  }
>
> -/**
> - * skb_dst_force - makes sure skb dst is refcounted
> - * @skb: buffer
> - *
> - * If dst is not yet refcounted, let's do it
> - */
> -static inline void skb_dst_force(struct sk_buff *skb)
> -{
> -       if (skb_dst_is_noref(skb)) {
> -               WARN_ON(!rcu_read_lock_held());
> -               skb->_skb_refdst &= ~SKB_DST_NOREF;
> -               dst_clone(skb_dst(skb));
> -       }
> -}
> -
>  /**
>   * dst_hold_safe - Take a reference on a dst if possible
>   * @dst: pointer to dst entry
> @@ -339,16 +324,17 @@ static inline bool dst_hold_safe(struct dst_entry *dst)
>  }
>
>  /**
> - * skb_dst_force_safe - makes sure skb dst is refcounted
> + * skb_dst_force - makes sure skb dst is refcounted
>   * @skb: buffer
>   *
>   * If dst is not yet refcounted and not destroyed, grab a ref on it.
>   */
> -static inline void skb_dst_force_safe(struct sk_buff *skb)
> +static inline void skb_dst_force(struct sk_buff *skb)
>  {
>         if (skb_dst_is_noref(skb)) {
>                 struct dst_entry *dst = skb_dst(skb);
>
> +               WARN_ON(!rcu_read_lock_held());
>                 if (!dst_hold_safe(dst))
>                         dst = NULL;
>
> diff --git a/include/net/route.h b/include/net/route.h
> index 1b09a9368c68d46f0c5ee8ce3cefe566000c1ec1..57dfc6850d378e4b96f13b140eef554d66c24cdf 100644
> --- a/include/net/route.h
> +++ b/include/net/route.h
> @@ -190,7 +190,7 @@ static inline int ip_route_input(struct sk_buff *skb, __be32 dst, __be32 src,
>         rcu_read_lock();
>         err = ip_route_input_noref(skb, dst, src, tos, devin);
>         if (!err) {
> -               skb_dst_force_safe(skb);
> +               skb_dst_force(skb);
>                 if (!skb_dst(skb))
>                         err = -EINVAL;
>         }
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 03a362568357acc7278a318423dd3873103f90ca..a6b9a8d1a6df3f72df8f1aac0f577257fa6452d0 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -856,7 +856,7 @@ void sk_stream_write_space(struct sock *sk);
>  static inline void __sk_add_backlog(struct sock *sk, struct sk_buff *skb)
>  {
>         /* dont let skb dst not refcounted, we are going to leave rcu lock */
> -       skb_dst_force_safe(skb);
> +       skb_dst_force(skb);
>
>         if (!sk->sk_backlog.tail)
>                 sk->sk_backlog.head = skb;
>
>

  reply	other threads:[~2017-09-21 16:49 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <4745525f-18e4-7f69-fe21-8e507e407b33@itcare.pl>
2017-09-19 22:35 ` Latest net-next from GIT panic Paweł Staszewski
2017-09-19 23:45   ` Paweł Staszewski
2017-09-20  0:01     ` Paweł Staszewski
2017-09-20  0:06       ` Paweł Staszewski
2017-09-20  0:26         ` Paweł Staszewski
2017-09-20  3:24         ` Eric Dumazet
2017-09-20  7:58           ` Paweł Staszewski
2017-09-20  8:44             ` Paweł Staszewski
2017-09-20  9:45               ` Paweł Staszewski
2017-09-20 10:21                 ` Paweł Staszewski
2017-09-20 10:22                   ` Paweł Staszewski
2017-09-20 11:02                     ` Paweł Staszewski
2017-09-20 12:23                       ` Paweł Staszewski
2017-09-20 12:49                         ` Paweł Staszewski
2017-09-20 13:05                           ` Paweł Staszewski
2017-09-20 13:09                             ` Paweł Staszewski
2017-09-20 13:11                           ` Eric Dumazet
2017-09-20 13:16                             ` Paweł Staszewski
2017-09-20 13:34                               ` Eric Dumazet
2017-09-20 13:37                                 ` Eric Dumazet
2017-09-20 13:39                                 ` Paweł Staszewski
2017-09-20 13:44                                   ` Eric Dumazet
2017-09-20 14:03                                     ` Paweł Staszewski
2017-09-20 14:40                                       ` Eric Dumazet
2017-09-20 15:05                                         ` Paweł Staszewski
2017-09-20 17:46                                           ` Wei Wang
2017-09-20 17:58                                             ` Paweł Staszewski
2017-09-20 17:50                             ` Cong Wang
2017-09-20 17:59                               ` Eric Dumazet
     [not found]                               ` <3c227be7-a954-a406-1987-24e908cf214c@itcare.pl>
2017-09-20 18:22                                 ` Cong Wang
2017-09-20 18:30                                   ` Eric Dumazet
2017-09-20 18:36                                     ` Cong Wang
2017-09-20 19:13                                       ` Paweł Staszewski
2017-09-20 19:23                                         ` Paweł Staszewski
2017-09-20 21:10                                           ` Paweł Staszewski
2017-09-20 21:24                                             ` Paweł Staszewski
2017-09-20 21:25                                               ` Paweł Staszewski
2017-09-20 21:27                                                 ` Paweł Staszewski
2017-09-20 22:09                                                 ` Wei Wang
2017-09-21  1:09                                                   ` Wei Wang
2017-09-21  1:17                                                     ` Eric Dumazet
2017-09-21  9:06                                                       ` Paweł Staszewski
2017-09-21 11:03                                                         ` Eric Dumazet
2017-09-21 11:12                                                           ` Paweł Staszewski
2017-09-21 11:14                                                             ` Paweł Staszewski
2017-09-21 11:31                                                           ` Paweł Staszewski
2017-09-21 13:18                                                             ` Paweł Staszewski
2017-09-21 14:56                                                               ` Eric Dumazet
2017-09-21 16:15                                                                 ` [PATCH net] net: prevent dst uses after free Eric Dumazet
2017-09-21 16:49                                                                   ` Wei Wang [this message]
2017-09-21 17:12                                                                   ` Martin KaFai Lau
2017-09-22  3:42                                                                   ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAEA6p_Dxaei6f0khmgSZoz8F8ePStFQSvi7L7Namvoh7V7BtfA@mail.gmail.com \
    --to=weiwan@google.com \
    --cc=edumazet@google.com \
    --cc=eric.dumazet@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=pstaszewski@itcare.pl \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).