From mboxrd@z Thu Jan  1 00:00:00 1970
From: stranche@codeaurora.org
Subject: Re: [PATCH net] af_key: free SKBs under RCU protection
Date: Thu, 20 Sep 2018 13:25:18 -0600
Message-ID: <357e28c3fa0c7bacaffde4e960f58a87@codeaurora.org>
References: <1537402712-12875-1-git-send-email-stranche@codeaurora.org>
 <6d194ac2-15e8-76d7-31d0-b4c4eed68d5a@gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII;
 format=flowed
Content-Transfer-Encoding: 7bit
Cc: netdev@vger.kernel.org, steffen.klassert@secunet.com
To: Eric Dumazet <eric.dumazet@gmail.com>
Return-path: <netdev-owner@vger.kernel.org>
Received: from smtp.codeaurora.org ([198.145.29.96]:53888 "EHLO
        smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S2387485AbeIUBKU (ORCPT
        <rfc822;netdev@vger.kernel.org>); Thu, 20 Sep 2018 21:10:20 -0400
In-Reply-To: <6d194ac2-15e8-76d7-31d0-b4c4eed68d5a@gmail.com>
Sender: netdev-owner@vger.kernel.org
List-ID: <netdev.vger.kernel.org>

> 
> I do not believe the changelog or the patch makes sense.
> 
> Having skb still referencing a socket prevents this socket being 
> released.
> 
> If you think about it, what would prevent the freeing happening
> _before_ the rcu_read_lock() in pfkey_broadcast() ?
> 
> Maybe the correct fix is that pfkey_broadcast_one() should ensure the
> socket is still valid.
> 
> I would suggest something like :
> 
> diff --git a/net/key/af_key.c b/net/key/af_key.c
> index
> 9d61266526e767770d9a1ce184ac8cdd59de309a..5ce309d020dda5e46e4426c4a639bfb551e2260d
> 100644
> --- a/net/key/af_key.c
> +++ b/net/key/af_key.c
> @@ -201,7 +201,9 @@ static int pfkey_broadcast_one(struct sk_buff
> *skb, struct sk_buff **skb2,
>  {
>         int err = -ENOBUFS;
> 
> -       sock_hold(sk);
> +       if (!refcount_inc_not_zero(&sk->sk_refcnt))
> +               return -ENOENT;
> +
>         if (*skb2 == NULL) {
>                 if (refcount_read(&skb->users) != 1) {
>                         *skb2 = skb_clone(skb, allocation);

Hi Eric,

I'm not sure that the socket getting freed before the rcu_read_lock() 
would
be an issue, since then it would no longer be in the net_pkey->table 
that
we loop through (since we call pfkey_remove() from pfkey_relase()). 
Because of
that, all the sockets processed in pfkey_broadcast_one() have valid 
refcounts,
so checking for zero there doesn't prevent the crash that I'm seeing.

However, after going over the call flow again, I see that the actual 
problem
occurs because of pfkey_broadcast_one(). Specifically, because of this 
check:

	if (*skb2 == NULL) {
		if (refcount_read(&skb->users) != 1) {
			*skb2 = skb_clone(skb, allocation);
		} else {
			*skb2 = skb;
			refcount_inc(&skb->users);
		}
	}

Since we always pass a freshly cloned SKB to this function, skb->users 
is
always 1, and skb2 just becomes skb. We then set skb2 (and thus skb) to
belong to the socket.

If the socket we queue skb2 to frees this SKB (thereby decrementing its
refcount to 1) and the socket is freed before pfkey_broadcast() can
execute the kfree_skb(skb) on line 284, we will then attempt to run
sock_rfree() on an SKB with a dangling reference to this socket.

Perhaps a cleaner solution here is to always clone the SKB in
pfkey_broadcast_one(). That will ensure that the two kfree_skb() calls
in pfkey_broadcast() will never be passed an SKB with sock_rfree() as
its destructor, and we can avoid this race condition.