Re: [PATCH net v4] l2tp: Serialize access to sk_user_data with sk_callback_lock

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Hangbin Liu <liuhangbin@gmail.com>
To: Jakub Sitnicki <jakub@cloudflare.com>
Cc: netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Tom Parkin <tparkin@katalix.com>,
	Haowei Yan <g1042620637@gmail.com>,
	Roopa Prabhu <roopa@nvidia.com>,
	Nikolay Aleksandrov <nikolay@nvidia.com>
Subject: Re: [PATCH net v4] l2tp: Serialize access to sk_user_data with sk_callback_lock
Date: Fri, 2 Dec 2022 17:50:23 +0800	[thread overview]
Message-ID: <Y4nKX8IXjHLSVHnz@Laptop-X1> (raw)
In-Reply-To: <20221114191619.124659-1-jakub@cloudflare.com>

On Mon, Nov 14, 2022 at 08:16:19PM +0100, Jakub Sitnicki wrote:
> sk->sk_user_data has multiple users, which are not compatible with each
> other. Writers must synchronize by grabbing the sk->sk_callback_lock.
> 
> l2tp currently fails to grab the lock when modifying the underlying tunnel
> socket fields. Fix it by adding appropriate locking.
> 
> We err on the side of safety and grab the sk_callback_lock also inside the
> sk_destruct callback overridden by l2tp, even though there should be no
> refs allowing access to the sock at the time when sk_destruct gets called.
> 
> v4:
> - serialize write to sk_user_data in l2tp sk_destruct
> 
> v3:
> - switch from sock lock to sk_callback_lock
> - document write-protection for sk_user_data
> 
> v2:
> - update Fixes to point to origin of the bug
> - use real names in Reported/Tested-by tags
> 
> Cc: Tom Parkin <tparkin@katalix.com>
> Fixes: 3557baabf280 ("[L2TP]: PPP over L2TP driver core")
> Reported-by: Haowei Yan <g1042620637@gmail.com>
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---
> 
> This took me forever. Sorry about that.
> 
>  include/net/sock.h   |  2 +-
>  net/l2tp/l2tp_core.c | 19 +++++++++++++------
>  2 files changed, 14 insertions(+), 7 deletions(-)
> 
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 5db02546941c..e0517ecc6531 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -323,7 +323,7 @@ struct sk_filter;
>    *	@sk_tskey: counter to disambiguate concurrent tstamp requests
>    *	@sk_zckey: counter to order MSG_ZEROCOPY notifications
>    *	@sk_socket: Identd and reporting IO signals
> -  *	@sk_user_data: RPC layer private data
> +  *	@sk_user_data: RPC layer private data. Write-protected by @sk_callback_lock.
>    *	@sk_frag: cached page frag
>    *	@sk_peek_off: current peek_offset value
>    *	@sk_send_head: front of stuff to transmit
> diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
> index 7499c51b1850..754fdda8a5f5 100644
> --- a/net/l2tp/l2tp_core.c
> +++ b/net/l2tp/l2tp_core.c
> @@ -1150,8 +1150,10 @@ static void l2tp_tunnel_destruct(struct sock *sk)
>  	}
>  
>  	/* Remove hooks into tunnel socket */
> +	write_lock_bh(&sk->sk_callback_lock);
>  	sk->sk_destruct = tunnel->old_sk_destruct;
>  	sk->sk_user_data = NULL;
> +	write_unlock_bh(&sk->sk_callback_lock);
>  
>  	/* Call the original destructor */
>  	if (sk->sk_destruct)

Hi Jakub,

I have a similar issue with vxlan driver. Similar with commit
ad6c9986bcb6 ("vxlan: Fix GRO cells race condition between receive and link
delete"). There is still a race condition on vxlan that when receive a packet
while deleting a VXLAN device. In vxlan_ecn_decapsulate(), the
vxlan_get_sk_family() call panic as sk is NULL.

 #0 [ffffa25ec6978a38] machine_kexec at ffffffff8c669757
 #1 [ffffa25ec6978a90] __crash_kexec at ffffffff8c7c0a4d
 #2 [ffffa25ec6978b58] crash_kexec at ffffffff8c7c1c48
 #3 [ffffa25ec6978b60] oops_end at ffffffff8c627f2b
 #4 [ffffa25ec6978b80] page_fault_oops at ffffffff8c678fcb
 #5 [ffffa25ec6978bd8] exc_page_fault at ffffffff8d109542
 #6 [ffffa25ec6978c00] asm_exc_page_fault at ffffffff8d200b62
    [exception RIP: vxlan_ecn_decapsulate+0x3b]
    RIP: ffffffffc1014e7b  RSP: ffffa25ec6978cb0  RFLAGS: 00010246
    RAX: 0000000000000008  RBX: ffff8aa000888000  RCX: 0000000000000000
    RDX: 000000000000000e  RSI: ffff8a9fc7ab803e  RDI: ffff8a9fd1168700
    RBP: ffff8a9fc7ab803e   R8: 0000000000700000   R9: 00000000000010ae
    R10: ffff8a9fcb748980  R11: 0000000000000000  R12: ffff8a9fd1168700
    R13: ffff8aa000888000  R14: 00000000002a0000  R15: 00000000000010ae
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffffa25ec6978ce8] vxlan_rcv at ffffffffc10189cd [vxlan]
 #8 [ffffa25ec6978d90] udp_queue_rcv_one_skb at ffffffff8cfb6507
 #9 [ffffa25ec6978dc0] udp_unicast_rcv_skb at ffffffff8cfb6e45
#10 [ffffa25ec6978dc8] __udp4_lib_rcv at ffffffff8cfb8807
#11 [ffffa25ec6978e20] ip_protocol_deliver_rcu at ffffffff8cf76951
#12 [ffffa25ec6978e48] ip_local_deliver at ffffffff8cf76bde
#13 [ffffa25ec6978ea0] __netif_receive_skb_one_core at ffffffff8cecde9b
#14 [ffffa25ec6978ec8] process_backlog at ffffffff8cece139
#15 [ffffa25ec6978f00] __napi_poll at ffffffff8ceced1a
#16 [ffffa25ec6978f28] net_rx_action at ffffffff8cecf1f3
#17 [ffffa25ec6978fa0] __softirqentry_text_start at ffffffff8d4000ca
#18 [ffffa25ec6978ff0] do_softirq at ffffffff8c6fbdc3
--- <IRQ stack> ---

> struct socket ffff8a9fd1168700
struct socket {
  state = SS_FREE,
  type = 0,
  flags = 0,
  file = 0xffff8a9fcb748000,
  sk = 0x0,
  ops = 0x0,

So I'm wondering if we should also have locks in udp_tunnel_sock_release().
Or should we add a checking in sk state before calling vxlan_get_sk_family()?

Thanks
Hangbin

next prev parent reply	other threads:[~2022-12-02  9:50 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-14 19:16 [PATCH net v4] l2tp: Serialize access to sk_user_data with sk_callback_lock Jakub Sitnicki
2022-11-15 11:12 ` Tom Parkin
2022-11-16 13:30 ` patchwork-bot+netdevbpf
2022-11-17  9:07   ` Eric Dumazet
2022-11-17  9:35     ` Jakub Sitnicki
2022-11-17  9:54       ` Eric Dumazet
2022-11-17  9:40     ` Eric Dumazet
2022-11-17  9:55       ` Jakub Sitnicki
2022-11-18 10:28         ` Eric Dumazet
2022-11-18 10:57           ` Jakub Sitnicki
2022-11-18 11:09             ` Eric Dumazet
2022-11-19 13:04               ` Jakub Sitnicki
2022-12-02  9:50 ` Hangbin Liu [this message]
2022-12-05 10:24   ` Jakub Sitnicki
2022-12-05 12:37     ` Hangbin Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y4nKX8IXjHLSVHnz@Laptop-X1 \
    --to=liuhangbin@gmail.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=g1042620637@gmail.com \
    --cc=jakub@cloudflare.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=nikolay@nvidia.com \
    --cc=pabeni@redhat.com \
    --cc=roopa@nvidia.com \
    --cc=tparkin@katalix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.