All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Jakub Kicinski <kuba@kernel.org>,
	Wen Gu <guwen@linux.alibaba.com>,
	Hawkins Jiawei <yin31149@gmail.com>,
	Jakub Sitnicki <jakub@cloudflare.com>,
	syzbot+5f26f85569bd179c18ce@syzkaller.appspotmail.com
Subject: [PATCH 5.10 13/37] net: fix refcount bug in sk_psock_get (2)
Date: Fri,  2 Sep 2022 14:19:35 +0200	[thread overview]
Message-ID: <20220902121359.580930358@linuxfoundation.org> (raw)
In-Reply-To: <20220902121359.177846782@linuxfoundation.org>

From: Hawkins Jiawei <yin31149@gmail.com>

commit 2a0133723f9ebeb751cfce19f74ec07e108bef1f upstream.

Syzkaller reports refcount bug as follows:
------------[ cut here ]------------
refcount_t: saturated; leaking memory.
WARNING: CPU: 1 PID: 3605 at lib/refcount.c:19 refcount_warn_saturate+0xf4/0x1e0 lib/refcount.c:19
Modules linked in:
CPU: 1 PID: 3605 Comm: syz-executor208 Not tainted 5.18.0-syzkaller-03023-g7e062cda7d90 #0
 <TASK>
 __refcount_add_not_zero include/linux/refcount.h:163 [inline]
 __refcount_inc_not_zero include/linux/refcount.h:227 [inline]
 refcount_inc_not_zero include/linux/refcount.h:245 [inline]
 sk_psock_get+0x3bc/0x410 include/linux/skmsg.h:439
 tls_data_ready+0x6d/0x1b0 net/tls/tls_sw.c:2091
 tcp_data_ready+0x106/0x520 net/ipv4/tcp_input.c:4983
 tcp_data_queue+0x25f2/0x4c90 net/ipv4/tcp_input.c:5057
 tcp_rcv_state_process+0x1774/0x4e80 net/ipv4/tcp_input.c:6659
 tcp_v4_do_rcv+0x339/0x980 net/ipv4/tcp_ipv4.c:1682
 sk_backlog_rcv include/net/sock.h:1061 [inline]
 __release_sock+0x134/0x3b0 net/core/sock.c:2849
 release_sock+0x54/0x1b0 net/core/sock.c:3404
 inet_shutdown+0x1e0/0x430 net/ipv4/af_inet.c:909
 __sys_shutdown_sock net/socket.c:2331 [inline]
 __sys_shutdown_sock net/socket.c:2325 [inline]
 __sys_shutdown+0xf1/0x1b0 net/socket.c:2343
 __do_sys_shutdown net/socket.c:2351 [inline]
 __se_sys_shutdown net/socket.c:2349 [inline]
 __x64_sys_shutdown+0x50/0x70 net/socket.c:2349
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x46/0xb0
 </TASK>

During SMC fallback process in connect syscall, kernel will
replaces TCP with SMC. In order to forward wakeup
smc socket waitqueue after fallback, kernel will sets
clcsk->sk_user_data to origin smc socket in
smc_fback_replace_callbacks().

Later, in shutdown syscall, kernel will calls
sk_psock_get(), which treats the clcsk->sk_user_data
as psock type, triggering the refcnt warning.

So, the root cause is that smc and psock, both will use
sk_user_data field. So they will mismatch this field
easily.

This patch solves it by using another bit(defined as
SK_USER_DATA_PSOCK) in PTRMASK, to mark whether
sk_user_data points to a psock object or not.
This patch depends on a PTRMASK introduced in commit f1ff5ce2cd5e
("net, sk_msg: Clear sk_user_data pointer on clone if tagged").

For there will possibly be more flags in the sk_user_data field,
this patch also refactor sk_user_data flags code to be more generic
to improve its maintainability.

Reported-and-tested-by: syzbot+5f26f85569bd179c18ce@syzkaller.appspotmail.com
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/skmsg.h |    3 +-
 include/net/sock.h    |   68 +++++++++++++++++++++++++++++++++++---------------
 net/core/skmsg.c      |    4 ++
 3 files changed, 53 insertions(+), 22 deletions(-)

--- a/include/linux/skmsg.h
+++ b/include/linux/skmsg.h
@@ -281,7 +281,8 @@ static inline void sk_msg_sg_copy_clear(
 
 static inline struct sk_psock *sk_psock(const struct sock *sk)
 {
-	return rcu_dereference_sk_user_data(sk);
+	return __rcu_dereference_sk_user_data_with_flags(sk,
+							 SK_USER_DATA_PSOCK);
 }
 
 static inline void sk_psock_queue_msg(struct sk_psock *psock,
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -527,14 +527,26 @@ enum sk_pacing {
 	SK_PACING_FQ		= 2,
 };
 
-/* Pointer stored in sk_user_data might not be suitable for copying
- * when cloning the socket. For instance, it can point to a reference
- * counted object. sk_user_data bottom bit is set if pointer must not
- * be copied.
+/* flag bits in sk_user_data
+ *
+ * - SK_USER_DATA_NOCOPY:      Pointer stored in sk_user_data might
+ *   not be suitable for copying when cloning the socket. For instance,
+ *   it can point to a reference counted object. sk_user_data bottom
+ *   bit is set if pointer must not be copied.
+ *
+ * - SK_USER_DATA_BPF:         Mark whether sk_user_data field is
+ *   managed/owned by a BPF reuseport array. This bit should be set
+ *   when sk_user_data's sk is added to the bpf's reuseport_array.
+ *
+ * - SK_USER_DATA_PSOCK:       Mark whether pointer stored in
+ *   sk_user_data points to psock type. This bit should be set
+ *   when sk_user_data is assigned to a psock object.
  */
 #define SK_USER_DATA_NOCOPY	1UL
-#define SK_USER_DATA_BPF	2UL	/* Managed by BPF */
-#define SK_USER_DATA_PTRMASK	~(SK_USER_DATA_NOCOPY | SK_USER_DATA_BPF)
+#define SK_USER_DATA_BPF	2UL
+#define SK_USER_DATA_PSOCK	4UL
+#define SK_USER_DATA_PTRMASK	~(SK_USER_DATA_NOCOPY | SK_USER_DATA_BPF |\
+				  SK_USER_DATA_PSOCK)
 
 /**
  * sk_user_data_is_nocopy - Test if sk_user_data pointer must not be copied
@@ -547,24 +559,40 @@ static inline bool sk_user_data_is_nocop
 
 #define __sk_user_data(sk) ((*((void __rcu **)&(sk)->sk_user_data)))
 
+/**
+ * __rcu_dereference_sk_user_data_with_flags - return the pointer
+ * only if argument flags all has been set in sk_user_data. Otherwise
+ * return NULL
+ *
+ * @sk: socket
+ * @flags: flag bits
+ */
+static inline void *
+__rcu_dereference_sk_user_data_with_flags(const struct sock *sk,
+					  uintptr_t flags)
+{
+	uintptr_t sk_user_data = (uintptr_t)rcu_dereference(__sk_user_data(sk));
+
+	WARN_ON_ONCE(flags & SK_USER_DATA_PTRMASK);
+
+	if ((sk_user_data & flags) == flags)
+		return (void *)(sk_user_data & SK_USER_DATA_PTRMASK);
+	return NULL;
+}
+
 #define rcu_dereference_sk_user_data(sk)				\
+	__rcu_dereference_sk_user_data_with_flags(sk, 0)
+#define __rcu_assign_sk_user_data_with_flags(sk, ptr, flags)		\
 ({									\
-	void *__tmp = rcu_dereference(__sk_user_data((sk)));		\
-	(void *)((uintptr_t)__tmp & SK_USER_DATA_PTRMASK);		\
-})
-#define rcu_assign_sk_user_data(sk, ptr)				\
-({									\
-	uintptr_t __tmp = (uintptr_t)(ptr);				\
-	WARN_ON_ONCE(__tmp & ~SK_USER_DATA_PTRMASK);			\
-	rcu_assign_pointer(__sk_user_data((sk)), __tmp);		\
-})
-#define rcu_assign_sk_user_data_nocopy(sk, ptr)				\
-({									\
-	uintptr_t __tmp = (uintptr_t)(ptr);				\
-	WARN_ON_ONCE(__tmp & ~SK_USER_DATA_PTRMASK);			\
+	uintptr_t __tmp1 = (uintptr_t)(ptr),				\
+		  __tmp2 = (uintptr_t)(flags);				\
+	WARN_ON_ONCE(__tmp1 & ~SK_USER_DATA_PTRMASK);			\
+	WARN_ON_ONCE(__tmp2 & SK_USER_DATA_PTRMASK);			\
 	rcu_assign_pointer(__sk_user_data((sk)),			\
-			   __tmp | SK_USER_DATA_NOCOPY);		\
+			   __tmp1 | __tmp2);				\
 })
+#define rcu_assign_sk_user_data(sk, ptr)				\
+	__rcu_assign_sk_user_data_with_flags(sk, ptr, 0)
 
 /*
  * SK_CAN_REUSE and SK_NO_REUSE on a socket mean that the socket is OK
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -612,7 +612,9 @@ struct sk_psock *sk_psock_init(struct so
 	sk_psock_set_state(psock, SK_PSOCK_TX_ENABLED);
 	refcount_set(&psock->refcnt, 1);
 
-	rcu_assign_sk_user_data_nocopy(sk, psock);
+	__rcu_assign_sk_user_data_with_flags(sk, psock,
+					     SK_USER_DATA_NOCOPY |
+					     SK_USER_DATA_PSOCK);
 	sock_hold(sk);
 
 out:



  parent reply	other threads:[~2022-09-02 13:13 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-02 12:19 [PATCH 5.10 00/37] 5.10.141-rc1 review Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 01/37] mm: Force TLB flush for PFNMAP mappings before unlink_file_vma() Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 02/37] x86/nospec: Unwreck the RSB stuffing Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 03/37] x86/nospec: Fix i386 " Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 04/37] crypto: lib - remove unneeded selection of XOR_BLOCKS Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 05/37] s390/mm: do not trigger write fault when vma does not allow VM_WRITE Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 06/37] kbuild: Fix include path in scripts/Makefile.modpost Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 07/37] Bluetooth: L2CAP: Fix build errors in some archs Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 08/37] Revert "PCI/portdrv: Dont disable AER reporting in get_port_device_capability()" Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 09/37] HID: steam: Prevent NULL pointer dereference in steam_{recv,send}_report Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 10/37] udmabuf: Set the DMA mask for the udmabuf device (v2) Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 11/37] media: pvrusb2: fix memory leak in pvr_probe Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 12/37] HID: hidraw: fix memory leak in hidraw_release() Greg Kroah-Hartman
2022-09-02 12:19 ` Greg Kroah-Hartman [this message]
2022-09-02 12:19 ` [PATCH 5.10 14/37] fbdev: fb_pm2fb: Avoid potential divide by zero error Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 15/37] ftrace: Fix NULL pointer dereference in is_ftrace_trampoline when ftrace is dead Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 16/37] bpf: Dont redirect packets with invalid pkt_len Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 17/37] mm/rmap: Fix anon_vma->degree ambiguity leading to double-reuse Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 18/37] mmc: mtk-sd: Clear interrupts when cqe off/disable Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 19/37] drm/amd/display: Avoid MPC infinite loop Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 20/37] drm/amd/display: For stereo keep "FLIP_ANY_FRAME" Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 21/37] drm/amd/display: clear optc underflow before turn off odm clock Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 22/37] neigh: fix possible DoS due to net iface start/stop loop Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 23/37] s390/hypfs: avoid error message under KVM Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 24/37] drm/amd/pm: add missing ->fini_microcode interface for Sienna Cichlid Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 25/37] drm/amd/display: Fix pixel clock programming Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 26/37] drm/amdgpu: Increase tlb flush timeout for sriov Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 27/37] netfilter: conntrack: NF_CONNTRACK_PROCFS should no longer default to y Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 28/37] lib/vdso: Mark do_hres_timens() and do_coarse_timens() __always_inline() Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 29/37] kprobes: dont call disarm_kprobe() for disabled kprobes Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 30/37] io_uring: disable polling pollfree files Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 31/37] xfs: remove infinite loop when reserving free block pool Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 32/37] xfs: always succeed at setting the reserve pool size Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 33/37] xfs: fix overfilling of reserve pool Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 34/37] xfs: fix soft lockup via spinning in filestream ag selection loop Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 35/37] xfs: revert "xfs: actually bump warning counts when we send warnings" Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 36/37] net/af_packet: check len when min_header_len equals to 0 Greg Kroah-Hartman
2022-09-02 12:19 ` [PATCH 5.10 37/37] net: neigh: dont call kfree_skb() under spin_lock_irqsave() Greg Kroah-Hartman
2022-09-02 16:36 ` [PATCH 5.10 00/37] 5.10.141-rc1 review Jon Hunter
2022-09-02 17:36 ` Florian Fainelli
2022-09-02 22:16 ` Shuah Khan
2022-09-03  0:36 ` Guenter Roeck
2022-09-03  3:15 ` Rudi Heitbaum
2022-09-03  6:06 ` Naresh Kamboju
2022-09-03 10:43 ` Sudip Mukherjee
2022-09-05  7:44 ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220902121359.580930358@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=guwen@linux.alibaba.com \
    --cc=jakub@cloudflare.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=syzbot+5f26f85569bd179c18ce@syzkaller.appspotmail.com \
    --cc=yin31149@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.