Archive-only list for patches
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, Kuniyuki Iwashima <kuniyu@amazon.com>,
	Paolo Abeni <pabeni@redhat.com>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH 5.15 45/79] udp: Update reuse->has_conns under reuseport_lock.
Date: Thu, 27 Oct 2022 18:55:43 +0200	[thread overview]
Message-ID: <20221027165056.404109613@linuxfoundation.org> (raw)
In-Reply-To: <20221027165054.917467648@linuxfoundation.org>

From: Kuniyuki Iwashima <kuniyu@amazon.com>

[ Upstream commit 69421bf98482d089e50799f45e48b25ce4a8d154 ]

When we call connect() for a UDP socket in a reuseport group, we have
to update sk->sk_reuseport_cb->has_conns to 1.  Otherwise, the kernel
could select a unconnected socket wrongly for packets sent to the
connected socket.

However, the current way to set has_conns is illegal and possible to
trigger that problem.  reuseport_has_conns() changes has_conns under
rcu_read_lock(), which upgrades the RCU reader to the updater.  Then,
it must do the update under the updater's lock, reuseport_lock, but
it doesn't for now.

For this reason, there is a race below where we fail to set has_conns
resulting in the wrong socket selection.  To avoid the race, let's split
the reader and updater with proper locking.

 cpu1                               cpu2
+----+                             +----+

__ip[46]_datagram_connect()        reuseport_grow()
.                                  .
|- reuseport_has_conns(sk, true)   |- more_reuse = __reuseport_alloc(more_socks_size)
|  .                               |
|  |- rcu_read_lock()
|  |- reuse = rcu_dereference(sk->sk_reuseport_cb)
|  |
|  |                               |  /* reuse->has_conns == 0 here */
|  |                               |- more_reuse->has_conns = reuse->has_conns
|  |- reuse->has_conns = 1         |  /* more_reuse->has_conns SHOULD BE 1 HERE */
|  |                               |
|  |                               |- rcu_assign_pointer(reuse->socks[i]->sk_reuseport_cb,
|  |                               |                     more_reuse)
|  `- rcu_read_unlock()            `- kfree_rcu(reuse, rcu)
|
|- sk->sk_state = TCP_ESTABLISHED

Note the likely(reuse) in reuseport_has_conns_set() is always true,
but we put the test there for ease of review.  [0]

For the record, usually, sk_reuseport_cb is changed under lock_sock().
The only exception is reuseport_grow() & TCP reqsk migration case.

  1) shutdown() TCP listener, which is moved into the latter part of
     reuse->socks[] to migrate reqsk.

  2) New listen() overflows reuse->socks[] and call reuseport_grow().

  3) reuse->max_socks overflows u16 with the new listener.

  4) reuseport_grow() pops the old shutdown()ed listener from the array
     and update its sk->sk_reuseport_cb as NULL without lock_sock().

shutdown()ed TCP sk->sk_reuseport_cb can be changed without lock_sock(),
but, reuseport_has_conns_set() is called only for UDP under lock_sock(),
so likely(reuse) never be false in reuseport_has_conns_set().

[0]: https://lore.kernel.org/netdev/CANn89iLja=eQHbsM_Ta2sQF0tOGU8vAGrh_izRuuHjuO1ouUag@mail.gmail.com/

Fixes: acdcecc61285 ("udp: correct reuseport selection with connected sockets")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20221014182625.89913-1-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/net/sock_reuseport.h | 11 +++++------
 net/core/sock_reuseport.c    | 16 ++++++++++++++++
 net/ipv4/datagram.c          |  2 +-
 net/ipv4/udp.c               |  2 +-
 net/ipv6/datagram.c          |  2 +-
 net/ipv6/udp.c               |  2 +-
 6 files changed, 25 insertions(+), 10 deletions(-)

diff --git a/include/net/sock_reuseport.h b/include/net/sock_reuseport.h
index 473b0b0fa4ab..efc9085c6892 100644
--- a/include/net/sock_reuseport.h
+++ b/include/net/sock_reuseport.h
@@ -43,21 +43,20 @@ struct sock *reuseport_migrate_sock(struct sock *sk,
 extern int reuseport_attach_prog(struct sock *sk, struct bpf_prog *prog);
 extern int reuseport_detach_prog(struct sock *sk);
 
-static inline bool reuseport_has_conns(struct sock *sk, bool set)
+static inline bool reuseport_has_conns(struct sock *sk)
 {
 	struct sock_reuseport *reuse;
 	bool ret = false;
 
 	rcu_read_lock();
 	reuse = rcu_dereference(sk->sk_reuseport_cb);
-	if (reuse) {
-		if (set)
-			reuse->has_conns = 1;
-		ret = reuse->has_conns;
-	}
+	if (reuse && reuse->has_conns)
+		ret = true;
 	rcu_read_unlock();
 
 	return ret;
 }
 
+void reuseport_has_conns_set(struct sock *sk);
+
 #endif  /* _SOCK_REUSEPORT_H */
diff --git a/net/core/sock_reuseport.c b/net/core/sock_reuseport.c
index 5daa1fa54249..fb90e1e00773 100644
--- a/net/core/sock_reuseport.c
+++ b/net/core/sock_reuseport.c
@@ -21,6 +21,22 @@ static DEFINE_IDA(reuseport_ida);
 static int reuseport_resurrect(struct sock *sk, struct sock_reuseport *old_reuse,
 			       struct sock_reuseport *reuse, bool bind_inany);
 
+void reuseport_has_conns_set(struct sock *sk)
+{
+	struct sock_reuseport *reuse;
+
+	if (!rcu_access_pointer(sk->sk_reuseport_cb))
+		return;
+
+	spin_lock_bh(&reuseport_lock);
+	reuse = rcu_dereference_protected(sk->sk_reuseport_cb,
+					  lockdep_is_held(&reuseport_lock));
+	if (likely(reuse))
+		reuse->has_conns = 1;
+	spin_unlock_bh(&reuseport_lock);
+}
+EXPORT_SYMBOL(reuseport_has_conns_set);
+
 static int reuseport_sock_index(struct sock *sk,
 				const struct sock_reuseport *reuse,
 				bool closed)
diff --git a/net/ipv4/datagram.c b/net/ipv4/datagram.c
index 4a8550c49202..112c6e892d30 100644
--- a/net/ipv4/datagram.c
+++ b/net/ipv4/datagram.c
@@ -70,7 +70,7 @@ int __ip4_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len
 	}
 	inet->inet_daddr = fl4->daddr;
 	inet->inet_dport = usin->sin_port;
-	reuseport_has_conns(sk, true);
+	reuseport_has_conns_set(sk);
 	sk->sk_state = TCP_ESTABLISHED;
 	sk_set_txhash(sk);
 	inet->inet_id = prandom_u32();
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 75d1977ecc07..79d5425bed07 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -446,7 +446,7 @@ static struct sock *udp4_lib_lookup2(struct net *net,
 			result = lookup_reuseport(net, sk, skb,
 						  saddr, sport, daddr, hnum);
 			/* Fall back to scoring if group has connections */
-			if (result && !reuseport_has_conns(sk, false))
+			if (result && !reuseport_has_conns(sk))
 				return result;
 
 			result = result ? : sk;
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index 206f66310a88..f4559e5bc84b 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -256,7 +256,7 @@ int __ip6_datagram_connect(struct sock *sk, struct sockaddr *uaddr,
 		goto out;
 	}
 
-	reuseport_has_conns(sk, true);
+	reuseport_has_conns_set(sk);
 	sk->sk_state = TCP_ESTABLISHED;
 	sk_set_txhash(sk);
 out:
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 07726a51a3f0..19b6c4da0f42 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -180,7 +180,7 @@ static struct sock *udp6_lib_lookup2(struct net *net,
 			result = lookup_reuseport(net, sk, skb,
 						  saddr, sport, daddr, hnum);
 			/* Fall back to scoring if group has connections */
-			if (result && !reuseport_has_conns(sk, false))
+			if (result && !reuseport_has_conns(sk))
 				return result;
 
 			result = result ? : sk;
-- 
2.35.1




  parent reply	other threads:[~2022-10-27 17:02 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-27 16:54 [PATCH 5.15 00/79] 5.15.76-rc1 review Greg Kroah-Hartman
2022-10-27 16:54 ` [PATCH 5.15 01/79] r8152: add PID for the Lenovo OneLink+ Dock Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 02/79] arm64/mm: Consolidate TCR_EL1 fields Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 03/79] usb: gadget: uvc: consistently use define for headerlen Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 04/79] usb: gadget: uvc: use on returned header len in video_encode_isoc_sg Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 05/79] usb: gadget: uvc: rework uvcg_queue_next_buffer to uvcg_complete_buffer Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 06/79] usb: gadget: uvc: giveback vb2 buffer on req complete Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 07/79] usb: gadget: uvc: improve sg exit condition Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 08/79] arm64: errata: Remove AES hwcap for COMPAT tasks Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 09/79] perf/x86/intel/pt: Relax address filter validation Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 10/79] btrfs: enhance unsupported compat RO flags handling Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 11/79] ocfs2: clear dinode links count in case of error Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 12/79] ocfs2: fix BUG when iput after ocfs2_mknod fails Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 13/79] selinux: enable use of both GFP_KERNEL and GFP_ATOMIC in convert_context() Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 14/79] cpufreq: qcom: fix writes in read-only memory region Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 15/79] i2c: qcom-cci: Fix ordering of pm_runtime_xx and i2c_add_adapter Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 16/79] cpufreq: tegra194: Fix module loading Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 17/79] x86/microcode/AMD: Apply the patch early on every logical thread Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 18/79] hwmon/coretemp: Handle large core ID value Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 19/79] ata: ahci-imx: Fix MODULE_ALIAS Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 20/79] ata: ahci: Match EM_MAX_SLOTS with SATA_PMP_MAX_PORTS Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 21/79] x86/resctrl: Fix min_cbm_bits for AMD Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 22/79] cpufreq: qcom: fix memory leak in error path Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 23/79] drm/amdgpu: fix sdma doorbell init ordering on APUs Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 24/79] mm,hugetlb: take hugetlb_lock before decrementing h->resv_huge_pages Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 25/79] kvm: Add support for arch compat vm ioctls Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 26/79] KVM: arm64: vgic: Fix exit condition in scan_its_table() Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 27/79] media: ipu3-imgu: Fix NULL pointer dereference in active selection access Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 28/79] media: mceusb: set timeout to at least timeout provided Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 29/79] media: venus: dec: Handle the case where find_format fails Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 30/79] x86/topology: Fix multiple packages shown on a single-package system Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 31/79] x86/topology: Fix duplicated core ID within a package Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 32/79] btrfs: fix processing of delayed data refs during backref walking Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 33/79] btrfs: fix processing of delayed tree block " Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 34/79] drm/vc4: Add module dependency on hdmi-codec Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 35/79] ACPI: extlog: Handle multiple records Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 36/79] tipc: Fix recognition of trial period Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 37/79] tipc: fix an information leak in tipc_topsrv_kern_subscr Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 38/79] i40e: Fix DMA mappings leak Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 39/79] HID: magicmouse: Do not set BTN_MOUSE on double report Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 40/79] sfc: Change VF mac via PF as first preference if available Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 41/79] net/atm: fix proc_mpc_write incorrect return value Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 42/79] net: phy: dp83867: Extend RX strap quirk for SGMII mode Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 43/79] net: phylink: add mac_managed_pm in phylink_config structure Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 44/79] scsi: lpfc: Fix memory leak in lpfc_create_port() Greg Kroah-Hartman
2022-10-27 16:55 ` Greg Kroah-Hartman [this message]
2022-10-27 16:55 ` [PATCH 5.15 46/79] cifs: Fix xid leak in cifs_create() Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 47/79] cifs: Fix xid leak in cifs_copy_file_range() Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 48/79] cifs: Fix xid leak in cifs_flock() Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 49/79] cifs: Fix xid leak in cifs_ses_add_channel() Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 50/79] dm: remove unnecessary assignment statement in alloc_dev() Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 51/79] net: hsr: avoid possible NULL deref in skb_clone() Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 52/79] ionic: catch NULL pointer issue on reconfig Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 53/79] netfilter: nf_tables: relax NFTA_SET_ELEM_KEY_END set flags requirements Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 54/79] nvme-hwmon: consistently ignore errors from nvme_hwmon_init Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 55/79] nvme-hwmon: kmalloc the NVME SMART log buffer Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 56/79] nvmet: fix workqueue MEM_RECLAIM flushing dependency Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 57/79] net: sched: cake: fix null pointer access issue when cake_init() fails Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 58/79] net: sched: delete duplicate cleanup of backlog and qlen Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 59/79] net: sched: sfb: fix null pointer access issue when sfb_init() fails Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 60/79] sfc: include vport_id in filter spec hash and equal() Greg Kroah-Hartman
2022-10-27 16:55 ` [PATCH 5.15 61/79] wwan_hwsim: fix possible memory leak in wwan_hwsim_dev_new() Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.15 62/79] net: hns: fix possible memory leak in hnae_ae_register() Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.15 63/79] net: sched: fix race condition in qdisc_graft() Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.15 64/79] net: phy: dp83822: disable MDI crossover status change interrupt Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.15 65/79] iommu/vt-d: Allow NVS regions in arch_rmrr_sanity_check() Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.15 66/79] iommu/vt-d: Clean up si_domain in the init_dmars() error path Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.15 67/79] fs: dlm: fix invalid derefence of sb_lvbptr Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.15 68/79] arm64: mte: move register initialization to C Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.15 69/79] ksmbd: handle smb2 query dir request for OutputBufferLength that is too small Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.15 70/79] ksmbd: fix incorrect handling of iterate_dir Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.15 71/79] tracing: Simplify conditional compilation code in tracing_set_tracer() Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.15 72/79] tracing: Do not free snapshot if tracer is on cmdline Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.15 73/79] mmc: sdhci-tegra: Use actual clock rate for SW tuning correction Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.15 74/79] perf: Skip and warn on unknown format configN attrs Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.15 75/79] [PATCH v3] ACPI: video: Force backlight native for more TongFang devices Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.15 76/79] x86/Kconfig: Drop check for -mabi=ms for CONFIG_EFI_STUB Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.15 77/79] Makefile.debug: re-enable debug info for .S files Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.15 78/79] mmc: core: Add SD card quirk for broken discard Greg Kroah-Hartman
2022-10-27 16:56 ` [PATCH 5.15 79/79] mm: /proc/pid/smaps_rollup: fix no vmas null-deref Greg Kroah-Hartman
2022-10-27 18:18 ` [PATCH 5.15 00/79] 5.15.76-rc1 review Guenter Roeck
2022-10-28 10:42   ` Greg Kroah-Hartman
2022-10-28  3:46 ` Bagas Sanjaya
2022-10-28  9:58 ` Naresh Kamboju
2022-10-28 10:40 ` Sudip Mukherjee (Codethink)
2022-10-28 10:44   ` Sudip Mukherjee
2022-10-28 11:58 ` Jon Hunter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221027165056.404109613@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=kuniyu@amazon.com \
    --cc=pabeni@redhat.com \
    --cc=patches@lists.linux.dev \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox