From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Joseph Huang <Joseph.Huang@garmin.com>,
Nikolay Aleksandrov <nikolay@nvidia.com>,
Jakub Kicinski <kuba@kernel.org>
Subject: [PATCH 5.9 12/49] bridge: Fix a deadlock when enabling multicast snooping
Date: Sat, 19 Dec 2020 13:58:16 +0100 [thread overview]
Message-ID: <20201219125345.277579762@linuxfoundation.org> (raw)
In-Reply-To: <20201219125344.671832095@linuxfoundation.org>
From: Joseph Huang <Joseph.Huang@garmin.com>
[ Upstream commit 851d0a73c90e6c8c63fef106c6c1e73df7e05d9d ]
When enabling multicast snooping, bridge module deadlocks on multicast_lock
if 1) IPv6 is enabled, and 2) there is an existing querier on the same L2
network.
The deadlock was caused by the following sequence: While holding the lock,
br_multicast_open calls br_multicast_join_snoopers, which eventually causes
IP stack to (attempt to) send out a Listener Report (in igmp6_join_group).
Since the destination Ethernet address is a multicast address, br_dev_xmit
feeds the packet back to the bridge via br_multicast_rcv, which in turn
calls br_multicast_add_group, which then deadlocks on multicast_lock.
The fix is to move the call br_multicast_join_snoopers outside of the
critical section. This works since br_multicast_join_snoopers only deals
with IP and does not modify any multicast data structures of the bridge,
so there's no need to hold the lock.
Steps to reproduce:
1. sysctl net.ipv6.conf.all.force_mld_version=1
2. have another querier
3. ip link set dev bridge type bridge mcast_snooping 0 && \
ip link set dev bridge type bridge mcast_snooping 1 < deadlock >
A typical call trace looks like the following:
[ 936.251495] _raw_spin_lock+0x5c/0x68
[ 936.255221] br_multicast_add_group+0x40/0x170 [bridge]
[ 936.260491] br_multicast_rcv+0x7ac/0xe30 [bridge]
[ 936.265322] br_dev_xmit+0x140/0x368 [bridge]
[ 936.269689] dev_hard_start_xmit+0x94/0x158
[ 936.273876] __dev_queue_xmit+0x5ac/0x7f8
[ 936.277890] dev_queue_xmit+0x10/0x18
[ 936.281563] neigh_resolve_output+0xec/0x198
[ 936.285845] ip6_finish_output2+0x240/0x710
[ 936.290039] __ip6_finish_output+0x130/0x170
[ 936.294318] ip6_output+0x6c/0x1c8
[ 936.297731] NF_HOOK.constprop.0+0xd8/0xe8
[ 936.301834] igmp6_send+0x358/0x558
[ 936.305326] igmp6_join_group.part.0+0x30/0xf0
[ 936.309774] igmp6_group_added+0xfc/0x110
[ 936.313787] __ipv6_dev_mc_inc+0x1a4/0x290
[ 936.317885] ipv6_dev_mc_inc+0x10/0x18
[ 936.321677] br_multicast_open+0xbc/0x110 [bridge]
[ 936.326506] br_multicast_toggle+0xec/0x140 [bridge]
Fixes: 4effd28c1245 ("bridge: join all-snoopers multicast address")
Signed-off-by: Joseph Huang <Joseph.Huang@garmin.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Link: https://lore.kernel.org/r/20201204235628.50653-1-Joseph.Huang@garmin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
net/bridge/br_device.c | 6 ++++++
net/bridge/br_multicast.c | 34 +++++++++++++++++++++++++---------
net/bridge/br_private.h | 10 ++++++++++
3 files changed, 41 insertions(+), 9 deletions(-)
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -177,6 +177,9 @@ static int br_dev_open(struct net_device
br_stp_enable_bridge(br);
br_multicast_open(br);
+ if (br_opt_get(br, BROPT_MULTICAST_ENABLED))
+ br_multicast_join_snoopers(br);
+
return 0;
}
@@ -197,6 +200,9 @@ static int br_dev_stop(struct net_device
br_stp_disable_bridge(br);
br_multicast_stop(br);
+ if (br_opt_get(br, BROPT_MULTICAST_ENABLED))
+ br_multicast_leave_snoopers(br);
+
netif_stop_queue(dev);
return 0;
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -1848,7 +1848,7 @@ static inline void br_ip6_multicast_join
}
#endif
-static void br_multicast_join_snoopers(struct net_bridge *br)
+void br_multicast_join_snoopers(struct net_bridge *br)
{
br_ip4_multicast_join_snoopers(br);
br_ip6_multicast_join_snoopers(br);
@@ -1879,7 +1879,7 @@ static inline void br_ip6_multicast_leav
}
#endif
-static void br_multicast_leave_snoopers(struct net_bridge *br)
+void br_multicast_leave_snoopers(struct net_bridge *br)
{
br_ip4_multicast_leave_snoopers(br);
br_ip6_multicast_leave_snoopers(br);
@@ -1898,9 +1898,6 @@ static void __br_multicast_open(struct n
void br_multicast_open(struct net_bridge *br)
{
- if (br_opt_get(br, BROPT_MULTICAST_ENABLED))
- br_multicast_join_snoopers(br);
-
__br_multicast_open(br, &br->ip4_own_query);
#if IS_ENABLED(CONFIG_IPV6)
__br_multicast_open(br, &br->ip6_own_query);
@@ -1916,9 +1913,6 @@ void br_multicast_stop(struct net_bridge
del_timer_sync(&br->ip6_other_query.timer);
del_timer_sync(&br->ip6_own_query.timer);
#endif
-
- if (br_opt_get(br, BROPT_MULTICAST_ENABLED))
- br_multicast_leave_snoopers(br);
}
void br_multicast_dev_del(struct net_bridge *br)
@@ -2049,6 +2043,7 @@ static void br_multicast_start_querier(s
int br_multicast_toggle(struct net_bridge *br, unsigned long val)
{
struct net_bridge_port *port;
+ bool change_snoopers = false;
spin_lock_bh(&br->multicast_lock);
if (!!br_opt_get(br, BROPT_MULTICAST_ENABLED) == !!val)
@@ -2057,7 +2052,7 @@ int br_multicast_toggle(struct net_bridg
br_mc_disabled_update(br->dev, val);
br_opt_toggle(br, BROPT_MULTICAST_ENABLED, !!val);
if (!br_opt_get(br, BROPT_MULTICAST_ENABLED)) {
- br_multicast_leave_snoopers(br);
+ change_snoopers = true;
goto unlock;
}
@@ -2068,9 +2063,30 @@ int br_multicast_toggle(struct net_bridg
list_for_each_entry(port, &br->port_list, list)
__br_multicast_enable_port(port);
+ change_snoopers = true;
+
unlock:
spin_unlock_bh(&br->multicast_lock);
+ /* br_multicast_join_snoopers has the potential to cause
+ * an MLD Report/Leave to be delivered to br_multicast_rcv,
+ * which would in turn call br_multicast_add_group, which would
+ * attempt to acquire multicast_lock. This function should be
+ * called after the lock has been released to avoid deadlocks on
+ * multicast_lock.
+ *
+ * br_multicast_leave_snoopers does not have the problem since
+ * br_multicast_rcv first checks BROPT_MULTICAST_ENABLED, and
+ * returns without calling br_multicast_ipv4/6_rcv if it's not
+ * enabled. Moved both functions out just for symmetry.
+ */
+ if (change_snoopers) {
+ if (br_opt_get(br, BROPT_MULTICAST_ENABLED))
+ br_multicast_join_snoopers(br);
+ else
+ br_multicast_leave_snoopers(br);
+ }
+
return 0;
}
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -745,6 +745,8 @@ void br_multicast_del_port(struct net_br
void br_multicast_enable_port(struct net_bridge_port *port);
void br_multicast_disable_port(struct net_bridge_port *port);
void br_multicast_init(struct net_bridge *br);
+void br_multicast_join_snoopers(struct net_bridge *br);
+void br_multicast_leave_snoopers(struct net_bridge *br);
void br_multicast_open(struct net_bridge *br);
void br_multicast_stop(struct net_bridge *br);
void br_multicast_dev_del(struct net_bridge *br);
@@ -872,6 +874,14 @@ static inline void br_multicast_init(str
{
}
+static inline void br_multicast_join_snoopers(struct net_bridge *br)
+{
+}
+
+static inline void br_multicast_leave_snoopers(struct net_bridge *br)
+{
+}
+
static inline void br_multicast_open(struct net_bridge *br)
{
}
next prev parent reply other threads:[~2020-12-19 12:58 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-19 12:58 [PATCH 5.9 00/49] 5.9.16-rc1 review Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 01/49] ptrace: Prevent kernel-infoleak in ptrace_get_syscall_info() Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 02/49] net/sched: fq_pie: initialize timer earlier in fq_pie_init() Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 03/49] net: ipa: pass the correct size when freeing DMA memory Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 04/49] ipv4: fix error return code in rtm_to_fib_config() Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 05/49] mac80211: mesh: fix mesh_pathtbl_init() error path Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 06/49] net: bridge: vlan: fix error return code in __vlan_add() Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 07/49] vrf: packets with lladdr src needs dst at input with orig_iif when needs strict Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 08/49] net: mscc: ocelot: fix dropping of unknown IPv4 multicast on Seville Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 09/49] net: hns3: remove a misused pragma packed Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 10/49] udp: fix the proto value passed to ip_protocol_deliver_rcu for the segments Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 11/49] enetc: Fix reporting of h/w packet counters Greg Kroah-Hartman
2020-12-19 12:58 ` Greg Kroah-Hartman [this message]
2020-12-19 12:58 ` [PATCH 5.9 13/49] mptcp: print new line in mptcp_seq_show() if mptcp isnt in use Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 14/49] net: stmmac: dwmac-meson8b: fix mask definition of the m250_sel mux Greg Kroah-Hartman
2020-12-19 21:51 ` Pavel Machek
2020-12-19 22:38 ` Martin Blumenstingl
2020-12-19 23:13 ` Pavel Machek
2020-12-21 14:31 ` Martin Blumenstingl
2020-12-19 12:58 ` [PATCH 5.9 15/49] net: stmmac: start phylink instance before stmmac_hw_setup() Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 16/49] net: stmmac: free tx skb buffer in stmmac_resume() Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 17/49] net: stmmac: delete the eee_ctrl_timer after napi disabled Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 18/49] net: stmmac: overwrite the dma_cap.addr64 according to HW design Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 19/49] net: ll_temac: Fix potential NULL dereference in temac_probe() Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 20/49] tcp: select sane initial rcvq_space.space for big MSS Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 21/49] e1000e: fix S0ix flow to allow S0i3.2 subset entry Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 22/49] ethtool: fix stack overflow in ethnl_parse_bitset() Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 23/49] tcp: fix cwnd-limited bug for TSO deferral where we send nothing Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 24/49] net: flow_offload: Fix memory leak for indirect flow block Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 25/49] net/mlx4_en: Avoid scheduling restart task if it is already running Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 26/49] net/mlx4_en: Handle TX error CQE Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 27/49] net: sched: Fix dump of MPLS_OPT_LSE_LABEL attribute in cls_flower Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 28/49] bonding: fix feature flag setting at init time Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 29/49] ch_ktls: fix build warning for ipv4-only config Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 30/49] lan743x: fix for potential NULL pointer dereference with bare card Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 31/49] net: stmmac: increase the timeout for dma reset Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 32/49] net: tipc: prevent possible null deref of link Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 33/49] ktest.pl: If size of log is too big to email, email error message Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 34/49] ktest.pl: Fix the logic for truncating the size of the log file for email Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 35/49] USB: dummy-hcd: Fix uninitialized array use in init() Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 36/49] USB: add RESET_RESUME quirk for Snapscan 1212 Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 37/49] ALSA: usb-audio: Fix potential out-of-bounds shift Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 38/49] ALSA: usb-audio: Fix control access overflow errors from chmap Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 39/49] xhci: Give USB2 ports time to enter U3 in bus suspend Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 40/49] xhci-pci: Allow host runtime PM as default for Intel Alpine Ridge LP Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 41/49] xhci-pci: Allow host runtime PM as default for Intel Maple Ridge xHCI Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 42/49] USB: UAS: introduce a quirk to set no_write_same Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 43/49] USB: sisusbvga: Make console support depend on BROKEN Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 44/49] ALSA: pcm: oss: Fix potential out-of-bounds shift Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 45/49] serial: 8250_omap: Avoid FIFO corruption caused by MDR1 access Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 46/49] KVM: mmu: Fix SPTE encoding of MMIO generation upper half Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 47/49] membarrier: Explicitly sync remote cores when SYNC_CORE is requested Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 48/49] x86/resctrl: Remove unused struct mbm_state::chunks_bw Greg Kroah-Hartman
2020-12-19 12:58 ` [PATCH 5.9 49/49] x86/resctrl: Fix incorrect local bandwidth when mba_sc is enabled Greg Kroah-Hartman
2020-12-19 21:49 ` [PATCH 5.9 00/49] 5.9.16-rc1 review Guenter Roeck
2020-12-20 3:51 ` Naresh Kamboju
2020-12-20 13:37 ` Jon Hunter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201219125345.277579762@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=Joseph.Huang@garmin.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=nikolay@nvidia.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.