Archive-only list for patches
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, Rafael Richter <rafael.richter@gin.de>,
	Vladimir Oltean <vladimir.oltean@nxp.com>,
	"David S. Miller" <davem@davemloft.net>,
	xu.xin16@zte.com.cn, Vladimir Oltean <olteanv@gmail.com>,
	Paolo Abeni <pabeni@redhat.com>
Subject: [PATCH 5.15 01/57] net: dsa: fix panic when DSA master device unbinds on shutdown
Date: Thu, 11 Apr 2024 11:57:09 +0200	[thread overview]
Message-ID: <20240411095408.029941837@linuxfoundation.org> (raw)
In-Reply-To: <20240411095407.982258070@linuxfoundation.org>

5.15-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Vladimir Oltean <vladimir.oltean@nxp.com>

commit ee534378f00561207656663d93907583958339ae upstream.

Rafael reports that on a system with LX2160A and Marvell DSA switches,
if a reboot occurs while the DSA master (dpaa2-eth) is up, the following
panic can be seen:

systemd-shutdown[1]: Rebooting.
Unable to handle kernel paging request at virtual address 00a0000800000041
[00a0000800000041] address between user and kernel address ranges
Internal error: Oops: 96000004 [#1] PREEMPT SMP
CPU: 6 PID: 1 Comm: systemd-shutdow Not tainted 5.16.5-00042-g8f5585009b24 #32
pc : dsa_slave_netdevice_event+0x130/0x3e4
lr : raw_notifier_call_chain+0x50/0x6c
Call trace:
 dsa_slave_netdevice_event+0x130/0x3e4
 raw_notifier_call_chain+0x50/0x6c
 call_netdevice_notifiers_info+0x54/0xa0
 __dev_close_many+0x50/0x130
 dev_close_many+0x84/0x120
 unregister_netdevice_many+0x130/0x710
 unregister_netdevice_queue+0x8c/0xd0
 unregister_netdev+0x20/0x30
 dpaa2_eth_remove+0x68/0x190
 fsl_mc_driver_remove+0x20/0x5c
 __device_release_driver+0x21c/0x220
 device_release_driver_internal+0xac/0xb0
 device_links_unbind_consumers+0xd4/0x100
 __device_release_driver+0x94/0x220
 device_release_driver+0x28/0x40
 bus_remove_device+0x118/0x124
 device_del+0x174/0x420
 fsl_mc_device_remove+0x24/0x40
 __fsl_mc_device_remove+0xc/0x20
 device_for_each_child+0x58/0xa0
 dprc_remove+0x90/0xb0
 fsl_mc_driver_remove+0x20/0x5c
 __device_release_driver+0x21c/0x220
 device_release_driver+0x28/0x40
 bus_remove_device+0x118/0x124
 device_del+0x174/0x420
 fsl_mc_bus_remove+0x80/0x100
 fsl_mc_bus_shutdown+0xc/0x1c
 platform_shutdown+0x20/0x30
 device_shutdown+0x154/0x330
 __do_sys_reboot+0x1cc/0x250
 __arm64_sys_reboot+0x20/0x30
 invoke_syscall.constprop.0+0x4c/0xe0
 do_el0_svc+0x4c/0x150
 el0_svc+0x24/0xb0
 el0t_64_sync_handler+0xa8/0xb0
 el0t_64_sync+0x178/0x17c

It can be seen from the stack trace that the problem is that the
deregistration of the master causes a dev_close(), which gets notified
as NETDEV_GOING_DOWN to dsa_slave_netdevice_event().
But dsa_switch_shutdown() has already run, and this has unregistered the
DSA slave interfaces, and yet, the NETDEV_GOING_DOWN handler attempts to
call dev_close_many() on those slave interfaces, leading to the problem.

The previous attempt to avoid the NETDEV_GOING_DOWN on the master after
dsa_switch_shutdown() was called seems improper. Unregistering the slave
interfaces is unnecessary and unhelpful. Instead, after the slaves have
stopped being uppers of the DSA master, we can now reset to NULL the
master->dsa_ptr pointer, which will make DSA start ignoring all future
notifier events on the master.

Fixes: 0650bf52b31f ("net: dsa: be compatible with masters which unregister on shutdown")
Reported-by: Rafael Richter <rafael.richter@gin.de>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: xu.xin16@zte.com.cn
Cc: Vladimir Oltean <olteanv@gmail.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/dsa/dsa2.c |   25 ++++++-------------------
 1 file changed, 6 insertions(+), 19 deletions(-)

--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -1634,7 +1634,6 @@ EXPORT_SYMBOL_GPL(dsa_unregister_switch)
 void dsa_switch_shutdown(struct dsa_switch *ds)
 {
 	struct net_device *master, *slave_dev;
-	LIST_HEAD(unregister_list);
 	struct dsa_port *dp;
 
 	mutex_lock(&dsa2_mutex);
@@ -1655,25 +1654,13 @@ void dsa_switch_shutdown(struct dsa_swit
 		slave_dev = dp->slave;
 
 		netdev_upper_dev_unlink(master, slave_dev);
-		/* Just unlinking ourselves as uppers of the master is not
-		 * sufficient. When the master net device unregisters, that will
-		 * also call dev_close, which we will catch as NETDEV_GOING_DOWN
-		 * and trigger a dev_close on our own devices (dsa_slave_close).
-		 * In turn, that will call dev_mc_unsync on the master's net
-		 * device. If the master is also a DSA switch port, this will
-		 * trigger dsa_slave_set_rx_mode which will call dev_mc_sync on
-		 * its own master. Lockdep will complain about the fact that
-		 * all cascaded masters have the same dsa_master_addr_list_lock_key,
-		 * which it normally would not do if the cascaded masters would
-		 * be in a proper upper/lower relationship, which we've just
-		 * destroyed.
-		 * To suppress the lockdep warnings, let's actually unregister
-		 * the DSA slave interfaces too, to avoid the nonsensical
-		 * multicast address list synchronization on shutdown.
-		 */
-		unregister_netdevice_queue(slave_dev, &unregister_list);
 	}
-	unregister_netdevice_many(&unregister_list);
+
+	/* Disconnect from further netdevice notifiers on the master,
+	 * since netdev_uses_dsa() will now return false.
+	 */
+	dsa_switch_for_each_cpu_port(dp, ds)
+		dp->master->dsa_ptr = NULL;
 
 	rtnl_unlock();
 out:



  reply	other threads:[~2024-04-11 10:49 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-11  9:57 [PATCH 5.15 00/57] 5.15.155-rc1 review Greg Kroah-Hartman
2024-04-11  9:57 ` Greg Kroah-Hartman [this message]
2024-04-11  9:57 ` [PATCH 5.15 02/57] wifi: ath9k: fix LNA selection in ath_ant_try_scan() Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 03/57] batman-adv: Return directly after a failed batadv_dat_select_candidates() in batadv_dat_forward_data() Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 04/57] batman-adv: Improve exception handling in batadv_throw_uevent() Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 05/57] VMCI: Fix memcpy() run-time warning in dg_dispatch_as_host() Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 06/57] panic: Flush kernel log buffer at the end Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 07/57] cpuidle: Avoid potential overflow in integer multiplication Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 08/57] arm64: dts: rockchip: fix rk3328 hdmi ports node Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 09/57] arm64: dts: rockchip: fix rk3399 " Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 10/57] ionic: set adminq irq affinity Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 11/57] pstore/zone: Add a null pointer check to the psz_kmsg_read Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 12/57] tools/power x86_energy_perf_policy: Fix file leak in get_pkg_num() Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 13/57] net: pcs: xpcs: Return EINVAL in the internal methods Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 14/57] wifi: ath11k: decrease MHI channel buffer length to 8KB Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 15/57] btrfs: handle chunk tree lookup error in btrfs_relocate_sys_chunks() Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 16/57] btrfs: export: handle invalid inode or root reference in btrfs_get_parent() Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 17/57] btrfs: send: handle path ref underflow in header iterate_inode_ref() Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 18/57] net/smc: reduce rtnl pressure in smc_pnet_create_pnetids_list() Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 19/57] Bluetooth: btintel: Fix null ptr deref in btintel_read_version Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 20/57] Input: synaptics-rmi4 - fail probing if memory allocation for "phys" fails Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 21/57] pinctrl: renesas: checker: Limit cfg reg enum checks to provided IDs Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 22/57] sysv: dont call sb_bread() with pointers_lock held Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 23/57] scsi: lpfc: Fix possible memory leak in lpfc_rcv_padisc() Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 24/57] isofs: handle CDs with bad root inode but good Joliet root directory Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 25/57] media: sta2x11: fix irq handler cast Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 26/57] ALSA: firewire-lib: handle quirk to calculate payload quadlets as data block counter Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 27/57] ext4: add a hint for block bitmap corrupt state in mb_groups Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 28/57] ext4: forbid commit inconsistent quota data when errors=remount-ro Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 29/57] drm/amd/display: Fix nanosec stat overflow Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 30/57] SUNRPC: increase size of rpc_wait_queue.qlen from unsigned short to unsigned int Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 31/57] Revert "ACPI: PM: Block ASUS B1400CEAE from suspend to idle by default" Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 32/57] libperf evlist: Avoid out-of-bounds access Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 33/57] block: prevent division by zero in blk_rq_stat_sum() Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 34/57] RDMA/cm: add timeout to cm_destroy_id wait Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 35/57] Input: allocate keycode for Display refresh rate toggle Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 36/57] platform/x86: touchscreen_dmi: Add an extra entry for a variant of the Chuwi Vi8 tablet Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 37/57] ktest: force $buildonly = 1 for make_warnings_file test type Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 38/57] ring-buffer: use READ_ONCE() to read cpu_buffer->commit_page in concurrent environment Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 39/57] tools: iio: replace seekdir() in iio_generic_buffer Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 40/57] usb: typec: tcpci: add generic tcpci fallback compatible Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 41/57] usb: sl811-hcd: only defined function checkdone if QUIRK2 is defined Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 42/57] ASoC: soc-core.c: Skip dummy codec when adding platforms Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 43/57] fbdev: viafb: fix typo in hw_bitblt_1 and hw_bitblt_2 Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 44/57] drivers/nvme: Add quirks for device 126f:2262 Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 45/57] fbmon: prevent division by zero in fb_videomode_from_videomode() Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 46/57] netfilter: nf_tables: release batch on table validation from abort path Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 47/57] netfilter: nf_tables: release mutex after nft_gc_seq_end " Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 48/57] netfilter: nf_tables: discard table flag update with pending basechain deletion Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 49/57] tty: n_gsm: require CAP_NET_ADMIN to attach N_GSM0710 ldisc Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 50/57] gcc-plugins/stackleak: Ignore .noinstr.text and .entry.text Greg Kroah-Hartman
2024-04-11  9:57 ` [PATCH 5.15 51/57] gcc-plugins/stackleak: Avoid .head.text section Greg Kroah-Hartman
2024-04-11  9:58 ` [PATCH 5.15 52/57] virtio: reenable config if freezing device failed Greg Kroah-Hartman
2024-04-11  9:58 ` [PATCH 5.15 53/57] x86/mm/pat: fix VM_PAT handling in COW mappings Greg Kroah-Hartman
2024-04-11  9:58 ` [PATCH 5.15 54/57] randomize_kstack: Improve entropy diffusion Greg Kroah-Hartman
2024-04-11  9:58 ` [PATCH 5.15 55/57] platform/x86: intel-vbtn: Update tablet mode switch at end of probe Greg Kroah-Hartman
2024-04-11  9:58 ` [PATCH 5.15 56/57] Bluetooth: btintel: Fixe build regression Greg Kroah-Hartman
2024-04-11  9:58 ` [PATCH 5.15 57/57] VMCI: Fix possible memcpy() run-time warning in vmci_datagram_invoke_guest_handler() Greg Kroah-Hartman
2024-04-11 17:12 ` [PATCH 5.15 00/57] 5.15.155-rc1 review SeongJae Park
2024-04-11 18:36 ` Easwar Hariharan
2024-04-12  8:27   ` Greg Kroah-Hartman
2024-04-11 19:13 ` Florian Fainelli
2024-04-11 23:46 ` Shuah Khan
2024-04-12  6:40 ` Shreeya Patel
2024-04-12  7:28 ` Ron Economos
2024-04-12  8:03 ` Jon Hunter
2024-04-12 10:25 ` Harshit Mogalapalli
2024-04-12 10:50   ` Greg Kroah-Hartman
2024-04-12 15:57   ` Chuck Lever III
2024-04-12 20:06     ` Calum Mackay
2024-04-12 20:11     ` Harshit Mogalapalli
2024-04-12 20:23       ` Chuck Lever
2024-04-12 21:34         ` Harshit Mogalapalli
2024-04-13 15:56           ` Chuck Lever
2024-04-14  6:13             ` Greg Kroah-Hartman
2024-04-15 13:31               ` Chuck Lever
2024-04-12 18:24 ` Naresh Kamboju
2024-04-12 22:22 ` Kelsey Steele

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240411095408.029941837@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=davem@davemloft.net \
    --cc=olteanv@gmail.com \
    --cc=pabeni@redhat.com \
    --cc=patches@lists.linux.dev \
    --cc=rafael.richter@gin.de \
    --cc=stable@vger.kernel.org \
    --cc=vladimir.oltean@nxp.com \
    --cc=xu.xin16@zte.com.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox