From: Colin Foster <colin.foster@in-advantage.com>
To: Vladimir Oltean <vladimir.oltean@nxp.com>
Cc: netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Andrew Lunn <andrew@lunn.ch>,
Florian Fainelli <f.fainelli@gmail.com>,
Claudiu Manoil <claudiu.manoil@nxp.com>,
Alexandre Belloni <alexandre.belloni@bootlin.com>,
UNGLinuxDriver@microchip.com
Subject: Re: [PATCH net] net: dsa: ocelot: call dsa_tag_8021q_unregister() under rtnl_lock() on driver remove
Date: Fri, 4 Aug 2023 09:03:33 -0700 [thread overview]
Message-ID: <ZM0hVTA7nHuRCSXa@euler> (raw)
In-Reply-To: <20230803134253.2711124-1-vladimir.oltean@nxp.com>
Hi Vladimir,
On Thu, Aug 03, 2023 at 04:42:53PM +0300, Vladimir Oltean wrote:
> When the tagging protocol in current use is "ocelot-8021q" and we unbind
> the driver, we see this splat:
>
> $ echo '0000:00:00.2' > /sys/bus/pci/drivers/fsl_enetc/unbind
> mscc_felix 0000:00:00.5 swp0: left promiscuous mode
> sja1105 spi2.0: Link is Down
> DSA: tree 1 torn down
> mscc_felix 0000:00:00.5 swp2: left promiscuous mode
> sja1105 spi2.2: Link is Down
> DSA: tree 3 torn down
> fsl_enetc 0000:00:00.2 eno2: left promiscuous mode
> mscc_felix 0000:00:00.5: Link is Down
> ------------[ cut here ]------------
> RTNL: assertion failed at net/dsa/tag_8021q.c (409)
> WARNING: CPU: 1 PID: 329 at net/dsa/tag_8021q.c:409 dsa_tag_8021q_unregister+0x12c/0x1a0
> Modules linked in:
> CPU: 1 PID: 329 Comm: bash Not tainted 6.5.0-rc3+ #771
> pc : dsa_tag_8021q_unregister+0x12c/0x1a0
> lr : dsa_tag_8021q_unregister+0x12c/0x1a0
> Call trace:
> dsa_tag_8021q_unregister+0x12c/0x1a0
> felix_tag_8021q_teardown+0x130/0x150
> felix_teardown+0x3c/0xd8
> dsa_tree_teardown_switches+0xbc/0xe0
> dsa_unregister_switch+0x168/0x260
> felix_pci_remove+0x30/0x60
> pci_device_remove+0x4c/0x100
> device_release_driver_internal+0x188/0x288
> device_links_unbind_consumers+0xfc/0x138
> device_release_driver_internal+0xe0/0x288
> device_driver_detach+0x24/0x38
> unbind_store+0xd8/0x108
> drv_attr_store+0x30/0x50
> ---[ end trace 0000000000000000 ]---
> ------------[ cut here ]------------
> RTNL: assertion failed at net/8021q/vlan_core.c (376)
> WARNING: CPU: 1 PID: 329 at net/8021q/vlan_core.c:376 vlan_vid_del+0x1b8/0x1f0
> CPU: 1 PID: 329 Comm: bash Tainted: G W 6.5.0-rc3+ #771
> pc : vlan_vid_del+0x1b8/0x1f0
> lr : vlan_vid_del+0x1b8/0x1f0
> dsa_tag_8021q_unregister+0x8c/0x1a0
> felix_tag_8021q_teardown+0x130/0x150
> felix_teardown+0x3c/0xd8
> dsa_tree_teardown_switches+0xbc/0xe0
> dsa_unregister_switch+0x168/0x260
> felix_pci_remove+0x30/0x60
> pci_device_remove+0x4c/0x100
> device_release_driver_internal+0x188/0x288
> device_links_unbind_consumers+0xfc/0x138
> device_release_driver_internal+0xe0/0x288
> device_driver_detach+0x24/0x38
> unbind_store+0xd8/0x108
> drv_attr_store+0x30/0x50
> DSA: tree 0 torn down
>
> This was somewhat not so easy to spot, because "ocelot-8021q" is not the
> default tagging protocol, and thus, not everyone who tests the unbinding
> path may have switched to it beforehand. The default
> felix_tag_npi_teardown() does not require rtnl_lock() to be held.
I ran this unbind test (with just ocelot tagging) on my currently
running system (6.5.1-rc1 + 8). This doesn't include your patch, but I
suspect this is entirely different because I'm not using ocelot-8021q.
# echo spi0.0 > /sys/bus/spi/drivers/ocelot-soc/unbind
br0: port 1(swp1) entered disabled state
ocelot-ext-switch ocelot-ext-switch.5.auto swp1 (unregistering): left allmulticast mode
ocelot-ext-switch ocelot-ext-switch.5.auto swp1 (unregistering): left promiscuous mode
br0: port 1(swp1) entered disabled state
br0: port 2(swp2) entered disabled state
ocelot-ext-switch ocelot-ext-switch.5.auto swp2 (unregistering): left allmulticast mode
ocelot-ext-switch ocelot-ext-switch.5.auto swp2 (unregistering): left promiscuous mode
br0: port 2(swp2) entered disabled state
br0: port 3(swp3) entered disabled state
ocelot-ext-switch ocelot-ext-switch.5.auto swp3 (unregistering): left allmulticast mode
ocelot-ext-switch ocelot-ext-switch.5.auto swp3 (unregistering): left promiscuous mode
br0: port 3(swp3) entered disabled state
br0: port 4(swp4) entered disabled state
ocelot-ext-switch ocelot-ext-switch.5.auto swp4 (unregistering): left allmulticast mode
ocelot-ext-switch ocelot-ext-switch.5.auto swp4 (unregistering): left promiscuous mode
br0: port 4(swp4) entered disabled state
br0: port 5(swp5) entered disabled state
ocelot-ext-switch ocelot-ext-switch.5.auto swp5 (unregistering): left allmulticast mode
ocelot-ext-switch ocelot-ext-switch.5.auto swp5 (unregistering): left promiscuous mode
br0: port 5(swp5) entered disabled state
br0: port 6(swp6) entered disabled state
ocelot-ext-switch ocelot-ext-switch.5.auto swp6 (unregistering): left allmulticast mode
ocelot-ext-switch ocelot-ext-switch.5.auto swp6 (unregistering): left promiscuous mode
br0: port 6(swp6) entered disabled state
br0: port 7(swp7) entered disabled state
ocelot-ext-switch ocelot-ext-switch.5.auto swp7 (unregistering): left allmulticast mode
cpsw-switch 4a100000.switch eth0: left allmulticast mode
ocelot-ext-switch ocelot-ext-switch.5.auto swp7 (unregistering): left promiscuous mode
cpsw-switch 4a100000.switch eth0: left promiscuous mode
br0: port 7(swp7) entered disabled state
ocelot-ext-switch ocelot-ext-switch.5.auto: Link is Down
DSA: tree 0 torn down
------------[ cut here ]------------
WARNING: CPU: 0 PID: 157 at net/dsa/dsa.c:1490 dsa_switch_release_ports+0x104/0x12c
Modules linked in:
CPU: 0 PID: 157 Comm: bash Not tainted 6.5.0-rc1-00008-ga5ed09af118a #1324
Hardware name: Generic AM33XX (Flattened Device Tree)
Backtrace:
dump_backtrace from show_stack+0x20/0x24
r7:00000009 r6:00000000 r5:c18c0a8c r4:000e0113
show_stack from dump_stack_lvl+0x60/0x78
dump_stack_lvl from dump_stack+0x18/0x1c
r7:00000009 r6:c1186e10 r5:000005d2 r4:c1a06270
dump_stack from __warn+0x88/0x160
__warn from warn_slowpath_fmt+0xe4/0x1e0
r8:00000009 r7:000005d2 r6:c1a06270 r5:c1d05590 r4:c1c978a4
warn_slowpath_fmt from dsa_switch_release_ports+0x104/0x12c
r10:c1ea8b7c r9:c4290da8 r8:00000100 r7:c1a06270 r6:c4288380 r5:c427f800
r4:c427f600
dsa_switch_release_ports from dsa_unregister_switch+0x38/0x18c
r9:c4290da8 r8:00000044 r7:c4255c54 r6:c4290db0 r5:c4290d80 r4:c4288380
dsa_unregister_switch from ocelot_ext_remove+0x28/0x40
r9:c1f6ec1c r8:00000044 r7:c4255c54 r6:c1ec5454 r5:00000000 r4:c26db800
ocelot_ext_remove from platform_remove+0x50/0x6c
r5:00000000 r4:c4255c10
platform_remove from device_remove+0x50/0x74
r5:00000000 r4:c4255c10
device_remove from device_release_driver_internal+0x190/0x204
r5:00000000 r4:c4255c10
device_release_driver_internal from device_release_driver+0x20/0x24
r9:c1f6ec1c r8:c2146940 r7:c2146938 r6:c214690c r5:c4255c10 r4:c2146930
device_release_driver from bus_remove_device+0xd0/0xf4
bus_remove_device from device_del+0x164/0x454
r9:c1f6ec1c r8:c424d800 r7:c47b4700 r6:00000000 r5:c4255c10 r4:c4255c54
device_del from platform_device_del.part.0+0x20/0x84
r10:c1ea8b7c r9:c4292e80 r8:00000100 r7:00000122 r6:c4255c00 r5:c4255c00
r4:c4255c00
platform_device_del.part.0 from platform_device_unregister+0x28/0x34
r5:c4255c10 r4:c4255c00
platform_device_unregister from mfd_remove_devices_fn+0xe8/0xf4
r5:c4255c10 r4:c1ea8b7c
mfd_remove_devices_fn from device_for_each_child_reverse+0x80/0xc8
r10:c47b4700 r9:c1d04d5c r8:c1f099a8 r7:c424d800 r6:c0a98f74 r5:e0c55d78
r4:00000000 r3:00000001
device_for_each_child_reverse from devm_mfd_dev_release+0x40/0x68
r6:e0c55dd4 r5:c4270e00 r4:c4270f00
devm_mfd_dev_release from release_nodes+0x78/0x104
release_nodes from devres_release_all+0x90/0xe0
r10:c4b05b10 r9:00000000 r8:c424d444 r7:c424d9b0 r6:80030013 r5:00000039
r4:c424d800
devres_release_all from device_unbind_cleanup+0x1c/0x70
r7:c424d844 r6:c1ea8b94 r5:c424d400 r4:c424d800
device_unbind_cleanup from device_release_driver_internal+0x1c0/0x204
r5:c424d400 r4:c424d800
device_release_driver_internal from device_driver_detach+0x20/0x24
r9:00000000 r8:00000000 r7:c1ea8b94 r6:00000007 r5:c424d800 r4:c1eb9108
device_driver_detach from unbind_store+0x64/0xa0
unbind_store from drv_attr_store+0x34/0x40
r7:e0c55f08 r6:c4b05b00 r5:c471d040 r4:c0a53410
drv_attr_store from sysfs_kf_write+0x48/0x54
r5:c471d040 r4:c0a5266c
sysfs_kf_write from kernfs_fop_write_iter+0x11c/0x1dc
r5:c471d040 r4:00000007
kernfs_fop_write_iter from vfs_write+0x2d0/0x41c
r10:00000000 r9:00004004 r8:00000000 r7:00000007 r6:005c9ef8 r5:e0c55f68
r4:c4958cc0
vfs_write from ksys_write+0x70/0xf4
r10:00000004 r9:c47b4700 r8:c03002f4 r7:00000000 r6:00000000 r5:c4958cc0
r4:c4958cc0
ksys_write from sys_write+0x18/0x1c
r7:00000004 r6:b6fad550 r5:005c9ef8 r4:00000007
sys_write from ret_fast_syscall+0x0/0x1c
Exception stack(0xe0c55fa8 to 0xe0c55ff0)
5fa0: 00000007 005c9ef8 00000001 005c9ef8 00000007 00000000
5fc0: 00000007 005c9ef8 b6fad550 00000004 00000007 00000001 00000000 be8e4a6c
5fe0: 00000004 be8e49c8 b6e56767 b6de1e06
---[ end trace 0000000000000000 ]---
gpio_stub_drv gpiochip6: REMOVING GPIOCHIP WITH GPIOS STILL REQUESTED
BUG: scheduling while atomic: bash/157/0x00000002
Modules linked in:
Preemption disabled at:
[<c03b8f98>] __wake_up_klogd.part.0+0x20/0xb4
CPU: 0 PID: 157 Comm: bash Tainted: G W 6.5.0-rc1-00008-ga5ed09af118a #1324
Hardware name: Generic AM33XX (Flattened Device Tree)
Backtrace:
dump_backtrace from show_stack+0x20/0x24
r7:c47b4700 r6:00000000 r5:c18c0a8c r4:000e0113
show_stack from dump_stack_lvl+0x60/0x78
dump_stack_lvl from dump_stack+0x18/0x1c
r7:c47b4700 r6:c47b4700 r5:c03b8f98 r4:c47b4700
dump_stack from __schedule_bug+0x94/0xa4
__schedule_bug from __schedule+0x8fc/0xc48
r5:00000000 r4:df99a400
__schedule from schedule+0x60/0xf4
r10:e0c55ab4 r9:00000002 r8:e0c55a3c r7:c47b4700 r6:e0c55ab0 r5:e0c55aac
r4:c47b4700
schedule from schedule_timeout+0xd8/0x190
r5:e0c55aac r4:7fffffff
schedule_timeout from wait_for_completion+0xa0/0x124
r8:e0c55a3c r7:c47b4700 r6:e0c55ab0 r5:e0c55aac r4:7fffffff
wait_for_completion from devtmpfs_submit_req+0x70/0x80
r10:c47b4700 r9:c1f6ec1c r8:c424e810 r7:00000000 r6:e0c55aac r5:e0c55aa8
r4:c1f6ed78
devtmpfs_submit_req from devtmpfs_delete_node+0x84/0xb4
r7:c47b4700 r6:c4250264 r5:c4250000 r4:00000000
devtmpfs_delete_node from device_del+0x3b8/0x454
r5:c4250000 r4:c4250044
device_del from cdev_device_del+0x24/0x54
r10:c47b4700 r9:c1d04d5c r8:00000040 r7:c4250234 r6:c4250264 r5:c42501e0
r4:c4250000
cdev_device_del from gpiolib_cdev_unregister+0x20/0x24
r5:c4250000 r4:00000000
gpiolib_cdev_unregister from gpiochip_remove+0x100/0x130
gpiochip_remove from devm_gpio_chip_release+0x18/0x1c
r9:c1d04d5c r8:c1f099a8 r7:c424e810 r6:e0c55bf4 r5:c427e700 r4:c427ea80
devm_gpio_chip_release from devm_action_release+0x1c/0x20
devm_action_release from release_nodes+0x78/0x104
release_nodes from devres_release_all+0x90/0xe0
r10:c1ea8b7c r9:c1f6ec1c r8:00000044 r7:c424e9c0 r6:800e0113 r5:00000093
r4:c424e810
devres_release_all from device_unbind_cleanup+0x1c/0x70
r7:c424e854 r6:c1dd9a80 r5:00000000 r4:c424e810
device_unbind_cleanup from device_release_driver_internal+0x1c0/0x204
r5:00000000 r4:c424e810
device_release_driver_internal from device_release_driver+0x20/0x24
r9:c1f6ec1c r8:c2146940 r7:c2146938 r6:c214690c r5:c424e810 r4:c2146930
device_release_driver from bus_remove_device+0xd0/0xf4
bus_remove_device from device_del+0x164/0x454
r9:c1f6ec1c r8:c424d800 r7:c47b4700 r6:00000000 r5:c424e810 r4:c424e854
device_del from platform_device_del.part.0+0x20/0x84
r10:c1ea8b7c r9:c4274f00 r8:00000100 r7:00000122 r6:c424e800 r5:c424e800
r4:c424e800
platform_device_del.part.0 from platform_device_unregister+0x28/0x34
r5:c424e810 r4:c424e800
platform_device_unregister from mfd_remove_devices_fn+0xe8/0xf4
r5:c424e810 r4:c1ea8b7c
mfd_remove_devices_fn from device_for_each_child_reverse+0x80/0xc8
r10:c47b4700 r9:c1d04d5c r8:c1f099a8 r7:c424d800 r6:c0a98f74 r5:e0c55d78
r4:00000000 r3:00000001
device_for_each_child_reverse from devm_mfd_dev_release+0x40/0x68
r6:e0c55dd4 r5:c4270e00 r4:c4270f00
devm_mfd_dev_release from release_nodes+0x78/0x104
release_nodes from devres_release_all+0x90/0xe0
r10:c4b05b10 r9:00000000 r8:c424d444 r7:c424d9b0 r6:80030013 r5:00000039
r4:c424d800
devres_release_all from device_unbind_cleanup+0x1c/0x70
r7:c424d844 r6:c1ea8b94 r5:c424d400 r4:c424d800
device_unbind_cleanup from device_release_driver_internal+0x1c0/0x204
r5:c424d400 r4:c424d800
device_release_driver_internal from device_driver_detach+0x20/0x24
r9:00000000 r8:00000000 r7:c1ea8b94 r6:00000007 r5:c424d800 r4:c1eb9108
device_driver_detach from unbind_store+0x64/0xa0
unbind_store from drv_attr_store+0x34/0x40
r7:e0c55f08 r6:c4b05b00 r5:c471d040 r4:c0a53410
drv_attr_store from sysfs_kf_write+0x48/0x54
r5:c471d040 r4:c0a5266c
sysfs_kf_write from kernfs_fop_write_iter+0x11c/0x1dc
r5:c471d040 r4:00000007
kernfs_fop_write_iter from vfs_write+0x2d0/0x41c
r10:00000000 r9:00004004 r8:00000000 r7:00000007 r6:005c9ef8 r5:e0c55f68
r4:c4958cc0
vfs_write from ksys_write+0x70/0xf4
r10:00000004 r9:c47b4700 r8:c03002f4 r7:00000000 r6:00000000 r5:c4958cc0
r4:c4958cc0
ksys_write from sys_write+0x18/0x1c
r7:00000004 r6:b6fad550 r5:005c9ef8 r4:00000007
sys_write from ret_fast_syscall+0x0/0x1c
Exception stack(0xe0c55fa8 to 0xe0c55ff0)
5fa0: 00000007 005c9ef8 00000001 005c9ef8 00000007 00000000
5fc0: 00000007 005c9ef8 b6fad550 00000004 00000007 00000001 00000000 be8e4a6c
5fe0: 00000004 be8e49c8 b6e56767 b6de1e06
cpsw-switch 4a100000.switch eth0: Link is Down
It looks to me like I have some things to fix :)
Is it worth me still trying to recreate / test? I haven't used
ocelot-8021q really at all.
Colin
next prev parent reply other threads:[~2023-08-04 16:03 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-03 13:42 [PATCH net] net: dsa: ocelot: call dsa_tag_8021q_unregister() under rtnl_lock() on driver remove Vladimir Oltean
2023-08-03 19:29 ` Simon Horman
2023-08-04 11:10 ` Vladimir Oltean
2023-08-05 8:10 ` Simon Horman
2023-08-04 16:03 ` Colin Foster [this message]
2023-08-04 17:09 ` Vladimir Oltean
2023-08-04 21:49 ` Colin Foster
2023-08-04 22:40 ` patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZM0hVTA7nHuRCSXa@euler \
--to=colin.foster@in-advantage.com \
--cc=UNGLinuxDriver@microchip.com \
--cc=alexandre.belloni@bootlin.com \
--cc=andrew@lunn.ch \
--cc=claudiu.manoil@nxp.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=f.fainelli@gmail.com \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=vladimir.oltean@nxp.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.