* [PATCH net-next] net: Restore napi threaded state only when it is enabled @ 2025-07-21 23:36 Samiullah Khawaja 2025-07-23 1:36 ` Jakub Kicinski 0 siblings, 1 reply; 5+ messages in thread From: Samiullah Khawaja @ 2025-07-21 23:36 UTC (permalink / raw) To: Jakub Kicinski, David S . Miller , Eric Dumazet, Paolo Abeni, almasrymina, willemb Cc: netdev, skhawaja Commit 2677010e7793 ("Add support to set NAPI threaded for individual NAPI") added support to enable/disable threaded napi using netlink. This also extended the napi config save/restore functionality to set the napi threaded state. This breaks the netdev reset when threaded napi is enabled at device level as the napi_restore_config tries to stop the kthreads as napi->config->thread is false when threaded is enabled at device level. The napi_restore_config should only restore the napi threaded state when threaded is enabled at NAPI level. The issue can be reproduced on virtio-net device using qemu. To reproduce the issue run following, echo 1 > /sys/class/net/threaded ethtool -L eth0 combined 1 Fixes: 2677010e7793 ("Add support to set NAPI threaded for individual NAPI") Signed-off-by: Samiullah Khawaja <skhawaja@google.com> --- net/core/dev.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/core/dev.c b/net/core/dev.c index 621a639aeba1..1cab04b8f60a 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -7244,7 +7244,8 @@ static void napi_restore_config(struct napi_struct *n) n->config->napi_id = n->napi_id; } - WARN_ON_ONCE(napi_set_threaded(n, n->config->threaded)); + if (n->config->threaded) + WARN_ON_ONCE(napi_set_threaded(n, n->config->threaded)); } static void napi_save_config(struct napi_struct *n) base-commit: c3886ccaadf8fdc2c91bfbdcdca36ccdc6ef8f70 -- 2.50.0.727.gbf7dc18ff4-goog ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH net-next] net: Restore napi threaded state only when it is enabled 2025-07-21 23:36 [PATCH net-next] net: Restore napi threaded state only when it is enabled Samiullah Khawaja @ 2025-07-23 1:36 ` Jakub Kicinski 2025-07-23 2:36 ` Samiullah Khawaja 0 siblings, 1 reply; 5+ messages in thread From: Jakub Kicinski @ 2025-07-23 1:36 UTC (permalink / raw) To: Samiullah Khawaja Cc: David S . Miller , Eric Dumazet, Paolo Abeni, almasrymina, willemb, netdev On Mon, 21 Jul 2025 23:36:08 +0000 Samiullah Khawaja wrote: > Commit 2677010e7793 ("Add support to set NAPI threaded for individual > NAPI") added support to enable/disable threaded napi using netlink. This > also extended the napi config save/restore functionality to set the napi > threaded state. This breaks the netdev reset when threaded napi is "This breaks the netdev reset" is very vague. > enabled at device level as the napi_restore_config tries to stop the > kthreads as napi->config->thread is false when threaded is enabled at > device level. My reading of the commit message is that the WARN triggers, but looking at the code I think you mean that we fail to update the config when we set at the device level? > The napi_restore_config should only restore the napi threaded state when > threaded is enabled at NAPI level. > > The issue can be reproduced on virtio-net device using qemu. To > reproduce the issue run following, > > echo 1 > /sys/class/net/threaded > ethtool -L eth0 combined 1 Maybe we should add that as a test under tools/testing/drivers/net - it will run against netdevsim but also all drivers we test which currently means virtio and fbnic, but hopefully soon more. Up to you. I'm not sure I agree with the semantics, tho. IIUC you're basically making the code prefer threaded option. If user enables threading for the device, then disables it for a NAPI - reconfiguration will lose the fact that specific NAPI instance was supposed to have threading disabled. Why not update napi->config in netif_set_threaded() instead? -- pw-bot: cr ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net-next] net: Restore napi threaded state only when it is enabled 2025-07-23 1:36 ` Jakub Kicinski @ 2025-07-23 2:36 ` Samiullah Khawaja 2025-07-23 14:56 ` Jakub Kicinski 0 siblings, 1 reply; 5+ messages in thread From: Samiullah Khawaja @ 2025-07-23 2:36 UTC (permalink / raw) To: Jakub Kicinski Cc: David S . Miller, Eric Dumazet, Paolo Abeni, almasrymina, willemb, netdev On Tue, Jul 22, 2025 at 6:36 PM Jakub Kicinski <kuba@kernel.org> wrote: > > On Mon, 21 Jul 2025 23:36:08 +0000 Samiullah Khawaja wrote: > > Commit 2677010e7793 ("Add support to set NAPI threaded for individual > > NAPI") added support to enable/disable threaded napi using netlink. This > > also extended the napi config save/restore functionality to set the napi > > threaded state. This breaks the netdev reset when threaded napi is > > "This breaks the netdev reset" is very vague. Basically on netdev reset inside napi_enable when it calls napi_restore_config it tries to stop the NAPI kthread. Since during napi_enable, the NAPI has STATE_SCHED set on it, the stop_kthread gets stuck waiting for the STATE_SCHED to be unset. It should not be destroying the kthread since threaded is enabled at device level. But I think your point below is valid, we should probably set napi->config->threaded in netif_set_threaded. I should add this to the commit message. > > > enabled at device level as the napi_restore_config tries to stop the > > kthreads as napi->config->thread is false when threaded is enabled at > > device level. > > My reading of the commit message is that the WARN triggers, but > looking at the code I think you mean that we fail to update the > config when we set at the device level? > > > The napi_restore_config should only restore the napi threaded state when > > threaded is enabled at NAPI level. > > > > The issue can be reproduced on virtio-net device using qemu. To > > reproduce the issue run following, > > > > echo 1 > /sys/class/net/threaded > > ethtool -L eth0 combined 1 > > Maybe we should add that as a test under tools/testing/drivers/net - > it will run against netdevsim but also all drivers we test which > currently means virtio and fbnic, but hopefully soon more. Up to you. +1 I do want to add a test for it. I was thinking of extending nl_netdev.py but that doesn't seem suitable as it only looks at the state. I will look at the driver directory. Should I send a test with this fix or should I send that later after the next reopen? > > I'm not sure I agree with the semantics, tho. IIUC you're basically > making the code prefer threaded option. If user enables threading > for the device, then disables it for a NAPI - reconfiguration will > lose the fact that specific NAPI instance was supposed to have threading > disabled. Why not update napi->config in netif_set_threaded() instead? I think this discrepancy is orthogonal to the stopping of threads in restore. This is because the napi_enable is calling restore_config and then setting the STATE_THREADED bit on napis based on dev->threaded. Even if restore unsets the THREADED bit (which would have already been unset during napi_disable), it will be set again based on dev->threaded. I think napi_restore_config should be called after setting up STATE bits? > -- > pw-bot: cr ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net-next] net: Restore napi threaded state only when it is enabled 2025-07-23 2:36 ` Samiullah Khawaja @ 2025-07-23 14:56 ` Jakub Kicinski 2025-07-29 19:30 ` Samiullah Khawaja 0 siblings, 1 reply; 5+ messages in thread From: Jakub Kicinski @ 2025-07-23 14:56 UTC (permalink / raw) To: Samiullah Khawaja Cc: David S . Miller, Eric Dumazet, Paolo Abeni, almasrymina, willemb, netdev, Joe Damato On Tue, 22 Jul 2025 19:36:31 -0700 Samiullah Khawaja wrote: > On Tue, Jul 22, 2025 at 6:36 PM Jakub Kicinski <kuba@kernel.org> wrote: > > > > On Mon, 21 Jul 2025 23:36:08 +0000 Samiullah Khawaja wrote: > > > Commit 2677010e7793 ("Add support to set NAPI threaded for individual > > > NAPI") added support to enable/disable threaded napi using netlink. This > > > also extended the napi config save/restore functionality to set the napi > > > threaded state. This breaks the netdev reset when threaded napi is > > > > "This breaks the netdev reset" is very vague. > Basically on netdev reset inside napi_enable when it calls > napi_restore_config it tries to stop the NAPI kthread. Since during > napi_enable, the NAPI has STATE_SCHED set on it, the stop_kthread gets > stuck waiting for the STATE_SCHED to be unset. It should not be > destroying the kthread since threaded is enabled at device level. But > I think your point below is valid, we should probably set > napi->config->threaded in netif_set_threaded. > > I should add this to the commit message. Ah, I see! The impermanence of the DISABLED bit strikes again :( > > > enabled at device level as the napi_restore_config tries to stop the > > > kthreads as napi->config->thread is false when threaded is enabled at > > > device level. > > > > My reading of the commit message is that the WARN triggers, but > > looking at the code I think you mean that we fail to update the > > config when we set at the device level? > > > > > The napi_restore_config should only restore the napi threaded state when > > > threaded is enabled at NAPI level. > > > > > > The issue can be reproduced on virtio-net device using qemu. To > > > reproduce the issue run following, > > > > > > echo 1 > /sys/class/net/threaded > > > ethtool -L eth0 combined 1 > > > > Maybe we should add that as a test under tools/testing/drivers/net - > > it will run against netdevsim but also all drivers we test which > > currently means virtio and fbnic, but hopefully soon more. Up to you. > +1 > > I do want to add a test for it. I was thinking of extending > nl_netdev.py but that doesn't seem suitable as it only looks at the > state. I will look at the driver directory. Should I send a test with > this fix or should I send that later after the next reopen? We recommend sending the patch and the fix in one series, but up to you. > > I'm not sure I agree with the semantics, tho. IIUC you're basically > > making the code prefer threaded option. If user enables threading > > for the device, then disables it for a NAPI - reconfiguration will > > lose the fact that specific NAPI instance was supposed to have threading > > disabled. Why not update napi->config in netif_set_threaded() instead? > I think this discrepancy is orthogonal to the stopping of threads in > restore. This is because the napi_enable is calling restore_config and > then setting the STATE_THREADED bit on napis based on dev->threaded. > Even if restore unsets the THREADED bit (which would have already been > unset during napi_disable), it will be set again based on > dev->threaded. > > I think napi_restore_config should be called after setting up STATE bits? Yes, I think that would work. In fact I think it makes more sense in the first place to "list" the NAPI in the hash table accessible by user space only after it's actually enabled. I'd also be tempted to move some of the config restoration to netif_napi_add(), so that we don't start the thread just to stop it. But we'd need it in both places, IIRC netlink only holds the instance lock, so theoretically the config may change between "add" and "enable" :( cc: Joe, who's probably offline, but note his non-Fastly address ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net-next] net: Restore napi threaded state only when it is enabled 2025-07-23 14:56 ` Jakub Kicinski @ 2025-07-29 19:30 ` Samiullah Khawaja 0 siblings, 0 replies; 5+ messages in thread From: Samiullah Khawaja @ 2025-07-29 19:30 UTC (permalink / raw) To: Jakub Kicinski Cc: David S . Miller, Eric Dumazet, Paolo Abeni, almasrymina, willemb, netdev, Joe Damato On Wed, Jul 23, 2025 at 7:56 AM Jakub Kicinski <kuba@kernel.org> wrote: > > On Tue, 22 Jul 2025 19:36:31 -0700 Samiullah Khawaja wrote: > > On Tue, Jul 22, 2025 at 6:36 PM Jakub Kicinski <kuba@kernel.org> wrote: > > > > > > On Mon, 21 Jul 2025 23:36:08 +0000 Samiullah Khawaja wrote: > > > > Commit 2677010e7793 ("Add support to set NAPI threaded for individual > > > > NAPI") added support to enable/disable threaded napi using netlink. This > > > > also extended the napi config save/restore functionality to set the napi > > > > threaded state. This breaks the netdev reset when threaded napi is > > > > > > "This breaks the netdev reset" is very vague. > > Basically on netdev reset inside napi_enable when it calls > > napi_restore_config it tries to stop the NAPI kthread. Since during > > napi_enable, the NAPI has STATE_SCHED set on it, the stop_kthread gets > > stuck waiting for the STATE_SCHED to be unset. It should not be > > destroying the kthread since threaded is enabled at device level. But > > I think your point below is valid, we should probably set > > napi->config->threaded in netif_set_threaded. > > > > I should add this to the commit message. > > Ah, I see! The impermanence of the DISABLED bit strikes again :( > > > > > enabled at device level as the napi_restore_config tries to stop the > > > > kthreads as napi->config->thread is false when threaded is enabled at > > > > device level. > > > > > > My reading of the commit message is that the WARN triggers, but > > > looking at the code I think you mean that we fail to update the > > > config when we set at the device level? > > > > > > > The napi_restore_config should only restore the napi threaded state when > > > > threaded is enabled at NAPI level. > > > > > > > > The issue can be reproduced on virtio-net device using qemu. To > > > > reproduce the issue run following, > > > > > > > > echo 1 > /sys/class/net/threaded > > > > ethtool -L eth0 combined 1 > > > > > > Maybe we should add that as a test under tools/testing/drivers/net - > > > it will run against netdevsim but also all drivers we test which > > > currently means virtio and fbnic, but hopefully soon more. Up to you. > > +1 > > > > I do want to add a test for it. I was thinking of extending > > nl_netdev.py but that doesn't seem suitable as it only looks at the > > state. I will look at the driver directory. Should I send a test with > > this fix or should I send that later after the next reopen? > > We recommend sending the patch and the fix in one series, but up to you. > > > > I'm not sure I agree with the semantics, tho. IIUC you're basically > > > making the code prefer threaded option. If user enables threading > > > for the device, then disables it for a NAPI - reconfiguration will > > > lose the fact that specific NAPI instance was supposed to have threading > > > disabled. Why not update napi->config in netif_set_threaded() instead? I will do this. > > I think this discrepancy is orthogonal to the stopping of threads in > > restore. This is because the napi_enable is calling restore_config and > > then setting the STATE_THREADED bit on napis based on dev->threaded. > > Even if restore unsets the THREADED bit (which would have already been > > unset during napi_disable), it will be set again based on > > dev->threaded. Looking at this again, it seems the current mechanism should be fine as if the napi thread was disabled for a napi then the kthread associated with it will be destroyed also. This means that the STATE_THREADED will not be enabled for it. I will set the threaded config for each napi in netif_set_threaded. > > > > I think napi_restore_config should be called after setting up STATE bits? This is not needed. > > Yes, I think that would work. In fact I think it makes more sense in > the first place to "list" the NAPI in the hash table accessible by user > space only after it's actually enabled. > > I'd also be tempted to move some of the config restoration to > netif_napi_add(), so that we don't start the thread just to stop it. > But we'd need it in both places, IIRC netlink only holds the instance > lock, so theoretically the config may change between "add" and "enable" > :( > > cc: Joe, who's probably offline, but note his non-Fastly address ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-07-29 19:31 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-07-21 23:36 [PATCH net-next] net: Restore napi threaded state only when it is enabled Samiullah Khawaja 2025-07-23 1:36 ` Jakub Kicinski 2025-07-23 2:36 ` Samiullah Khawaja 2025-07-23 14:56 ` Jakub Kicinski 2025-07-29 19:30 ` Samiullah Khawaja
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).