netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next] net: Restore napi threaded state only when it is enabled
@ 2025-07-21 23:36 Samiullah Khawaja
  2025-07-23  1:36 ` Jakub Kicinski
  0 siblings, 1 reply; 5+ messages in thread
From: Samiullah Khawaja @ 2025-07-21 23:36 UTC (permalink / raw)
  To: Jakub Kicinski, David S . Miller , Eric Dumazet, Paolo Abeni,
	almasrymina, willemb
  Cc: netdev, skhawaja

Commit 2677010e7793 ("Add support to set NAPI threaded for individual
NAPI") added support to enable/disable threaded napi using netlink. This
also extended the napi config save/restore functionality to set the napi
threaded state. This breaks the netdev reset when threaded napi is
enabled at device level as the napi_restore_config tries to stop the
kthreads as napi->config->thread is false when threaded is enabled at
device level.

The napi_restore_config should only restore the napi threaded state when
threaded is enabled at NAPI level.

The issue can be reproduced on virtio-net device using qemu. To
reproduce the issue run following,

  echo 1 > /sys/class/net/threaded
  ethtool -L eth0 combined 1

Fixes: 2677010e7793 ("Add support to set NAPI threaded for individual NAPI")
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>

---
 net/core/dev.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 621a639aeba1..1cab04b8f60a 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -7244,7 +7244,8 @@ static void napi_restore_config(struct napi_struct *n)
 		n->config->napi_id = n->napi_id;
 	}
 
-	WARN_ON_ONCE(napi_set_threaded(n, n->config->threaded));
+	if (n->config->threaded)
+		WARN_ON_ONCE(napi_set_threaded(n, n->config->threaded));
 }
 
 static void napi_save_config(struct napi_struct *n)

base-commit: c3886ccaadf8fdc2c91bfbdcdca36ccdc6ef8f70
-- 
2.50.0.727.gbf7dc18ff4-goog


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next] net: Restore napi threaded state only when it is enabled
  2025-07-21 23:36 [PATCH net-next] net: Restore napi threaded state only when it is enabled Samiullah Khawaja
@ 2025-07-23  1:36 ` Jakub Kicinski
  2025-07-23  2:36   ` Samiullah Khawaja
  0 siblings, 1 reply; 5+ messages in thread
From: Jakub Kicinski @ 2025-07-23  1:36 UTC (permalink / raw)
  To: Samiullah Khawaja
  Cc: David S . Miller , Eric Dumazet, Paolo Abeni, almasrymina,
	willemb, netdev

On Mon, 21 Jul 2025 23:36:08 +0000 Samiullah Khawaja wrote:
> Commit 2677010e7793 ("Add support to set NAPI threaded for individual
> NAPI") added support to enable/disable threaded napi using netlink. This
> also extended the napi config save/restore functionality to set the napi
> threaded state. This breaks the netdev reset when threaded napi is

"This breaks the netdev reset" is very vague.

> enabled at device level as the napi_restore_config tries to stop the
> kthreads as napi->config->thread is false when threaded is enabled at
> device level.

My reading of the commit message is that the WARN triggers, but
looking at the code I think you mean that we fail to update the
config when we set at the device level?

> The napi_restore_config should only restore the napi threaded state when
> threaded is enabled at NAPI level.
> 
> The issue can be reproduced on virtio-net device using qemu. To
> reproduce the issue run following,
> 
>   echo 1 > /sys/class/net/threaded
>   ethtool -L eth0 combined 1

Maybe we should add that as a test under tools/testing/drivers/net -
it will run against netdevsim but also all drivers we test which
currently means virtio and fbnic, but hopefully soon more. Up to you.

I'm not sure I agree with the semantics, tho. IIUC you're basically
making the code prefer threaded option. If user enables threading
for the device, then disables it for a NAPI - reconfiguration will
lose the fact that specific NAPI instance was supposed to have threading
disabled. Why not update napi->config in netif_set_threaded() instead?
-- 
pw-bot: cr

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next] net: Restore napi threaded state only when it is enabled
  2025-07-23  1:36 ` Jakub Kicinski
@ 2025-07-23  2:36   ` Samiullah Khawaja
  2025-07-23 14:56     ` Jakub Kicinski
  0 siblings, 1 reply; 5+ messages in thread
From: Samiullah Khawaja @ 2025-07-23  2:36 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David S . Miller, Eric Dumazet, Paolo Abeni, almasrymina, willemb,
	netdev

On Tue, Jul 22, 2025 at 6:36 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Mon, 21 Jul 2025 23:36:08 +0000 Samiullah Khawaja wrote:
> > Commit 2677010e7793 ("Add support to set NAPI threaded for individual
> > NAPI") added support to enable/disable threaded napi using netlink. This
> > also extended the napi config save/restore functionality to set the napi
> > threaded state. This breaks the netdev reset when threaded napi is
>
> "This breaks the netdev reset" is very vague.
Basically on netdev reset inside napi_enable when it calls
napi_restore_config it tries to stop the NAPI kthread. Since during
napi_enable, the NAPI has STATE_SCHED set on it, the stop_kthread gets
stuck waiting for the STATE_SCHED to be unset. It should not be
destroying the kthread since threaded is enabled at device level. But
I think your point below is valid, we should probably set
napi->config->threaded in netif_set_threaded.

I should add this to the commit message.
>
> > enabled at device level as the napi_restore_config tries to stop the
> > kthreads as napi->config->thread is false when threaded is enabled at
> > device level.
>
> My reading of the commit message is that the WARN triggers, but
> looking at the code I think you mean that we fail to update the
> config when we set at the device level?
>
> > The napi_restore_config should only restore the napi threaded state when
> > threaded is enabled at NAPI level.
> >
> > The issue can be reproduced on virtio-net device using qemu. To
> > reproduce the issue run following,
> >
> >   echo 1 > /sys/class/net/threaded
> >   ethtool -L eth0 combined 1
>
> Maybe we should add that as a test under tools/testing/drivers/net -
> it will run against netdevsim but also all drivers we test which
> currently means virtio and fbnic, but hopefully soon more. Up to you.
+1

I do want to add a test for it. I was thinking of extending
nl_netdev.py but that doesn't seem suitable as it only looks at the
state. I will look at the driver directory. Should I send a test with
this fix or should I send that later after the next reopen?
>
> I'm not sure I agree with the semantics, tho. IIUC you're basically
> making the code prefer threaded option. If user enables threading
> for the device, then disables it for a NAPI - reconfiguration will
> lose the fact that specific NAPI instance was supposed to have threading
> disabled. Why not update napi->config in netif_set_threaded() instead?
I think this discrepancy is orthogonal to the stopping of threads in
restore. This is because the napi_enable is calling restore_config and
then setting the STATE_THREADED bit on napis based on dev->threaded.
Even if restore unsets the THREADED bit (which would have already been
unset during napi_disable), it will be set again based on
dev->threaded.

I think napi_restore_config should be called after setting up STATE bits?
> --
> pw-bot: cr

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next] net: Restore napi threaded state only when it is enabled
  2025-07-23  2:36   ` Samiullah Khawaja
@ 2025-07-23 14:56     ` Jakub Kicinski
  2025-07-29 19:30       ` Samiullah Khawaja
  0 siblings, 1 reply; 5+ messages in thread
From: Jakub Kicinski @ 2025-07-23 14:56 UTC (permalink / raw)
  To: Samiullah Khawaja
  Cc: David S . Miller, Eric Dumazet, Paolo Abeni, almasrymina, willemb,
	netdev, Joe Damato

On Tue, 22 Jul 2025 19:36:31 -0700 Samiullah Khawaja wrote:
> On Tue, Jul 22, 2025 at 6:36 PM Jakub Kicinski <kuba@kernel.org> wrote:
> >
> > On Mon, 21 Jul 2025 23:36:08 +0000 Samiullah Khawaja wrote:  
> > > Commit 2677010e7793 ("Add support to set NAPI threaded for individual
> > > NAPI") added support to enable/disable threaded napi using netlink. This
> > > also extended the napi config save/restore functionality to set the napi
> > > threaded state. This breaks the netdev reset when threaded napi is  
> >
> > "This breaks the netdev reset" is very vague.  
> Basically on netdev reset inside napi_enable when it calls
> napi_restore_config it tries to stop the NAPI kthread. Since during
> napi_enable, the NAPI has STATE_SCHED set on it, the stop_kthread gets
> stuck waiting for the STATE_SCHED to be unset. It should not be
> destroying the kthread since threaded is enabled at device level. But
> I think your point below is valid, we should probably set
> napi->config->threaded in netif_set_threaded.
> 
> I should add this to the commit message.

Ah, I see! The impermanence of the DISABLED bit strikes again :(

> > > enabled at device level as the napi_restore_config tries to stop the
> > > kthreads as napi->config->thread is false when threaded is enabled at
> > > device level.  
> >
> > My reading of the commit message is that the WARN triggers, but
> > looking at the code I think you mean that we fail to update the
> > config when we set at the device level?
> >  
> > > The napi_restore_config should only restore the napi threaded state when
> > > threaded is enabled at NAPI level.
> > >
> > > The issue can be reproduced on virtio-net device using qemu. To
> > > reproduce the issue run following,
> > >
> > >   echo 1 > /sys/class/net/threaded
> > >   ethtool -L eth0 combined 1  
> >
> > Maybe we should add that as a test under tools/testing/drivers/net -
> > it will run against netdevsim but also all drivers we test which
> > currently means virtio and fbnic, but hopefully soon more. Up to you.  
> +1
> 
> I do want to add a test for it. I was thinking of extending
> nl_netdev.py but that doesn't seem suitable as it only looks at the
> state. I will look at the driver directory. Should I send a test with
> this fix or should I send that later after the next reopen?

We recommend sending the patch and the fix in one series, but up to you.

> > I'm not sure I agree with the semantics, tho. IIUC you're basically
> > making the code prefer threaded option. If user enables threading
> > for the device, then disables it for a NAPI - reconfiguration will
> > lose the fact that specific NAPI instance was supposed to have threading
> > disabled. Why not update napi->config in netif_set_threaded() instead?  
> I think this discrepancy is orthogonal to the stopping of threads in
> restore. This is because the napi_enable is calling restore_config and
> then setting the STATE_THREADED bit on napis based on dev->threaded.
> Even if restore unsets the THREADED bit (which would have already been
> unset during napi_disable), it will be set again based on
> dev->threaded.
> 
> I think napi_restore_config should be called after setting up STATE bits?

Yes, I think that would work. In fact I think it makes more sense in
the first place to "list" the NAPI in the hash table accessible by user
space only after it's actually enabled.

I'd also be tempted to move some of the config restoration to
netif_napi_add(), so that we don't start the thread just to stop it.
But we'd need it in both places, IIRC netlink only holds the instance
lock, so theoretically the config may change between "add" and "enable"
:(

cc: Joe, who's probably offline, but note his non-Fastly address

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next] net: Restore napi threaded state only when it is enabled
  2025-07-23 14:56     ` Jakub Kicinski
@ 2025-07-29 19:30       ` Samiullah Khawaja
  0 siblings, 0 replies; 5+ messages in thread
From: Samiullah Khawaja @ 2025-07-29 19:30 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David S . Miller, Eric Dumazet, Paolo Abeni, almasrymina, willemb,
	netdev, Joe Damato

On Wed, Jul 23, 2025 at 7:56 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Tue, 22 Jul 2025 19:36:31 -0700 Samiullah Khawaja wrote:
> > On Tue, Jul 22, 2025 at 6:36 PM Jakub Kicinski <kuba@kernel.org> wrote:
> > >
> > > On Mon, 21 Jul 2025 23:36:08 +0000 Samiullah Khawaja wrote:
> > > > Commit 2677010e7793 ("Add support to set NAPI threaded for individual
> > > > NAPI") added support to enable/disable threaded napi using netlink. This
> > > > also extended the napi config save/restore functionality to set the napi
> > > > threaded state. This breaks the netdev reset when threaded napi is
> > >
> > > "This breaks the netdev reset" is very vague.
> > Basically on netdev reset inside napi_enable when it calls
> > napi_restore_config it tries to stop the NAPI kthread. Since during
> > napi_enable, the NAPI has STATE_SCHED set on it, the stop_kthread gets
> > stuck waiting for the STATE_SCHED to be unset. It should not be
> > destroying the kthread since threaded is enabled at device level. But
> > I think your point below is valid, we should probably set
> > napi->config->threaded in netif_set_threaded.
> >
> > I should add this to the commit message.
>
> Ah, I see! The impermanence of the DISABLED bit strikes again :(
>
> > > > enabled at device level as the napi_restore_config tries to stop the
> > > > kthreads as napi->config->thread is false when threaded is enabled at
> > > > device level.
> > >
> > > My reading of the commit message is that the WARN triggers, but
> > > looking at the code I think you mean that we fail to update the
> > > config when we set at the device level?
> > >
> > > > The napi_restore_config should only restore the napi threaded state when
> > > > threaded is enabled at NAPI level.
> > > >
> > > > The issue can be reproduced on virtio-net device using qemu. To
> > > > reproduce the issue run following,
> > > >
> > > >   echo 1 > /sys/class/net/threaded
> > > >   ethtool -L eth0 combined 1
> > >
> > > Maybe we should add that as a test under tools/testing/drivers/net -
> > > it will run against netdevsim but also all drivers we test which
> > > currently means virtio and fbnic, but hopefully soon more. Up to you.
> > +1
> >
> > I do want to add a test for it. I was thinking of extending
> > nl_netdev.py but that doesn't seem suitable as it only looks at the
> > state. I will look at the driver directory. Should I send a test with
> > this fix or should I send that later after the next reopen?
>
> We recommend sending the patch and the fix in one series, but up to you.
>
> > > I'm not sure I agree with the semantics, tho. IIUC you're basically
> > > making the code prefer threaded option. If user enables threading
> > > for the device, then disables it for a NAPI - reconfiguration will
> > > lose the fact that specific NAPI instance was supposed to have threading
> > > disabled. Why not update napi->config in netif_set_threaded() instead?
I will do this.
> > I think this discrepancy is orthogonal to the stopping of threads in
> > restore. This is because the napi_enable is calling restore_config and
> > then setting the STATE_THREADED bit on napis based on dev->threaded.
> > Even if restore unsets the THREADED bit (which would have already been
> > unset during napi_disable), it will be set again based on
> > dev->threaded.
Looking at this again, it seems the current mechanism should be fine
as if the napi thread was disabled for a napi then the kthread
associated with it will be destroyed also. This means that the
STATE_THREADED will not be enabled for it. I will set the threaded
config for each napi in netif_set_threaded.
> >
> > I think napi_restore_config should be called after setting up STATE bits?
This is not needed.
>
> Yes, I think that would work. In fact I think it makes more sense in
> the first place to "list" the NAPI in the hash table accessible by user
> space only after it's actually enabled.
>
> I'd also be tempted to move some of the config restoration to
> netif_napi_add(), so that we don't start the thread just to stop it.
> But we'd need it in both places, IIRC netlink only holds the instance
> lock, so theoretically the config may change between "add" and "enable"
> :(
>
> cc: Joe, who's probably offline, but note his non-Fastly address

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-07-29 19:31 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-21 23:36 [PATCH net-next] net: Restore napi threaded state only when it is enabled Samiullah Khawaja
2025-07-23  1:36 ` Jakub Kicinski
2025-07-23  2:36   ` Samiullah Khawaja
2025-07-23 14:56     ` Jakub Kicinski
2025-07-29 19:30       ` Samiullah Khawaja

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).