Netdev List
 help / color / mirror / Atom feed
* [PATCH net] net/sched: re-enable queue reset on root qdisc graft
@ 2026-05-14  3:12 Jiayuan Chen
  2026-05-19  1:16 ` Jakub Kicinski
  0 siblings, 1 reply; 2+ messages in thread
From: Jiayuan Chen @ 2026-05-14  3:12 UTC (permalink / raw)
  To: netdev
  Cc: Jiayuan Chen, syzbot+9744ccaabe337c6fb123, Jamal Hadi Salim,
	Jiri Pirko, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Toke Høiland-Jørgensen,
	Victor Nogueira, linux-kernel

Commit 47e8dbb6e763 ("net/sched: do not reset queues in graft
operations") changed dev_deactivate() in qdisc_graft() from
reset_needed=true to false. This was the right call for graft paths
where the new qdisc has an ->attach op (mq): the new root
takes over per-tx-queue state via attach, and a blanket reset would
needlessly drop packets in unrelated leaves on every graft.

For the path where the new qdisc has no ->attach (e.g. HTB, sfq
as root, or qdisc_graft() called for deletion with new == NULL), the
old root subtree is going to be torn down anyway: every leaf will be
freed shortly via __qdisc_destroy(). Skipping the early reset there
provides no benefit, but it leaves leaf qdiscs with their queues
intact during the window between rcu_assign_pointer(dev->qdisc, new)
and the per-leaf sfq_destroy()/timer_delete_sync(). If a leaf has a
self-armed timer that walks the parent chain (sfq_perturbation ->
sfq_rehash -> qdisc_tree_reduce_backlog), the timer can fire after the
old root has been swapped out, find dev->qdisc no longer matching,
and trigger WARN_ON_ONCE(parentid != TC_H_ROOT).

Reported-by: syzbot+9744ccaabe337c6fb123@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/6a0175e0.a00a0220.1c3806.0016.GAE@google.com/T/
Fixes: 47e8dbb6e763 ("net/sched: do not reset queues in graft operations")
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
---
 net/sched/sch_api.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index 6f7847c5536f..932cd1144b2b 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -1097,6 +1097,7 @@ static int qdisc_graft(struct net_device *dev, struct Qdisc *parent,
 	struct net *net = dev_net(dev);
 
 	if (parent == NULL) {
+		bool need_skip = false;
 		unsigned int i, num_q, ingress;
 		struct netdev_queue *dev_queue;
 
@@ -1123,12 +1124,15 @@ static int qdisc_graft(struct net_device *dev, struct Qdisc *parent,
 			}
 		}
 
+		if (new && new->ops->attach && !ingress)
+			need_skip = true;
+
 		if (dev->flags & IFF_UP)
-			dev_deactivate(dev, false);
+			dev_deactivate(dev, !need_skip);
 
 		qdisc_offload_graft_root(dev, new, old, extack);
 
-		if (new && new->ops->attach && !ingress)
+		if (need_skip)
 			goto skip;
 
 		if (!ingress) {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH net] net/sched: re-enable queue reset on root qdisc graft
  2026-05-14  3:12 [PATCH net] net/sched: re-enable queue reset on root qdisc graft Jiayuan Chen
@ 2026-05-19  1:16 ` Jakub Kicinski
  0 siblings, 0 replies; 2+ messages in thread
From: Jakub Kicinski @ 2026-05-19  1:16 UTC (permalink / raw)
  To: Jiayuan Chen
  Cc: netdev, syzbot+9744ccaabe337c6fb123, Jamal Hadi Salim, Jiri Pirko,
	David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
	Toke Høiland-Jørgensen, Victor Nogueira, linux-kernel

On Thu, 14 May 2026 11:12:41 +0800 Jiayuan Chen wrote:
> Commit 47e8dbb6e763 ("net/sched: do not reset queues in graft
> operations") changed dev_deactivate() in qdisc_graft() from
> reset_needed=true to false. This was the right call for graft paths
> where the new qdisc has an ->attach op (mq): the new root
> takes over per-tx-queue state via attach, and a blanket reset would
> needlessly drop packets in unrelated leaves on every graft.
> 
> For the path where the new qdisc has no ->attach (e.g. HTB, sfq
> as root, or qdisc_graft() called for deletion with new == NULL), the
> old root subtree is going to be torn down anyway: every leaf will be
> freed shortly via __qdisc_destroy(). Skipping the early reset there
> provides no benefit, but it leaves leaf qdiscs with their queues
> intact during the window between rcu_assign_pointer(dev->qdisc, new)
> and the per-leaf sfq_destroy()/timer_delete_sync(). If a leaf has a
> self-armed timer that walks the parent chain (sfq_perturbation ->
> sfq_rehash -> qdisc_tree_reduce_backlog), the timer can fire after the
> old root has been swapped out, find dev->qdisc no longer matching,
> and trigger WARN_ON_ONCE(parentid != TC_H_ROOT).

Is not resetting root really worth the extra code?
We care about mq not resetting for each child N times
more than we care about the mq itself?

If we do care - you're breaking reverse xmas tree,
and "need_skip" is quite a poor choice of a name for
readability. 

> diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
> index 6f7847c5536f..932cd1144b2b 100644
> --- a/net/sched/sch_api.c
> +++ b/net/sched/sch_api.c
> @@ -1097,6 +1097,7 @@ static int qdisc_graft(struct net_device *dev, struct Qdisc *parent,
>  	struct net *net = dev_net(dev);
>  
>  	if (parent == NULL) {
> +		bool need_skip = false;
>  		unsigned int i, num_q, ingress;
>  		struct netdev_queue *dev_queue;
>  
> @@ -1123,12 +1124,15 @@ static int qdisc_graft(struct net_device *dev, struct Qdisc *parent,
>  			}
>  		}
>  
> +		if (new && new->ops->attach && !ingress)
> +			need_skip = true;
> +
>  		if (dev->flags & IFF_UP)
> -			dev_deactivate(dev, false);
> +			dev_deactivate(dev, !need_skip);
>  
>  		qdisc_offload_graft_root(dev, new, old, extack);
>  
> -		if (new && new->ops->attach && !ingress)
> +		if (need_skip)
>  			goto skip;
>  
>  		if (!ingress) {


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-05-19  1:16 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-14  3:12 [PATCH net] net/sched: re-enable queue reset on root qdisc graft Jiayuan Chen
2026-05-19  1:16 ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox