* [bug report] net/sched: sch_qfq: Fix race condition on qfq_aggregate
@ 2025-07-17 16:51 Dan Carpenter
2025-07-17 20:06 ` Xiang Mei
0 siblings, 1 reply; 3+ messages in thread
From: Dan Carpenter @ 2025-07-17 16:51 UTC (permalink / raw)
To: Xiang Mei; +Cc: netdev
Hello Xiang Mei,
Commit 5e28d5a3f774 ("net/sched: sch_qfq: Fix race condition on
qfq_aggregate") from Jul 10, 2025 (linux-next), leads to the
following Smatch static checker warning:
net/sched/sch_generic.c:1107 qdisc_put()
warn: sleeping in atomic context
547 static int qfq_delete_class(struct Qdisc *sch, unsigned long arg,
548 struct netlink_ext_ack *extack)
549 {
550 struct qfq_sched *q = qdisc_priv(sch);
551 struct qfq_class *cl = (struct qfq_class *)arg;
552
553 if (qdisc_class_in_use(&cl->common)) {
554 NL_SET_ERR_MSG_MOD(extack, "QFQ class in use");
555 return -EBUSY;
556 }
557
558 sch_tree_lock(sch);
559
560 qdisc_purge_queue(cl->qdisc);
561 qdisc_class_hash_remove(&q->clhash, &cl->common);
562 qfq_destroy_class(sch, cl);
^^^^^^^^^^^^^^^^^
We used to unlock first and then did the destroy but the patch moved
this qfq_destroy_class() under the sch_tree_unlock() to solve a race
condition. Unfortunately, it introduces a sleeping in atomic context.
563
564 sch_tree_unlock(sch);
565
566 return 0;
567 }
The call tree is:
qfq_delete_class() <- disables preempt
-> qfq_destroy_class()
-> qdisc_put() <- sleeps
net/sched/sch_generic.c
1098 void qdisc_put(struct Qdisc *qdisc)
1099 {
1100 if (!qdisc)
1101 return;
1102
1103 if (qdisc->flags & TCQ_F_BUILTIN ||
1104 !refcount_dec_and_test(&qdisc->refcnt))
1105 return;
1106
--> 1107 __qdisc_destroy(qdisc);
It's the lockdep_unregister_key() call which sleeps.
1108 }
regards,
dan carpenter
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [bug report] net/sched: sch_qfq: Fix race condition on qfq_aggregate
2025-07-17 16:51 [bug report] net/sched: sch_qfq: Fix race condition on qfq_aggregate Dan Carpenter
@ 2025-07-17 20:06 ` Xiang Mei
2025-07-17 20:44 ` Dan Carpenter
0 siblings, 1 reply; 3+ messages in thread
From: Xiang Mei @ 2025-07-17 20:06 UTC (permalink / raw)
To: Dan Carpenter; +Cc: netdev
On Thu, Jul 17, 2025 at 11:51:43AM -0500, Dan Carpenter wrote:
> Hello Xiang Mei,
>
> Commit 5e28d5a3f774 ("net/sched: sch_qfq: Fix race condition on
> qfq_aggregate") from Jul 10, 2025 (linux-next), leads to the
> following Smatch static checker warning:
>
> net/sched/sch_generic.c:1107 qdisc_put()
> warn: sleeping in atomic context
>
> 547 static int qfq_delete_class(struct Qdisc *sch, unsigned long arg,
> 548 struct netlink_ext_ack *extack)
> 549 {
> 550 struct qfq_sched *q = qdisc_priv(sch);
> 551 struct qfq_class *cl = (struct qfq_class *)arg;
> 552
> 553 if (qdisc_class_in_use(&cl->common)) {
> 554 NL_SET_ERR_MSG_MOD(extack, "QFQ class in use");
> 555 return -EBUSY;
> 556 }
> 557
> 558 sch_tree_lock(sch);
> 559
> 560 qdisc_purge_queue(cl->qdisc);
> 561 qdisc_class_hash_remove(&q->clhash, &cl->common);
> 562 qfq_destroy_class(sch, cl);
> ^^^^^^^^^^^^^^^^^
> We used to unlock first and then did the destroy but the patch moved
> this qfq_destroy_class() under the sch_tree_unlock() to solve a race
> condition. Unfortunately, it introduces a sleeping in atomic context.
>
> 563
> 564 sch_tree_unlock(sch);
> 565
> 566 return 0;
> 567 }
>
> The call tree is:
>
> qfq_delete_class() <- disables preempt
> -> qfq_destroy_class()
> -> qdisc_put() <- sleeps
>
> net/sched/sch_generic.c
> 1098 void qdisc_put(struct Qdisc *qdisc)
> 1099 {
> 1100 if (!qdisc)
> 1101 return;
> 1102
> 1103 if (qdisc->flags & TCQ_F_BUILTIN ||
> 1104 !refcount_dec_and_test(&qdisc->refcnt))
> 1105 return;
> 1106
> --> 1107 __qdisc_destroy(qdisc);
>
> It's the lockdep_unregister_key() call which sleeps.
>
> 1108 }
>
> regards,
> dan carpenter
Thanks Dan for the explanations.
What do you think about this solution: We split qfq_destory_class to two
parts: qfq_rm_from_agg(q, cl) and the left calls. Since the race condition
is about agg, we can keep the left calls out of the lock but moving
qfq_rm_from_agg into the lock.
This could avoid calling __qdisc_destroy in the lock. Please let me know
if it works, I can help to deliever a new version of patch.
Best,
Xiang
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [bug report] net/sched: sch_qfq: Fix race condition on qfq_aggregate
2025-07-17 20:06 ` Xiang Mei
@ 2025-07-17 20:44 ` Dan Carpenter
0 siblings, 0 replies; 3+ messages in thread
From: Dan Carpenter @ 2025-07-17 20:44 UTC (permalink / raw)
To: Xiang Mei; +Cc: netdev
On Thu, Jul 17, 2025 at 01:06:15PM -0700, Xiang Mei wrote:
> Thanks Dan for the explanations.
>
> What do you think about this solution: We split qfq_destory_class to two
> parts: qfq_rm_from_agg(q, cl) and the left calls. Since the race condition
> is about agg, we can keep the left calls out of the lock but moving
> qfq_rm_from_agg into the lock.
>
> This could avoid calling __qdisc_destroy in the lock. Please let me know
> if it works, I can help to deliever a new version of patch.
What you're saying sounds reasonable to me, but I don't know this code
at all so my opinions on this are a bit worthless.
The patch was already merged into the net tree so you can't send a new
version. It has a be a fixup patch based on top of the current tree
with a Fixes tag etc.
regards,
dan carpenter
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-07-17 20:45 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-17 16:51 [bug report] net/sched: sch_qfq: Fix race condition on qfq_aggregate Dan Carpenter
2025-07-17 20:06 ` Xiang Mei
2025-07-17 20:44 ` Dan Carpenter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox