public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [bug report] net/sched: sch_qfq: Fix race condition on qfq_aggregate
@ 2025-07-17 16:51 Dan Carpenter
  2025-07-17 20:06 ` Xiang Mei
  0 siblings, 1 reply; 3+ messages in thread
From: Dan Carpenter @ 2025-07-17 16:51 UTC (permalink / raw)
  To: Xiang Mei; +Cc: netdev

Hello Xiang Mei,

Commit 5e28d5a3f774 ("net/sched: sch_qfq: Fix race condition on
qfq_aggregate") from Jul 10, 2025 (linux-next), leads to the
following Smatch static checker warning:

	net/sched/sch_generic.c:1107 qdisc_put()
	warn: sleeping in atomic context

   547  static int qfq_delete_class(struct Qdisc *sch, unsigned long arg,
   548                              struct netlink_ext_ack *extack)
   549  {
   550          struct qfq_sched *q = qdisc_priv(sch);
   551          struct qfq_class *cl = (struct qfq_class *)arg;
   552  
   553          if (qdisc_class_in_use(&cl->common)) {
   554                  NL_SET_ERR_MSG_MOD(extack, "QFQ class in use");
   555                  return -EBUSY;
   556          }
   557  
   558          sch_tree_lock(sch);
   559  
   560          qdisc_purge_queue(cl->qdisc);
   561          qdisc_class_hash_remove(&q->clhash, &cl->common);
   562          qfq_destroy_class(sch, cl);
                ^^^^^^^^^^^^^^^^^
We used to unlock first and then did the destroy but the patch moved
this qfq_destroy_class() under the sch_tree_unlock() to solve a race
condition.  Unfortunately, it introduces a sleeping in atomic context.

   563  
   564          sch_tree_unlock(sch);
   565  
   566          return 0;
   567  }

The call tree is:

qfq_delete_class() <- disables preempt
-> qfq_destroy_class()
   -> qdisc_put() <- sleeps

net/sched/sch_generic.c
    1098 void qdisc_put(struct Qdisc *qdisc)
    1099 {
    1100         if (!qdisc)
    1101                 return;
    1102 
    1103         if (qdisc->flags & TCQ_F_BUILTIN ||
    1104             !refcount_dec_and_test(&qdisc->refcnt))
    1105                 return;
    1106 
--> 1107         __qdisc_destroy(qdisc);

It's the lockdep_unregister_key() call which sleeps.

    1108 }

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [bug report] net/sched: sch_qfq: Fix race condition on qfq_aggregate
  2025-07-17 16:51 [bug report] net/sched: sch_qfq: Fix race condition on qfq_aggregate Dan Carpenter
@ 2025-07-17 20:06 ` Xiang Mei
  2025-07-17 20:44   ` Dan Carpenter
  0 siblings, 1 reply; 3+ messages in thread
From: Xiang Mei @ 2025-07-17 20:06 UTC (permalink / raw)
  To: Dan Carpenter; +Cc: netdev

On Thu, Jul 17, 2025 at 11:51:43AM -0500, Dan Carpenter wrote:
> Hello Xiang Mei,
> 
> Commit 5e28d5a3f774 ("net/sched: sch_qfq: Fix race condition on
> qfq_aggregate") from Jul 10, 2025 (linux-next), leads to the
> following Smatch static checker warning:
> 
> 	net/sched/sch_generic.c:1107 qdisc_put()
> 	warn: sleeping in atomic context
> 
>    547  static int qfq_delete_class(struct Qdisc *sch, unsigned long arg,
>    548                              struct netlink_ext_ack *extack)
>    549  {
>    550          struct qfq_sched *q = qdisc_priv(sch);
>    551          struct qfq_class *cl = (struct qfq_class *)arg;
>    552  
>    553          if (qdisc_class_in_use(&cl->common)) {
>    554                  NL_SET_ERR_MSG_MOD(extack, "QFQ class in use");
>    555                  return -EBUSY;
>    556          }
>    557  
>    558          sch_tree_lock(sch);
>    559  
>    560          qdisc_purge_queue(cl->qdisc);
>    561          qdisc_class_hash_remove(&q->clhash, &cl->common);
>    562          qfq_destroy_class(sch, cl);
>                 ^^^^^^^^^^^^^^^^^
> We used to unlock first and then did the destroy but the patch moved
> this qfq_destroy_class() under the sch_tree_unlock() to solve a race
> condition.  Unfortunately, it introduces a sleeping in atomic context.
> 
>    563  
>    564          sch_tree_unlock(sch);
>    565  
>    566          return 0;
>    567  }
> 
> The call tree is:
> 
> qfq_delete_class() <- disables preempt
> -> qfq_destroy_class()
>    -> qdisc_put() <- sleeps
> 
> net/sched/sch_generic.c
>     1098 void qdisc_put(struct Qdisc *qdisc)
>     1099 {
>     1100         if (!qdisc)
>     1101                 return;
>     1102 
>     1103         if (qdisc->flags & TCQ_F_BUILTIN ||
>     1104             !refcount_dec_and_test(&qdisc->refcnt))
>     1105                 return;
>     1106 
> --> 1107         __qdisc_destroy(qdisc);
> 
> It's the lockdep_unregister_key() call which sleeps.
> 
>     1108 }
> 
> regards,
> dan carpenter
Thanks Dan for the explanations. 

What do you think about this solution: We split qfq_destory_class to two 
parts: qfq_rm_from_agg(q, cl) and the left calls. Since the race condition
is about agg, we can keep the left calls out of the lock but moving 
qfq_rm_from_agg into the lock.

This could avoid calling __qdisc_destroy in the lock. Please let me know 
if it works, I can help to deliever a new version of patch.

Best,
Xiang

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [bug report] net/sched: sch_qfq: Fix race condition on qfq_aggregate
  2025-07-17 20:06 ` Xiang Mei
@ 2025-07-17 20:44   ` Dan Carpenter
  0 siblings, 0 replies; 3+ messages in thread
From: Dan Carpenter @ 2025-07-17 20:44 UTC (permalink / raw)
  To: Xiang Mei; +Cc: netdev

On Thu, Jul 17, 2025 at 01:06:15PM -0700, Xiang Mei wrote:
> Thanks Dan for the explanations. 
> 
> What do you think about this solution: We split qfq_destory_class to two 
> parts: qfq_rm_from_agg(q, cl) and the left calls. Since the race condition
> is about agg, we can keep the left calls out of the lock but moving 
> qfq_rm_from_agg into the lock.
> 
> This could avoid calling __qdisc_destroy in the lock. Please let me know 
> if it works, I can help to deliever a new version of patch.

What you're saying sounds reasonable to me, but I don't know this code
at all so my opinions on this are a bit worthless.

The patch was already merged into the net tree so you can't send a new
version.  It has a be a fixup patch based on top of the current tree
with a Fixes tag etc.

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-07-17 20:45 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-17 16:51 [bug report] net/sched: sch_qfq: Fix race condition on qfq_aggregate Dan Carpenter
2025-07-17 20:06 ` Xiang Mei
2025-07-17 20:44   ` Dan Carpenter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox