All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net v2 1/1] net: sched: ets: avoid deficit wrap and bound empty dequeue  rounds
@ 2026-06-26  8:32 Ren Wei
  2026-06-26  9:54 ` Jamal Hadi Salim
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Ren Wei @ 2026-06-26  8:32 UTC (permalink / raw)
  To: netdev
  Cc: jhs, jiri, davem, petrm, yuantan098, yifanwucs, tomapufckgml,
	zcliangcn, bird, bronzed_45_vested, n05ec

From: Wyatt Feng <bronzed_45_vested@icloud.com>

ETS keeps each DRR-style deficit in a u32 and replenishes it with
the configured quantum whenever the head packet is too large. Both
the quantum and qdisc_pkt_len() are user-controlled inputs: a large
quantum can wrap the deficit counter, while a tiny quantum combined
with an inflated qdisc_pkt_len() can force billions of iterations in
softirq context before any packet becomes eligible.

Store the deficit in u64 so replenishment cannot wrap the counter.
This keeps the existing dequeue logic unchanged while fixing the
overflow condition.

Bound one dequeue attempt to at most nbands * 2 ETS rotations, as
suggested in review. This avoids the livelock without adding heavier
logic to the fast path.

Fixes: dcc68b4d8084 ("net: sch_ets: Add a new Qdisc")
Cc: stable@vger.kernel.org
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Zhengchuan Liang <zcliangcn@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com>
Assisted-by: Codex:GPT-5.4
Signed-off-by: Wyatt Feng <bronzed_45_vested@icloud.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
---
changes in v2:
  - Instead of doing a div() in the fast path, simply bound the loop per
    dequeue
  - v1 Link: https://lore.kernel.org/all/20260615103759.2404228-2-n05ec@lzu.edu.cn/


 net/sched/sch_ets.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/sched/sch_ets.c b/net/sched/sch_ets.c
index cb8cf437ce87..12a156ccb0a6 100644
--- a/net/sched/sch_ets.c
+++ b/net/sched/sch_ets.c
@@ -40,7 +40,7 @@ struct ets_class {
 	struct list_head alist; /* In struct ets_sched.active. */
 	struct Qdisc *qdisc;
 	u32 quantum;
-	u32 deficit;
+	u64 deficit;
 	struct gnet_stats_basic_sync bstats;
 	struct gnet_stats_queue qstats;
 };
@@ -463,6 +463,8 @@ ets_qdisc_dequeue_skb(struct Qdisc *sch, struct sk_buff *skb)
 static struct sk_buff *ets_qdisc_dequeue(struct Qdisc *sch)
 {
 	struct ets_sched *q = qdisc_priv(sch);
+	unsigned int max_loops = READ_ONCE(q->nbands) * 2;
+	unsigned int loops = 0;
 	struct ets_class *cl;
 	struct sk_buff *skb;
 	unsigned int band;
@@ -499,6 +501,8 @@ static struct sk_buff *ets_qdisc_dequeue(struct Qdisc *sch)
 
 		cl->deficit += READ_ONCE(cl->quantum);
 		list_move_tail(&cl->alist, &q->active);
+		if (++loops > max_loops)
+			goto out;
 	}
 out:
 	return NULL;
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net v2 1/1] net: sched: ets: avoid deficit wrap and bound empty dequeue rounds
  2026-06-26  8:32 [PATCH net v2 1/1] net: sched: ets: avoid deficit wrap and bound empty dequeue rounds Ren Wei
@ 2026-06-26  9:54 ` Jamal Hadi Salim
  2026-06-26 17:34   ` Yuan Tan
  2026-06-27 22:14 ` Jakub Kicinski
  2026-06-27 22:15 ` Jakub Kicinski
  2 siblings, 1 reply; 5+ messages in thread
From: Jamal Hadi Salim @ 2026-06-26  9:54 UTC (permalink / raw)
  To: Ren Wei
  Cc: netdev, jiri, davem, petrm, yuantan098, yifanwucs, tomapufckgml,
	zcliangcn, bird, bronzed_45_vested

On Fri, Jun 26, 2026 at 4:32 AM Ren Wei <n05ec@lzu.edu.cn> wrote:
>
> From: Wyatt Feng <bronzed_45_vested@icloud.com>
>
> ETS keeps each DRR-style deficit in a u32 and replenishes it with
> the configured quantum whenever the head packet is too large. Both
> the quantum and qdisc_pkt_len() are user-controlled inputs: a large
> quantum can wrap the deficit counter, while a tiny quantum combined
> with an inflated qdisc_pkt_len() can force billions of iterations in
> softirq context before any packet becomes eligible.
>
> Store the deficit in u64 so replenishment cannot wrap the counter.
> This keeps the existing dequeue logic unchanged while fixing the
> overflow condition.
>
> Bound one dequeue attempt to at most nbands * 2 ETS rotations, as
> suggested in review. This avoids the livelock without adding heavier
> logic to the fast path.
>
> Fixes: dcc68b4d8084 ("net: sch_ets: Add a new Qdisc")
> Cc: stable@vger.kernel.org
> Reported-by: Yuan Tan <yuantan098@gmail.com>
> Reported-by: Yifan Wu <yifanwucs@gmail.com>
> Reported-by: Juefei Pu <tomapufckgml@gmail.com>
> Reported-by: Zhengchuan Liang <zcliangcn@gmail.com>
> Reported-by: Xin Liu <bird@lzu.edu.cn>
> Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com>
> Assisted-by: Codex:GPT-5.4
> Signed-off-by: Wyatt Feng <bronzed_45_vested@icloud.com>
> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>

Note, you did not Cc many maintainers. Next time, or if you have to
resend this patch make sure you Cc the stakeholders (see
scripts/get_maintainers.pl)

cheers,
jamal

> ---
> changes in v2:
>   - Instead of doing a div() in the fast path, simply bound the loop per
>     dequeue
>   - v1 Link: https://lore.kernel.org/all/20260615103759.2404228-2-n05ec@lzu.edu.cn/
>
>
>  net/sched/sch_ets.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/net/sched/sch_ets.c b/net/sched/sch_ets.c
> index cb8cf437ce87..12a156ccb0a6 100644
> --- a/net/sched/sch_ets.c
> +++ b/net/sched/sch_ets.c
> @@ -40,7 +40,7 @@ struct ets_class {
>         struct list_head alist; /* In struct ets_sched.active. */
>         struct Qdisc *qdisc;
>         u32 quantum;
> -       u32 deficit;
> +       u64 deficit;
>         struct gnet_stats_basic_sync bstats;
>         struct gnet_stats_queue qstats;
>  };
> @@ -463,6 +463,8 @@ ets_qdisc_dequeue_skb(struct Qdisc *sch, struct sk_buff *skb)
>  static struct sk_buff *ets_qdisc_dequeue(struct Qdisc *sch)
>  {
>         struct ets_sched *q = qdisc_priv(sch);
> +       unsigned int max_loops = READ_ONCE(q->nbands) * 2;
> +       unsigned int loops = 0;
>         struct ets_class *cl;
>         struct sk_buff *skb;
>         unsigned int band;
> @@ -499,6 +501,8 @@ static struct sk_buff *ets_qdisc_dequeue(struct Qdisc *sch)
>
>                 cl->deficit += READ_ONCE(cl->quantum);
>                 list_move_tail(&cl->alist, &q->active);
> +               if (++loops > max_loops)
> +                       goto out;
>         }
>  out:
>         return NULL;
> --
> 2.47.3
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net v2 1/1] net: sched: ets: avoid deficit wrap and bound empty dequeue rounds
  2026-06-26  9:54 ` Jamal Hadi Salim
@ 2026-06-26 17:34   ` Yuan Tan
  0 siblings, 0 replies; 5+ messages in thread
From: Yuan Tan @ 2026-06-26 17:34 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Ren Wei, netdev, jiri, davem, petrm, yifanwucs, tomapufckgml,
	zcliangcn, bird, bronzed_45_vested

On Fri, Jun 26, 2026 at 2:54 AM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>
> On Fri, Jun 26, 2026 at 4:32 AM Ren Wei <n05ec@lzu.edu.cn> wrote:
> >
> > From: Wyatt Feng <bronzed_45_vested@icloud.com>
> >
> > ETS keeps each DRR-style deficit in a u32 and replenishes it with
> > the configured quantum whenever the head packet is too large. Both
> > the quantum and qdisc_pkt_len() are user-controlled inputs: a large
> > quantum can wrap the deficit counter, while a tiny quantum combined
> > with an inflated qdisc_pkt_len() can force billions of iterations in
> > softirq context before any packet becomes eligible.
> >
> > Store the deficit in u64 so replenishment cannot wrap the counter.
> > This keeps the existing dequeue logic unchanged while fixing the
> > overflow condition.
> >
> > Bound one dequeue attempt to at most nbands * 2 ETS rotations, as
> > suggested in review. This avoids the livelock without adding heavier
> > logic to the fast path.
> >
> > Fixes: dcc68b4d8084 ("net: sch_ets: Add a new Qdisc")
> > Cc: stable@vger.kernel.org
> > Reported-by: Yuan Tan <yuantan098@gmail.com>
> > Reported-by: Yifan Wu <yifanwucs@gmail.com>
> > Reported-by: Juefei Pu <tomapufckgml@gmail.com>
> > Reported-by: Zhengchuan Liang <zcliangcn@gmail.com>
> > Reported-by: Xin Liu <bird@lzu.edu.cn>
> > Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com>
> > Assisted-by: Codex:GPT-5.4
> > Signed-off-by: Wyatt Feng <bronzed_45_vested@icloud.com>
> > Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
>
> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
>
> Note, you did not Cc many maintainers. Next time, or if you have to
> resend this patch make sure you Cc the stakeholders (see
> scripts/get_maintainers.pl)

Thank you very much for your advice. We were previously concerned
about bothering too many maintainers, so we used the pattern-depth=1
option when running scripts/get_maintainers.pl. We’ll relax this
parameter a bit in future submissions.

>
> cheers,
> jamal
>
> > ---
> > changes in v2:
> >   - Instead of doing a div() in the fast path, simply bound the loop per
> >     dequeue
> >   - v1 Link: https://lore.kernel.org/all/20260615103759.2404228-2-n05ec@lzu.edu.cn/
> >
> >
> >  net/sched/sch_ets.c | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/sched/sch_ets.c b/net/sched/sch_ets.c
> > index cb8cf437ce87..12a156ccb0a6 100644
> > --- a/net/sched/sch_ets.c
> > +++ b/net/sched/sch_ets.c
> > @@ -40,7 +40,7 @@ struct ets_class {
> >         struct list_head alist; /* In struct ets_sched.active. */
> >         struct Qdisc *qdisc;
> >         u32 quantum;
> > -       u32 deficit;
> > +       u64 deficit;
> >         struct gnet_stats_basic_sync bstats;
> >         struct gnet_stats_queue qstats;
> >  };
> > @@ -463,6 +463,8 @@ ets_qdisc_dequeue_skb(struct Qdisc *sch, struct sk_buff *skb)
> >  static struct sk_buff *ets_qdisc_dequeue(struct Qdisc *sch)
> >  {
> >         struct ets_sched *q = qdisc_priv(sch);
> > +       unsigned int max_loops = READ_ONCE(q->nbands) * 2;
> > +       unsigned int loops = 0;
> >         struct ets_class *cl;
> >         struct sk_buff *skb;
> >         unsigned int band;
> > @@ -499,6 +501,8 @@ static struct sk_buff *ets_qdisc_dequeue(struct Qdisc *sch)
> >
> >                 cl->deficit += READ_ONCE(cl->quantum);
> >                 list_move_tail(&cl->alist, &q->active);
> > +               if (++loops > max_loops)
> > +                       goto out;
> >         }
> >  out:
> >         return NULL;
> > --
> > 2.47.3
> >

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net v2 1/1] net: sched: ets: avoid deficit wrap and bound empty dequeue  rounds
  2026-06-26  8:32 [PATCH net v2 1/1] net: sched: ets: avoid deficit wrap and bound empty dequeue rounds Ren Wei
  2026-06-26  9:54 ` Jamal Hadi Salim
@ 2026-06-27 22:14 ` Jakub Kicinski
  2026-06-27 22:15 ` Jakub Kicinski
  2 siblings, 0 replies; 5+ messages in thread
From: Jakub Kicinski @ 2026-06-27 22:14 UTC (permalink / raw)
  To: Ren Wei
  Cc: netdev, jhs, jiri, davem, petrm, yuantan098, yifanwucs,
	tomapufckgml, zcliangcn, bird, bronzed_45_vested

On Fri, 26 Jun 2026 16:32:00 +0800 Ren Wei wrote:
> From: Wyatt Feng <bronzed_45_vested@icloud.com>
> 
> ETS keeps each DRR-style deficit in a u32 and replenishes it with
> the configured quantum whenever the head packet is too large. Both
> the quantum and qdisc_pkt_len() are user-controlled inputs: a large
> quantum can wrap the deficit counter, while a tiny quantum combined
> with an inflated qdisc_pkt_len() can force billions of iterations in
> softirq context before any packet becomes eligible.

Do you mean when packet is gigabytes in size?
Where do such packets originate?

> Store the deficit in u64 so replenishment cannot wrap the counter.
> This keeps the existing dequeue logic unchanged while fixing the
> overflow condition.
> 
> Bound one dequeue attempt to at most nbands * 2 ETS rotations, as
> suggested in review. This avoids the livelock without adding heavier
> logic to the fast path.
> 
> Fixes: dcc68b4d8084 ("net: sch_ets: Add a new Qdisc")
> Cc: stable@vger.kernel.org
> Reported-by: Yuan Tan <yuantan098@gmail.com>
> Reported-by: Yifan Wu <yifanwucs@gmail.com>
> Reported-by: Juefei Pu <tomapufckgml@gmail.com>
> Reported-by: Zhengchuan Liang <zcliangcn@gmail.com>
> Reported-by: Xin Liu <bird@lzu.edu.cn>
> Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com>
> Assisted-by: Codex:GPT-5.4
> Signed-off-by: Wyatt Feng <bronzed_45_vested@icloud.com>
> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
> ---
> changes in v2:
>   - Instead of doing a div() in the fast path, simply bound the loop per
>     dequeue
>   - v1 Link: https://lore.kernel.org/all/20260615103759.2404228-2-n05ec@lzu.edu.cn/
> 
> 
>  net/sched/sch_ets.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/net/sched/sch_ets.c b/net/sched/sch_ets.c
> index cb8cf437ce87..12a156ccb0a6 100644
> --- a/net/sched/sch_ets.c
> +++ b/net/sched/sch_ets.c
> @@ -40,7 +40,7 @@ struct ets_class {
>  	struct list_head alist; /* In struct ets_sched.active. */
>  	struct Qdisc *qdisc;
>  	u32 quantum;
> -	u32 deficit;
> +	u64 deficit;
>  	struct gnet_stats_basic_sync bstats;
>  	struct gnet_stats_queue qstats;
>  };
> @@ -463,6 +463,8 @@ ets_qdisc_dequeue_skb(struct Qdisc *sch, struct sk_buff *skb)
>  static struct sk_buff *ets_qdisc_dequeue(struct Qdisc *sch)
>  {
>  	struct ets_sched *q = qdisc_priv(sch);
> +	unsigned int max_loops = READ_ONCE(q->nbands) * 2;
> +	unsigned int loops = 0;
>  	struct ets_class *cl;
>  	struct sk_buff *skb;
>  	unsigned int band;
> @@ -499,6 +501,8 @@ static struct sk_buff *ets_qdisc_dequeue(struct Qdisc *sch)
>  
>  		cl->deficit += READ_ONCE(cl->quantum);
>  		list_move_tail(&cl->alist, &q->active);
> +		if (++loops > max_loops)
> +			goto out;
>  	}
>  out:
>  	return NULL;


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net v2 1/1] net: sched: ets: avoid deficit wrap and bound empty dequeue  rounds
  2026-06-26  8:32 [PATCH net v2 1/1] net: sched: ets: avoid deficit wrap and bound empty dequeue rounds Ren Wei
  2026-06-26  9:54 ` Jamal Hadi Salim
  2026-06-27 22:14 ` Jakub Kicinski
@ 2026-06-27 22:15 ` Jakub Kicinski
  2 siblings, 0 replies; 5+ messages in thread
From: Jakub Kicinski @ 2026-06-27 22:15 UTC (permalink / raw)
  To: Ren Wei
  Cc: netdev, jhs, jiri, davem, petrm, yuantan098, yifanwucs,
	tomapufckgml, zcliangcn, bird, bronzed_45_vested

On Fri, 26 Jun 2026 16:32:00 +0800 Ren Wei wrote:
> @@ -499,6 +501,8 @@ static struct sk_buff *ets_qdisc_dequeue(struct Qdisc *sch)
>  
>  		cl->deficit += READ_ONCE(cl->quantum);
>  		list_move_tail(&cl->alist, &q->active);
> +		if (++loops > max_loops)
> +			goto out;

BTW sashiko says that this will permanently stall the qdisc.
-- 
pw-bot: cr

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-06-27 22:15 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-26  8:32 [PATCH net v2 1/1] net: sched: ets: avoid deficit wrap and bound empty dequeue rounds Ren Wei
2026-06-26  9:54 ` Jamal Hadi Salim
2026-06-26 17:34   ` Yuan Tan
2026-06-27 22:14 ` Jakub Kicinski
2026-06-27 22:15 ` Jakub Kicinski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.