netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: William Liu <will@willsroot.io>
Cc: netdev@vger.kernel.org, jhs@mojatatu.com,
	xiyou.wangcong@gmail.com, pabeni@redhat.com, jiri@resnulli.us,
	davem@davemloft.net, edumazet@google.com, horms@kernel.org,
	savy@syst3mfailure.io, victor@mojatatu.com
Subject: Re: [PATCH net v4 1/2] net/sched: Fix backlog accounting in qdisc_dequeue_internal
Date: Mon, 11 Aug 2025 08:29:58 -0700	[thread overview]
Message-ID: <20250811082958.489df3fa@kernel.org> (raw)
In-Reply-To: <n-GjVW0_1R1-ujkLgZIEgnaQKSsNtQ9-7UZiTmDCJsy1EutoUtiGOSahNSxpz2yANsp5olbxItT2X9apTC9btIRepMGAZZVBqWx6ueYE5O4=@willsroot.io>

On Sun, 10 Aug 2025 21:06:57 +0000 William Liu wrote:
> > On Sun, 27 Jul 2025 23:56:32 +0000 William Liu wrote:
> >   
> > > Special care is taken for fq_codel_dequeue to account for the
> > > qdisc_tree_reduce_backlog call in its dequeue handler. The
> > > cstats reset is moved from the end to the beginning of
> > > fq_codel_dequeue, so the change handler can use cstats for
> > > proper backlog reduction accounting purposes. The drop_len and
> > > drop_count fields are not used elsewhere so this reordering in
> > > fq_codel_dequeue is ok.  
> > 
> > 
> > Using local variables like we do in other qdiscs will not work?
> > I think your change will break drop accounting during normal dequeue?  
> 
> Can you elaborate on this? 
> 
> I just moved the reset of two cstats fields from the dequeue handler
> epilogue to the prologue. Those specific cstats fields are not used
> elsewhere so they should be fine, 

That's the disconnect. AFAICT they are passed to codel_dequeue(),
and will be used during normal dequeue, as part of normal active
queue management under traffic..

> but we need to accumulate their
> values during limit adjustment. Otherwise the limit adjustment loop
> could perform erroneous accounting in the final
> qdisc_tree_reduce_backlog because the dequeue path could have already
> triggered qdisc_tree_reduce_backlog calls.
>
> > > diff --git a/net/sched/sch_fq.c b/net/sched/sch_fq.c
> > > index 902ff5470607..986e71e3362c 100644
> > > --- a/net/sched/sch_fq.c
> > > +++ b/net/sched/sch_fq.c
> > > @@ -1014,10 +1014,10 @@ static int fq_change(struct Qdisc *sch, struct nlattr *opt,
> > > struct netlink_ext_ack *extack)
> > > {
> > > struct fq_sched_data *q = qdisc_priv(sch);
> > > + unsigned int prev_qlen, prev_backlog;
> > > struct nlattr *tb[TCA_FQ_MAX + 1];
> > > - int err, drop_count = 0;
> > > - unsigned drop_len = 0;
> > > u32 fq_log;
> > > + int err;
> > > 
> > > err = nla_parse_nested_deprecated(tb, TCA_FQ_MAX, opt, fq_policy,
> > > NULL);
> > > @@ -1135,16 +1135,16 @@ static int fq_change(struct Qdisc *sch, struct nlattr *opt,
> > > err = fq_resize(sch, fq_log);
> > > sch_tree_lock(sch);
> > > }
> > > +
> > > + prev_qlen = sch->q.qlen;
> > > + prev_backlog = sch->qstats.backlog;
> > > while (sch->q.qlen > sch->limit) {
> > > struct sk_buff *skb = qdisc_dequeue_internal(sch, false);
> > > 
> > > - if (!skb)
> > > - break;  
> > 
> > 
> > The break conditions is removed to align the code across the qdiscs?  
> 
> That break is no longer needed because qdisc_internal_dequeue handles
> all the length and backlog size adjustments. The check existed there
> because of the qdisc_pkt_len call.

Ack, tho, theoretically the break also prevents an infinite loop.
Change is fine, but worth calling this out in the commit message,
I reckon.

> > > - drop_len += qdisc_pkt_len(skb);
> > > rtnl_kfree_skbs(skb, skb);
> > > - drop_count++;
> > > }
> > > - qdisc_tree_reduce_backlog(sch, drop_count, drop_len);
> > > + qdisc_tree_reduce_backlog(sch, prev_qlen - sch->q.qlen,
> > > + prev_backlog - sch->qstats.backlog);  
> > 
> > 
> > There is no real change in the math here, right?
> > Again, you're just changing this to align across the qdiscs?  
> 
> Yep, asides from using a properly updated qlen and backlog from the
> revamped qdisc_dequeue_internal.

Personal preference, but my choice would be to follow the FQ code,
and count the skbs as they are freed. But up to you, since we hold
the lock supposedly the changes to backlog can only be due to our
purging.

  reply	other threads:[~2025-08-11 15:29 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-27 23:56 [PATCH net v4 1/2] net/sched: Fix backlog accounting in qdisc_dequeue_internal William Liu
2025-07-27 23:57 ` [PATCH net v4 2/2] selftests/tc-testing: Check backlog stats in gso_skb case William Liu
2025-07-30 17:54   ` Cong Wang
2025-07-30 17:59     ` William Liu
2025-08-08 21:27 ` [PATCH net v4 1/2] net/sched: Fix backlog accounting in qdisc_dequeue_internal Jakub Kicinski
2025-08-10 21:06   ` William Liu
2025-08-11 15:29     ` Jakub Kicinski [this message]
2025-08-11 16:52       ` William Liu
2025-08-11 17:24         ` Jakub Kicinski
2025-08-11 17:51           ` William Liu
2025-08-12  0:51             ` Jakub Kicinski
2025-08-12  2:10               ` William Liu
2025-08-12 14:38                 ` Jakub Kicinski
2025-08-12 16:59                   ` William Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250811082958.489df3fa@kernel.org \
    --to=kuba@kernel.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=savy@syst3mfailure.io \
    --cc=victor@mojatatu.com \
    --cc=will@willsroot.io \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).