netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jamal Hadi Salim <jhs@mojatatu.com>
To: Lion Ackermann <nnamrec@gmail.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>,
	netdev@vger.kernel.org,  Jiri Pirko <jiri@resnulli.us>,
	Mingi Cho <mincho@theori.io>
Subject: Re: Incomplete fix for recent bug in tc / hfsc
Date: Mon, 30 Jun 2025 07:34:08 -0400	[thread overview]
Message-ID: <CAM0EoMmoQuRER=eBUO+Th02yJUYvfCKu_g7Ppcg0trnA_m6v1Q@mail.gmail.com> (raw)
In-Reply-To: <8e19395d-b6d6-47d4-9ce0-e2b59e109b2b@gmail.com>

Hi,

On Mon, Jun 30, 2025 at 5:04 AM Lion Ackermann <nnamrec@gmail.com> wrote:
>
> Hi,
>
> On 6/29/25 9:50 PM, Cong Wang wrote:
> > On Sun, Jun 29, 2025 at 10:29:44AM -0400, Jamal Hadi Salim wrote:
> >>> On "What do you think the root cause is here?"
> >>>
> >>> I believe the root cause is that qdiscs like hfsc and qfq are dropping
> >>> all packets in enqueue (mostly in relation to peek()) and that result
> >>> is not being reflected in the return code returned to its parent
> >>> qdisc.
> >>> So, in the example you described in this thread, drr is oblivious to
> >>> the fact that the child qdisc dropped its packet because the call to
> >>> its child enqueue returned NET_XMIT_SUCCESS. This causes drr to
> >>> activate a class that shouldn't have been activated at all.
> >>>
> >>> You can argue that drr (and other similar qdiscs) may detect this by
> >>> checking the call to qlen_notify (as the drr patch was
> >>> doing), but that seems really counter-intuitive. Imagine writing a new
> >>> qdisc and having to check for that every time you call a child's
> >>> enqueue. Sure  your patch solves this, but it also seems like it's not
> >>> fixing the underlying issue (which is drr activating the class in the
> >>> first place). Your patch is simply removing all the classes from their
> >>> active lists when you delete them. And your patch may seem ok for now,
> >>> but I am worried it might break something else in the future that we
> >>> are not seeing.
> >>>
> >>> And do note: All of the examples of the hierarchy I have seen so far,
> >>> that put us in this situation, are nonsensical
> >>>
> >>
> >> At this point my thinking is to apply your patch and then we discuss a
> >> longer term solution. Cong?
> >
> > I agree. If Lion's patch works, it is certainly much better as a bug fix
> > for both -net and -stable.
> >
> > Also for all of those ->qlen_notify() craziness, I think we need to
> > rethink about the architecture, _maybe_ there are better architectural
> > solutions.
> >
> > Thanks!
>
> Just for the record, I agree with all your points and as was stated this
> patch really only does damage prevention. Your proposal of preventing
> hierarchies sounds useful in the long run to keep the backlogs sane.
>
> I did run all the tdc tests on the latest net tree and they passed. Also
> my HFSC reproducer does not trigger with the proposed patch. I do not have
> a simple reproducer at hand for the QFQ tree case that you mentioned. So
> please verify this too if you can.
>
> Otherwise please feel free to go forward with the patch. If I can add
> anything else to the discussion please let me know.
>

Please post the patch formally as per Cong request. A tdc test case of
the reproducer would also help.

cheers,
jamal

  reply	other threads:[~2025-06-30 11:34 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-23 10:41 Incomplete fix for recent bug in tc / hfsc Lion Ackermann
2025-06-23 14:43 ` Jamal Hadi Salim
2025-06-24  4:41 ` Cong Wang
2025-06-24  9:24   ` Lion Ackermann
2025-06-24 10:43     ` Lion Ackermann
2025-06-25 14:22       ` Jamal Hadi Salim
2025-06-26  8:08         ` Lion Ackermann
2025-06-28 21:43           ` Jamal Hadi Salim
2025-06-29 14:29             ` Jamal Hadi Salim
2025-06-29 19:50               ` Cong Wang
2025-06-30  9:04                 ` Lion Ackermann
2025-06-30 11:34                   ` Jamal Hadi Salim [this message]
2025-06-30 13:36                     ` Lion Ackermann
2025-06-30 14:57                       ` Jamal Hadi Salim
2025-06-30 17:52                         ` Victor Nogueira
2025-06-30 21:42                           ` Cong Wang
2025-07-01 12:41                             ` Lion Ackermann
2025-07-01 12:58                               ` Victor Nogueira
2025-06-30 11:47               ` Victor Nogueira
2025-06-30 13:27                 ` [PATCH] net/sched: Always pass notifications when child class becomes empty Lion Ackermann
2025-06-30 14:56                   ` Jamal Hadi Salim
2025-06-30 21:38                   ` Cong Wang
2025-07-01 14:03                     ` Jamal Hadi Salim
2025-07-02 21:50                   ` patchwork-bot+netdevbpf
2025-06-28  0:35       ` Incomplete fix for recent bug in tc / hfsc Cong Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAM0EoMmoQuRER=eBUO+Th02yJUYvfCKu_g7Ppcg0trnA_m6v1Q@mail.gmail.com' \
    --to=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=mincho@theori.io \
    --cc=netdev@vger.kernel.org \
    --cc=nnamrec@gmail.com \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).