From: Cong Wang <xiyou.wangcong@gmail.com>
To: William Liu <will@willsroot.io>
Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>,
Jiri Pirko <jiri@resnulli.us>,
Jamal Hadi Salim <jhs@mojatatu.com>, Savy <savy@syst3mfailure.io>,
Paolo Abeni <pabeni@redhat.com>
Subject: Re: [BUG] Inconsistency between qlen and backlog in hhf, fq, fq_codel, and fq_pie causing WARNING in qdisc_tree_reduce_backlog
Date: Mon, 7 Jul 2025 11:51:25 -0700 [thread overview]
Message-ID: <aGwXLe8djsE0H3Ed@pop-os.localdomain> (raw)
In-Reply-To: <X6Q8WFnkbyNTRaSQ07hgoBUIihJJdm7GIDvCCY0prplSeVDZPKXquiH2as4hPXAWn1J1a2tyP5RnBm8tjKWNM881yDlMlx0pMg9vioBRY1w=@willsroot.io>
On Sun, Jul 06, 2025 at 01:43:08PM +0000, William Liu wrote:
> On Saturday, July 5th, 2025 at 1:04 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>
> >
> >
> > On Wed, Jul 02, 2025 at 08:39:45PM +0000, William Liu wrote:
> >
> > > Hi,
> > >
> > > We write to report a bug in qlen and backlog consistency affecting hhf, fq, fq_codel, and fq_pie when acting as a child of tbf. The cause of this bug was introduced by the following fix last month designed to address a null dereference bug caused by gso segmentation and a temporary inconsistency in queue state when tbf peeks at its child while running out of tokens during tbf_dequeue [1]. We actually reported that bug but did not realize the subtle problem in the fix until now. We are aware of bugs with similar symptoms reported by Mingi [3] and Lion [4], but those are of a different root cause (at least what we can see of Mingi's report).
> >
> >
> > Thanks for your report.
> >
> > > This works on the upstream kernel, and we have the following reproducer.
> > >
> > > ./tc qdisc del dev lo root
> > > ./tc qdisc add dev lo root handle 1: tbf rate 8bit burst 100b latency 1ms || echo TBF
> > > ./tc qdisc add dev lo handle 3: parent 1:1 hhf limit 1000 || echo HH
> > > ping -I lo -f -c1 -s32 -W0.001 127.0.0.1 2>&1 >/dev/null
> > > ./tc qdisc change dev lo handle 3: parent 1:1 hhf limit 0 || echo HH
> > > ./tc qdisc replace dev lo handle 2: parent 1:1 sfq || echo SFQ
> > >
> > > Note that a patched version of tc that supports 0 limits must be built. The symptom of the bug arises in the WARN_ON_ONCE check in qdisc_tree_reduce_backlog [2], where n is 0. You can replace hhf with fq, fq_codel, and fq_pie to trigger warnings as well, though the success rate may vary.
> > >
> > > The root cause comes from the newly introduced function qdisc_dequeue_internal, which the change handler will trigger in the affected qdiscs [5]. When dequeuing from a non empty gso in this peek function, only qlen is decremented, and backlog is not considered. The gso insertion is triggered by qdisc_peek_dequeued, which tbf calls for these qdiscs when they are its child.
> > >
> > > When replacing the qdisc, tbf_graft triggers, and qdisc_purge_queue triggers qdisc_tree_reduce_backlog with the inconsistent values, which one can observe by adding printk to the passed qlen backlog values.
> >
> >
> > If I understand you correctly, the problem is the inconsistent behavior
> > between qdisc_purge_queue() and qdisc_dequeue_internal()? And it is
> > because the former does not take care of ->gso_skb?
> >
>
> No, there are 2 points of inconsistent behavior.
>
> 1. qdisc_dequeue_internal and qdisc_peek_dequeued. In qdisc_peek_dequeued, when a skb comes from the dequeue handler, it gets added to the gso_skb with qlen and backlog increased. In qdisc_dequeue_internal, only qlen is decreased when removing from gso.
>
Yes, because qlen is handled by qdisc_dequeue_internal()'s callers to
control their loop of sch->limit.
> 2. The dequeue limit loop in change handlers and qdisc_dequeue_internal. Every time those loops call qdisc_dequeue_internal, the loops track their own version of dropped items and total dropped packet sizes, before using those values for qdisc_tree_reduce_backlog. The hhf qdisc is an exception as it just uses a before and after loop in the limit change loop. Would this not lead to double counting in the gso_skb dequeue case for qdisc_dequeue_internal?
Right, I think hhf qdisc should align with other Qdisc's to track the
qlen.
Note: my patch didn't change its behavior, so this issue existed even
before my patch.
>
> Also, I took a look at some of the dequeue handlers (which qdisc_dequeue_internal call when direct is false), and fq_codel_dequeue touches the drop_count and drop_len statistics, which the limit adjustment loop also uses.
Right, this is why it is hard to move the qlen tracking into
qdisc_dequeue_internal().
Thanks.
next prev parent reply other threads:[~2025-07-07 18:51 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-02 20:39 [BUG] Inconsistency between qlen and backlog in hhf, fq, fq_codel, and fq_pie causing WARNING in qdisc_tree_reduce_backlog William Liu
2025-07-05 1:04 ` Cong Wang
2025-07-06 13:43 ` William Liu
2025-07-07 18:51 ` Cong Wang [this message]
2025-07-08 20:21 ` William Liu
2025-07-10 22:40 ` Cong Wang
2025-07-12 0:08 ` William Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aGwXLe8djsE0H3Ed@pop-os.localdomain \
--to=xiyou.wangcong@gmail.com \
--cc=jhs@mojatatu.com \
--cc=jiri@resnulli.us \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=savy@syst3mfailure.io \
--cc=will@willsroot.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox