public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Cong Wang <xiyou.wangcong@gmail.com>
To: William Liu <will@willsroot.io>
Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	Jiri Pirko <jiri@resnulli.us>,
	Jamal Hadi Salim <jhs@mojatatu.com>, Savy <savy@syst3mfailure.io>,
	Paolo Abeni <pabeni@redhat.com>
Subject: Re: [BUG]  Inconsistency between qlen and backlog in hhf, fq, fq_codel, and fq_pie causing WARNING in qdisc_tree_reduce_backlog
Date: Thu, 10 Jul 2025 15:40:41 -0700	[thread overview]
Message-ID: <aHBBaXPPQvBCejoM@pop-os.localdomain> (raw)
In-Reply-To: <rt_47kiO_w5I_HyL1B4RuHKclPjWvmSJqnlZwSZB5YKxStxbDAsb7lTae0yhrRLJkh9yb7JjV-LpU_xzNCd031YI23Z4Yy83u8-DgYuPsEI=@willsroot.io>

On Tue, Jul 08, 2025 at 08:21:31PM +0000, William Liu wrote:
> On Monday, July 7th, 2025 at 6:51 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> 
> > 
> > 
> > On Sun, Jul 06, 2025 at 01:43:08PM +0000, William Liu wrote:
> > 
> > > On Saturday, July 5th, 2025 at 1:04 AM, Cong Wang xiyou.wangcong@gmail.com wrote:
> > > 
> > > > On Wed, Jul 02, 2025 at 08:39:45PM +0000, William Liu wrote:
> > > > 
> > > > > Hi,
> > > > > 
> > > > > We write to report a bug in qlen and backlog consistency affecting hhf, fq, fq_codel, and fq_pie when acting as a child of tbf. The cause of this bug was introduced by the following fix last month designed to address a null dereference bug caused by gso segmentation and a temporary inconsistency in queue state when tbf peeks at its child while running out of tokens during tbf_dequeue [1]. We actually reported that bug but did not realize the subtle problem in the fix until now. We are aware of bugs with similar symptoms reported by Mingi [3] and Lion [4], but those are of a different root cause (at least what we can see of Mingi's report).
> > > > 
> > > > Thanks for your report.
> > > > 
> > > > > This works on the upstream kernel, and we have the following reproducer.
> > > > > 
> > > > > ./tc qdisc del dev lo root
> > > > > ./tc qdisc add dev lo root handle 1: tbf rate 8bit burst 100b latency 1ms || echo TBF
> > > > > ./tc qdisc add dev lo handle 3: parent 1:1 hhf limit 1000 || echo HH
> > > > > ping -I lo -f -c1 -s32 -W0.001 127.0.0.1 2>&1 >/dev/null
> > > > > ./tc qdisc change dev lo handle 3: parent 1:1 hhf limit 0 || echo HH
> > > > > ./tc qdisc replace dev lo handle 2: parent 1:1 sfq || echo SFQ
> > > > > 
> > > > > Note that a patched version of tc that supports 0 limits must be built. The symptom of the bug arises in the WARN_ON_ONCE check in qdisc_tree_reduce_backlog [2], where n is 0. You can replace hhf with fq, fq_codel, and fq_pie to trigger warnings as well, though the success rate may vary.
> > > > > 
> > > > > The root cause comes from the newly introduced function qdisc_dequeue_internal, which the change handler will trigger in the affected qdiscs [5]. When dequeuing from a non empty gso in this peek function, only qlen is decremented, and backlog is not considered. The gso insertion is triggered by qdisc_peek_dequeued, which tbf calls for these qdiscs when they are its child.
> > > > > 
> > > > > When replacing the qdisc, tbf_graft triggers, and qdisc_purge_queue triggers qdisc_tree_reduce_backlog with the inconsistent values, which one can observe by adding printk to the passed qlen backlog values.
> > > > 
> > > > If I understand you correctly, the problem is the inconsistent behavior
> > > > between qdisc_purge_queue() and qdisc_dequeue_internal()? And it is
> > > > because the former does not take care of ->gso_skb?
> > > 
> > > No, there are 2 points of inconsistent behavior.
> > > 
> > > 1. qdisc_dequeue_internal and qdisc_peek_dequeued. In qdisc_peek_dequeued, when a skb comes from the dequeue handler, it gets added to the gso_skb with qlen and backlog increased. In qdisc_dequeue_internal, only qlen is decreased when removing from gso.
> > 
> > 
> > Yes, because qlen is handled by qdisc_dequeue_internal()'s callers to
> > control their loop of sch->limit.
> > 
> 
> This makes sense. However, if backlog is not adjusted in that helper, then they would go out of sync. qdisc_tree_reduce_backlog only adjusts counters for parent/ancestral qdiscs. 

Oh, you mean some callers miss qdisc_qstats_backlog_dec()? If you can
confirm this is an issue and adding it could solve it, please go ahead
to send out a patch. It makes sense to me so far, at least for aligning
with other callers.

> 
> This should help elucidate the problem:
> export TARGET=hhf
> ./tc qdisc del dev lo root
> ./tc qdisc add dev lo root handle 1: tbf rate 8bit burst 100b latency 1ms || echo TBF
> ./tc qdisc replace dev lo handle 3: parent 1:1 $TARGET limit 1000 || echo HH
> echo ''; echo 'add child'; tc -s -d qdisc show dev lo
> ping -I lo -f -c1 -s32 -W0.001 127.0.0.1 2>&1 >/dev/null
> echo ''; echo 'after ping'; tc -s -d qdisc show dev lo
> ./tc qdisc change dev lo handle 3: parent 1:1 $TARGET limit 0 || echo HH
> echo ''; echo 'after limit drop'; tc -s -d qdisc show dev lo
> ./tc qdisc replace dev lo handle 2: parent 1:1 sfq || echo SFQ
> echo ''; echo 'post graft'; tc -s -d qdisc show dev lo
> 
> Perhaps the original WARNING in qdisc_tree_reduce_backlog (which Lion's patch removed) regarding an inconsistent backlog was not a real problem then? But this backlog adjustment can be accounted for pretty easily I think. Let me know if I can help with this.

I think Lion's patch fixes a completely different issue? At least I
don't immediately connect it with this sch->limit issue.

> 
> On another note, asides from hhf (because the backlog value to qdisc_tree_reduce_backlog is 0 due to its different limit change loop), the other qdiscs cause an underflow in the backlog of tbf by the final graft.
> 

Does adding qdisc_qstats_backlog_dec() solve this too?

Thanks.

  reply	other threads:[~2025-07-10 22:40 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-02 20:39 [BUG] Inconsistency between qlen and backlog in hhf, fq, fq_codel, and fq_pie causing WARNING in qdisc_tree_reduce_backlog William Liu
2025-07-05  1:04 ` Cong Wang
2025-07-06 13:43   ` William Liu
2025-07-07 18:51     ` Cong Wang
2025-07-08 20:21       ` William Liu
2025-07-10 22:40         ` Cong Wang [this message]
2025-07-12  0:08           ` William Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aHBBaXPPQvBCejoM@pop-os.localdomain \
    --to=xiyou.wangcong@gmail.com \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=savy@syst3mfailure.io \
    --cc=will@willsroot.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox