From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Jesper Dangaard Brouer <hawk@kernel.org>,
atenart@redhat.com, netdev@vger.kernel.org,
Eric Dumazet <eric.dumazet@gmail.com>,
Paolo Abeni <pabeni@redhat.com>
Cc: bpf@vger.kernel.org, Jakub Kicinski <kuba@kernel.org>,
horms@kernel.org, jiri@resnulli.us, edumazet@google.com,
xiyou.wangcong@gmail.com, jhs@mojatatu.com,
carges@cloudflare.com, kernel-team@cloudflare.com,
"David S. Miller" <davem@davemloft.net>
Subject: Re: [PATCH net-next v2 1/6] net: sched: introduce qdisc-specific drop reason tracing
Date: Fri, 06 Feb 2026 17:31:08 +0100 [thread overview]
Message-ID: <87pl6hzveb.fsf@toke.dk> (raw)
In-Reply-To: <26e54152-e576-4292-b6ba-d50cf7cc8aa2@kernel.org>
Jesper Dangaard Brouer <hawk@kernel.org> writes:
> On 06/02/2026 10.31, Toke Høiland-Jørgensen wrote:
>> Jesper Dangaard Brouer <hawk@kernel.org> writes:
>>
>>> Create new enum qdisc_drop_reason and trace_qdisc_drop tracepoint
>>> for qdisc layer drop diagnostics with direct qdisc context visibility.
>>>
>>> The new tracepoint includes qdisc handle, parent, kind (name), and
>>> device information. Existing SKB_DROP_REASON_QDISC_DROP is retained
>>> for backwards compatibility via kfree_skb_reason().
>>>
>>> Convert FQ, FQ_CoDel, CoDel, SFB, and pfifo_fast to use the new
>>> infrastructure.
>>>
>>> Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org>
> [...]
>>> diff --git a/include/net/dropreason-core.h b/include/net/dropreason-core.h
>>> index a7b7abd66e21..3d8d284e05c8 100644
>>> --- a/include/net/dropreason-core.h
>>> +++ b/include/net/dropreason-core.h
>>> @@ -68,12 +68,6 @@
>>> FN(SECURITY_HOOK) \
>>> FN(QDISC_DROP) \
>>> FN(QDISC_BURST_DROP) \
>>> - FN(QDISC_OVERLIMIT) \
>>> - FN(QDISC_CONGESTED) \
>>> - FN(CAKE_FLOOD) \
>>> - FN(FQ_BAND_LIMIT) \
>>> - FN(FQ_HORIZON_LIMIT) \
>>> - FN(FQ_FLOW_LIMIT) \
>>> FN(CPU_BACKLOG) \
>>> FN(XDP) \
>>> FN(TC_INGRESS) \
>>> @@ -371,8 +365,10 @@ enum skb_drop_reason {
>>> /** @SKB_DROP_REASON_SECURITY_HOOK: dropped due to security HOOK */
>>> SKB_DROP_REASON_SECURITY_HOOK,
>>> /**
>>> - * @SKB_DROP_REASON_QDISC_DROP: dropped by qdisc when packet outputting (
>>> - * failed to enqueue to current qdisc)
>>> + * @SKB_DROP_REASON_QDISC_DROP: dropped by qdisc during enqueue or
>>> + * dequeue. More specific drop reasons are available via the
>>> + * qdisc:qdisc_drop tracepoint, which also provides qdisc handle
>>> + * and name for identifying the source.
>>
>> IIUC, this is not needed, see below:
>>
>> [...]
>>
>>> @@ -37,6 +37,24 @@
>>> const struct Qdisc_ops *default_qdisc_ops = &pfifo_fast_ops;
>>> EXPORT_SYMBOL(default_qdisc_ops);
>>>
>>> +void tcf_kfree_skb_list(struct sk_buff *skb, struct Qdisc *q,
>>> + struct netdev_queue *txq,
>>> + struct net_device *dev)
>>> +{
>>> + while (unlikely(skb)) {
>>> + struct sk_buff *next = skb->next;
>>> + enum qdisc_drop_reason reason = tcf_get_qdisc_drop_reason(skb);
>>> +
>>> + prefetch(next);
>>> + /* Catch wrong enum: skb_drop_reason vs qdisc_drop_reason */
>>> + WARN_ON_ONCE(reason && reason < __QDISC_DROP_REASON);
>>> + trace_qdisc_drop(q, txq, dev, skb, reason);
>>> + kfree_skb_reason(skb, SKB_DROP_REASON_QDISC_DROP);
>>
>> AFAIU, the idea here is that you just pass the same 'reason' value to
>> kfree_skb_reason(). Because of the subsys shift and offset, these will
>> all be unique values, so anyone just watching the old tracepoint will
>> get QDISC_DROP_* reasons with no context, and if you want the context
>> you listen to the qdisc_drop tracepoint.
>
> I'm already changing this code in V3, as syzbot revealed that some
> existing skb_drop_reason's do arrive here, which is the TC filter drop
> reasons (SKB_DROP_REASON_TC_*). We primarily need subsys encoding as
> we share the SKB->cb area.
>
> This is the new code, that pass 'reason' through:
>
> /* TC classifier and qdisc share drop_reason storage.
> * Check subsystem mask to identify qdisc drop reasons,
> * else pass through skb_drop_reason set by TC classifier.
> */
> if ((reason & SKB_DROP_REASON_SUBSYS_MASK) == __QDISC_DROP_REASON) {
> trace_qdisc_drop(q, txq, dev, skb, reason);
> skb_reason = SKB_DROP_REASON_QDISC_DROP;
> } else {
> skb_reason = (enum skb_drop_reason)reason;
> }
> kfree_skb_reason(skb, skb_reason);
>
> I do realize that the reason is now a unique value, that in principle
> could be passed through, *BUT* that will break existing userspace
> tools. E.g. the `perf trace -e skb:kfree_skb` cannot decode these to
> strings. (The net/core/drop_monitor.c also need special handling.)
Ah, that's a bit unfortunate. I assumed the tools would handle these
correctly automatically :(
> I actually want to give userspace a *single* SKB_DROP_REASON_QDISC_DROP
> reason (that existing tools already knows).
>
> Leaking extra qdisc drop reasons is not helpful to consume as it is not
> actionable without knowing the qdisc. A QDISC_DROP_OVERLIMIT from FQ is
> very different from one in CAKE - knowing just the reason without the
> qdisc type/config is insufficient for debugging. The trace_qdisc_drop
> tracepoint provides both the reason AND the qdisc context together
Depends; if you only have one of those installed on your system it'll do
just fine :)
In any case we're changing things here (removing reasons, or changing
them). I would lean towards having more information available (i.e.,
passing through the reason), but if common tools can't decode them,
that's a bit of a bummer.
I suppose it'll be too much churn to start out with the single reason,
then fix the tools to understand subsystem drop reasons, then change it
again?
-Toke
next prev parent reply other threads:[~2026-02-06 16:31 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-05 21:21 [PATCH net-next v2 0/6] net: sched: refactor qdisc drop reasons into dedicated tracepoint Jesper Dangaard Brouer
2026-02-05 21:21 ` [PATCH net-next v2 1/6] net: sched: introduce qdisc-specific drop reason tracing Jesper Dangaard Brouer
2026-02-06 9:31 ` Toke Høiland-Jørgensen
2026-02-06 14:14 ` Jesper Dangaard Brouer
2026-02-06 16:31 ` Toke Høiland-Jørgensen [this message]
2026-02-09 13:45 ` Jesper Dangaard Brouer
2026-02-05 21:21 ` [PATCH net-next v2 2/6] net: sched: sfq: convert to qdisc drop reasons Jesper Dangaard Brouer
2026-02-05 21:21 ` [PATCH net-next v2 3/6] net: sched: sch_cake: use enum qdisc_drop_reason for cobalt_should_drop Jesper Dangaard Brouer
2026-02-05 21:21 ` [PATCH net-next v2 4/6] net: sched: rename QDISC_DROP_CAKE_FLOOD to QDISC_DROP_FLOOD_PROTECTION Jesper Dangaard Brouer
2026-02-05 21:21 ` [PATCH net-next v2 5/6] net: sched: rename QDISC_DROP_FQ_* to generic names Jesper Dangaard Brouer
2026-02-05 21:22 ` [PATCH net-next v2 6/6] net: sched: sch_dualpi2: use qdisc_dequeue_drop() for dequeue drops Jesper Dangaard Brouer
2026-02-06 4:09 ` kernel test robot
2026-02-06 4:50 ` kernel test robot
2026-02-06 8:32 ` [syzbot ci] Re: net: sched: refactor qdisc drop reasons into dedicated tracepoint syzbot ci
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87pl6hzveb.fsf@toke.dk \
--to=toke@redhat.com \
--cc=atenart@redhat.com \
--cc=bpf@vger.kernel.org \
--cc=carges@cloudflare.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=eric.dumazet@gmail.com \
--cc=hawk@kernel.org \
--cc=horms@kernel.org \
--cc=jhs@mojatatu.com \
--cc=jiri@resnulli.us \
--cc=kernel-team@cloudflare.com \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox