public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v1] net: sched: sfq: add detailed drop reasons for monitoring
@ 2026-01-15 12:23 Jesper Dangaard Brouer
  2026-01-16 10:40 ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 10+ messages in thread
From: Jesper Dangaard Brouer @ 2026-01-15 12:23 UTC (permalink / raw)
  To: netdev
  Cc: Jesper Dangaard Brouer, bpf, Eric Dumazet, David S. Miller,
	Jakub Kicinski, Paolo Abeni, Toke Høiland-Jørgensen,
	carges, kernel-team

Add specific drop reasons to SFQ qdisc to improve packet drop observability
and monitoring capabilities. This change replaces generic qdisc_drop()
calls with qdisc_drop_reason() to provide granular metrics about different
drop scenarios in production environments.

Two new drop reasons are introduced:

- SKB_DROP_REASON_QDISC_MAXFLOWS: Used when a new flow cannot be created
  because the maximum number of flows (flows parameter) has been
  reached and no free flow slots are available.

- SKB_DROP_REASON_QDISC_MAXDEPTH: Used when a flow's queue length exceeds
  the per-flow depth limit (depth parameter), triggering either tail drop
  or head drop depending on headdrop configuration.

The existing SKB_DROP_REASON_QDISC_OVERLIMIT is used in sfq_drop() when
the overall qdisc limit is exceeded and packets are dropped from the
longest queue.

These detailed drop reasons enable production monitoring systems to
distinguish between different SFQ drop scenarios and generate specific
metrics for:
- Flow table exhaustion (flows exceeded)
- Per-flow congestion (depth limit exceeded)
- Global qdisc congestion (overall limit exceeded)

This granular visibility allows operators to identify issues related
to traffic patterns, and optimize SFQ configuration based on
real-world drop patterns.

Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org>
---
 include/net/dropreason-core.h |   12 ++++++++++++
 net/sched/sch_sfq.c           |    8 ++++----
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/include/net/dropreason-core.h b/include/net/dropreason-core.h
index 58d91ccc56e0..e395d0ff9904 100644
--- a/include/net/dropreason-core.h
+++ b/include/net/dropreason-core.h
@@ -69,6 +69,8 @@
 	FN(QDISC_DROP)			\
 	FN(QDISC_OVERLIMIT)		\
 	FN(QDISC_CONGESTED)		\
+	FN(QDISC_MAXFLOWS)		\
+	FN(QDISC_MAXDEPTH)		\
 	FN(CAKE_FLOOD)			\
 	FN(FQ_BAND_LIMIT)		\
 	FN(FQ_HORIZON_LIMIT)		\
@@ -384,6 +386,16 @@ enum skb_drop_reason {
 	 * due to congestion.
 	 */
 	SKB_DROP_REASON_QDISC_CONGESTED,
+	/**
+	 * @SKB_DROP_REASON_QDISC_MAXFLOWS: dropped by qdisc when the maximum
+	 * number of flows is exceeded.
+	 */
+	SKB_DROP_REASON_QDISC_MAXFLOWS,
+	/**
+	 * @SKB_DROP_REASON_QDISC_MAXDEPTH: dropped by qdisc when a flow
+	 * exceeds its maximum queue depth limit.
+	 */
+	SKB_DROP_REASON_QDISC_MAXDEPTH,
 	/**
 	 * @SKB_DROP_REASON_CAKE_FLOOD: dropped by the flood protection part of
 	 * CAKE qdisc AQM algorithm (BLUE).
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 96eb2f122973..e91d74127600 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -302,7 +302,7 @@ static unsigned int sfq_drop(struct Qdisc *sch, struct sk_buff **to_free)
 		sfq_dec(q, x);
 		sch->q.qlen--;
 		qdisc_qstats_backlog_dec(sch, skb);
-		qdisc_drop(skb, sch, to_free);
+		qdisc_drop_reason(skb, sch, to_free, SKB_DROP_REASON_QDISC_OVERLIMIT);
 		return len;
 	}
 
@@ -363,7 +363,7 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc *sch, struct sk_buff **to_free)
 	if (x == SFQ_EMPTY_SLOT) {
 		x = q->dep[0].next; /* get a free slot */
 		if (x >= SFQ_MAX_FLOWS)
-			return qdisc_drop(skb, sch, to_free);
+			return qdisc_drop_reason(skb, sch, to_free, SKB_DROP_REASON_QDISC_MAXFLOWS);
 		q->ht[hash] = x;
 		slot = &q->slots[x];
 		slot->hash = hash;
@@ -420,14 +420,14 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc *sch, struct sk_buff **to_free)
 	if (slot->qlen >= q->maxdepth) {
 congestion_drop:
 		if (!sfq_headdrop(q))
-			return qdisc_drop(skb, sch, to_free);
+			return qdisc_drop_reason(skb, sch, to_free, SKB_DROP_REASON_QDISC_MAXDEPTH);
 
 		/* We know we have at least one packet in queue */
 		head = slot_dequeue_head(slot);
 		delta = qdisc_pkt_len(head) - qdisc_pkt_len(skb);
 		sch->qstats.backlog -= delta;
 		slot->backlog -= delta;
-		qdisc_drop(head, sch, to_free);
+		qdisc_drop_reason(head, sch, to_free, SKB_DROP_REASON_QDISC_MAXDEPTH);
 
 		slot_queue_add(slot, skb);
 		qdisc_tree_reduce_backlog(sch, 0, delta);



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next v1] net: sched: sfq: add detailed drop reasons for monitoring
  2026-01-15 12:23 [PATCH net-next v1] net: sched: sfq: add detailed drop reasons for monitoring Jesper Dangaard Brouer
@ 2026-01-16 10:40 ` Jesper Dangaard Brouer
  2026-01-16 11:00   ` Toke Høiland-Jørgensen
  2026-01-21  0:26   ` Jakub Kicinski
  0 siblings, 2 replies; 10+ messages in thread
From: Jesper Dangaard Brouer @ 2026-01-16 10:40 UTC (permalink / raw)
  To: netdev
  Cc: bpf, Eric Dumazet, David S. Miller, Jakub Kicinski, Paolo Abeni,
	Toke Høiland-Jørgensen, carges, kernel-team

Hi Eric,

I need an opinion on naming for drop_reasons below.

On 15/01/2026 13.23, Jesper Dangaard Brouer wrote:
> Add specific drop reasons to SFQ qdisc to improve packet drop observability
> and monitoring capabilities. This change replaces generic qdisc_drop()
> calls with qdisc_drop_reason() to provide granular metrics about different
> drop scenarios in production environments.
> 
> Two new drop reasons are introduced:
> 
> - SKB_DROP_REASON_QDISC_MAXFLOWS: Used when a new flow cannot be created
>    because the maximum number of flows (flows parameter) has been
>    reached and no free flow slots are available.
> 
> - SKB_DROP_REASON_QDISC_MAXDEPTH: Used when a flow's queue length exceeds
>    the per-flow depth limit (depth parameter), triggering either tail drop
>    or head drop depending on headdrop configuration.

I noticed commit 5765c7f6e317 ("net_sched: sch_fq: add three 
drop_reason") (Author: Eric Dumazet).

  SKB_DROP_REASON_FQ_BAND_LIMIT: Per-band packet limit exceeded
  SKB_DROP_REASON_FQ_HORIZON_LIMIT: Packet timestamp too far in future
  SKB_DROP_REASON_FQ_FLOW_LIMIT: Per-flow packet limit exceeded

Should I/we make SKB_DROP_REASON_QDISC_MAXDEPTH specific for SFQ ?
Like naming it = SKB_DROP_REASON_SFQ_MAXDEPTH ?

Currently SKB_DROP_REASON_QDISC_MAXDEPTH is only used in SFQ, but it
might be usable in other qdisc as well.  Except that I noticed the
meaning of SKB_DROP_REASON_FQ_FLOW_LIMIT which is basically the same.
This made me think that perhaps I should also make it qdisc specific.
I'm considering adding a per-flow limit to fq_codel as I'm seeing prod
issues with the global 10240 packet limit. This also need a similar flow
depth limit drop reason. I'm undecided which way to go, please advice.


> The existing SKB_DROP_REASON_QDISC_OVERLIMIT is used in sfq_drop() when
> the overall qdisc limit is exceeded and packets are dropped from the
> longest queue.
> 
> These detailed drop reasons enable production monitoring systems to
> distinguish between different SFQ drop scenarios and generate specific
> metrics for:
> - Flow table exhaustion (flows exceeded)
> - Per-flow congestion (depth limit exceeded)
> - Global qdisc congestion (overall limit exceeded)
> 
> This granular visibility allows operators to identify issues related
> to traffic patterns, and optimize SFQ configuration based on
> real-world drop patterns.
> 
> Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org>
> ---
>   include/net/dropreason-core.h |   12 ++++++++++++
>   net/sched/sch_sfq.c           |    8 ++++----
>   2 files changed, 16 insertions(+), 4 deletions(-)
> 
> diff --git a/include/net/dropreason-core.h b/include/net/dropreason-core.h
> index 58d91ccc56e0..e395d0ff9904 100644
> --- a/include/net/dropreason-core.h
> +++ b/include/net/dropreason-core.h
> @@ -69,6 +69,8 @@
>   	FN(QDISC_DROP)			\
>   	FN(QDISC_OVERLIMIT)		\
>   	FN(QDISC_CONGESTED)		\
> +	FN(QDISC_MAXFLOWS)		\
> +	FN(QDISC_MAXDEPTH)		\
>   	FN(CAKE_FLOOD)			\
>   	FN(FQ_BAND_LIMIT)		\
>   	FN(FQ_HORIZON_LIMIT)		\

FQ drop reason names can be seen above.

> @@ -384,6 +386,16 @@ enum skb_drop_reason {
>   	 * due to congestion.
>   	 */
>   	SKB_DROP_REASON_QDISC_CONGESTED,
> +	/**
> +	 * @SKB_DROP_REASON_QDISC_MAXFLOWS: dropped by qdisc when the maximum
> +	 * number of flows is exceeded.
> +	 */
> +	SKB_DROP_REASON_QDISC_MAXFLOWS,
> +	/**
> +	 * @SKB_DROP_REASON_QDISC_MAXDEPTH: dropped by qdisc when a flow
> +	 * exceeds its maximum queue depth limit.
> +	 */
> +	SKB_DROP_REASON_QDISC_MAXDEPTH,
>   	/**
>   	 * @SKB_DROP_REASON_CAKE_FLOOD: dropped by the flood protection part of
>   	 * CAKE qdisc AQM algorithm (BLUE).
> diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
> index 96eb2f122973..e91d74127600 100644
> --- a/net/sched/sch_sfq.c
> +++ b/net/sched/sch_sfq.c
> @@ -302,7 +302,7 @@ static unsigned int sfq_drop(struct Qdisc *sch, struct sk_buff **to_free)
>   		sfq_dec(q, x);
>   		sch->q.qlen--;
>   		qdisc_qstats_backlog_dec(sch, skb);
> -		qdisc_drop(skb, sch, to_free);
> +		qdisc_drop_reason(skb, sch, to_free, SKB_DROP_REASON_QDISC_OVERLIMIT);
>   		return len;
>   	}
>   
> @@ -363,7 +363,7 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc *sch, struct sk_buff **to_free)
>   	if (x == SFQ_EMPTY_SLOT) {
>   		x = q->dep[0].next; /* get a free slot */
>   		if (x >= SFQ_MAX_FLOWS)
> -			return qdisc_drop(skb, sch, to_free);
> +			return qdisc_drop_reason(skb, sch, to_free, SKB_DROP_REASON_QDISC_MAXFLOWS);
>   		q->ht[hash] = x;
>   		slot = &q->slots[x];
>   		slot->hash = hash;
> @@ -420,14 +420,14 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc *sch, struct sk_buff **to_free)
>   	if (slot->qlen >= q->maxdepth) {
>   congestion_drop:
>   		if (!sfq_headdrop(q))
> -			return qdisc_drop(skb, sch, to_free);
> +			return qdisc_drop_reason(skb, sch, to_free, SKB_DROP_REASON_QDISC_MAXDEPTH);
>   
>   		/* We know we have at least one packet in queue */
>   		head = slot_dequeue_head(slot);
>   		delta = qdisc_pkt_len(head) - qdisc_pkt_len(skb);
>   		sch->qstats.backlog -= delta;
>   		slot->backlog -= delta;
> -		qdisc_drop(head, sch, to_free);
> +		qdisc_drop_reason(head, sch, to_free, SKB_DROP_REASON_QDISC_MAXDEPTH);
>   
>   		slot_queue_add(slot, skb);
>   		qdisc_tree_reduce_backlog(sch, 0, delta);
> 
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next v1] net: sched: sfq: add detailed drop reasons for monitoring
  2026-01-16 10:40 ` Jesper Dangaard Brouer
@ 2026-01-16 11:00   ` Toke Høiland-Jørgensen
  2026-01-16 13:31     ` Jesper Dangaard Brouer
  2026-01-21  0:26   ` Jakub Kicinski
  1 sibling, 1 reply; 10+ messages in thread
From: Toke Høiland-Jørgensen @ 2026-01-16 11:00 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, netdev
  Cc: bpf, Eric Dumazet, David S. Miller, Jakub Kicinski, Paolo Abeni,
	carges, kernel-team

Jesper Dangaard Brouer <hawk@kernel.org> writes:

> Hi Eric,
>
> I need an opinion on naming for drop_reasons below.
>
> On 15/01/2026 13.23, Jesper Dangaard Brouer wrote:
>> Add specific drop reasons to SFQ qdisc to improve packet drop observability
>> and monitoring capabilities. This change replaces generic qdisc_drop()
>> calls with qdisc_drop_reason() to provide granular metrics about different
>> drop scenarios in production environments.
>> 
>> Two new drop reasons are introduced:
>> 
>> - SKB_DROP_REASON_QDISC_MAXFLOWS: Used when a new flow cannot be created
>>    because the maximum number of flows (flows parameter) has been
>>    reached and no free flow slots are available.
>> 
>> - SKB_DROP_REASON_QDISC_MAXDEPTH: Used when a flow's queue length exceeds
>>    the per-flow depth limit (depth parameter), triggering either tail drop
>>    or head drop depending on headdrop configuration.
>
> I noticed commit 5765c7f6e317 ("net_sched: sch_fq: add three 
> drop_reason") (Author: Eric Dumazet).
>
>   SKB_DROP_REASON_FQ_BAND_LIMIT: Per-band packet limit exceeded
>   SKB_DROP_REASON_FQ_HORIZON_LIMIT: Packet timestamp too far in future
>   SKB_DROP_REASON_FQ_FLOW_LIMIT: Per-flow packet limit exceeded
>
> Should I/we make SKB_DROP_REASON_QDISC_MAXDEPTH specific for SFQ ?
> Like naming it = SKB_DROP_REASON_SFQ_MAXDEPTH ?
>
> Currently SKB_DROP_REASON_QDISC_MAXDEPTH is only used in SFQ, but it
> might be usable in other qdisc as well.  Except that I noticed the
> meaning of SKB_DROP_REASON_FQ_FLOW_LIMIT which is basically the same.
> This made me think that perhaps I should also make it qdisc specific.
> I'm considering adding a per-flow limit to fq_codel as I'm seeing prod
> issues with the global 10240 packet limit. This also need a similar flow
> depth limit drop reason. I'm undecided which way to go, please advice.

IMO, we should be reusing drop reasons where it makes sense (so
s/FQ/QDISC/ SKB_DROP_REASON_FQ_FLOW_LIMIT), but not sure if these are
considered UAPI (i.e., can we change the name of the existing one)?

-Toke


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next v1] net: sched: sfq: add detailed drop reasons for monitoring
  2026-01-16 11:00   ` Toke Høiland-Jørgensen
@ 2026-01-16 13:31     ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 10+ messages in thread
From: Jesper Dangaard Brouer @ 2026-01-16 13:31 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, netdev
  Cc: bpf, Eric Dumazet, David S. Miller, Jakub Kicinski, Paolo Abeni,
	carges, kernel-team, Alexei Starovoitov, Yan Zhai



On 16/01/2026 12.00, Toke Høiland-Jørgensen wrote:
> Jesper Dangaard Brouer <hawk@kernel.org> writes:
> 
>> Hi Eric,
>>
>> I need an opinion on naming for drop_reasons below.
>>
>> On 15/01/2026 13.23, Jesper Dangaard Brouer wrote:
>>> Add specific drop reasons to SFQ qdisc to improve packet drop observability
>>> and monitoring capabilities. This change replaces generic qdisc_drop()
>>> calls with qdisc_drop_reason() to provide granular metrics about different
>>> drop scenarios in production environments.
>>>
>>> Two new drop reasons are introduced:
>>>
>>> - SKB_DROP_REASON_QDISC_MAXFLOWS: Used when a new flow cannot be created
>>>     because the maximum number of flows (flows parameter) has been
>>>     reached and no free flow slots are available.
>>>
>>> - SKB_DROP_REASON_QDISC_MAXDEPTH: Used when a flow's queue length exceeds
>>>     the per-flow depth limit (depth parameter), triggering either tail drop
>>>     or head drop depending on headdrop configuration.
>>
>> I noticed commit 5765c7f6e317 ("net_sched: sch_fq: add three
>> drop_reason") (Author: Eric Dumazet).
>>
>>    SKB_DROP_REASON_FQ_BAND_LIMIT: Per-band packet limit exceeded
>>    SKB_DROP_REASON_FQ_HORIZON_LIMIT: Packet timestamp too far in future
>>    SKB_DROP_REASON_FQ_FLOW_LIMIT: Per-flow packet limit exceeded
>>
>> Should I/we make SKB_DROP_REASON_QDISC_MAXDEPTH specific for SFQ ?
>> Like naming it = SKB_DROP_REASON_SFQ_MAXDEPTH ?
>>
>> Currently SKB_DROP_REASON_QDISC_MAXDEPTH is only used in SFQ, but it
>> might be usable in other qdisc as well.  Except that I noticed the
>> meaning of SKB_DROP_REASON_FQ_FLOW_LIMIT which is basically the same.
>> This made me think that perhaps I should also make it qdisc specific.
>> I'm considering adding a per-flow limit to fq_codel as I'm seeing prod
>> issues with the global 10240 packet limit. This also need a similar flow
>> depth limit drop reason. I'm undecided which way to go, please advice.
> 
> IMO, we should be reusing drop reasons where it makes sense (so
> s/FQ/QDISC/ SKB_DROP_REASON_FQ_FLOW_LIMIT), but not sure if these are
> considered UAPI (i.e., can we change the name of the existing one)?
> 

The UAPI definition for SKB_DROP_REASON's is interesting. In this patch
(and Eric's commit) we insert in the middle of enum skb_drop_reason (on
purpose), this shows the enum numbers are not UAPI.  This is because we
want to force users to use BTF info in running kernel to resolve these IDs.
The resolved name are IMHO UAPI as e.g. our Rust code (Cc Yan) match on
these names and configures different sampling rates. So, changing name
SKB_DROP_REASON_FQ_FLOW_LIMIT have the change of breaking some userspace
tool consuming these.

Production wise, it would be easier to have SKB_DROP_REASON_SFQ_MAXDEPTH
and SKB_DROP_REASON_FQ_FLOW_LIMIT separate as the qdisc is part of the
name.  Then our Rust code don't have to also decode the net_device to
identify the qdisc involved.

--Jesper

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next v1] net: sched: sfq: add detailed drop reasons for monitoring
  2026-01-16 10:40 ` Jesper Dangaard Brouer
  2026-01-16 11:00   ` Toke Høiland-Jørgensen
@ 2026-01-21  0:26   ` Jakub Kicinski
  2026-01-21 19:13     ` Jesper Dangaard Brouer
  1 sibling, 1 reply; 10+ messages in thread
From: Jakub Kicinski @ 2026-01-21  0:26 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: netdev, bpf, Eric Dumazet, David S. Miller, Paolo Abeni,
	Toke Høiland-Jørgensen, carges, kernel-team

On Fri, 16 Jan 2026 11:40:06 +0100 Jesper Dangaard Brouer wrote:
> On 15/01/2026 13.23, Jesper Dangaard Brouer wrote:
> > Add specific drop reasons to SFQ qdisc to improve packet drop observability
> > and monitoring capabilities. This change replaces generic qdisc_drop()
> > calls with qdisc_drop_reason() to provide granular metrics about different
> > drop scenarios in production environments.
> > 
> > Two new drop reasons are introduced:
> > 
> > - SKB_DROP_REASON_QDISC_MAXFLOWS: Used when a new flow cannot be created
> >    because the maximum number of flows (flows parameter) has been
> >    reached and no free flow slots are available.
> > 
> > - SKB_DROP_REASON_QDISC_MAXDEPTH: Used when a flow's queue length exceeds
> >    the per-flow depth limit (depth parameter), triggering either tail drop
> >    or head drop depending on headdrop configuration.  
> 
> I noticed commit 5765c7f6e317 ("net_sched: sch_fq: add three 
> drop_reason") (Author: Eric Dumazet).
> 
>   SKB_DROP_REASON_FQ_BAND_LIMIT: Per-band packet limit exceeded
>   SKB_DROP_REASON_FQ_HORIZON_LIMIT: Packet timestamp too far in future
>   SKB_DROP_REASON_FQ_FLOW_LIMIT: Per-flow packet limit exceeded
> 
> Should I/we make SKB_DROP_REASON_QDISC_MAXDEPTH specific for SFQ ?
> Like naming it = SKB_DROP_REASON_SFQ_MAXDEPTH ?

FWIW FLOW_LIMIT is more intuitive to me, but I'm mostly dealing with 
fq so probably because that's the param name there.

I'd prefer the reuse (just MAXDEPTH, I don't see equivalent of
MAXFLOWS?). We assign multiple names the same values in the enum to
avoid breaking FQ users.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next v1] net: sched: sfq: add detailed drop reasons for monitoring
  2026-01-21  0:26   ` Jakub Kicinski
@ 2026-01-21 19:13     ` Jesper Dangaard Brouer
  2026-01-22  1:21       ` Jakub Kicinski
  0 siblings, 1 reply; 10+ messages in thread
From: Jesper Dangaard Brouer @ 2026-01-21 19:13 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, bpf, Eric Dumazet, David S. Miller, Paolo Abeni,
	Toke Høiland-Jørgensen, carges, kernel-team, Yan Zhai




On 21/01/2026 01.26, Jakub Kicinski wrote:
> On Fri, 16 Jan 2026 11:40:06 +0100 Jesper Dangaard Brouer wrote:
>> On 15/01/2026 13.23, Jesper Dangaard Brouer wrote:
>>> Add specific drop reasons to SFQ qdisc to improve packet drop observability
>>> and monitoring capabilities. This change replaces generic qdisc_drop()
>>> calls with qdisc_drop_reason() to provide granular metrics about different
>>> drop scenarios in production environments.
>>>
>>> Two new drop reasons are introduced:
>>>
>>> - SKB_DROP_REASON_QDISC_MAXFLOWS: Used when a new flow cannot be created
>>>     because the maximum number of flows (flows parameter) has been
>>>     reached and no free flow slots are available.
>>>
>>> - SKB_DROP_REASON_QDISC_MAXDEPTH: Used when a flow's queue length exceeds
>>>     the per-flow depth limit (depth parameter), triggering either tail drop
>>>     or head drop depending on headdrop configuration.
>>
>> I noticed commit 5765c7f6e317 ("net_sched: sch_fq: add three
>> drop_reason") (Author: Eric Dumazet).
>>
>>    SKB_DROP_REASON_FQ_BAND_LIMIT: Per-band packet limit exceeded
>>    SKB_DROP_REASON_FQ_HORIZON_LIMIT: Packet timestamp too far in future
>>    SKB_DROP_REASON_FQ_FLOW_LIMIT: Per-flow packet limit exceeded
>>
>> Should I/we make SKB_DROP_REASON_QDISC_MAXDEPTH specific for SFQ ?
>> Like naming it = SKB_DROP_REASON_SFQ_MAXDEPTH ?
> 
> FWIW FLOW_LIMIT is more intuitive to me, but I'm mostly dealing with
> fq so probably because that's the param name there.
> 
> I'd prefer the reuse (just MAXDEPTH, I don't see equivalent of
> MAXFLOWS?). We assign multiple names the same values in the enum to
> avoid breaking FQ users.

I've taken a detailed look at how we consume this in production and what
Prometheus metrics that we generate. My conclusion is that, we want to
keep the Eric's approach of having qdisc specific drop-reason (that
relates to qdisc tunables), but extend this with a prefix
"SKB_DROP_REASON_QDISC_xxx". This is what I implemented in [v2] like
"SKB_DROP_REASON_QDISC_SFQ_MAXFLOWS".  The QDISC_ prefix enables pattern
matching for qdisc-related drops allowing monitoring tools to match this
for categorizing new/future drop reasons into same qdisc category.

For Prometheus drop_reason metrics the decision was to omit the
net_device label to keep the metric counts manageable and avoid
exploding the number of time series (lower cardinality, less storage).
Without qdisc-specific names in the enum, we'd be forced to add back the
net_device label, which would explode our time series count.

The FQ flow_limit and SFQ depth parameters have different semantics: FQ
uses exact flow classification while SFQ uses stochastic hashing. They
also have different defaults (FQ: 100 packets, SFQ: 127 packets). Having
the same drop_reason for both would obscure which qdisc's tunable needs
adjustment when analyzing production drops.

Having more enum numbers for drop reasons is essentially free, while
keeping metrics per net_device is expensive in terms of storage and
query performance.

--Jesper

[v2] 
https://lore.kernel.org/all/176885213069.1172320.14270453855073120526.stgit@firesoul/




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next v1] net: sched: sfq: add detailed drop reasons for monitoring
  2026-01-21 19:13     ` Jesper Dangaard Brouer
@ 2026-01-22  1:21       ` Jakub Kicinski
  2026-01-22 15:33         ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 10+ messages in thread
From: Jakub Kicinski @ 2026-01-22  1:21 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: netdev, bpf, Eric Dumazet, David S. Miller, Paolo Abeni,
	Toke Høiland-Jørgensen, carges, kernel-team, Yan Zhai

On Wed, 21 Jan 2026 20:13:31 +0100 Jesper Dangaard Brouer wrote:
> >> I noticed commit 5765c7f6e317 ("net_sched: sch_fq: add three
> >> drop_reason") (Author: Eric Dumazet).
> >>
> >>    SKB_DROP_REASON_FQ_BAND_LIMIT: Per-band packet limit exceeded
> >>    SKB_DROP_REASON_FQ_HORIZON_LIMIT: Packet timestamp too far in future
> >>    SKB_DROP_REASON_FQ_FLOW_LIMIT: Per-flow packet limit exceeded
> >>
> >> Should I/we make SKB_DROP_REASON_QDISC_MAXDEPTH specific for SFQ ?
> >> Like naming it = SKB_DROP_REASON_SFQ_MAXDEPTH ?  
> > 
> > FWIW FLOW_LIMIT is more intuitive to me, but I'm mostly dealing with
> > fq so probably because that's the param name there.
> > 
> > I'd prefer the reuse (just MAXDEPTH, I don't see equivalent of
> > MAXFLOWS?). We assign multiple names the same values in the enum to
> > avoid breaking FQ users.  
> 
> I've taken a detailed look at how we consume this in production and what
> Prometheus metrics that we generate. My conclusion is that, we want to
> keep the Eric's approach of having qdisc specific drop-reason (that
> relates to qdisc tunables), but extend this with a prefix
> "SKB_DROP_REASON_QDISC_xxx". This is what I implemented in [v2] like
> "SKB_DROP_REASON_QDISC_SFQ_MAXFLOWS".  The QDISC_ prefix enables pattern
> matching for qdisc-related drops allowing monitoring tools to match this
> for categorizing new/future drop reasons into same qdisc category.
> 
> For Prometheus drop_reason metrics the decision was to omit the
> net_device label to keep the metric counts manageable and avoid
> exploding the number of time series (lower cardinality, less storage).
> Without qdisc-specific names in the enum, we'd be forced to add back the
> net_device label, which would explode our time series count.
> 
> The FQ flow_limit and SFQ depth parameters have different semantics: FQ
> uses exact flow classification while SFQ uses stochastic hashing. They
> also have different defaults (FQ: 100 packets, SFQ: 127 packets). Having
> the same drop_reason for both would obscure which qdisc's tunable needs
> adjustment when analyzing production drops.
> 
> Having more enum numbers for drop reasons is essentially free, while
> keeping metrics per net_device is expensive in terms of storage and
> query performance.

I'm not familiar with Prometheus is there some off-the-shelf plugin
which exports the drops? The kfree_skb tracepoint does not have netdev,
so if you're saying you have the ability to see the netdev then
presumably there's some BPF in the mix BTF'ing the netdev info out.
And if you can read the netdev info, you can as well read 
netdev->qdisc->ops->id and/or netdev->_tx[0]->qdisc->ops->id

It kinda reads to me like you're trying to encode otherwise-reachable
metadata into the drop reason.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next v1] net: sched: sfq: add detailed drop reasons for monitoring
  2026-01-22  1:21       ` Jakub Kicinski
@ 2026-01-22 15:33         ` Jesper Dangaard Brouer
  2026-01-23  1:59           ` Jakub Kicinski
  0 siblings, 1 reply; 10+ messages in thread
From: Jesper Dangaard Brouer @ 2026-01-22 15:33 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, bpf, Eric Dumazet, David S. Miller, Paolo Abeni,
	Toke Høiland-Jørgensen, carges, kernel-team, Yan Zhai




On 22/01/2026 02.21, Jakub Kicinski wrote:
> On Wed, 21 Jan 2026 20:13:31 +0100 Jesper Dangaard Brouer wrote:
>>>> I noticed commit 5765c7f6e317 ("net_sched: sch_fq: add three
>>>> drop_reason") (Author: Eric Dumazet).
>>>>
>>>>     SKB_DROP_REASON_FQ_BAND_LIMIT: Per-band packet limit exceeded
>>>>     SKB_DROP_REASON_FQ_HORIZON_LIMIT: Packet timestamp too far in future
>>>>     SKB_DROP_REASON_FQ_FLOW_LIMIT: Per-flow packet limit exceeded
>>>>
>>>> Should I/we make SKB_DROP_REASON_QDISC_MAXDEPTH specific for SFQ ?
>>>> Like naming it = SKB_DROP_REASON_SFQ_MAXDEPTH ?
>>>
>>> FWIW FLOW_LIMIT is more intuitive to me, but I'm mostly dealing with
>>> fq so probably because that's the param name there.
>>>
>>> I'd prefer the reuse (just MAXDEPTH, I don't see equivalent of
>>> MAXFLOWS?). We assign multiple names the same values in the enum to
>>> avoid breaking FQ users.
>>
>> I've taken a detailed look at how we consume this in production and what
>> Prometheus metrics that we generate. My conclusion is that, we want to
>> keep the Eric's approach of having qdisc specific drop-reason (that
>> relates to qdisc tunables), but extend this with a prefix
>> "SKB_DROP_REASON_QDISC_xxx". This is what I implemented in [v2] like
>> "SKB_DROP_REASON_QDISC_SFQ_MAXFLOWS".  The QDISC_ prefix enables pattern
>> matching for qdisc-related drops allowing monitoring tools to match this
>> for categorizing new/future drop reasons into same qdisc category.
>>
>> For Prometheus drop_reason metrics the decision was to omit the
>> net_device label to keep the metric counts manageable and avoid
>> exploding the number of time series (lower cardinality, less storage).
>> Without qdisc-specific names in the enum, we'd be forced to add back the
>> net_device label, which would explode our time series count.
>>
>> The FQ flow_limit and SFQ depth parameters have different semantics: FQ
>> uses exact flow classification while SFQ uses stochastic hashing. They
>> also have different defaults (FQ: 100 packets, SFQ: 127 packets). Having
>> the same drop_reason for both would obscure which qdisc's tunable needs
>> adjustment when analyzing production drops.
>>
>> Having more enum numbers for drop reasons is essentially free, while
>> keeping metrics per net_device is expensive in terms of storage and
>> query performance.
> 
> I'm not familiar with Prometheus is there some off-the-shelf plugin
> which exports the drops? 

(I assume you are familiar with Prometheus cardinality explosion problem?)
We have written our own drop-reason monitor that exports drop_reason
enums directly to Prometheus (without decoding netdev or qdisc
information). (Irrelevant: we also collect sampled (e.g. 1/4000) copies
of packet data to userspace where we collect as much metadata as
possible, but that is too expensive to do for every packet).

> The kfree_skb tracepoint does not have netdev,

Correct, the kfree_skb tracepoint does not have netdev avail.
I'm also saying that I don't want the netdev, because that would
explode/increase the Prometheus metric data.

> so if you're saying you have the ability to see the netdev then
> presumably there's some BPF in the mix BTF'ing the netdev info out.
> And if you can read the netdev info, you can as well read
> netdev->qdisc->ops->id and/or netdev->_tx[0]->qdisc->ops->id
> 
> It kinda reads to me like you're trying to encode otherwise-reachable
> metadata into the drop reason.

This feels like a misunderstanding. We don't want to (somehow) decode
extra metadata like netdev or qdisc.  First of all we don't want to
spend extra BPF prog cycles doing this (if it's even possible), and
secondly we want to keep metric simple (see cardinality explosion).

I'm simply saying let us keep Eric's current approach of having qdisc
specific drop-reason *when* those relates to qdisc specific tunables.
And I'm arguing that FQ flow_limit and SFQ depth are two different
things, and should have their own enums.
   I have a specific production use-case, where I want configure SFQ to
have small flow "depth" to penalize elefant flows, so I'm okay with
SKB_DROP_REASON_SFQ_MAXDEPTH events.  So, I want to be able to tell this
apart from FQ drops (SKB_DROP_REASON_FQ_FLOW_LIMIT). I hope that this
makes it clear why I want this to be separate enums(?)

--Jesper


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next v1] net: sched: sfq: add detailed drop reasons for monitoring
  2026-01-22 15:33         ` Jesper Dangaard Brouer
@ 2026-01-23  1:59           ` Jakub Kicinski
  2026-01-23 10:59             ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 10+ messages in thread
From: Jakub Kicinski @ 2026-01-23  1:59 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: netdev, bpf, Eric Dumazet, David S. Miller, Paolo Abeni,
	Toke Høiland-Jørgensen, carges, kernel-team, Yan Zhai

On Thu, 22 Jan 2026 16:33:00 +0100 Jesper Dangaard Brouer wrote:
> > The kfree_skb tracepoint does not have netdev,  
> 
> Correct, the kfree_skb tracepoint does not have netdev avail.
> I'm also saying that I don't want the netdev, because that would
> explode/increase the Prometheus metric data.

Perhaps cutting someones reply mid-sentence is not the most effective
way to communicate.

> > so if you're saying you have the ability to see the netdev then
> > presumably there's some BPF in the mix BTF'ing the netdev info out.
> > And if you can read the netdev info, you can as well read
> > netdev->qdisc->ops->id and/or netdev->_tx[0]->qdisc->ops->id
> > 
> > It kinda reads to me like you're trying to encode otherwise-reachable
> > metadata into the drop reason.  
> 
> This feels like a misunderstanding. We don't want to (somehow) decode
> extra metadata like netdev or qdisc.  First of all we don't want to
> spend extra BPF prog cycles doing this (if it's even possible), and
> secondly we want to keep metric simple (see cardinality explosion).
> 
> I'm simply saying let us keep Eric's current approach of having qdisc
> specific drop-reason *when* those relates to qdisc specific tunables.
> And I'm arguing that FQ flow_limit and SFQ depth are two different
> things, and should have their own enums.
>    I have a specific production use-case, where I want configure SFQ to
> have small flow "depth" to penalize elefant flows, so I'm okay with
> SKB_DROP_REASON_SFQ_MAXDEPTH events.  So, I want to be able to tell this
> apart from FQ drops (SKB_DROP_REASON_FQ_FLOW_LIMIT). I hope that this
> makes it clear why I want this to be separate enums(?)

It really sounds to me like _in your setup_ there's some special
meaning associated with some traffic being in SFQ and other
traffic in FQ. And you can get the relevant information with BPF,
you're just saying that you don't want to spend cycles (with no
data to back that up).

I'm not gonna lose sleep over this, if you convince another maintainer
to merge it it's fine. To me having per qdisc drop reasons. When qdiscs
are *pluggable modules* and drop reasons are a core thing makes
no sense. You're encoding arbitrary metadata into the drop reason for
the convenience of your narrow use case.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next v1] net: sched: sfq: add detailed drop reasons for monitoring
  2026-01-23  1:59           ` Jakub Kicinski
@ 2026-01-23 10:59             ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 10+ messages in thread
From: Jesper Dangaard Brouer @ 2026-01-23 10:59 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, bpf, Eric Dumazet, David S. Miller, Paolo Abeni,
	Toke Høiland-Jørgensen, carges, kernel-team, Yan Zhai



On 23/01/2026 02.59, Jakub Kicinski wrote:
> On Thu, 22 Jan 2026 16:33:00 +0100 Jesper Dangaard Brouer wrote:
>>> The kfree_skb tracepoint does not have netdev,
>>
>> Correct, the kfree_skb tracepoint does not have netdev avail.
>> I'm also saying that I don't want the netdev, because that would
>> explode/increase the Prometheus metric data.
> 
> Perhaps cutting someones reply mid-sentence is not the most effective
> way to communicate.
> 
>>> so if you're saying you have the ability to see the netdev then
>>> presumably there's some BPF in the mix BTF'ing the netdev info out.
>>> And if you can read the netdev info, you can as well read
>>> netdev->qdisc->ops->id and/or netdev->_tx[0]->qdisc->ops->id
>>>
>>> It kinda reads to me like you're trying to encode otherwise-reachable
>>> metadata into the drop reason.
>>
>> This feels like a misunderstanding. We don't want to (somehow) decode
>> extra metadata like netdev or qdisc.  First of all we don't want to
>> spend extra BPF prog cycles doing this (if it's even possible), and
>> secondly we want to keep metric simple (see cardinality explosion).
>>
>> I'm simply saying let us keep Eric's current approach of having qdisc
>> specific drop-reason *when* those relates to qdisc specific tunables.
>> And I'm arguing that FQ flow_limit and SFQ depth are two different
>> things, and should have their own enums.
>>     I have a specific production use-case, where I want configure SFQ to
>> have small flow "depth" to penalize elefant flows, so I'm okay with
>> SKB_DROP_REASON_SFQ_MAXDEPTH events.  So, I want to be able to tell this
>> apart from FQ drops (SKB_DROP_REASON_FQ_FLOW_LIMIT). I hope that this
>> makes it clear why I want this to be separate enums(?)
> 
> It really sounds to me like _in your setup_ there's some special
> meaning associated with some traffic being in SFQ and other
> traffic in FQ. And you can get the relevant information with BPF,
> you're just saying that you don't want to spend cycles (with no
> data to back that up).
> 
> I'm not gonna lose sleep over this, if you convince another maintainer
> to merge it it's fine. To me having per qdisc drop reasons. When qdiscs
> are *pluggable modules* and drop reasons are a core thing makes
> no sense. You're encoding arbitrary metadata into the drop reason for
> the convenience of your narrow use case.

Fair enough. I understand your concern about encoding qdisc identity
into drop reasons. My view is that qdisc-specific drop reasons help
pinpoint exact code paths and tunables, which is valuable for production
debugging. Eric's FQ commit (5765c7f6e317) already established this
pattern with FQ_BAND_LIMIT, FQ_HORIZON_LIMIT, and FQ_FLOW_LIMIT.

I'll send a v3 and expect other maintainers to review this.

--Jesper

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-01-23 10:59 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-15 12:23 [PATCH net-next v1] net: sched: sfq: add detailed drop reasons for monitoring Jesper Dangaard Brouer
2026-01-16 10:40 ` Jesper Dangaard Brouer
2026-01-16 11:00   ` Toke Høiland-Jørgensen
2026-01-16 13:31     ` Jesper Dangaard Brouer
2026-01-21  0:26   ` Jakub Kicinski
2026-01-21 19:13     ` Jesper Dangaard Brouer
2026-01-22  1:21       ` Jakub Kicinski
2026-01-22 15:33         ` Jesper Dangaard Brouer
2026-01-23  1:59           ` Jakub Kicinski
2026-01-23 10:59             ` Jesper Dangaard Brouer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox