* [PATCH net-next v2 1/2] net, sched: Make tc-related drop reason more flexible
@ 2023-10-09 9:26 Daniel Borkmann
2023-10-09 9:26 ` [PATCH net-next v2 2/2] net, sched: Add tcf_set_drop_reason for {__,}tcf_classify Daniel Borkmann
` (4 more replies)
0 siblings, 5 replies; 16+ messages in thread
From: Daniel Borkmann @ 2023-10-09 9:26 UTC (permalink / raw)
To: kuba
Cc: netdev, bpf, jhs, victor, martin.lau, dxu, xiyou.wangcong,
Daniel Borkmann
Currently, the kfree_skb_reason() in sch_handle_{ingress,egress}() can only
express a basic SKB_DROP_REASON_TC_INGRESS or SKB_DROP_REASON_TC_EGRESS reason.
Victor kicked-off an initial proposal to make this more flexible by disambiguating
verdict from return code by moving the verdict into struct tcf_result and
letting tcf_classify() return a negative error. If hit, then two new drop
reasons were added in the proposal, that is SKB_DROP_REASON_TC_INGRESS_ERROR
as well as SKB_DROP_REASON_TC_EGRESS_ERROR. Further analysis of the actual
error codes would have required to attach to tcf_classify via kprobe/kretprobe
to more deeply debug skb and the returned error.
In order to make the kfree_skb_reason() in sch_handle_{ingress,egress}() more
extensible, it can be addressed in a more straight forward way, that is: Instead
of placing the verdict into struct tcf_result, we can just put the drop reason
in there, which does not require changes throughout various classful schedulers
given the existing verdict logic can stay as is.
Then, SKB_DROP_REASON_TC_ERROR{,_*} can be added to the enum skb_drop_reason
to disambiguate between an error or an intentional drop. New drop reason error
codes can be added successively to the tc code base.
For internal error locations which have not yet been annotated with a
SKB_DROP_REASON_TC_ERROR{,_*}, the fallback is SKB_DROP_REASON_TC_INGRESS and
SKB_DROP_REASON_TC_EGRESS, respectively. Generic errors could be marked with a
SKB_DROP_REASON_TC_ERROR code until they are converted to more specific ones
if it is found that they would be useful for troubleshooting.
While drop reasons have infrastructure for subsystem specific error codes which
are currently used by mac80211 and ovs, Jakub mentioned that it is preferred
for tc to use the enum skb_drop_reason core codes given it is a better fit and
currently the tooling support is better, too.
With regards to the latter:
[...] I think Alastair (bpftrace) is working on auto-prettifying enums when
bpftrace outputs maps. So we can do something like:
$ bpftrace -e 'tracepoint:skb:kfree_skb { @[args->reason] = count(); }'
Attaching 1 probe...
^C
@[SKB_DROP_REASON_TC_INGRESS]: 2
@[SKB_CONSUMED]: 34
^^^^^^^^^^^^ names!!
Auto-magically. [...]
Add a small helper tcf_set_drop_reason() which can be used to set the drop reason
into the tcf_result.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Victor Nogueira <victor@mojatatu.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/netdev/20231006063233.74345d36@kernel.org
---
v1 -> v2:
- Renamed tc_set_drop_reason -> tcf_set_drop_reason
- Moved tcf_set_drop_reason into pkt_cls.h (Cong)
include/net/pkt_cls.h | 6 ++++++
include/net/sch_generic.h | 3 +--
net/core/dev.c | 15 ++++++++++-----
3 files changed, 17 insertions(+), 7 deletions(-)
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index f308e8268651..a76c9171db0e 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -154,6 +154,12 @@ __cls_set_class(unsigned long *clp, unsigned long cl)
return xchg(clp, cl);
}
+static inline void tcf_set_drop_reason(struct tcf_result *res,
+ enum skb_drop_reason reason)
+{
+ res->drop_reason = reason;
+}
+
static inline void
__tcf_bind_filter(struct Qdisc *q, struct tcf_result *r, unsigned long base)
{
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index c7318c73cfd6..dcb9160e6467 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -324,7 +324,6 @@ struct Qdisc_ops {
struct module *owner;
};
-
struct tcf_result {
union {
struct {
@@ -332,8 +331,8 @@ struct tcf_result {
u32 classid;
};
const struct tcf_proto *goto_tp;
-
};
+ enum skb_drop_reason drop_reason;
};
struct tcf_chain;
diff --git a/net/core/dev.c b/net/core/dev.c
index 606a366cc209..664426285fa3 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3910,7 +3910,8 @@ EXPORT_SYMBOL_GPL(netdev_xmit_skip_txqueue);
#endif /* CONFIG_NET_EGRESS */
#ifdef CONFIG_NET_XGRESS
-static int tc_run(struct tcx_entry *entry, struct sk_buff *skb)
+static int tc_run(struct tcx_entry *entry, struct sk_buff *skb,
+ enum skb_drop_reason *drop_reason)
{
int ret = TC_ACT_UNSPEC;
#ifdef CONFIG_NET_CLS_ACT
@@ -3922,12 +3923,14 @@ static int tc_run(struct tcx_entry *entry, struct sk_buff *skb)
tc_skb_cb(skb)->mru = 0;
tc_skb_cb(skb)->post_ct = false;
+ res.drop_reason = *drop_reason;
mini_qdisc_bstats_cpu_update(miniq, skb);
ret = tcf_classify(skb, miniq->block, miniq->filter_list, &res, false);
/* Only tcf related quirks below. */
switch (ret) {
case TC_ACT_SHOT:
+ *drop_reason = res.drop_reason;
mini_qdisc_qstats_cpu_drop(miniq);
break;
case TC_ACT_OK:
@@ -3977,6 +3980,7 @@ sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_prev, int *ret,
struct net_device *orig_dev, bool *another)
{
struct bpf_mprog_entry *entry = rcu_dereference_bh(skb->dev->tcx_ingress);
+ enum skb_drop_reason drop_reason = SKB_DROP_REASON_TC_INGRESS;
int sch_ret;
if (!entry)
@@ -3994,7 +3998,7 @@ sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_prev, int *ret,
if (sch_ret != TC_ACT_UNSPEC)
goto ingress_verdict;
}
- sch_ret = tc_run(tcx_entry(entry), skb);
+ sch_ret = tc_run(tcx_entry(entry), skb, &drop_reason);
ingress_verdict:
switch (sch_ret) {
case TC_ACT_REDIRECT:
@@ -4011,7 +4015,7 @@ sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_prev, int *ret,
*ret = NET_RX_SUCCESS;
return NULL;
case TC_ACT_SHOT:
- kfree_skb_reason(skb, SKB_DROP_REASON_TC_INGRESS);
+ kfree_skb_reason(skb, drop_reason);
*ret = NET_RX_DROP;
return NULL;
/* used by tc_run */
@@ -4032,6 +4036,7 @@ static __always_inline struct sk_buff *
sch_handle_egress(struct sk_buff *skb, int *ret, struct net_device *dev)
{
struct bpf_mprog_entry *entry = rcu_dereference_bh(dev->tcx_egress);
+ enum skb_drop_reason drop_reason = SKB_DROP_REASON_TC_EGRESS;
int sch_ret;
if (!entry)
@@ -4045,7 +4050,7 @@ sch_handle_egress(struct sk_buff *skb, int *ret, struct net_device *dev)
if (sch_ret != TC_ACT_UNSPEC)
goto egress_verdict;
}
- sch_ret = tc_run(tcx_entry(entry), skb);
+ sch_ret = tc_run(tcx_entry(entry), skb, &drop_reason);
egress_verdict:
switch (sch_ret) {
case TC_ACT_REDIRECT:
@@ -4054,7 +4059,7 @@ sch_handle_egress(struct sk_buff *skb, int *ret, struct net_device *dev)
*ret = NET_XMIT_SUCCESS;
return NULL;
case TC_ACT_SHOT:
- kfree_skb_reason(skb, SKB_DROP_REASON_TC_EGRESS);
+ kfree_skb_reason(skb, drop_reason);
*ret = NET_XMIT_DROP;
return NULL;
/* used by tc_run */
--
2.34.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH net-next v2 2/2] net, sched: Add tcf_set_drop_reason for {__,}tcf_classify
2023-10-09 9:26 [PATCH net-next v2 1/2] net, sched: Make tc-related drop reason more flexible Daniel Borkmann
@ 2023-10-09 9:26 ` Daniel Borkmann
2023-10-09 14:03 ` [PATCH net-next v2 1/2] net, sched: Make tc-related drop reason more flexible Jamal Hadi Salim
` (3 subsequent siblings)
4 siblings, 0 replies; 16+ messages in thread
From: Daniel Borkmann @ 2023-10-09 9:26 UTC (permalink / raw)
To: kuba
Cc: netdev, bpf, jhs, victor, martin.lau, dxu, xiyou.wangcong,
Daniel Borkmann
Add an initial user for the newly added tcf_set_drop_reason() helper to set the
drop reason for internal errors leading to TC_ACT_SHOT inside {__,}tcf_classify().
Right now this only adds a very basic SKB_DROP_REASON_TC_ERROR as a generic
fallback indicator to mark drop locations. Where needed, such locations can be
converted to more specific codes, for example, when hitting the reclassification
limit, etc.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Victor Nogueira <victor@mojatatu.com>
Cc: Jakub Kicinski <kuba@kernel.org>
---
include/net/dropreason-core.h | 3 +++
net/sched/cls_api.c | 26 ++++++++++++++++++++------
2 files changed, 23 insertions(+), 6 deletions(-)
diff --git a/include/net/dropreason-core.h b/include/net/dropreason-core.h
index a587e83fc169..845dce805de7 100644
--- a/include/net/dropreason-core.h
+++ b/include/net/dropreason-core.h
@@ -80,6 +80,7 @@
FN(IPV6_NDISC_BAD_OPTIONS) \
FN(IPV6_NDISC_NS_OTHERHOST) \
FN(QUEUE_PURGE) \
+ FN(TC_ERROR) \
FNe(MAX)
/**
@@ -345,6 +346,8 @@ enum skb_drop_reason {
SKB_DROP_REASON_IPV6_NDISC_NS_OTHERHOST,
/** @SKB_DROP_REASON_QUEUE_PURGE: bulk free. */
SKB_DROP_REASON_QUEUE_PURGE,
+ /** @SKB_DROP_REASON_TC_ERROR: generic internal tc error. */
+ SKB_DROP_REASON_TC_ERROR,
/**
* @SKB_DROP_REASON_MAX: the maximum of core drop reasons, which
* shouldn't be used as a real 'reason' - only for tracing code gen
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index a193cc7b3241..1daeb2182b70 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -1681,12 +1681,16 @@ static inline int __tcf_classify(struct sk_buff *skb,
* time we got here with a cookie from hardware.
*/
if (unlikely(n->tp != tp || n->tp->chain != n->chain ||
- !tp->ops->get_exts))
+ !tp->ops->get_exts)) {
+ tcf_set_drop_reason(res, SKB_DROP_REASON_TC_ERROR);
return TC_ACT_SHOT;
+ }
exts = tp->ops->get_exts(tp, n->handle);
- if (unlikely(!exts || n->exts != exts))
+ if (unlikely(!exts || n->exts != exts)) {
+ tcf_set_drop_reason(res, SKB_DROP_REASON_TC_ERROR);
return TC_ACT_SHOT;
+ }
n = NULL;
err = tcf_exts_exec_ex(skb, exts, act_index, res);
@@ -1712,8 +1716,10 @@ static inline int __tcf_classify(struct sk_buff *skb,
return err;
}
- if (unlikely(n))
+ if (unlikely(n)) {
+ tcf_set_drop_reason(res, SKB_DROP_REASON_TC_ERROR);
return TC_ACT_SHOT;
+ }
return TC_ACT_UNSPEC; /* signal: continue lookup */
#ifdef CONFIG_NET_CLS_ACT
@@ -1723,6 +1729,7 @@ static inline int __tcf_classify(struct sk_buff *skb,
tp->chain->block->index,
tp->prio & 0xffff,
ntohs(tp->protocol));
+ tcf_set_drop_reason(res, SKB_DROP_REASON_TC_ERROR);
return TC_ACT_SHOT;
}
@@ -1759,8 +1766,10 @@ int tcf_classify(struct sk_buff *skb,
if (ext->act_miss) {
n = tcf_exts_miss_cookie_lookup(ext->act_miss_cookie,
&act_index);
- if (!n)
+ if (!n) {
+ tcf_set_drop_reason(res, SKB_DROP_REASON_TC_ERROR);
return TC_ACT_SHOT;
+ }
chain = n->chain_index;
} else {
@@ -1768,8 +1777,10 @@ int tcf_classify(struct sk_buff *skb,
}
fchain = tcf_chain_lookup_rcu(block, chain);
- if (!fchain)
+ if (!fchain) {
+ tcf_set_drop_reason(res, SKB_DROP_REASON_TC_ERROR);
return TC_ACT_SHOT;
+ }
/* Consume, so cloned/redirect skbs won't inherit ext */
skb_ext_del(skb, TC_SKB_EXT);
@@ -1788,8 +1799,11 @@ int tcf_classify(struct sk_buff *skb,
struct tc_skb_cb *cb = tc_skb_cb(skb);
ext = tc_skb_ext_alloc(skb);
- if (WARN_ON_ONCE(!ext))
+ if (WARN_ON_ONCE(!ext)) {
+ tcf_set_drop_reason(res, SKB_DROP_REASON_TC_ERROR);
return TC_ACT_SHOT;
+ }
+
ext->chain = last_executed_chain;
ext->mru = cb->mru;
ext->post_ct = cb->post_ct;
--
2.34.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH net-next v2 1/2] net, sched: Make tc-related drop reason more flexible
2023-10-09 9:26 [PATCH net-next v2 1/2] net, sched: Make tc-related drop reason more flexible Daniel Borkmann
2023-10-09 9:26 ` [PATCH net-next v2 2/2] net, sched: Add tcf_set_drop_reason for {__,}tcf_classify Daniel Borkmann
@ 2023-10-09 14:03 ` Jamal Hadi Salim
2023-10-13 21:17 ` Jakub Kicinski
` (2 subsequent siblings)
4 siblings, 0 replies; 16+ messages in thread
From: Jamal Hadi Salim @ 2023-10-09 14:03 UTC (permalink / raw)
To: Daniel Borkmann
Cc: kuba, netdev, bpf, victor, martin.lau, dxu, xiyou.wangcong
On Mon, Oct 9, 2023 at 5:27 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> Currently, the kfree_skb_reason() in sch_handle_{ingress,egress}() can only
> express a basic SKB_DROP_REASON_TC_INGRESS or SKB_DROP_REASON_TC_EGRESS reason.
>
> Victor kicked-off an initial proposal to make this more flexible by disambiguating
> verdict from return code by moving the verdict into struct tcf_result and
> letting tcf_classify() return a negative error. If hit, then two new drop
> reasons were added in the proposal, that is SKB_DROP_REASON_TC_INGRESS_ERROR
> as well as SKB_DROP_REASON_TC_EGRESS_ERROR. Further analysis of the actual
> error codes would have required to attach to tcf_classify via kprobe/kretprobe
> to more deeply debug skb and the returned error.
>
> In order to make the kfree_skb_reason() in sch_handle_{ingress,egress}() more
> extensible, it can be addressed in a more straight forward way, that is: Instead
> of placing the verdict into struct tcf_result, we can just put the drop reason
> in there, which does not require changes throughout various classful schedulers
> given the existing verdict logic can stay as is.
>
> Then, SKB_DROP_REASON_TC_ERROR{,_*} can be added to the enum skb_drop_reason
> to disambiguate between an error or an intentional drop. New drop reason error
> codes can be added successively to the tc code base.
>
> For internal error locations which have not yet been annotated with a
> SKB_DROP_REASON_TC_ERROR{,_*}, the fallback is SKB_DROP_REASON_TC_INGRESS and
> SKB_DROP_REASON_TC_EGRESS, respectively. Generic errors could be marked with a
> SKB_DROP_REASON_TC_ERROR code until they are converted to more specific ones
> if it is found that they would be useful for troubleshooting.
>
> While drop reasons have infrastructure for subsystem specific error codes which
> are currently used by mac80211 and ovs, Jakub mentioned that it is preferred
> for tc to use the enum skb_drop_reason core codes given it is a better fit and
> currently the tooling support is better, too.
Daniel - one of us will get to this sometime this week (kind of loaded
on some other stuff atm).
cheers,
jamal
> With regards to the latter:
>
> [...] I think Alastair (bpftrace) is working on auto-prettifying enums when
> bpftrace outputs maps. So we can do something like:
>
> $ bpftrace -e 'tracepoint:skb:kfree_skb { @[args->reason] = count(); }'
> Attaching 1 probe...
> ^C
>
> @[SKB_DROP_REASON_TC_INGRESS]: 2
> @[SKB_CONSUMED]: 34
>
> ^^^^^^^^^^^^ names!!
>
> Auto-magically. [...]
>
> Add a small helper tcf_set_drop_reason() which can be used to set the drop reason
> into the tcf_result.
>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Jamal Hadi Salim <jhs@mojatatu.com>
> Cc: Victor Nogueira <victor@mojatatu.com>
> Cc: Jakub Kicinski <kuba@kernel.org>
> Link: https://lore.kernel.org/netdev/20231006063233.74345d36@kernel.org
> ---
> v1 -> v2:
> - Renamed tc_set_drop_reason -> tcf_set_drop_reason
> - Moved tcf_set_drop_reason into pkt_cls.h (Cong)
>
> include/net/pkt_cls.h | 6 ++++++
> include/net/sch_generic.h | 3 +--
> net/core/dev.c | 15 ++++++++++-----
> 3 files changed, 17 insertions(+), 7 deletions(-)
>
> diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
> index f308e8268651..a76c9171db0e 100644
> --- a/include/net/pkt_cls.h
> +++ b/include/net/pkt_cls.h
> @@ -154,6 +154,12 @@ __cls_set_class(unsigned long *clp, unsigned long cl)
> return xchg(clp, cl);
> }
>
> +static inline void tcf_set_drop_reason(struct tcf_result *res,
> + enum skb_drop_reason reason)
> +{
> + res->drop_reason = reason;
> +}
> +
> static inline void
> __tcf_bind_filter(struct Qdisc *q, struct tcf_result *r, unsigned long base)
> {
> diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
> index c7318c73cfd6..dcb9160e6467 100644
> --- a/include/net/sch_generic.h
> +++ b/include/net/sch_generic.h
> @@ -324,7 +324,6 @@ struct Qdisc_ops {
> struct module *owner;
> };
>
> -
> struct tcf_result {
> union {
> struct {
> @@ -332,8 +331,8 @@ struct tcf_result {
> u32 classid;
> };
> const struct tcf_proto *goto_tp;
> -
> };
> + enum skb_drop_reason drop_reason;
> };
>
> struct tcf_chain;
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 606a366cc209..664426285fa3 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3910,7 +3910,8 @@ EXPORT_SYMBOL_GPL(netdev_xmit_skip_txqueue);
> #endif /* CONFIG_NET_EGRESS */
>
> #ifdef CONFIG_NET_XGRESS
> -static int tc_run(struct tcx_entry *entry, struct sk_buff *skb)
> +static int tc_run(struct tcx_entry *entry, struct sk_buff *skb,
> + enum skb_drop_reason *drop_reason)
> {
> int ret = TC_ACT_UNSPEC;
> #ifdef CONFIG_NET_CLS_ACT
> @@ -3922,12 +3923,14 @@ static int tc_run(struct tcx_entry *entry, struct sk_buff *skb)
>
> tc_skb_cb(skb)->mru = 0;
> tc_skb_cb(skb)->post_ct = false;
> + res.drop_reason = *drop_reason;
>
> mini_qdisc_bstats_cpu_update(miniq, skb);
> ret = tcf_classify(skb, miniq->block, miniq->filter_list, &res, false);
> /* Only tcf related quirks below. */
> switch (ret) {
> case TC_ACT_SHOT:
> + *drop_reason = res.drop_reason;
> mini_qdisc_qstats_cpu_drop(miniq);
> break;
> case TC_ACT_OK:
> @@ -3977,6 +3980,7 @@ sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_prev, int *ret,
> struct net_device *orig_dev, bool *another)
> {
> struct bpf_mprog_entry *entry = rcu_dereference_bh(skb->dev->tcx_ingress);
> + enum skb_drop_reason drop_reason = SKB_DROP_REASON_TC_INGRESS;
> int sch_ret;
>
> if (!entry)
> @@ -3994,7 +3998,7 @@ sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_prev, int *ret,
> if (sch_ret != TC_ACT_UNSPEC)
> goto ingress_verdict;
> }
> - sch_ret = tc_run(tcx_entry(entry), skb);
> + sch_ret = tc_run(tcx_entry(entry), skb, &drop_reason);
> ingress_verdict:
> switch (sch_ret) {
> case TC_ACT_REDIRECT:
> @@ -4011,7 +4015,7 @@ sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_prev, int *ret,
> *ret = NET_RX_SUCCESS;
> return NULL;
> case TC_ACT_SHOT:
> - kfree_skb_reason(skb, SKB_DROP_REASON_TC_INGRESS);
> + kfree_skb_reason(skb, drop_reason);
> *ret = NET_RX_DROP;
> return NULL;
> /* used by tc_run */
> @@ -4032,6 +4036,7 @@ static __always_inline struct sk_buff *
> sch_handle_egress(struct sk_buff *skb, int *ret, struct net_device *dev)
> {
> struct bpf_mprog_entry *entry = rcu_dereference_bh(dev->tcx_egress);
> + enum skb_drop_reason drop_reason = SKB_DROP_REASON_TC_EGRESS;
> int sch_ret;
>
> if (!entry)
> @@ -4045,7 +4050,7 @@ sch_handle_egress(struct sk_buff *skb, int *ret, struct net_device *dev)
> if (sch_ret != TC_ACT_UNSPEC)
> goto egress_verdict;
> }
> - sch_ret = tc_run(tcx_entry(entry), skb);
> + sch_ret = tc_run(tcx_entry(entry), skb, &drop_reason);
> egress_verdict:
> switch (sch_ret) {
> case TC_ACT_REDIRECT:
> @@ -4054,7 +4059,7 @@ sch_handle_egress(struct sk_buff *skb, int *ret, struct net_device *dev)
> *ret = NET_XMIT_SUCCESS;
> return NULL;
> case TC_ACT_SHOT:
> - kfree_skb_reason(skb, SKB_DROP_REASON_TC_EGRESS);
> + kfree_skb_reason(skb, drop_reason);
> *ret = NET_XMIT_DROP;
> return NULL;
> /* used by tc_run */
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next v2 1/2] net, sched: Make tc-related drop reason more flexible
2023-10-09 9:26 [PATCH net-next v2 1/2] net, sched: Make tc-related drop reason more flexible Daniel Borkmann
2023-10-09 9:26 ` [PATCH net-next v2 2/2] net, sched: Add tcf_set_drop_reason for {__,}tcf_classify Daniel Borkmann
2023-10-09 14:03 ` [PATCH net-next v2 1/2] net, sched: Make tc-related drop reason more flexible Jamal Hadi Salim
@ 2023-10-13 21:17 ` Jakub Kicinski
2023-10-16 20:10 ` patchwork-bot+netdevbpf
2023-10-25 8:59 ` Ido Schimmel
4 siblings, 0 replies; 16+ messages in thread
From: Jakub Kicinski @ 2023-10-13 21:17 UTC (permalink / raw)
To: Daniel Borkmann; +Cc: netdev, bpf, jhs, victor, martin.lau, dxu, xiyou.wangcong
On Mon, 9 Oct 2023 11:26:54 +0200 Daniel Borkmann wrote:
> Currently, the kfree_skb_reason() in sch_handle_{ingress,egress}() can only
> express a basic SKB_DROP_REASON_TC_INGRESS or SKB_DROP_REASON_TC_EGRESS reason.
>
> Victor kicked-off an initial proposal to make this more flexible by disambiguating
> verdict from return code by moving the verdict into struct tcf_result and
> letting tcf_classify() return a negative error. If hit, then two new drop
> reasons were added in the proposal, that is SKB_DROP_REASON_TC_INGRESS_ERROR
> as well as SKB_DROP_REASON_TC_EGRESS_ERROR. Further analysis of the actual
> error codes would have required to attach to tcf_classify via kprobe/kretprobe
> to more deeply debug skb and the returned error.
I guess Jamal said "this week" so he has two more days? :)
While we wait:
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next v2 1/2] net, sched: Make tc-related drop reason more flexible
2023-10-09 9:26 [PATCH net-next v2 1/2] net, sched: Make tc-related drop reason more flexible Daniel Borkmann
` (2 preceding siblings ...)
2023-10-13 21:17 ` Jakub Kicinski
@ 2023-10-16 20:10 ` patchwork-bot+netdevbpf
2023-10-25 8:59 ` Ido Schimmel
4 siblings, 0 replies; 16+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-10-16 20:10 UTC (permalink / raw)
To: Daniel Borkmann
Cc: kuba, netdev, bpf, jhs, victor, martin.lau, dxu, xiyou.wangcong
Hello:
This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Mon, 9 Oct 2023 11:26:54 +0200 you wrote:
> Currently, the kfree_skb_reason() in sch_handle_{ingress,egress}() can only
> express a basic SKB_DROP_REASON_TC_INGRESS or SKB_DROP_REASON_TC_EGRESS reason.
>
> Victor kicked-off an initial proposal to make this more flexible by disambiguating
> verdict from return code by moving the verdict into struct tcf_result and
> letting tcf_classify() return a negative error. If hit, then two new drop
> reasons were added in the proposal, that is SKB_DROP_REASON_TC_INGRESS_ERROR
> as well as SKB_DROP_REASON_TC_EGRESS_ERROR. Further analysis of the actual
> error codes would have required to attach to tcf_classify via kprobe/kretprobe
> to more deeply debug skb and the returned error.
>
> [...]
Here is the summary with links:
- [net-next,v2,1/2] net, sched: Make tc-related drop reason more flexible
https://git.kernel.org/netdev/net-next/c/54a59aed395c
- [net-next,v2,2/2] net, sched: Add tcf_set_drop_reason for {__,}tcf_classify
https://git.kernel.org/netdev/net-next/c/39d08b91646d
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next v2 1/2] net, sched: Make tc-related drop reason more flexible
2023-10-09 9:26 [PATCH net-next v2 1/2] net, sched: Make tc-related drop reason more flexible Daniel Borkmann
` (3 preceding siblings ...)
2023-10-16 20:10 ` patchwork-bot+netdevbpf
@ 2023-10-25 8:59 ` Ido Schimmel
2023-10-25 10:01 ` Daniel Borkmann
4 siblings, 1 reply; 16+ messages in thread
From: Ido Schimmel @ 2023-10-25 8:59 UTC (permalink / raw)
To: Daniel Borkmann
Cc: kuba, netdev, bpf, jhs, victor, martin.lau, dxu, xiyou.wangcong
On Mon, Oct 09, 2023 at 11:26:54AM +0200, Daniel Borkmann wrote:
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 606a366cc209..664426285fa3 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3910,7 +3910,8 @@ EXPORT_SYMBOL_GPL(netdev_xmit_skip_txqueue);
> #endif /* CONFIG_NET_EGRESS */
>
> #ifdef CONFIG_NET_XGRESS
> -static int tc_run(struct tcx_entry *entry, struct sk_buff *skb)
> +static int tc_run(struct tcx_entry *entry, struct sk_buff *skb,
> + enum skb_drop_reason *drop_reason)
> {
> int ret = TC_ACT_UNSPEC;
> #ifdef CONFIG_NET_CLS_ACT
> @@ -3922,12 +3923,14 @@ static int tc_run(struct tcx_entry *entry, struct sk_buff *skb)
>
> tc_skb_cb(skb)->mru = 0;
> tc_skb_cb(skb)->post_ct = false;
> + res.drop_reason = *drop_reason;
>
> mini_qdisc_bstats_cpu_update(miniq, skb);
> ret = tcf_classify(skb, miniq->block, miniq->filter_list, &res, false);
> /* Only tcf related quirks below. */
> switch (ret) {
> case TC_ACT_SHOT:
> + *drop_reason = res.drop_reason;
Daniel,
Getting the following splat [1] with CONFIG_DEBUG_NET=y and this
reproducer [2]. Problem seems to be that classifiers clear 'struct
tcf_result::drop_reason', thereby triggering the warning in
__kfree_skb_reason() due to reason being 'SKB_NOT_DROPPED_YET' (0).
Fixed by maintaining the original drop reason if the one returned from
tcf_classify() is 'SKB_NOT_DROPPED_YET' [3]. I can submit this fix
unless you have a better idea.
Thanks
[1]
WARNING: CPU: 0 PID: 181 at net/core/skbuff.c:1082 kfree_skb_reason+0x38/0x130
Modules linked in:
CPU: 0 PID: 181 Comm: mausezahn Not tainted 6.6.0-rc6-custom-ge43e6d9582e0 #682
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014
RIP: 0010:kfree_skb_reason+0x38/0x130
[...]
Call Trace:
<IRQ>
__netif_receive_skb_core.constprop.0+0x837/0xdb0
__netif_receive_skb_one_core+0x3c/0x70
process_backlog+0x95/0x130
__napi_poll+0x25/0x1b0
net_rx_action+0x29b/0x310
__do_softirq+0xc0/0x29b
do_softirq+0x43/0x60
</IRQ>
[2]
#!/bin/bash
ip link add name veth0 type veth peer name veth1
ip link set dev veth0 up
ip link set dev veth1 up
tc qdisc add dev veth1 clsact
tc filter add dev veth1 ingress pref 1 proto all flower dst_mac 00:11:22:33:44:55 action drop
mausezahn veth0 -a own -b 00:11:22:33:44:55 -q -c 1
[3]
diff --git a/net/core/dev.c b/net/core/dev.c
index a37a932a3e14..abd0b13f3f17 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3929,7 +3929,8 @@ static int tc_run(struct tcx_entry *entry, struct sk_buff *skb,
/* Only tcf related quirks below. */
switch (ret) {
case TC_ACT_SHOT:
- *drop_reason = res.drop_reason;
+ if (res.drop_reason != SKB_NOT_DROPPED_YET)
+ *drop_reason = res.drop_reason;
mini_qdisc_qstats_cpu_drop(miniq);
break;
case TC_ACT_OK:
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH net-next v2 1/2] net, sched: Make tc-related drop reason more flexible
2023-10-25 8:59 ` Ido Schimmel
@ 2023-10-25 10:01 ` Daniel Borkmann
2023-10-25 11:05 ` Jamal Hadi Salim
0 siblings, 1 reply; 16+ messages in thread
From: Daniel Borkmann @ 2023-10-25 10:01 UTC (permalink / raw)
To: Ido Schimmel
Cc: kuba, netdev, bpf, jhs, victor, martin.lau, dxu, xiyou.wangcong
Hi Ido,
On 10/25/23 10:59 AM, Ido Schimmel wrote:
> On Mon, Oct 09, 2023 at 11:26:54AM +0200, Daniel Borkmann wrote:
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index 606a366cc209..664426285fa3 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -3910,7 +3910,8 @@ EXPORT_SYMBOL_GPL(netdev_xmit_skip_txqueue);
>> #endif /* CONFIG_NET_EGRESS */
>>
>> #ifdef CONFIG_NET_XGRESS
>> -static int tc_run(struct tcx_entry *entry, struct sk_buff *skb)
>> +static int tc_run(struct tcx_entry *entry, struct sk_buff *skb,
>> + enum skb_drop_reason *drop_reason)
>> {
>> int ret = TC_ACT_UNSPEC;
>> #ifdef CONFIG_NET_CLS_ACT
>> @@ -3922,12 +3923,14 @@ static int tc_run(struct tcx_entry *entry, struct sk_buff *skb)
>>
>> tc_skb_cb(skb)->mru = 0;
>> tc_skb_cb(skb)->post_ct = false;
>> + res.drop_reason = *drop_reason;
>>
>> mini_qdisc_bstats_cpu_update(miniq, skb);
>> ret = tcf_classify(skb, miniq->block, miniq->filter_list, &res, false);
>> /* Only tcf related quirks below. */
>> switch (ret) {
>> case TC_ACT_SHOT:
>> + *drop_reason = res.drop_reason;
>
> Daniel,
>
> Getting the following splat [1] with CONFIG_DEBUG_NET=y and this
> reproducer [2]. Problem seems to be that classifiers clear 'struct
> tcf_result::drop_reason', thereby triggering the warning in
> __kfree_skb_reason() due to reason being 'SKB_NOT_DROPPED_YET' (0).
>
> Fixed by maintaining the original drop reason if the one returned from
> tcf_classify() is 'SKB_NOT_DROPPED_YET' [3]. I can submit this fix
> unless you have a better idea.
Thanks for catching this, looks reasonable to me as a fix.
> [1]
> WARNING: CPU: 0 PID: 181 at net/core/skbuff.c:1082 kfree_skb_reason+0x38/0x130
> Modules linked in:
> CPU: 0 PID: 181 Comm: mausezahn Not tainted 6.6.0-rc6-custom-ge43e6d9582e0 #682
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014
> RIP: 0010:kfree_skb_reason+0x38/0x130
> [...]
> Call Trace:
> <IRQ>
> __netif_receive_skb_core.constprop.0+0x837/0xdb0
> __netif_receive_skb_one_core+0x3c/0x70
> process_backlog+0x95/0x130
> __napi_poll+0x25/0x1b0
> net_rx_action+0x29b/0x310
> __do_softirq+0xc0/0x29b
> do_softirq+0x43/0x60
> </IRQ>
>
> [2]
> #!/bin/bash
>
> ip link add name veth0 type veth peer name veth1
> ip link set dev veth0 up
> ip link set dev veth1 up
> tc qdisc add dev veth1 clsact
> tc filter add dev veth1 ingress pref 1 proto all flower dst_mac 00:11:22:33:44:55 action drop
> mausezahn veth0 -a own -b 00:11:22:33:44:55 -q -c 1
I didn't know you're using mausezahn, nice :)
> [3]
> diff --git a/net/core/dev.c b/net/core/dev.c
> index a37a932a3e14..abd0b13f3f17 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3929,7 +3929,8 @@ static int tc_run(struct tcx_entry *entry, struct sk_buff *skb,
> /* Only tcf related quirks below. */
> switch (ret) {
> case TC_ACT_SHOT:
> - *drop_reason = res.drop_reason;
> + if (res.drop_reason != SKB_NOT_DROPPED_YET)
> + *drop_reason = res.drop_reason;
> mini_qdisc_qstats_cpu_drop(miniq);
> break;
> case TC_ACT_OK:
>
When you submit feel free to add:
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next v2 1/2] net, sched: Make tc-related drop reason more flexible
2023-10-25 10:01 ` Daniel Borkmann
@ 2023-10-25 11:05 ` Jamal Hadi Salim
2023-10-25 11:52 ` Daniel Borkmann
0 siblings, 1 reply; 16+ messages in thread
From: Jamal Hadi Salim @ 2023-10-25 11:05 UTC (permalink / raw)
To: Daniel Borkmann
Cc: Ido Schimmel, kuba, netdev, bpf, victor, martin.lau, dxu,
xiyou.wangcong
On Wed, Oct 25, 2023 at 6:01 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> Hi Ido,
>
> On 10/25/23 10:59 AM, Ido Schimmel wrote:
> > On Mon, Oct 09, 2023 at 11:26:54AM +0200, Daniel Borkmann wrote:
> >> diff --git a/net/core/dev.c b/net/core/dev.c
> >> index 606a366cc209..664426285fa3 100644
> >> --- a/net/core/dev.c
> >> +++ b/net/core/dev.c
> >> @@ -3910,7 +3910,8 @@ EXPORT_SYMBOL_GPL(netdev_xmit_skip_txqueue);
> >> #endif /* CONFIG_NET_EGRESS */
> >>
> >> #ifdef CONFIG_NET_XGRESS
> >> -static int tc_run(struct tcx_entry *entry, struct sk_buff *skb)
> >> +static int tc_run(struct tcx_entry *entry, struct sk_buff *skb,
> >> + enum skb_drop_reason *drop_reason)
> >> {
> >> int ret = TC_ACT_UNSPEC;
> >> #ifdef CONFIG_NET_CLS_ACT
> >> @@ -3922,12 +3923,14 @@ static int tc_run(struct tcx_entry *entry, struct sk_buff *skb)
> >>
> >> tc_skb_cb(skb)->mru = 0;
> >> tc_skb_cb(skb)->post_ct = false;
> >> + res.drop_reason = *drop_reason;
> >>
> >> mini_qdisc_bstats_cpu_update(miniq, skb);
> >> ret = tcf_classify(skb, miniq->block, miniq->filter_list, &res, false);
> >> /* Only tcf related quirks below. */
> >> switch (ret) {
> >> case TC_ACT_SHOT:
> >> + *drop_reason = res.drop_reason;
> >
> > Daniel,
> >
> > Getting the following splat [1] with CONFIG_DEBUG_NET=y and this
> > reproducer [2]. Problem seems to be that classifiers clear 'struct
> > tcf_result::drop_reason', thereby triggering the warning in
> > __kfree_skb_reason() due to reason being 'SKB_NOT_DROPPED_YET' (0).
> >
> > Fixed by maintaining the original drop reason if the one returned from
> > tcf_classify() is 'SKB_NOT_DROPPED_YET' [3]. I can submit this fix
> > unless you have a better idea.
>
> Thanks for catching this, looks reasonable to me as a fix.
>
> > [1]
> > WARNING: CPU: 0 PID: 181 at net/core/skbuff.c:1082 kfree_skb_reason+0x38/0x130
> > Modules linked in:
> > CPU: 0 PID: 181 Comm: mausezahn Not tainted 6.6.0-rc6-custom-ge43e6d9582e0 #682
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014
> > RIP: 0010:kfree_skb_reason+0x38/0x130
> > [...]
> > Call Trace:
> > <IRQ>
> > __netif_receive_skb_core.constprop.0+0x837/0xdb0
> > __netif_receive_skb_one_core+0x3c/0x70
> > process_backlog+0x95/0x130
> > __napi_poll+0x25/0x1b0
> > net_rx_action+0x29b/0x310
> > __do_softirq+0xc0/0x29b
> > do_softirq+0x43/0x60
> > </IRQ>
> >
> > [2]
> > #!/bin/bash
> >
> > ip link add name veth0 type veth peer name veth1
> > ip link set dev veth0 up
> > ip link set dev veth1 up
> > tc qdisc add dev veth1 clsact
> > tc filter add dev veth1 ingress pref 1 proto all flower dst_mac 00:11:22:33:44:55 action drop
> > mausezahn veth0 -a own -b 00:11:22:33:44:55 -q -c 1
>
> I didn't know you're using mausezahn, nice :)
>
> > [3]
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index a37a932a3e14..abd0b13f3f17 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -3929,7 +3929,8 @@ static int tc_run(struct tcx_entry *entry, struct sk_buff *skb,
> > /* Only tcf related quirks below. */
> > switch (ret) {
> > case TC_ACT_SHOT:
> > - *drop_reason = res.drop_reason;
> > + if (res.drop_reason != SKB_NOT_DROPPED_YET)
> > + *drop_reason = res.drop_reason;
> > mini_qdisc_qstats_cpu_drop(miniq);
> > break;
> > case TC_ACT_OK:
> >
Out of curiosity - how does the policy say "drop" but drop_reason does
not reflect it?
cheers,
jamal
> When you submit feel free to add:
>
> Acked-by: Daniel Borkmann <daniel@iogearbox.net>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next v2 1/2] net, sched: Make tc-related drop reason more flexible
2023-10-25 11:05 ` Jamal Hadi Salim
@ 2023-10-25 11:52 ` Daniel Borkmann
2023-10-25 13:21 ` Jamal Hadi Salim
2023-10-25 13:53 ` Jakub Kicinski
0 siblings, 2 replies; 16+ messages in thread
From: Daniel Borkmann @ 2023-10-25 11:52 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: Ido Schimmel, kuba, netdev, bpf, victor, martin.lau, dxu,
xiyou.wangcong
On 10/25/23 1:05 PM, Jamal Hadi Salim wrote:
> On Wed, Oct 25, 2023 at 6:01 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>> On 10/25/23 10:59 AM, Ido Schimmel wrote:
>>> On Mon, Oct 09, 2023 at 11:26:54AM +0200, Daniel Borkmann wrote:
>>>> diff --git a/net/core/dev.c b/net/core/dev.c
>>>> index 606a366cc209..664426285fa3 100644
>>>> --- a/net/core/dev.c
>>>> +++ b/net/core/dev.c
>>>> @@ -3910,7 +3910,8 @@ EXPORT_SYMBOL_GPL(netdev_xmit_skip_txqueue);
>>>> #endif /* CONFIG_NET_EGRESS */
>>>>
>>>> #ifdef CONFIG_NET_XGRESS
>>>> -static int tc_run(struct tcx_entry *entry, struct sk_buff *skb)
>>>> +static int tc_run(struct tcx_entry *entry, struct sk_buff *skb,
>>>> + enum skb_drop_reason *drop_reason)
>>>> {
>>>> int ret = TC_ACT_UNSPEC;
>>>> #ifdef CONFIG_NET_CLS_ACT
>>>> @@ -3922,12 +3923,14 @@ static int tc_run(struct tcx_entry *entry, struct sk_buff *skb)
>>>>
>>>> tc_skb_cb(skb)->mru = 0;
>>>> tc_skb_cb(skb)->post_ct = false;
>>>> + res.drop_reason = *drop_reason;
>>>>
>>>> mini_qdisc_bstats_cpu_update(miniq, skb);
>>>> ret = tcf_classify(skb, miniq->block, miniq->filter_list, &res, false);
>>>> /* Only tcf related quirks below. */
>>>> switch (ret) {
>>>> case TC_ACT_SHOT:
>>>> + *drop_reason = res.drop_reason;
>>>
>>> Daniel,
>>>
>>> Getting the following splat [1] with CONFIG_DEBUG_NET=y and this
>>> reproducer [2]. Problem seems to be that classifiers clear 'struct
>>> tcf_result::drop_reason', thereby triggering the warning in
>>> __kfree_skb_reason() due to reason being 'SKB_NOT_DROPPED_YET' (0).
>>>
>>> Fixed by maintaining the original drop reason if the one returned from
>>> tcf_classify() is 'SKB_NOT_DROPPED_YET' [3]. I can submit this fix
>>> unless you have a better idea.
>>
>> Thanks for catching this, looks reasonable to me as a fix.
>>
>>> [1]
>>> WARNING: CPU: 0 PID: 181 at net/core/skbuff.c:1082 kfree_skb_reason+0x38/0x130
>>> Modules linked in:
>>> CPU: 0 PID: 181 Comm: mausezahn Not tainted 6.6.0-rc6-custom-ge43e6d9582e0 #682
>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014
>>> RIP: 0010:kfree_skb_reason+0x38/0x130
>>> [...]
>>> Call Trace:
>>> <IRQ>
>>> __netif_receive_skb_core.constprop.0+0x837/0xdb0
>>> __netif_receive_skb_one_core+0x3c/0x70
>>> process_backlog+0x95/0x130
>>> __napi_poll+0x25/0x1b0
>>> net_rx_action+0x29b/0x310
>>> __do_softirq+0xc0/0x29b
>>> do_softirq+0x43/0x60
>>> </IRQ>
>>>
>>> [2]
>>> #!/bin/bash
>>>
>>> ip link add name veth0 type veth peer name veth1
>>> ip link set dev veth0 up
>>> ip link set dev veth1 up
>>> tc qdisc add dev veth1 clsact
>>> tc filter add dev veth1 ingress pref 1 proto all flower dst_mac 00:11:22:33:44:55 action drop
>>> mausezahn veth0 -a own -b 00:11:22:33:44:55 -q -c 1
>>
>> I didn't know you're using mausezahn, nice :)
>>
>>> [3]
>>> diff --git a/net/core/dev.c b/net/core/dev.c
>>> index a37a932a3e14..abd0b13f3f17 100644
>>> --- a/net/core/dev.c
>>> +++ b/net/core/dev.c
>>> @@ -3929,7 +3929,8 @@ static int tc_run(struct tcx_entry *entry, struct sk_buff *skb,
>>> /* Only tcf related quirks below. */
>>> switch (ret) {
>>> case TC_ACT_SHOT:
>>> - *drop_reason = res.drop_reason;
>>> + if (res.drop_reason != SKB_NOT_DROPPED_YET)
>>> + *drop_reason = res.drop_reason;
>>> mini_qdisc_qstats_cpu_drop(miniq);
>>> break;
>>> case TC_ACT_OK:
>>>
>
> Out of curiosity - how does the policy say "drop" but drop_reason does
> not reflect it?
Ido, Jamal, wdyt about this alternative approach - these were the locations I could
find from an initial glance (compile-tested) :
From a3d46a55aac484372b60b783cb6a3c98a0fef75c Mon Sep 17 00:00:00 2001
From: Daniel Borkmann <daniel@iogearbox.net>
Date: Wed, 25 Oct 2023 11:43:44 +0000
Subject: [PATCH] net, sched: fix..
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
include/net/pkt_cls.h | 12 ++++++++++++
net/sched/cls_basic.c | 2 +-
net/sched/cls_bpf.c | 2 +-
net/sched/cls_flower.c | 2 +-
net/sched/cls_fw.c | 2 +-
net/sched/cls_matchall.c | 2 +-
net/sched/cls_route.c | 4 ++--
net/sched/cls_u32.c | 2 +-
8 files changed, 20 insertions(+), 8 deletions(-)
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index a76c9171db0e..31d8e8587824 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -160,6 +160,18 @@ static inline void tcf_set_drop_reason(struct tcf_result *res,
res->drop_reason = reason;
}
+static inline void tcf_set_result(struct tcf_result *to,
+ const struct tcf_result *from)
+{
+ /* tcf_result's drop_reason which is the last member must be
+ * preserved and cannot be copied from the cls'es tcf_result
+ * template given this is carried all the way and potentially
+ * set to a concrete tc drop reason upon error or intentional
+ * drop. See tcf_set_drop_reason() locations.
+ */
+ memcpy(to, from, offsetof(typeof(*to), drop_reason));
+}
+
static inline void
__tcf_bind_filter(struct Qdisc *q, struct tcf_result *r, unsigned long base)
{
diff --git a/net/sched/cls_basic.c b/net/sched/cls_basic.c
index 1b92c33b5f81..d7ead3fc3c45 100644
--- a/net/sched/cls_basic.c
+++ b/net/sched/cls_basic.c
@@ -50,7 +50,7 @@ TC_INDIRECT_SCOPE int basic_classify(struct sk_buff *skb,
if (!tcf_em_tree_match(skb, &f->ematches, NULL))
continue;
__this_cpu_inc(f->pf->rhit);
- *res = f->res;
+ tcf_set_result(res, &f->res);
r = tcf_exts_exec(skb, &f->exts, res);
if (r < 0)
continue;
diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
index 382c7a71f81f..e4620a462bc3 100644
--- a/net/sched/cls_bpf.c
+++ b/net/sched/cls_bpf.c
@@ -124,7 +124,7 @@ TC_INDIRECT_SCOPE int cls_bpf_classify(struct sk_buff *skb,
res->class = 0;
res->classid = filter_res;
} else {
- *res = prog->res;
+ tcf_set_result(res, &prog->res);
}
ret = tcf_exts_exec(skb, &prog->exts, res);
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index e5314a31f75a..eb94090fb26c 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -341,7 +341,7 @@ TC_INDIRECT_SCOPE int fl_classify(struct sk_buff *skb,
f = fl_mask_lookup(mask, &skb_key);
if (f && !tc_skip_sw(f->flags)) {
- *res = f->res;
+ tcf_set_result(res, &f->res);
return tcf_exts_exec(skb, &f->exts, res);
}
}
diff --git a/net/sched/cls_fw.c b/net/sched/cls_fw.c
index c49d6af0e048..70b873f8771f 100644
--- a/net/sched/cls_fw.c
+++ b/net/sched/cls_fw.c
@@ -63,7 +63,7 @@ TC_INDIRECT_SCOPE int fw_classify(struct sk_buff *skb,
for (f = rcu_dereference_bh(head->ht[fw_hash(id)]); f;
f = rcu_dereference_bh(f->next)) {
if (f->id == id) {
- *res = f->res;
+ tcf_set_result(res, &f->res);
if (!tcf_match_indev(skb, f->ifindex))
continue;
r = tcf_exts_exec(skb, &f->exts, res);
diff --git a/net/sched/cls_matchall.c b/net/sched/cls_matchall.c
index c4ed11df6254..a4018db80a60 100644
--- a/net/sched/cls_matchall.c
+++ b/net/sched/cls_matchall.c
@@ -37,7 +37,7 @@ TC_INDIRECT_SCOPE int mall_classify(struct sk_buff *skb,
if (tc_skip_sw(head->flags))
return -1;
- *res = head->res;
+ tcf_set_result(res, &head->res);
__this_cpu_inc(head->pf->rhit);
return tcf_exts_exec(skb, &head->exts, res);
}
diff --git a/net/sched/cls_route.c b/net/sched/cls_route.c
index 1424bfeaca73..cbfaa1d1820f 100644
--- a/net/sched/cls_route.c
+++ b/net/sched/cls_route.c
@@ -109,7 +109,7 @@ static inline int route4_hash_wild(void)
#define ROUTE4_APPLY_RESULT() \
{ \
- *res = f->res; \
+ tcf_set_result(res, &f->res); \
if (tcf_exts_has_actions(&f->exts)) { \
int r = tcf_exts_exec(skb, &f->exts, res); \
if (r < 0) { \
@@ -152,7 +152,7 @@ TC_INDIRECT_SCOPE int route4_classify(struct sk_buff *skb,
goto failure;
}
- *res = f->res;
+ tcf_set_result(res, &f->res);
spin_unlock(&fastmap_lock);
return 0;
}
diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
index 6663e971a13e..f50ae40a29d5 100644
--- a/net/sched/cls_u32.c
+++ b/net/sched/cls_u32.c
@@ -172,7 +172,7 @@ TC_INDIRECT_SCOPE int u32_classify(struct sk_buff *skb,
check_terminal:
if (n->sel.flags & TC_U32_TERMINAL) {
- *res = n->res;
+ tcf_set_result(res, &n->res);
if (!tcf_match_indev(skb, n->ifindex)) {
n = rcu_dereference_bh(n->next);
goto next_knode;
--
2.34.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH net-next v2 1/2] net, sched: Make tc-related drop reason more flexible
2023-10-25 11:52 ` Daniel Borkmann
@ 2023-10-25 13:21 ` Jamal Hadi Salim
2023-10-25 13:46 ` Daniel Borkmann
2023-10-25 13:53 ` Jakub Kicinski
1 sibling, 1 reply; 16+ messages in thread
From: Jamal Hadi Salim @ 2023-10-25 13:21 UTC (permalink / raw)
To: Daniel Borkmann
Cc: Ido Schimmel, kuba, netdev, bpf, victor, martin.lau, dxu,
xiyou.wangcong
On Wed, Oct 25, 2023 at 7:52 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 10/25/23 1:05 PM, Jamal Hadi Salim wrote:
> > On Wed, Oct 25, 2023 at 6:01 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >> On 10/25/23 10:59 AM, Ido Schimmel wrote:
> >>> On Mon, Oct 09, 2023 at 11:26:54AM +0200, Daniel Borkmann wrote:
> >>>> diff --git a/net/core/dev.c b/net/core/dev.c
> >>>> index 606a366cc209..664426285fa3 100644
> >>>> --- a/net/core/dev.c
> >>>> +++ b/net/core/dev.c
> >>>> @@ -3910,7 +3910,8 @@ EXPORT_SYMBOL_GPL(netdev_xmit_skip_txqueue);
> >>>> #endif /* CONFIG_NET_EGRESS */
> >>>>
> >>>> #ifdef CONFIG_NET_XGRESS
> >>>> -static int tc_run(struct tcx_entry *entry, struct sk_buff *skb)
> >>>> +static int tc_run(struct tcx_entry *entry, struct sk_buff *skb,
> >>>> + enum skb_drop_reason *drop_reason)
> >>>> {
> >>>> int ret = TC_ACT_UNSPEC;
> >>>> #ifdef CONFIG_NET_CLS_ACT
> >>>> @@ -3922,12 +3923,14 @@ static int tc_run(struct tcx_entry *entry, struct sk_buff *skb)
> >>>>
> >>>> tc_skb_cb(skb)->mru = 0;
> >>>> tc_skb_cb(skb)->post_ct = false;
> >>>> + res.drop_reason = *drop_reason;
> >>>>
> >>>> mini_qdisc_bstats_cpu_update(miniq, skb);
> >>>> ret = tcf_classify(skb, miniq->block, miniq->filter_list, &res, false);
> >>>> /* Only tcf related quirks below. */
> >>>> switch (ret) {
> >>>> case TC_ACT_SHOT:
> >>>> + *drop_reason = res.drop_reason;
> >>>
> >>> Daniel,
> >>>
> >>> Getting the following splat [1] with CONFIG_DEBUG_NET=y and this
> >>> reproducer [2]. Problem seems to be that classifiers clear 'struct
> >>> tcf_result::drop_reason', thereby triggering the warning in
> >>> __kfree_skb_reason() due to reason being 'SKB_NOT_DROPPED_YET' (0).
> >>>
> >>> Fixed by maintaining the original drop reason if the one returned from
> >>> tcf_classify() is 'SKB_NOT_DROPPED_YET' [3]. I can submit this fix
> >>> unless you have a better idea.
> >>
> >> Thanks for catching this, looks reasonable to me as a fix.
> >>
> >>> [1]
> >>> WARNING: CPU: 0 PID: 181 at net/core/skbuff.c:1082 kfree_skb_reason+0x38/0x130
> >>> Modules linked in:
> >>> CPU: 0 PID: 181 Comm: mausezahn Not tainted 6.6.0-rc6-custom-ge43e6d9582e0 #682
> >>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014
> >>> RIP: 0010:kfree_skb_reason+0x38/0x130
> >>> [...]
> >>> Call Trace:
> >>> <IRQ>
> >>> __netif_receive_skb_core.constprop.0+0x837/0xdb0
> >>> __netif_receive_skb_one_core+0x3c/0x70
> >>> process_backlog+0x95/0x130
> >>> __napi_poll+0x25/0x1b0
> >>> net_rx_action+0x29b/0x310
> >>> __do_softirq+0xc0/0x29b
> >>> do_softirq+0x43/0x60
> >>> </IRQ>
> >>>
> >>> [2]
> >>> #!/bin/bash
> >>>
> >>> ip link add name veth0 type veth peer name veth1
> >>> ip link set dev veth0 up
> >>> ip link set dev veth1 up
> >>> tc qdisc add dev veth1 clsact
> >>> tc filter add dev veth1 ingress pref 1 proto all flower dst_mac 00:11:22:33:44:55 action drop
> >>> mausezahn veth0 -a own -b 00:11:22:33:44:55 -q -c 1
> >>
> >> I didn't know you're using mausezahn, nice :)
> >>
> >>> [3]
> >>> diff --git a/net/core/dev.c b/net/core/dev.c
> >>> index a37a932a3e14..abd0b13f3f17 100644
> >>> --- a/net/core/dev.c
> >>> +++ b/net/core/dev.c
> >>> @@ -3929,7 +3929,8 @@ static int tc_run(struct tcx_entry *entry, struct sk_buff *skb,
> >>> /* Only tcf related quirks below. */
> >>> switch (ret) {
> >>> case TC_ACT_SHOT:
> >>> - *drop_reason = res.drop_reason;
> >>> + if (res.drop_reason != SKB_NOT_DROPPED_YET)
> >>> + *drop_reason = res.drop_reason;
> >>> mini_qdisc_qstats_cpu_drop(miniq);
> >>> break;
> >>> case TC_ACT_OK:
> >>>
> >
> > Out of curiosity - how does the policy say "drop" but drop_reason does
> > not reflect it?
>
> Ido, Jamal, wdyt about this alternative approach - these were the locations I could
> find from an initial glance (compile-tested) :
>
> From a3d46a55aac484372b60b783cb6a3c98a0fef75c Mon Sep 17 00:00:00 2001
> From: Daniel Borkmann <daniel@iogearbox.net>
> Date: Wed, 25 Oct 2023 11:43:44 +0000
> Subject: [PATCH] net, sched: fix..
>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> ---
> include/net/pkt_cls.h | 12 ++++++++++++
> net/sched/cls_basic.c | 2 +-
> net/sched/cls_bpf.c | 2 +-
> net/sched/cls_flower.c | 2 +-
> net/sched/cls_fw.c | 2 +-
> net/sched/cls_matchall.c | 2 +-
> net/sched/cls_route.c | 4 ++--
> net/sched/cls_u32.c | 2 +-
> 8 files changed, 20 insertions(+), 8 deletions(-)
>
> diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
> index a76c9171db0e..31d8e8587824 100644
> --- a/include/net/pkt_cls.h
> +++ b/include/net/pkt_cls.h
> @@ -160,6 +160,18 @@ static inline void tcf_set_drop_reason(struct tcf_result *res,
> res->drop_reason = reason;
> }
>
> +static inline void tcf_set_result(struct tcf_result *to,
> + const struct tcf_result *from)
> +{
> + /* tcf_result's drop_reason which is the last member must be
> + * preserved and cannot be copied from the cls'es tcf_result
> + * template given this is carried all the way and potentially
> + * set to a concrete tc drop reason upon error or intentional
> + * drop. See tcf_set_drop_reason() locations.
> + */
> + memcpy(to, from, offsetof(typeof(*to), drop_reason));
> +}
> +
Daniel, IMO, doing this at cls_api is best instead (like what Victors
or my original patch did). Iam ~30K feet right now with a lousy
keyboard - you can either do it, or i or Victor can send the patch by
end of day today. There are missing cases which were covered by Victor
and possibly something else will pop up next.
cheers,
jamal
> static inline void
> __tcf_bind_filter(struct Qdisc *q, struct tcf_result *r, unsigned long base)
> {
> diff --git a/net/sched/cls_basic.c b/net/sched/cls_basic.c
> index 1b92c33b5f81..d7ead3fc3c45 100644
> --- a/net/sched/cls_basic.c
> +++ b/net/sched/cls_basic.c
> @@ -50,7 +50,7 @@ TC_INDIRECT_SCOPE int basic_classify(struct sk_buff *skb,
> if (!tcf_em_tree_match(skb, &f->ematches, NULL))
> continue;
> __this_cpu_inc(f->pf->rhit);
> - *res = f->res;
> + tcf_set_result(res, &f->res);
> r = tcf_exts_exec(skb, &f->exts, res);
> if (r < 0)
> continue;
> diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
> index 382c7a71f81f..e4620a462bc3 100644
> --- a/net/sched/cls_bpf.c
> +++ b/net/sched/cls_bpf.c
> @@ -124,7 +124,7 @@ TC_INDIRECT_SCOPE int cls_bpf_classify(struct sk_buff *skb,
> res->class = 0;
> res->classid = filter_res;
> } else {
> - *res = prog->res;
> + tcf_set_result(res, &prog->res);
> }
>
> ret = tcf_exts_exec(skb, &prog->exts, res);
> diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
> index e5314a31f75a..eb94090fb26c 100644
> --- a/net/sched/cls_flower.c
> +++ b/net/sched/cls_flower.c
> @@ -341,7 +341,7 @@ TC_INDIRECT_SCOPE int fl_classify(struct sk_buff *skb,
>
> f = fl_mask_lookup(mask, &skb_key);
> if (f && !tc_skip_sw(f->flags)) {
> - *res = f->res;
> + tcf_set_result(res, &f->res);
> return tcf_exts_exec(skb, &f->exts, res);
> }
> }
> diff --git a/net/sched/cls_fw.c b/net/sched/cls_fw.c
> index c49d6af0e048..70b873f8771f 100644
> --- a/net/sched/cls_fw.c
> +++ b/net/sched/cls_fw.c
> @@ -63,7 +63,7 @@ TC_INDIRECT_SCOPE int fw_classify(struct sk_buff *skb,
> for (f = rcu_dereference_bh(head->ht[fw_hash(id)]); f;
> f = rcu_dereference_bh(f->next)) {
> if (f->id == id) {
> - *res = f->res;
> + tcf_set_result(res, &f->res);
> if (!tcf_match_indev(skb, f->ifindex))
> continue;
> r = tcf_exts_exec(skb, &f->exts, res);
> diff --git a/net/sched/cls_matchall.c b/net/sched/cls_matchall.c
> index c4ed11df6254..a4018db80a60 100644
> --- a/net/sched/cls_matchall.c
> +++ b/net/sched/cls_matchall.c
> @@ -37,7 +37,7 @@ TC_INDIRECT_SCOPE int mall_classify(struct sk_buff *skb,
> if (tc_skip_sw(head->flags))
> return -1;
>
> - *res = head->res;
> + tcf_set_result(res, &head->res);
> __this_cpu_inc(head->pf->rhit);
> return tcf_exts_exec(skb, &head->exts, res);
> }
> diff --git a/net/sched/cls_route.c b/net/sched/cls_route.c
> index 1424bfeaca73..cbfaa1d1820f 100644
> --- a/net/sched/cls_route.c
> +++ b/net/sched/cls_route.c
> @@ -109,7 +109,7 @@ static inline int route4_hash_wild(void)
>
> #define ROUTE4_APPLY_RESULT() \
> { \
> - *res = f->res; \
> + tcf_set_result(res, &f->res); \
> if (tcf_exts_has_actions(&f->exts)) { \
> int r = tcf_exts_exec(skb, &f->exts, res); \
> if (r < 0) { \
> @@ -152,7 +152,7 @@ TC_INDIRECT_SCOPE int route4_classify(struct sk_buff *skb,
> goto failure;
> }
>
> - *res = f->res;
> + tcf_set_result(res, &f->res);
> spin_unlock(&fastmap_lock);
> return 0;
> }
> diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
> index 6663e971a13e..f50ae40a29d5 100644
> --- a/net/sched/cls_u32.c
> +++ b/net/sched/cls_u32.c
> @@ -172,7 +172,7 @@ TC_INDIRECT_SCOPE int u32_classify(struct sk_buff *skb,
> check_terminal:
> if (n->sel.flags & TC_U32_TERMINAL) {
>
> - *res = n->res;
> + tcf_set_result(res, &n->res);
> if (!tcf_match_indev(skb, n->ifindex)) {
> n = rcu_dereference_bh(n->next);
> goto next_knode;
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next v2 1/2] net, sched: Make tc-related drop reason more flexible
2023-10-25 13:21 ` Jamal Hadi Salim
@ 2023-10-25 13:46 ` Daniel Borkmann
2023-10-25 23:13 ` Jamal Hadi Salim
0 siblings, 1 reply; 16+ messages in thread
From: Daniel Borkmann @ 2023-10-25 13:46 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: Ido Schimmel, kuba, netdev, bpf, victor, martin.lau, dxu,
xiyou.wangcong
On 10/25/23 3:21 PM, Jamal Hadi Salim wrote:
> On Wed, Oct 25, 2023 at 7:52 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>> On 10/25/23 1:05 PM, Jamal Hadi Salim wrote:
>>> On Wed, Oct 25, 2023 at 6:01 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>>>> On 10/25/23 10:59 AM, Ido Schimmel wrote:
>>>>> On Mon, Oct 09, 2023 at 11:26:54AM +0200, Daniel Borkmann wrote:
>>>>>> diff --git a/net/core/dev.c b/net/core/dev.c
>>>>>> index 606a366cc209..664426285fa3 100644
>>>>>> --- a/net/core/dev.c
>>>>>> +++ b/net/core/dev.c
>>>>>> @@ -3910,7 +3910,8 @@ EXPORT_SYMBOL_GPL(netdev_xmit_skip_txqueue);
>>>>>> #endif /* CONFIG_NET_EGRESS */
>>>>>>
>>>>>> #ifdef CONFIG_NET_XGRESS
>>>>>> -static int tc_run(struct tcx_entry *entry, struct sk_buff *skb)
>>>>>> +static int tc_run(struct tcx_entry *entry, struct sk_buff *skb,
>>>>>> + enum skb_drop_reason *drop_reason)
>>>>>> {
>>>>>> int ret = TC_ACT_UNSPEC;
>>>>>> #ifdef CONFIG_NET_CLS_ACT
>>>>>> @@ -3922,12 +3923,14 @@ static int tc_run(struct tcx_entry *entry, struct sk_buff *skb)
>>>>>>
>>>>>> tc_skb_cb(skb)->mru = 0;
>>>>>> tc_skb_cb(skb)->post_ct = false;
>>>>>> + res.drop_reason = *drop_reason;
>>>>>>
>>>>>> mini_qdisc_bstats_cpu_update(miniq, skb);
>>>>>> ret = tcf_classify(skb, miniq->block, miniq->filter_list, &res, false);
>>>>>> /* Only tcf related quirks below. */
>>>>>> switch (ret) {
>>>>>> case TC_ACT_SHOT:
>>>>>> + *drop_reason = res.drop_reason;
>>>>>
>>>>> Daniel,
>>>>>
>>>>> Getting the following splat [1] with CONFIG_DEBUG_NET=y and this
>>>>> reproducer [2]. Problem seems to be that classifiers clear 'struct
>>>>> tcf_result::drop_reason', thereby triggering the warning in
>>>>> __kfree_skb_reason() due to reason being 'SKB_NOT_DROPPED_YET' (0).
>>>>>
>>>>> Fixed by maintaining the original drop reason if the one returned from
>>>>> tcf_classify() is 'SKB_NOT_DROPPED_YET' [3]. I can submit this fix
>>>>> unless you have a better idea.
>>>>
>>>> Thanks for catching this, looks reasonable to me as a fix.
>>>>
>>>>> [1]
>>>>> WARNING: CPU: 0 PID: 181 at net/core/skbuff.c:1082 kfree_skb_reason+0x38/0x130
>>>>> Modules linked in:
>>>>> CPU: 0 PID: 181 Comm: mausezahn Not tainted 6.6.0-rc6-custom-ge43e6d9582e0 #682
>>>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014
>>>>> RIP: 0010:kfree_skb_reason+0x38/0x130
>>>>> [...]
>>>>> Call Trace:
>>>>> <IRQ>
>>>>> __netif_receive_skb_core.constprop.0+0x837/0xdb0
>>>>> __netif_receive_skb_one_core+0x3c/0x70
>>>>> process_backlog+0x95/0x130
>>>>> __napi_poll+0x25/0x1b0
>>>>> net_rx_action+0x29b/0x310
>>>>> __do_softirq+0xc0/0x29b
>>>>> do_softirq+0x43/0x60
>>>>> </IRQ>
>>>>>
>>>>> [2]
>>>>> #!/bin/bash
>>>>>
>>>>> ip link add name veth0 type veth peer name veth1
>>>>> ip link set dev veth0 up
>>>>> ip link set dev veth1 up
>>>>> tc qdisc add dev veth1 clsact
>>>>> tc filter add dev veth1 ingress pref 1 proto all flower dst_mac 00:11:22:33:44:55 action drop
>>>>> mausezahn veth0 -a own -b 00:11:22:33:44:55 -q -c 1
>>>>
>>>> I didn't know you're using mausezahn, nice :)
>>>>
>>>>> [3]
>>>>> diff --git a/net/core/dev.c b/net/core/dev.c
>>>>> index a37a932a3e14..abd0b13f3f17 100644
>>>>> --- a/net/core/dev.c
>>>>> +++ b/net/core/dev.c
>>>>> @@ -3929,7 +3929,8 @@ static int tc_run(struct tcx_entry *entry, struct sk_buff *skb,
>>>>> /* Only tcf related quirks below. */
>>>>> switch (ret) {
>>>>> case TC_ACT_SHOT:
>>>>> - *drop_reason = res.drop_reason;
>>>>> + if (res.drop_reason != SKB_NOT_DROPPED_YET)
>>>>> + *drop_reason = res.drop_reason;
>>>>> mini_qdisc_qstats_cpu_drop(miniq);
>>>>> break;
>>>>> case TC_ACT_OK:
>>>>>
>>>
>>> Out of curiosity - how does the policy say "drop" but drop_reason does
>>> not reflect it?
>>
>> Ido, Jamal, wdyt about this alternative approach - these were the locations I could
>> find from an initial glance (compile-tested) :
>>
>> From a3d46a55aac484372b60b783cb6a3c98a0fef75c Mon Sep 17 00:00:00 2001
>> From: Daniel Borkmann <daniel@iogearbox.net>
>> Date: Wed, 25 Oct 2023 11:43:44 +0000
>> Subject: [PATCH] net, sched: fix..
>>
>> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
>> ---
>> include/net/pkt_cls.h | 12 ++++++++++++
>> net/sched/cls_basic.c | 2 +-
>> net/sched/cls_bpf.c | 2 +-
>> net/sched/cls_flower.c | 2 +-
>> net/sched/cls_fw.c | 2 +-
>> net/sched/cls_matchall.c | 2 +-
>> net/sched/cls_route.c | 4 ++--
>> net/sched/cls_u32.c | 2 +-
>> 8 files changed, 20 insertions(+), 8 deletions(-)
>>
>> diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
>> index a76c9171db0e..31d8e8587824 100644
>> --- a/include/net/pkt_cls.h
>> +++ b/include/net/pkt_cls.h
>> @@ -160,6 +160,18 @@ static inline void tcf_set_drop_reason(struct tcf_result *res,
>> res->drop_reason = reason;
>> }
>>
>> +static inline void tcf_set_result(struct tcf_result *to,
>> + const struct tcf_result *from)
>> +{
>> + /* tcf_result's drop_reason which is the last member must be
>> + * preserved and cannot be copied from the cls'es tcf_result
>> + * template given this is carried all the way and potentially
>> + * set to a concrete tc drop reason upon error or intentional
>> + * drop. See tcf_set_drop_reason() locations.
>> + */
>> + memcpy(to, from, offsetof(typeof(*to), drop_reason));
>> +}
>> +
>
> Daniel, IMO, doing this at cls_api is best instead (like what Victors
> or my original patch did). Iam ~30K feet right now with a lousy
> keyboard - you can either do it, or i or Victor can send the patch by
> end of day today. There are missing cases which were covered by Victor
> and possibly something else will pop up next.
Sure, if you have sth clean and simple for today, go for it. Otherwise I
can cook a proper one out of this as a fix and ship it tomorrow AM, so we
have a fix for the splat in CONFIG_DEBUG_NET kernels and you can still
refactor later.
Thanks,
Daniel
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next v2 1/2] net, sched: Make tc-related drop reason more flexible
2023-10-25 11:52 ` Daniel Borkmann
2023-10-25 13:21 ` Jamal Hadi Salim
@ 2023-10-25 13:53 ` Jakub Kicinski
1 sibling, 0 replies; 16+ messages in thread
From: Jakub Kicinski @ 2023-10-25 13:53 UTC (permalink / raw)
To: Daniel Borkmann
Cc: Jamal Hadi Salim, Ido Schimmel, netdev, bpf, victor, martin.lau,
dxu, xiyou.wangcong
On Wed, 25 Oct 2023 13:52:43 +0200 Daniel Borkmann wrote:
> Ido, Jamal, wdyt about this alternative approach - these were the locations I could
> find from an initial glance (compile-tested) :
If we were to vote on which is better, I vote for this approach.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next v2 1/2] net, sched: Make tc-related drop reason more flexible
2023-10-25 13:46 ` Daniel Borkmann
@ 2023-10-25 23:13 ` Jamal Hadi Salim
2023-10-27 14:01 ` Daniel Borkmann
0 siblings, 1 reply; 16+ messages in thread
From: Jamal Hadi Salim @ 2023-10-25 23:13 UTC (permalink / raw)
To: Daniel Borkmann
Cc: Ido Schimmel, kuba, netdev, bpf, victor, martin.lau, dxu,
xiyou.wangcong
On Wed, Oct 25, 2023 at 9:46 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 10/25/23 3:21 PM, Jamal Hadi Salim wrote:
> > On Wed, Oct 25, 2023 at 7:52 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >> On 10/25/23 1:05 PM, Jamal Hadi Salim wrote:
> >>> On Wed, Oct 25, 2023 at 6:01 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >>>> On 10/25/23 10:59 AM, Ido Schimmel wrote:
> >>>>> On Mon, Oct 09, 2023 at 11:26:54AM +0200, Daniel Borkmann wrote:
> >>>>>> diff --git a/net/core/dev.c b/net/core/dev.c
> >>>>>> index 606a366cc209..664426285fa3 100644
> >>>>>> --- a/net/core/dev.c
> >>>>>> +++ b/net/core/dev.c
> >>>>>> @@ -3910,7 +3910,8 @@ EXPORT_SYMBOL_GPL(netdev_xmit_skip_txqueue);
> >>>>>> #endif /* CONFIG_NET_EGRESS */
> >>>>>>
> >>>>>> #ifdef CONFIG_NET_XGRESS
> >>>>>> -static int tc_run(struct tcx_entry *entry, struct sk_buff *skb)
> >>>>>> +static int tc_run(struct tcx_entry *entry, struct sk_buff *skb,
> >>>>>> + enum skb_drop_reason *drop_reason)
> >>>>>> {
> >>>>>> int ret = TC_ACT_UNSPEC;
> >>>>>> #ifdef CONFIG_NET_CLS_ACT
> >>>>>> @@ -3922,12 +3923,14 @@ static int tc_run(struct tcx_entry *entry, struct sk_buff *skb)
> >>>>>>
> >>>>>> tc_skb_cb(skb)->mru = 0;
> >>>>>> tc_skb_cb(skb)->post_ct = false;
> >>>>>> + res.drop_reason = *drop_reason;
> >>>>>>
> >>>>>> mini_qdisc_bstats_cpu_update(miniq, skb);
> >>>>>> ret = tcf_classify(skb, miniq->block, miniq->filter_list, &res, false);
> >>>>>> /* Only tcf related quirks below. */
> >>>>>> switch (ret) {
> >>>>>> case TC_ACT_SHOT:
> >>>>>> + *drop_reason = res.drop_reason;
> >>>>>
> >>>>> Daniel,
> >>>>>
> >>>>> Getting the following splat [1] with CONFIG_DEBUG_NET=y and this
> >>>>> reproducer [2]. Problem seems to be that classifiers clear 'struct
> >>>>> tcf_result::drop_reason', thereby triggering the warning in
> >>>>> __kfree_skb_reason() due to reason being 'SKB_NOT_DROPPED_YET' (0).
> >>>>>
> >>>>> Fixed by maintaining the original drop reason if the one returned from
> >>>>> tcf_classify() is 'SKB_NOT_DROPPED_YET' [3]. I can submit this fix
> >>>>> unless you have a better idea.
> >>>>
> >>>> Thanks for catching this, looks reasonable to me as a fix.
> >>>>
> >>>>> [1]
> >>>>> WARNING: CPU: 0 PID: 181 at net/core/skbuff.c:1082 kfree_skb_reason+0x38/0x130
> >>>>> Modules linked in:
> >>>>> CPU: 0 PID: 181 Comm: mausezahn Not tainted 6.6.0-rc6-custom-ge43e6d9582e0 #682
> >>>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014
> >>>>> RIP: 0010:kfree_skb_reason+0x38/0x130
> >>>>> [...]
> >>>>> Call Trace:
> >>>>> <IRQ>
> >>>>> __netif_receive_skb_core.constprop.0+0x837/0xdb0
> >>>>> __netif_receive_skb_one_core+0x3c/0x70
> >>>>> process_backlog+0x95/0x130
> >>>>> __napi_poll+0x25/0x1b0
> >>>>> net_rx_action+0x29b/0x310
> >>>>> __do_softirq+0xc0/0x29b
> >>>>> do_softirq+0x43/0x60
> >>>>> </IRQ>
> >>>>>
> >>>>> [2]
> >>>>> #!/bin/bash
> >>>>>
> >>>>> ip link add name veth0 type veth peer name veth1
> >>>>> ip link set dev veth0 up
> >>>>> ip link set dev veth1 up
> >>>>> tc qdisc add dev veth1 clsact
> >>>>> tc filter add dev veth1 ingress pref 1 proto all flower dst_mac 00:11:22:33:44:55 action drop
> >>>>> mausezahn veth0 -a own -b 00:11:22:33:44:55 -q -c 1
> >>>>
> >>>> I didn't know you're using mausezahn, nice :)
> >>>>
> >>>>> [3]
> >>>>> diff --git a/net/core/dev.c b/net/core/dev.c
> >>>>> index a37a932a3e14..abd0b13f3f17 100644
> >>>>> --- a/net/core/dev.c
> >>>>> +++ b/net/core/dev.c
> >>>>> @@ -3929,7 +3929,8 @@ static int tc_run(struct tcx_entry *entry, struct sk_buff *skb,
> >>>>> /* Only tcf related quirks below. */
> >>>>> switch (ret) {
> >>>>> case TC_ACT_SHOT:
> >>>>> - *drop_reason = res.drop_reason;
> >>>>> + if (res.drop_reason != SKB_NOT_DROPPED_YET)
> >>>>> + *drop_reason = res.drop_reason;
> >>>>> mini_qdisc_qstats_cpu_drop(miniq);
> >>>>> break;
> >>>>> case TC_ACT_OK:
> >>>>>
> >>>
> >>> Out of curiosity - how does the policy say "drop" but drop_reason does
> >>> not reflect it?
> >>
> >> Ido, Jamal, wdyt about this alternative approach - these were the locations I could
> >> find from an initial glance (compile-tested) :
> >>
> >> From a3d46a55aac484372b60b783cb6a3c98a0fef75c Mon Sep 17 00:00:00 2001
> >> From: Daniel Borkmann <daniel@iogearbox.net>
> >> Date: Wed, 25 Oct 2023 11:43:44 +0000
> >> Subject: [PATCH] net, sched: fix..
> >>
> >> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> >> ---
> >> include/net/pkt_cls.h | 12 ++++++++++++
> >> net/sched/cls_basic.c | 2 +-
> >> net/sched/cls_bpf.c | 2 +-
> >> net/sched/cls_flower.c | 2 +-
> >> net/sched/cls_fw.c | 2 +-
> >> net/sched/cls_matchall.c | 2 +-
> >> net/sched/cls_route.c | 4 ++--
> >> net/sched/cls_u32.c | 2 +-
> >> 8 files changed, 20 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
> >> index a76c9171db0e..31d8e8587824 100644
> >> --- a/include/net/pkt_cls.h
> >> +++ b/include/net/pkt_cls.h
> >> @@ -160,6 +160,18 @@ static inline void tcf_set_drop_reason(struct tcf_result *res,
> >> res->drop_reason = reason;
> >> }
> >>
> >> +static inline void tcf_set_result(struct tcf_result *to,
> >> + const struct tcf_result *from)
> >> +{
> >> + /* tcf_result's drop_reason which is the last member must be
> >> + * preserved and cannot be copied from the cls'es tcf_result
> >> + * template given this is carried all the way and potentially
> >> + * set to a concrete tc drop reason upon error or intentional
> >> + * drop. See tcf_set_drop_reason() locations.
> >> + */
> >> + memcpy(to, from, offsetof(typeof(*to), drop_reason));
> >> +}
> >> +
> >
> > Daniel, IMO, doing this at cls_api is best instead (like what Victors
> > or my original patch did). Iam ~30K feet right now with a lousy
> > keyboard - you can either do it, or i or Victor can send the patch by
> > end of day today. There are missing cases which were covered by Victor
> > and possibly something else will pop up next.
>
> Sure, if you have sth clean and simple for today, go for it. Otherwise I
> can cook a proper one out of this as a fix and ship it tomorrow AM, so we
> have a fix for the splat in CONFIG_DEBUG_NET kernels and you can still
> refactor later.
>
I was thinking something along these lines:
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -1662,6 +1662,7 @@ static inline int __tcf_classify(struct sk_buff *skb,
const int max_reclassify_loop = 16;
const struct tcf_proto *first_tp;
int limit = 0;
+ u32 reason_code = res->drop_reason;
reclassify:
#endif
@@ -1712,8 +1713,11 @@ static inline int __tcf_classify(struct sk_buff *skb,
goto reset;
}
#endif
- if (err >= 0)
+ if (err >= 0) {
+ if (err == TC_ACT_SHOT) /* policy drop */
+ tcf_set_drop_reason(res, orig_reason);
return err;
+ }
}
if (unlikely(n)) {
But tbh, i am struggling with the whole approach you took in the
earlier patch - i.e setting the drop reason from cls_act and then
expecting it not to be changed on policy drops; while it works for
clsact, it is not going to work for the rest of the qdiscs - unless we
change all the qdisc enqueues to take an extra param for drop_reason.
Thoughts?
cheers,
jamal
> Thanks,
> Daniel
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next v2 1/2] net, sched: Make tc-related drop reason more flexible
2023-10-25 23:13 ` Jamal Hadi Salim
@ 2023-10-27 14:01 ` Daniel Borkmann
2023-10-27 17:37 ` Jamal Hadi Salim
0 siblings, 1 reply; 16+ messages in thread
From: Daniel Borkmann @ 2023-10-27 14:01 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: Ido Schimmel, kuba, netdev, bpf, victor, martin.lau, dxu,
xiyou.wangcong
On 10/26/23 1:13 AM, Jamal Hadi Salim wrote:
> On Wed, Oct 25, 2023 at 9:46 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>> On 10/25/23 3:21 PM, Jamal Hadi Salim wrote:
>>> On Wed, Oct 25, 2023 at 7:52 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>>>> On 10/25/23 1:05 PM, Jamal Hadi Salim wrote:
>>>>> On Wed, Oct 25, 2023 at 6:01 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>>>>>> On 10/25/23 10:59 AM, Ido Schimmel wrote:
>>>>>>> On Mon, Oct 09, 2023 at 11:26:54AM +0200, Daniel Borkmann wrote:
>>>>>>>> diff --git a/net/core/dev.c b/net/core/dev.c
>>>>>>>> index 606a366cc209..664426285fa3 100644
>>>>>>>> --- a/net/core/dev.c
>>>>>>>> +++ b/net/core/dev.c
>>>>>>>> @@ -3910,7 +3910,8 @@ EXPORT_SYMBOL_GPL(netdev_xmit_skip_txqueue);
>>>>>>>> #endif /* CONFIG_NET_EGRESS */
>>>>>>>>
>>>>>>>> #ifdef CONFIG_NET_XGRESS
>>>>>>>> -static int tc_run(struct tcx_entry *entry, struct sk_buff *skb)
>>>>>>>> +static int tc_run(struct tcx_entry *entry, struct sk_buff *skb,
>>>>>>>> + enum skb_drop_reason *drop_reason)
>>>>>>>> {
>>>>>>>> int ret = TC_ACT_UNSPEC;
>>>>>>>> #ifdef CONFIG_NET_CLS_ACT
>>>>>>>> @@ -3922,12 +3923,14 @@ static int tc_run(struct tcx_entry *entry, struct sk_buff *skb)
>>>>>>>>
>>>>>>>> tc_skb_cb(skb)->mru = 0;
>>>>>>>> tc_skb_cb(skb)->post_ct = false;
>>>>>>>> + res.drop_reason = *drop_reason;
>>>>>>>>
>>>>>>>> mini_qdisc_bstats_cpu_update(miniq, skb);
>>>>>>>> ret = tcf_classify(skb, miniq->block, miniq->filter_list, &res, false);
>>>>>>>> /* Only tcf related quirks below. */
>>>>>>>> switch (ret) {
>>>>>>>> case TC_ACT_SHOT:
>>>>>>>> + *drop_reason = res.drop_reason;
>>>>>>>
>>>>>>> Daniel,
>>>>>>>
>>>>>>> Getting the following splat [1] with CONFIG_DEBUG_NET=y and this
>>>>>>> reproducer [2]. Problem seems to be that classifiers clear 'struct
>>>>>>> tcf_result::drop_reason', thereby triggering the warning in
>>>>>>> __kfree_skb_reason() due to reason being 'SKB_NOT_DROPPED_YET' (0).
>>>>>>>
>>>>>>> Fixed by maintaining the original drop reason if the one returned from
>>>>>>> tcf_classify() is 'SKB_NOT_DROPPED_YET' [3]. I can submit this fix
>>>>>>> unless you have a better idea.
>>>>>>
>>>>>> Thanks for catching this, looks reasonable to me as a fix.
>>>>>>
>>>>>>> [1]
>>>>>>> WARNING: CPU: 0 PID: 181 at net/core/skbuff.c:1082 kfree_skb_reason+0x38/0x130
>>>>>>> Modules linked in:
>>>>>>> CPU: 0 PID: 181 Comm: mausezahn Not tainted 6.6.0-rc6-custom-ge43e6d9582e0 #682
>>>>>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014
>>>>>>> RIP: 0010:kfree_skb_reason+0x38/0x130
>>>>>>> [...]
>>>>>>> Call Trace:
>>>>>>> <IRQ>
>>>>>>> __netif_receive_skb_core.constprop.0+0x837/0xdb0
>>>>>>> __netif_receive_skb_one_core+0x3c/0x70
>>>>>>> process_backlog+0x95/0x130
>>>>>>> __napi_poll+0x25/0x1b0
>>>>>>> net_rx_action+0x29b/0x310
>>>>>>> __do_softirq+0xc0/0x29b
>>>>>>> do_softirq+0x43/0x60
>>>>>>> </IRQ>
>>>>>>>
>>>>>>> [2]
>>>>>>> #!/bin/bash
>>>>>>>
>>>>>>> ip link add name veth0 type veth peer name veth1
>>>>>>> ip link set dev veth0 up
>>>>>>> ip link set dev veth1 up
>>>>>>> tc qdisc add dev veth1 clsact
>>>>>>> tc filter add dev veth1 ingress pref 1 proto all flower dst_mac 00:11:22:33:44:55 action drop
>>>>>>> mausezahn veth0 -a own -b 00:11:22:33:44:55 -q -c 1
>>>>>>
>>>>>> I didn't know you're using mausezahn, nice :)
>>>>>>
>>>>>>> [3]
>>>>>>> diff --git a/net/core/dev.c b/net/core/dev.c
>>>>>>> index a37a932a3e14..abd0b13f3f17 100644
>>>>>>> --- a/net/core/dev.c
>>>>>>> +++ b/net/core/dev.c
>>>>>>> @@ -3929,7 +3929,8 @@ static int tc_run(struct tcx_entry *entry, struct sk_buff *skb,
>>>>>>> /* Only tcf related quirks below. */
>>>>>>> switch (ret) {
>>>>>>> case TC_ACT_SHOT:
>>>>>>> - *drop_reason = res.drop_reason;
>>>>>>> + if (res.drop_reason != SKB_NOT_DROPPED_YET)
>>>>>>> + *drop_reason = res.drop_reason;
>>>>>>> mini_qdisc_qstats_cpu_drop(miniq);
>>>>>>> break;
>>>>>>> case TC_ACT_OK:
>>>>>>>
>>>>>
>>>>> Out of curiosity - how does the policy say "drop" but drop_reason does
>>>>> not reflect it?
>>>>
>>>> Ido, Jamal, wdyt about this alternative approach - these were the locations I could
>>>> find from an initial glance (compile-tested) :
>>>>
>>>> From a3d46a55aac484372b60b783cb6a3c98a0fef75c Mon Sep 17 00:00:00 2001
>>>> From: Daniel Borkmann <daniel@iogearbox.net>
>>>> Date: Wed, 25 Oct 2023 11:43:44 +0000
>>>> Subject: [PATCH] net, sched: fix..
>>>>
>>>> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
>>>> ---
>>>> include/net/pkt_cls.h | 12 ++++++++++++
>>>> net/sched/cls_basic.c | 2 +-
>>>> net/sched/cls_bpf.c | 2 +-
>>>> net/sched/cls_flower.c | 2 +-
>>>> net/sched/cls_fw.c | 2 +-
>>>> net/sched/cls_matchall.c | 2 +-
>>>> net/sched/cls_route.c | 4 ++--
>>>> net/sched/cls_u32.c | 2 +-
>>>> 8 files changed, 20 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
>>>> index a76c9171db0e..31d8e8587824 100644
>>>> --- a/include/net/pkt_cls.h
>>>> +++ b/include/net/pkt_cls.h
>>>> @@ -160,6 +160,18 @@ static inline void tcf_set_drop_reason(struct tcf_result *res,
>>>> res->drop_reason = reason;
>>>> }
>>>>
>>>> +static inline void tcf_set_result(struct tcf_result *to,
>>>> + const struct tcf_result *from)
>>>> +{
>>>> + /* tcf_result's drop_reason which is the last member must be
>>>> + * preserved and cannot be copied from the cls'es tcf_result
>>>> + * template given this is carried all the way and potentially
>>>> + * set to a concrete tc drop reason upon error or intentional
>>>> + * drop. See tcf_set_drop_reason() locations.
>>>> + */
>>>> + memcpy(to, from, offsetof(typeof(*to), drop_reason));
>>>> +}
>>>> +
>>>
>>> Daniel, IMO, doing this at cls_api is best instead (like what Victors
>>> or my original patch did). Iam ~30K feet right now with a lousy
>>> keyboard - you can either do it, or i or Victor can send the patch by
>>> end of day today. There are missing cases which were covered by Victor
>>> and possibly something else will pop up next.
>>
>> Sure, if you have sth clean and simple for today, go for it. Otherwise I
>> can cook a proper one out of this as a fix and ship it tomorrow AM, so we
>> have a fix for the splat in CONFIG_DEBUG_NET kernels and you can still
>> refactor later.
>
> I was thinking something along these lines:
>
> --- a/net/sched/cls_api.c
> +++ b/net/sched/cls_api.c
> @@ -1662,6 +1662,7 @@ static inline int __tcf_classify(struct sk_buff *skb,
> const int max_reclassify_loop = 16;
> const struct tcf_proto *first_tp;
> int limit = 0;
> + u32 reason_code = res->drop_reason;
>
> reclassify:
> #endif
> @@ -1712,8 +1713,11 @@ static inline int __tcf_classify(struct sk_buff *skb,
> goto reset;
> }
> #endif
> - if (err >= 0)
> + if (err >= 0) {
> + if (err == TC_ACT_SHOT) /* policy drop */
> + tcf_set_drop_reason(res, orig_reason);
> return err;
> + }
> }
>
> if (unlikely(n)) {
>
> But tbh, i am struggling with the whole approach you took in the
> earlier patch - i.e setting the drop reason from cls_act and then
> expecting it not to be changed on policy drops; while it works for
> clsact, it is not going to work for the rest of the qdiscs - unless we
> change all the qdisc enqueues to take an extra param for drop_reason.
> Thoughts?
The downside of the above would be that you then cannot make use of
tcf_set_drop_reason() further down the tc engine, for example, within
classifiers/actions, e.g. tcf_action_exec() where you also have drops
like:
[...]
} else if (TC_ACT_EXT_CMP(ret, TC_ACT_GOTO_CHAIN)) {
if (unlikely(!rcu_access_pointer(a->goto_chain))) {
net_warn_ratelimited("can't go to NULL chain!\n");
return TC_ACT_SHOT;
}
tcf_action_goto_chain_exec(a, res);
}
[...]
If I spot this correctly, qdiscs also use tcf_classify() as well, and
so far none of them have kfree_skb_reason() support, but if you plan to
add it you can just set a default res.drop_reason from there as well
since tcf_classify() already takes the res param.
Thanks,
Daniel
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next v2 1/2] net, sched: Make tc-related drop reason more flexible
2023-10-27 14:01 ` Daniel Borkmann
@ 2023-10-27 17:37 ` Jamal Hadi Salim
2023-10-27 18:29 ` Daniel Borkmann
0 siblings, 1 reply; 16+ messages in thread
From: Jamal Hadi Salim @ 2023-10-27 17:37 UTC (permalink / raw)
To: Daniel Borkmann
Cc: Ido Schimmel, kuba, netdev, bpf, victor, martin.lau, dxu,
xiyou.wangcong
On Fri, Oct 27, 2023 at 10:01 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 10/26/23 1:13 AM, Jamal Hadi Salim wrote:
> > On Wed, Oct 25, 2023 at 9:46 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >> On 10/25/23 3:21 PM, Jamal Hadi Salim wrote:
> >>> On Wed, Oct 25, 2023 at 7:52 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >>>> On 10/25/23 1:05 PM, Jamal Hadi Salim wrote:
> >>>>> On Wed, Oct 25, 2023 at 6:01 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >>>>>> On 10/25/23 10:59 AM, Ido Schimmel wrote:
> >>>>>>> On Mon, Oct 09, 2023 at 11:26:54AM +0200, Daniel Borkmann wrote:
> >>>>>>>> diff --git a/net/core/dev.c b/net/core/dev.c
> >>>>>>>> index 606a366cc209..664426285fa3 100644
> >>>>>>>> --- a/net/core/dev.c
> >>>>>>>> +++ b/net/core/dev.c
> >>>>>>>> @@ -3910,7 +3910,8 @@ EXPORT_SYMBOL_GPL(netdev_xmit_skip_txqueue);
> >>>>>>>> #endif /* CONFIG_NET_EGRESS */
> >>>>>>>>
> >>>>>>>> #ifdef CONFIG_NET_XGRESS
> >>>>>>>> -static int tc_run(struct tcx_entry *entry, struct sk_buff *skb)
> >>>>>>>> +static int tc_run(struct tcx_entry *entry, struct sk_buff *skb,
> >>>>>>>> + enum skb_drop_reason *drop_reason)
> >>>>>>>> {
> >>>>>>>> int ret = TC_ACT_UNSPEC;
> >>>>>>>> #ifdef CONFIG_NET_CLS_ACT
> >>>>>>>> @@ -3922,12 +3923,14 @@ static int tc_run(struct tcx_entry *entry, struct sk_buff *skb)
> >>>>>>>>
> >>>>>>>> tc_skb_cb(skb)->mru = 0;
> >>>>>>>> tc_skb_cb(skb)->post_ct = false;
> >>>>>>>> + res.drop_reason = *drop_reason;
> >>>>>>>>
> >>>>>>>> mini_qdisc_bstats_cpu_update(miniq, skb);
> >>>>>>>> ret = tcf_classify(skb, miniq->block, miniq->filter_list, &res, false);
> >>>>>>>> /* Only tcf related quirks below. */
> >>>>>>>> switch (ret) {
> >>>>>>>> case TC_ACT_SHOT:
> >>>>>>>> + *drop_reason = res.drop_reason;
> >>>>>>>
> >>>>>>> Daniel,
> >>>>>>>
> >>>>>>> Getting the following splat [1] with CONFIG_DEBUG_NET=y and this
> >>>>>>> reproducer [2]. Problem seems to be that classifiers clear 'struct
> >>>>>>> tcf_result::drop_reason', thereby triggering the warning in
> >>>>>>> __kfree_skb_reason() due to reason being 'SKB_NOT_DROPPED_YET' (0).
> >>>>>>>
> >>>>>>> Fixed by maintaining the original drop reason if the one returned from
> >>>>>>> tcf_classify() is 'SKB_NOT_DROPPED_YET' [3]. I can submit this fix
> >>>>>>> unless you have a better idea.
> >>>>>>
> >>>>>> Thanks for catching this, looks reasonable to me as a fix.
> >>>>>>
> >>>>>>> [1]
> >>>>>>> WARNING: CPU: 0 PID: 181 at net/core/skbuff.c:1082 kfree_skb_reason+0x38/0x130
> >>>>>>> Modules linked in:
> >>>>>>> CPU: 0 PID: 181 Comm: mausezahn Not tainted 6.6.0-rc6-custom-ge43e6d9582e0 #682
> >>>>>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014
> >>>>>>> RIP: 0010:kfree_skb_reason+0x38/0x130
> >>>>>>> [...]
> >>>>>>> Call Trace:
> >>>>>>> <IRQ>
> >>>>>>> __netif_receive_skb_core.constprop.0+0x837/0xdb0
> >>>>>>> __netif_receive_skb_one_core+0x3c/0x70
> >>>>>>> process_backlog+0x95/0x130
> >>>>>>> __napi_poll+0x25/0x1b0
> >>>>>>> net_rx_action+0x29b/0x310
> >>>>>>> __do_softirq+0xc0/0x29b
> >>>>>>> do_softirq+0x43/0x60
> >>>>>>> </IRQ>
> >>>>>>>
> >>>>>>> [2]
> >>>>>>> #!/bin/bash
> >>>>>>>
> >>>>>>> ip link add name veth0 type veth peer name veth1
> >>>>>>> ip link set dev veth0 up
> >>>>>>> ip link set dev veth1 up
> >>>>>>> tc qdisc add dev veth1 clsact
> >>>>>>> tc filter add dev veth1 ingress pref 1 proto all flower dst_mac 00:11:22:33:44:55 action drop
> >>>>>>> mausezahn veth0 -a own -b 00:11:22:33:44:55 -q -c 1
> >>>>>>
> >>>>>> I didn't know you're using mausezahn, nice :)
> >>>>>>
> >>>>>>> [3]
> >>>>>>> diff --git a/net/core/dev.c b/net/core/dev.c
> >>>>>>> index a37a932a3e14..abd0b13f3f17 100644
> >>>>>>> --- a/net/core/dev.c
> >>>>>>> +++ b/net/core/dev.c
> >>>>>>> @@ -3929,7 +3929,8 @@ static int tc_run(struct tcx_entry *entry, struct sk_buff *skb,
> >>>>>>> /* Only tcf related quirks below. */
> >>>>>>> switch (ret) {
> >>>>>>> case TC_ACT_SHOT:
> >>>>>>> - *drop_reason = res.drop_reason;
> >>>>>>> + if (res.drop_reason != SKB_NOT_DROPPED_YET)
> >>>>>>> + *drop_reason = res.drop_reason;
> >>>>>>> mini_qdisc_qstats_cpu_drop(miniq);
> >>>>>>> break;
> >>>>>>> case TC_ACT_OK:
> >>>>>>>
> >>>>>
> >>>>> Out of curiosity - how does the policy say "drop" but drop_reason does
> >>>>> not reflect it?
> >>>>
> >>>> Ido, Jamal, wdyt about this alternative approach - these were the locations I could
> >>>> find from an initial glance (compile-tested) :
> >>>>
> >>>> From a3d46a55aac484372b60b783cb6a3c98a0fef75c Mon Sep 17 00:00:00 2001
> >>>> From: Daniel Borkmann <daniel@iogearbox.net>
> >>>> Date: Wed, 25 Oct 2023 11:43:44 +0000
> >>>> Subject: [PATCH] net, sched: fix..
> >>>>
> >>>> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> >>>> ---
> >>>> include/net/pkt_cls.h | 12 ++++++++++++
> >>>> net/sched/cls_basic.c | 2 +-
> >>>> net/sched/cls_bpf.c | 2 +-
> >>>> net/sched/cls_flower.c | 2 +-
> >>>> net/sched/cls_fw.c | 2 +-
> >>>> net/sched/cls_matchall.c | 2 +-
> >>>> net/sched/cls_route.c | 4 ++--
> >>>> net/sched/cls_u32.c | 2 +-
> >>>> 8 files changed, 20 insertions(+), 8 deletions(-)
> >>>>
> >>>> diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
> >>>> index a76c9171db0e..31d8e8587824 100644
> >>>> --- a/include/net/pkt_cls.h
> >>>> +++ b/include/net/pkt_cls.h
> >>>> @@ -160,6 +160,18 @@ static inline void tcf_set_drop_reason(struct tcf_result *res,
> >>>> res->drop_reason = reason;
> >>>> }
> >>>>
> >>>> +static inline void tcf_set_result(struct tcf_result *to,
> >>>> + const struct tcf_result *from)
> >>>> +{
> >>>> + /* tcf_result's drop_reason which is the last member must be
> >>>> + * preserved and cannot be copied from the cls'es tcf_result
> >>>> + * template given this is carried all the way and potentially
> >>>> + * set to a concrete tc drop reason upon error or intentional
> >>>> + * drop. See tcf_set_drop_reason() locations.
> >>>> + */
> >>>> + memcpy(to, from, offsetof(typeof(*to), drop_reason));
> >>>> +}
> >>>> +
> >>>
> >>> Daniel, IMO, doing this at cls_api is best instead (like what Victors
> >>> or my original patch did). Iam ~30K feet right now with a lousy
> >>> keyboard - you can either do it, or i or Victor can send the patch by
> >>> end of day today. There are missing cases which were covered by Victor
> >>> and possibly something else will pop up next.
> >>
> >> Sure, if you have sth clean and simple for today, go for it. Otherwise I
> >> can cook a proper one out of this as a fix and ship it tomorrow AM, so we
> >> have a fix for the splat in CONFIG_DEBUG_NET kernels and you can still
> >> refactor later.
> >
> > I was thinking something along these lines:
> >
> > --- a/net/sched/cls_api.c
> > +++ b/net/sched/cls_api.c
> > @@ -1662,6 +1662,7 @@ static inline int __tcf_classify(struct sk_buff *skb,
> > const int max_reclassify_loop = 16;
> > const struct tcf_proto *first_tp;
> > int limit = 0;
> > + u32 reason_code = res->drop_reason;
> >
> > reclassify:
> > #endif
> > @@ -1712,8 +1713,11 @@ static inline int __tcf_classify(struct sk_buff *skb,
> > goto reset;
> > }
> > #endif
> > - if (err >= 0)
> > + if (err >= 0) {
> > + if (err == TC_ACT_SHOT) /* policy drop */
> > + tcf_set_drop_reason(res, orig_reason);
> > return err;
> > + }
> > }
> >
> > if (unlikely(n)) {
> >
> > But tbh, i am struggling with the whole approach you took in the
> > earlier patch - i.e setting the drop reason from cls_act and then
> > expecting it not to be changed on policy drops; while it works for
> > clsact, it is not going to work for the rest of the qdiscs - unless we
> > change all the qdisc enqueues to take an extra param for drop_reason.
> > Thoughts?
>
> The downside of the above would be that you then cannot make use of
> tcf_set_drop_reason() further down the tc engine, for example, within
> classifiers/actions, e.g. tcf_action_exec() where you also have drops
> like:
>
> [...]
> } else if (TC_ACT_EXT_CMP(ret, TC_ACT_GOTO_CHAIN)) {
> if (unlikely(!rcu_access_pointer(a->goto_chain))) {
> net_warn_ratelimited("can't go to NULL chain!\n");
> return TC_ACT_SHOT;
> }
> tcf_action_goto_chain_exec(a, res);
> }
> [...]
Unless i misread your approach it is what you are doing? i.e set a
drop reason you dont expect to be changed on policy decision..
> If I spot this correctly, qdiscs also use tcf_classify() as well, and
> so far none of them have kfree_skb_reason() support, but if you plan to
> add it you can just set a default res.drop_reason from there as well
> since tcf_classify() already takes the res param.
>
The issue is the calling code starting at the dev_queue_xmit will
overwrite the code. You dont have this problem because ingress and
clsact both are self contained. A better solution is to also set it in
some skb->metadata (tc_cb should be enough); i note that looking at
the calling code it is done per batch of skbs with exactly the same
code reason for the whole batch instead per skb (git says Eric added
this change as an optimization)
cheers,
jamal
> Thanks,
> Daniel
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next v2 1/2] net, sched: Make tc-related drop reason more flexible
2023-10-27 17:37 ` Jamal Hadi Salim
@ 2023-10-27 18:29 ` Daniel Borkmann
0 siblings, 0 replies; 16+ messages in thread
From: Daniel Borkmann @ 2023-10-27 18:29 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: Ido Schimmel, kuba, netdev, bpf, victor, martin.lau, dxu,
xiyou.wangcong
On 10/27/23 7:37 PM, Jamal Hadi Salim wrote:
> On Fri, Oct 27, 2023 at 10:01 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>> On 10/26/23 1:13 AM, Jamal Hadi Salim wrote:
>>> On Wed, Oct 25, 2023 at 9:46 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>>>> On 10/25/23 3:21 PM, Jamal Hadi Salim wrote:
>>>>> On Wed, Oct 25, 2023 at 7:52 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>>>>>> On 10/25/23 1:05 PM, Jamal Hadi Salim wrote:
>>>>>>> On Wed, Oct 25, 2023 at 6:01 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>>>>>>>> On 10/25/23 10:59 AM, Ido Schimmel wrote:
>>>>>>>>> On Mon, Oct 09, 2023 at 11:26:54AM +0200, Daniel Borkmann wrote:
>>>>>>>>>> diff --git a/net/core/dev.c b/net/core/dev.c
>>>>>>>>>> index 606a366cc209..664426285fa3 100644
>>>>>>>>>> --- a/net/core/dev.c
>>>>>>>>>> +++ b/net/core/dev.c
>>>>>>>>>> @@ -3910,7 +3910,8 @@ EXPORT_SYMBOL_GPL(netdev_xmit_skip_txqueue);
>>>>>>>>>> #endif /* CONFIG_NET_EGRESS */
>>>>>>>>>>
>>>>>>>>>> #ifdef CONFIG_NET_XGRESS
>>>>>>>>>> -static int tc_run(struct tcx_entry *entry, struct sk_buff *skb)
>>>>>>>>>> +static int tc_run(struct tcx_entry *entry, struct sk_buff *skb,
>>>>>>>>>> + enum skb_drop_reason *drop_reason)
>>>>>>>>>> {
>>>>>>>>>> int ret = TC_ACT_UNSPEC;
>>>>>>>>>> #ifdef CONFIG_NET_CLS_ACT
>>>>>>>>>> @@ -3922,12 +3923,14 @@ static int tc_run(struct tcx_entry *entry, struct sk_buff *skb)
>>>>>>>>>>
>>>>>>>>>> tc_skb_cb(skb)->mru = 0;
>>>>>>>>>> tc_skb_cb(skb)->post_ct = false;
>>>>>>>>>> + res.drop_reason = *drop_reason;
>>>>>>>>>>
>>>>>>>>>> mini_qdisc_bstats_cpu_update(miniq, skb);
>>>>>>>>>> ret = tcf_classify(skb, miniq->block, miniq->filter_list, &res, false);
>>>>>>>>>> /* Only tcf related quirks below. */
>>>>>>>>>> switch (ret) {
>>>>>>>>>> case TC_ACT_SHOT:
>>>>>>>>>> + *drop_reason = res.drop_reason;
>>>>>>>>>
>>>>>>>>> Daniel,
>>>>>>>>>
>>>>>>>>> Getting the following splat [1] with CONFIG_DEBUG_NET=y and this
>>>>>>>>> reproducer [2]. Problem seems to be that classifiers clear 'struct
>>>>>>>>> tcf_result::drop_reason', thereby triggering the warning in
>>>>>>>>> __kfree_skb_reason() due to reason being 'SKB_NOT_DROPPED_YET' (0).
>>>>>>>>>
>>>>>>>>> Fixed by maintaining the original drop reason if the one returned from
>>>>>>>>> tcf_classify() is 'SKB_NOT_DROPPED_YET' [3]. I can submit this fix
>>>>>>>>> unless you have a better idea.
>>>>>>>>
>>>>>>>> Thanks for catching this, looks reasonable to me as a fix.
>>>>>>>>
>>>>>>>>> [1]
>>>>>>>>> WARNING: CPU: 0 PID: 181 at net/core/skbuff.c:1082 kfree_skb_reason+0x38/0x130
>>>>>>>>> Modules linked in:
>>>>>>>>> CPU: 0 PID: 181 Comm: mausezahn Not tainted 6.6.0-rc6-custom-ge43e6d9582e0 #682
>>>>>>>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014
>>>>>>>>> RIP: 0010:kfree_skb_reason+0x38/0x130
>>>>>>>>> [...]
>>>>>>>>> Call Trace:
>>>>>>>>> <IRQ>
>>>>>>>>> __netif_receive_skb_core.constprop.0+0x837/0xdb0
>>>>>>>>> __netif_receive_skb_one_core+0x3c/0x70
>>>>>>>>> process_backlog+0x95/0x130
>>>>>>>>> __napi_poll+0x25/0x1b0
>>>>>>>>> net_rx_action+0x29b/0x310
>>>>>>>>> __do_softirq+0xc0/0x29b
>>>>>>>>> do_softirq+0x43/0x60
>>>>>>>>> </IRQ>
>>>>>>>>>
>>>>>>>>> [2]
>>>>>>>>> #!/bin/bash
>>>>>>>>>
>>>>>>>>> ip link add name veth0 type veth peer name veth1
>>>>>>>>> ip link set dev veth0 up
>>>>>>>>> ip link set dev veth1 up
>>>>>>>>> tc qdisc add dev veth1 clsact
>>>>>>>>> tc filter add dev veth1 ingress pref 1 proto all flower dst_mac 00:11:22:33:44:55 action drop
>>>>>>>>> mausezahn veth0 -a own -b 00:11:22:33:44:55 -q -c 1
>>>>>>>>
>>>>>>>> I didn't know you're using mausezahn, nice :)
>>>>>>>>
>>>>>>>>> [3]
>>>>>>>>> diff --git a/net/core/dev.c b/net/core/dev.c
>>>>>>>>> index a37a932a3e14..abd0b13f3f17 100644
>>>>>>>>> --- a/net/core/dev.c
>>>>>>>>> +++ b/net/core/dev.c
>>>>>>>>> @@ -3929,7 +3929,8 @@ static int tc_run(struct tcx_entry *entry, struct sk_buff *skb,
>>>>>>>>> /* Only tcf related quirks below. */
>>>>>>>>> switch (ret) {
>>>>>>>>> case TC_ACT_SHOT:
>>>>>>>>> - *drop_reason = res.drop_reason;
>>>>>>>>> + if (res.drop_reason != SKB_NOT_DROPPED_YET)
>>>>>>>>> + *drop_reason = res.drop_reason;
>>>>>>>>> mini_qdisc_qstats_cpu_drop(miniq);
>>>>>>>>> break;
>>>>>>>>> case TC_ACT_OK:
>>>>>>>>>
>>>>>>>
>>>>>>> Out of curiosity - how does the policy say "drop" but drop_reason does
>>>>>>> not reflect it?
>>>>>>
>>>>>> Ido, Jamal, wdyt about this alternative approach - these were the locations I could
>>>>>> find from an initial glance (compile-tested) :
>>>>>>
>>>>>> From a3d46a55aac484372b60b783cb6a3c98a0fef75c Mon Sep 17 00:00:00 2001
>>>>>> From: Daniel Borkmann <daniel@iogearbox.net>
>>>>>> Date: Wed, 25 Oct 2023 11:43:44 +0000
>>>>>> Subject: [PATCH] net, sched: fix..
>>>>>>
>>>>>> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
>>>>>> ---
>>>>>> include/net/pkt_cls.h | 12 ++++++++++++
>>>>>> net/sched/cls_basic.c | 2 +-
>>>>>> net/sched/cls_bpf.c | 2 +-
>>>>>> net/sched/cls_flower.c | 2 +-
>>>>>> net/sched/cls_fw.c | 2 +-
>>>>>> net/sched/cls_matchall.c | 2 +-
>>>>>> net/sched/cls_route.c | 4 ++--
>>>>>> net/sched/cls_u32.c | 2 +-
>>>>>> 8 files changed, 20 insertions(+), 8 deletions(-)
>>>>>>
>>>>>> diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
>>>>>> index a76c9171db0e..31d8e8587824 100644
>>>>>> --- a/include/net/pkt_cls.h
>>>>>> +++ b/include/net/pkt_cls.h
>>>>>> @@ -160,6 +160,18 @@ static inline void tcf_set_drop_reason(struct tcf_result *res,
>>>>>> res->drop_reason = reason;
>>>>>> }
>>>>>>
>>>>>> +static inline void tcf_set_result(struct tcf_result *to,
>>>>>> + const struct tcf_result *from)
>>>>>> +{
>>>>>> + /* tcf_result's drop_reason which is the last member must be
>>>>>> + * preserved and cannot be copied from the cls'es tcf_result
>>>>>> + * template given this is carried all the way and potentially
>>>>>> + * set to a concrete tc drop reason upon error or intentional
>>>>>> + * drop. See tcf_set_drop_reason() locations.
>>>>>> + */
>>>>>> + memcpy(to, from, offsetof(typeof(*to), drop_reason));
>>>>>> +}
>>>>>> +
>>>>>
>>>>> Daniel, IMO, doing this at cls_api is best instead (like what Victors
>>>>> or my original patch did). Iam ~30K feet right now with a lousy
>>>>> keyboard - you can either do it, or i or Victor can send the patch by
>>>>> end of day today. There are missing cases which were covered by Victor
>>>>> and possibly something else will pop up next.
>>>>
>>>> Sure, if you have sth clean and simple for today, go for it. Otherwise I
>>>> can cook a proper one out of this as a fix and ship it tomorrow AM, so we
>>>> have a fix for the splat in CONFIG_DEBUG_NET kernels and you can still
>>>> refactor later.
>>>
>>> I was thinking something along these lines:
>>>
>>> --- a/net/sched/cls_api.c
>>> +++ b/net/sched/cls_api.c
>>> @@ -1662,6 +1662,7 @@ static inline int __tcf_classify(struct sk_buff *skb,
>>> const int max_reclassify_loop = 16;
>>> const struct tcf_proto *first_tp;
>>> int limit = 0;
>>> + u32 reason_code = res->drop_reason;
>>>
>>> reclassify:
>>> #endif
>>> @@ -1712,8 +1713,11 @@ static inline int __tcf_classify(struct sk_buff *skb,
>>> goto reset;
>>> }
>>> #endif
>>> - if (err >= 0)
>>> + if (err >= 0) {
>>> + if (err == TC_ACT_SHOT) /* policy drop */
>>> + tcf_set_drop_reason(res, orig_reason);
>>> return err;
>>> + }
>>> }
>>>
>>> if (unlikely(n)) {
>>>
>>> But tbh, i am struggling with the whole approach you took in the
>>> earlier patch - i.e setting the drop reason from cls_act and then
>>> expecting it not to be changed on policy drops; while it works for
>>> clsact, it is not going to work for the rest of the qdiscs - unless we
>>> change all the qdisc enqueues to take an extra param for drop_reason.
>>> Thoughts?
>>
>> The downside of the above would be that you then cannot make use of
>> tcf_set_drop_reason() further down the tc engine, for example, within
>> classifiers/actions, e.g. tcf_action_exec() where you also have drops
>> like:
>>
>> [...]
>> } else if (TC_ACT_EXT_CMP(ret, TC_ACT_GOTO_CHAIN)) {
>> if (unlikely(!rcu_access_pointer(a->goto_chain))) {
>> net_warn_ratelimited("can't go to NULL chain!\n");
>> return TC_ACT_SHOT;
>> }
>> tcf_action_goto_chain_exec(a, res);
>> }
>> [...]
>
> Unless i misread your approach it is what you are doing? i.e set a
> drop reason you dont expect to be changed on policy decision..
I meant as just mentioned in the other thread if you were to place a
tcf_set_drop_reason() somewhere near the net_warn_ratelimited(), then
then the above diff will think it's a policy drop (even though it was
not), and use the orig_reason.
>> If I spot this correctly, qdiscs also use tcf_classify() as well, and
>> so far none of them have kfree_skb_reason() support, but if you plan to
>> add it you can just set a default res.drop_reason from there as well
>> since tcf_classify() already takes the res param.
>
> The issue is the calling code starting at the dev_queue_xmit will
> overwrite the code. You dont have this problem because ingress and
> clsact both are self contained. A better solution is to also set it in
> some skb->metadata (tc_cb should be enough); i note that looking at
> the calling code it is done per batch of skbs with exactly the same
> code reason for the whole batch instead per skb (git says Eric added
> this change as an optimization)
Yeah for regular qdiscs you might need to temporarily store this in tc cb
if you wanted to extend the drop reason all the way and support passing
this along via __qdisc_drop() etc.
Thanks,
Daniel
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2023-10-27 18:31 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-09 9:26 [PATCH net-next v2 1/2] net, sched: Make tc-related drop reason more flexible Daniel Borkmann
2023-10-09 9:26 ` [PATCH net-next v2 2/2] net, sched: Add tcf_set_drop_reason for {__,}tcf_classify Daniel Borkmann
2023-10-09 14:03 ` [PATCH net-next v2 1/2] net, sched: Make tc-related drop reason more flexible Jamal Hadi Salim
2023-10-13 21:17 ` Jakub Kicinski
2023-10-16 20:10 ` patchwork-bot+netdevbpf
2023-10-25 8:59 ` Ido Schimmel
2023-10-25 10:01 ` Daniel Borkmann
2023-10-25 11:05 ` Jamal Hadi Salim
2023-10-25 11:52 ` Daniel Borkmann
2023-10-25 13:21 ` Jamal Hadi Salim
2023-10-25 13:46 ` Daniel Borkmann
2023-10-25 23:13 ` Jamal Hadi Salim
2023-10-27 14:01 ` Daniel Borkmann
2023-10-27 17:37 ` Jamal Hadi Salim
2023-10-27 18:29 ` Daniel Borkmann
2023-10-25 13:53 ` Jakub Kicinski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).