* Storing sk_buffs as kptrs in map
@ 2024-11-26 17:05 Maciej Fijalkowski
2024-11-26 19:56 ` [External] " Amery Hung
0 siblings, 1 reply; 10+ messages in thread
From: Maciej Fijalkowski @ 2024-11-26 17:05 UTC (permalink / raw)
To: bpf; +Cc: magnus.karlsson, sreedevi.joshi, amery.hung, ast
Hello eBPFers,
I have a use case where I would like to store sk_buff pointers as kptrs in
eBPF map. To do so, I am borrowing skb kfuncs for acquire/release/destroy
from Amery Hung's bpf qdisc set [0], but they are registered for
BPF_PROG_TYPE_SCHED_CLS programs.
TL;DR - due to following callstack:
do_check()
check_kfunc_call()
check_kfunc_args()
get_kfunc_ptr_arg_type()
btf_is_prog_ctx_type()
btf_is_projection_of() -- return true
sk_buff argument is being interpreted as KF_ARG_PTR_TO_CTX, but what we
have there is KF_ARG_PTR_TO_BTF_ID. Verifier is unhappy about it. Should
this be workarounded via some typedef or adding mentioned kfuncs to
special_kfunc_list ? If the latter, then what else needs to be handled?
Commenting out sk_buff part from btf_is_projection_of() makes it work, but
that probably is not a solution:)
Another question is in case bpf qdisc set lands, could we have these
kfuncs not being limited to BPF_PROG_TYPE_STRUCT_OPS ?
I would be thankful for any pointers/stions regarding this issue.
Maciej
[0]: https://lore.kernel.org/bpf/20240714175130.4051012-7-amery.hung@bytedance.com/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [External] Storing sk_buffs as kptrs in map
2024-11-26 17:05 Storing sk_buffs as kptrs in map Maciej Fijalkowski
@ 2024-11-26 19:56 ` Amery Hung
2024-11-26 20:47 ` Martin KaFai Lau
0 siblings, 1 reply; 10+ messages in thread
From: Amery Hung @ 2024-11-26 19:56 UTC (permalink / raw)
To: Maciej Fijalkowski; +Cc: bpf, magnus.karlsson, sreedevi.joshi, ast
On Tue, Nov 26, 2024 at 9:06 AM Maciej Fijalkowski
<maciej.fijalkowski@intel.com> wrote:
>
> Hello eBPFers,
>
> I have a use case where I would like to store sk_buff pointers as kptrs in
> eBPF map. To do so, I am borrowing skb kfuncs for acquire/release/destroy
> from Amery Hung's bpf qdisc set [0], but they are registered for
> BPF_PROG_TYPE_SCHED_CLS programs.
>
> TL;DR - due to following callstack:
>
> do_check()
> check_kfunc_call()
> check_kfunc_args()
> get_kfunc_ptr_arg_type()
> btf_is_prog_ctx_type()
> btf_is_projection_of() -- return true
>
> sk_buff argument is being interpreted as KF_ARG_PTR_TO_CTX, but what we
> have there is KF_ARG_PTR_TO_BTF_ID. Verifier is unhappy about it. Should
> this be workarounded via some typedef or adding mentioned kfuncs to
> special_kfunc_list ? If the latter, then what else needs to be handled?
>
> Commenting out sk_buff part from btf_is_projection_of() makes it work, but
> that probably is not a solution:)
>
> Another question is in case bpf qdisc set lands, could we have these
> kfuncs not being limited to BPF_PROG_TYPE_STRUCT_OPS ?
>
Hi Maciej,
bpf qdisc will use a different mechanism to acquire skb referenced
kptrs (i.e., no skb-acquiring kfunc). For skb-releasing kfunc, I don't
think it can be made available to SCHED_CLS programs directly either.
The problem is the bpf program could kfree_skb() the skb in the ctx
with this kfunc. However, the kernel is still expecting a valid skb
later in cls_bpf_classify().
Another potential problem to consider in order to enable skb kptrs in
SCHED_CLS is the cleanup. In bpf qdisc case, we are still working on
releasing skb kptrs in maps or graphs automatically when .reset is
called so that we don't hold the resources forever.
I am interested in hearing your use case of storing skb kptr in maps.
Do you mind sharing it?
Thanks,
Amery
> I would be thankful for any pointers/stions regarding this issue.
> Maciej
>
> [0]: https://lore.kernel.org/bpf/20240714175130.4051012-7-amery.hung@bytedance.com/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [External] Storing sk_buffs as kptrs in map
2024-11-26 19:56 ` [External] " Amery Hung
@ 2024-11-26 20:47 ` Martin KaFai Lau
2024-11-27 19:07 ` Maciej Fijalkowski
0 siblings, 1 reply; 10+ messages in thread
From: Martin KaFai Lau @ 2024-11-26 20:47 UTC (permalink / raw)
To: Amery Hung, Maciej Fijalkowski; +Cc: bpf, magnus.karlsson, sreedevi.joshi, ast
On 11/26/24 11:56 AM, Amery Hung wrote:
>> I have a use case where I would like to store sk_buff pointers as kptrs in
>> eBPF map. To do so, I am borrowing skb kfuncs for acquire/release/destroy
>> from Amery Hung's bpf qdisc set [0], but they are registered for
>> BPF_PROG_TYPE_SCHED_CLS programs.
>>
>> TL;DR - due to following callstack:
>>
>> do_check()
>> check_kfunc_call()
>> check_kfunc_args()
>> get_kfunc_ptr_arg_type()
>> btf_is_prog_ctx_type()
>> btf_is_projection_of() -- return true
>>
>> sk_buff argument is being interpreted as KF_ARG_PTR_TO_CTX, but what we
>> have there is KF_ARG_PTR_TO_BTF_ID. Verifier is unhappy about it. Should
I don't think I fully understand "what we have there is KF_ARG_PTR_TO_BTF_ID". I
am trying to guess you meant what we have there in the reg->type is in
(PTR_TO_BTF_ID | PTR_TRUSTED).
It makes sense to have "struct sk_buff __kptr *" instead of "struct __sk_buff
__kptr *". However, the get_kfunc_ptr_arg_type() is expecting KF_ARG_PTR_TO_CTX
because the prog type is BPF_PROG_TYPE_SCHED_CLS.
From a very quick look, under the "case KF_ARG_PTR_TO_CTX:" in
check_kfunc_args(), I think it needs to teach the verifier that the reg->type
with a trusted PTR_TO_BTF_ID ("struct sk_buff *") can be used as the PTR_TO_CTX.
>> this be workarounded via some typedef or adding mentioned kfuncs to
>> special_kfunc_list ? If the latter, then what else needs to be handled?
>>
>> Commenting out sk_buff part from btf_is_projection_of() makes it work, but
>> that probably is not a solution:)
>>
>> Another question is in case bpf qdisc set lands, could we have these
>> kfuncs not being limited to BPF_PROG_TYPE_STRUCT_OPS ?
Similar to Amery's comment. Please share the patch and user case. It will be
easier to discuss.
> In bpf qdisc case, we are still working on
> releasing skb kptrs in maps or graphs automatically when .reset is
> called so that we don't hold the resources forever.
Regarding specifically the bpf qdisc case, the .reset should do the right thing
to release the queued skb. imo, after sleeping on it, if the bpf prog missed
releasing the skb, it is fine to depend on the map destruction to finally
release them. It is the same as other kptrs type stored in the map which will
also be finally released during map_free.
In the future, for the struct_ops case, it can be improved by allowing to define
the sch->privdata. May be allow to define the layout of this privdata, e.g. the
whole privdata is a one element map backed by a btf id. The implementation will
need to be generic enough for any bpf_struct_ops instead of something specific
to the bpf-qdisc. This can be a follow up improvement as a more seamless per sch
instance cleanup after the core bpf-qdisc pieces landed.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [External] Storing sk_buffs as kptrs in map
2024-11-26 20:47 ` Martin KaFai Lau
@ 2024-11-27 19:07 ` Maciej Fijalkowski
2024-11-27 20:54 ` Martin KaFai Lau
0 siblings, 1 reply; 10+ messages in thread
From: Maciej Fijalkowski @ 2024-11-27 19:07 UTC (permalink / raw)
To: Martin KaFai Lau; +Cc: Amery Hung, bpf, magnus.karlsson, sreedevi.joshi, ast
On Tue, Nov 26, 2024 at 12:47:09PM -0800, Martin KaFai Lau wrote:
> On 11/26/24 11:56 AM, Amery Hung wrote:
> > > I have a use case where I would like to store sk_buff pointers as kptrs in
> > > eBPF map. To do so, I am borrowing skb kfuncs for acquire/release/destroy
> > > from Amery Hung's bpf qdisc set [0], but they are registered for
> > > BPF_PROG_TYPE_SCHED_CLS programs.
> > >
> > > TL;DR - due to following callstack:
> > >
> > > do_check()
> > > check_kfunc_call()
> > > check_kfunc_args()
> > > get_kfunc_ptr_arg_type()
> > > btf_is_prog_ctx_type()
> > > btf_is_projection_of() -- return true
> > >
> > > sk_buff argument is being interpreted as KF_ARG_PTR_TO_CTX, but what we
> > > have there is KF_ARG_PTR_TO_BTF_ID. Verifier is unhappy about it. Should
>
> I don't think I fully understand "what we have there is
> KF_ARG_PTR_TO_BTF_ID". I am trying to guess you meant what we have there in
> the reg->type is in (PTR_TO_BTF_ID | PTR_TRUSTED).
Yes, sorry for taking the shortcuts here.
>
> It makes sense to have "struct sk_buff __kptr *" instead of "struct
> __sk_buff __kptr *". However, the get_kfunc_ptr_arg_type() is expecting
> KF_ARG_PTR_TO_CTX because the prog type is BPF_PROG_TYPE_SCHED_CLS.
Yes I have sk_buff as an kfunc's arg, I am using bpf_cast_to_kern_ctx() to
get __sk_buff->sk_buff conversion.
>
> From a very quick look, under the "case KF_ARG_PTR_TO_CTX:" in
> check_kfunc_args(), I think it needs to teach the verifier that the
> reg->type with a trusted PTR_TO_BTF_ID ("struct sk_buff *") can be used as
> the PTR_TO_CTX.
But kfunc does not work on PTR_TO_CTX - it takes in directly sk_buff, not
__sk_buff. As I mention above we use bpf_cast_to_kern_ctx() and per my
current limited understanding it overwrites the reg->type to
PTR_TO_BTF_ID | PTR_TRUSTED.
However, as you said, due to prog type used and sk_buff as an arg,
get_kfunc_ptr_arg_type() interprets the kfunc's arg as KF_ARG_PTR_TO_CTX.
>
> > > this be workarounded via some typedef or adding mentioned kfuncs to
> > > special_kfunc_list ? If the latter, then what else needs to be handled?
> > >
> > > Commenting out sk_buff part from btf_is_projection_of() makes it work, but
> > > that probably is not a solution:)
> > >
> > > Another question is in case bpf qdisc set lands, could we have these
> > > kfuncs not being limited to BPF_PROG_TYPE_STRUCT_OPS ?
>
> Similar to Amery's comment. Please share the patch and user case. It will be
> easier to discuss.
I tried to simplify the use case that customer has, but I am a bit worried
that it might only confuse people more :/ however, here it is:
On TC egress hook skb is stored in a map - reason for picking it over the
linked list or rbtree is that we want to be able to access skbs via some index,
say a hash. This is where we bump the skb's refcount via acquire kfunc.
During TC ingress hook on the same interface, the skb that was previously
stored in map is retrieved, current skb that resides in the context of
hook carries the timestamp via metadata. We then use the retrieved skb and
tstamp from metadata on skb_tstamp_tx() (another kfunc) and finally
decrement skb's refcount via release kfunc.
Anyways, since we are able to do similar operations on task_struct
(holding it in map via kptr), I don't see a reason why wouldn't we allow
ourselves to do it on sk_buffs, no?
>
> > In bpf qdisc case, we are still working on
> > releasing skb kptrs in maps or graphs automatically when .reset is
> > called so that we don't hold the resources forever.
>
> Regarding specifically the bpf qdisc case, the .reset should do the right
> thing to release the queued skb. imo, after sleeping on it, if the bpf prog
> missed releasing the skb, it is fine to depend on the map destruction to
> finally release them. It is the same as other kptrs type stored in the map
> which will also be finally released during map_free.
>
> In the future, for the struct_ops case, it can be improved by allowing to
> define the sch->privdata. May be allow to define the layout of this
> privdata, e.g. the whole privdata is a one element map backed by a btf id.
> The implementation will need to be generic enough for any bpf_struct_ops
> instead of something specific to the bpf-qdisc. This can be a follow up
> improvement as a more seamless per sch instance cleanup after the core
> bpf-qdisc pieces landed.
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [External] Storing sk_buffs as kptrs in map
2024-11-27 19:07 ` Maciej Fijalkowski
@ 2024-11-27 20:54 ` Martin KaFai Lau
2024-12-03 20:46 ` Maciej Fijalkowski
0 siblings, 1 reply; 10+ messages in thread
From: Martin KaFai Lau @ 2024-11-27 20:54 UTC (permalink / raw)
To: Maciej Fijalkowski; +Cc: Amery Hung, bpf, magnus.karlsson, sreedevi.joshi, ast
On 11/27/24 11:07 AM, Maciej Fijalkowski wrote:
> But kfunc does not work on PTR_TO_CTX - it takes in directly sk_buff, not
> __sk_buff. As I mention above we use bpf_cast_to_kern_ctx() and per my
> current limited understanding it overwrites the reg->type to
> PTR_TO_BTF_ID | PTR_TRUSTED.
Can you try skip calling the bpf_cast_to_kern_ctx and directly pass the "struct
__sk_buff *skb" to the "struct sk_buff *bpf_skb_acquire(struct __sk_buff *skb).
> I tried to simplify the use case that customer has, but I am a bit worried
> that it might only confuse people more :/ however, here it is:
No. not at all. I suspect the use case has some similarity to the net-timestamp
patches
(https://lore.kernel.org/bpf/20241028110535.82999-1-kerneljasonxing@gmail.com/)
which uses a skb tskey to associate/co-relate different timestamp.
Please share the patch and the test case. It will be easier for others to help.
> On TC egress hook skb is stored in a map - reason for picking it over the
> linked list or rbtree is that we want to be able to access skbs via some index,
> say a hash. This is where we bump the skb's refcount via acquire kfunc.
>
> During TC ingress hook on the same interface, the skb that was previously
> stored in map is retrieved, current skb that resides in the context of
> hook carries the timestamp via metadata. We then use the retrieved skb and
> tstamp from metadata on skb_tstamp_tx() (another kfunc) and finally
> decrement skb's refcount via release kfunc.
>
>
> Anyways, since we are able to do similar operations on task_struct
> (holding it in map via kptr), I don't see a reason why wouldn't we allow
> ourselves to do it on sk_buffs, no?
skb holds other things like dev and dst, like someone may be trying to remove
the netdevice and route...etc. Overall, yes, the skb refcnt will eventually be
decremented when the map is freed like other kptr (e.g. task) do.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [External] Storing sk_buffs as kptrs in map
2024-11-27 20:54 ` Martin KaFai Lau
@ 2024-12-03 20:46 ` Maciej Fijalkowski
2024-12-04 23:24 ` Martin KaFai Lau
0 siblings, 1 reply; 10+ messages in thread
From: Maciej Fijalkowski @ 2024-12-03 20:46 UTC (permalink / raw)
To: Martin KaFai Lau; +Cc: Amery Hung, bpf, magnus.karlsson, sreedevi.joshi, ast
On Wed, Nov 27, 2024 at 12:54:41PM -0800, Martin KaFai Lau wrote:
> On 11/27/24 11:07 AM, Maciej Fijalkowski wrote:
> > But kfunc does not work on PTR_TO_CTX - it takes in directly sk_buff, not
> > __sk_buff. As I mention above we use bpf_cast_to_kern_ctx() and per my
> > current limited understanding it overwrites the reg->type to
> > PTR_TO_BTF_ID | PTR_TRUSTED.
>
> Can you try skip calling the bpf_cast_to_kern_ctx and directly pass the
> "struct __sk_buff *skb" to the "struct sk_buff *bpf_skb_acquire(struct
> __sk_buff *skb).
That didn't help, log below:
libbpf: prog 'bpf_tc_ingress': -- BEGIN PROG LOAD LOG --
0: R1=ctx() R10=fp0
; map_entry = bpf_map_lookup_elem(&skb_map, &get_key); @ bpf_bpf.c:145
0: (18) r6 = 0xffffc9000d186004 ; R6_w=map_value(map=bpf_bpf.bss,ks=4,vs=8,off=4)
2: (18) r1 = 0xffff88984ab8ce00 ; R1_w=map_ptr(map=skb_map,ks=4,vs=8)
4: (18) r2 = 0xffffc9000d186004 ; R2_w=map_value(map=bpf_bpf.bss,ks=4,vs=8,off=4)
6: (85) call bpf_map_lookup_elem#1 ; R0_w=map_value_or_null(id=1,map=skb_map,ks=4,vs=8)
; if (!map_entry) { @ bpf_bpf.c:146
7: (55) if r0 != 0x0 goto pc+7 15: R0_w=map_value(map=skb_map,ks=4,vs=8) R6_w=map_value(map=bpf_bpf.bss,ks=4,vs=8,off=4) R10=fp0
; tmp = bpf_kptr_xchg(&map_entry->skb, tmp); @ bpf_bpf.c:151
15: (bf) r1 = r0 ; R0_w=map_value(map=skb_map,ks=4,vs=8) R1_w=map_value(map=skb_map,ks=4,vs=8)
16: (b7) r2 = 0 ; R2_w=0
17: (85) call bpf_kptr_xchg#194 ; R0_w=ptr_or_null_sk_buff(id=3,ref_obj_id=3) refs=3
18: (bf) r6 = r0 ; R0_w=ptr_or_null_sk_buff(id=3,ref_obj_id=3) R6_w=ptr_or_null_sk_buff(id=3,ref_obj_id=3) refs=3
19: (b4) w7 = -1 ; R7=0xffffffff refs=3
; if (!tmp) @ bpf_bpf.c:152
20: (15) if r6 == 0x0 goto pc+17 ; R6=ptr_sk_buff(ref_obj_id=3) refs=3
; bpf_printk("retrieved skb %p from index %d\n", tmp, get_key); @ bpf_bpf.c:155
21: (18) r8 = 0xffffc9000d186004 ; R8_w=map_value(map=bpf_bpf.bss,ks=4,vs=8,off=4) refs=3
23: (61) r4 = *(u32 *)(r8 +0) ; R4_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R8_w=map_value(map=bpf_bpf.bss,ks=4,vs=8,off=4) refs=3
24: (18) r1 = 0xffff88984ab8cb20 ; R1_w=map_value(map=bpf_bpf.rodata,ks=4,vs=83,off=24) refs=3
26: (b4) w2 = 32 ; R2_w=32 refs=3
27: (bf) r3 = r6 ; R3_w=ptr_sk_buff(ref_obj_id=3) R6=ptr_sk_buff(ref_obj_id=3) refs=3
28: (85) call bpf_trace_printk#6 ; R0_w=scalar() refs=3
; get_key++; @ bpf_bpf.c:157
29: (61) r1 = *(u32 *)(r8 +0) ; R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R8_w=map_value(map=bpf_bpf.bss,ks=4,vs=8,off=4) refs=3
30: (04) w1 += 1 ; R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) refs=3
31: (b4) w7 = 0 ; R7_w=0 refs=3
32: (b4) w2 = 0 ; R2=0 refs=3
; if (get_key >= 16) @ bpf_bpf.c:158
33: (26) if w1 > 0xf goto pc+1 ; R1=scalar(smin=smin32=0,smax=umax=smax32=umax32=15,var_off=(0x0; 0xf)) refs=3
34: (bc) w2 = w1 ; R1=scalar(id=4,smin=smin32=0,smax=umax=smax32=umax32=15,var_off=(0x0; 0xf)) R2_w=scalar(id=4,smin=smin32=0,smax=umax=smax32=umax32=15,var_off=(0x0; 0xf)) refs=3
35: (63) *(u32 *)(r8 +0) = r2 ; R2_w=scalar(id=4,smin=smin32=0,smax=umax=smax32=umax32=15,var_off=(0x0; 0xf)) R8=map_value(map=bpf_bpf.bss,ks=4,vs=8,off=4) refs=3
; bpf_skb_release((struct __sk_buff *)tmp); @ bpf_bpf.c:161
36: (bf) r1 = r6 ; R1_w=ptr_sk_buff(ref_obj_id=3) R6=ptr_sk_buff(ref_obj_id=3) refs=3
37: (85) call bpf_skb_release#102037
arg#0 expected pointer to ctx, but got ptr_
processed 34 insns (limit 1000000) max_states_per_insn 0 total_states 3 peak_states 3 mark_read 1
-- END PROG LOAD LOG --
Still the same problem. Also even it would work it was not very convenient
to cast these types back and forth...I then tried to store __sk_buff as
kptr but I ended up with:
"map 'skb_map' has to have BTF in order to use bpf_kptr_xchg"
which got me lost:) I have a solution though which I'd like to discuss.
>
> Please share the patch and the test case. It will be easier for others to help.
I have come up with rather simple way of achieving what I desired when
starting this thread, how about something like this:
From 0df7760330cccfe71235b56018d0a33d4a3b9863 Mon Sep 17 00:00:00 2001
From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Date: Tue, 3 Dec 2024 21:00:44 +0100
Subject: [PATCH RFC bpf-next] bpf: add __ctx_ref kfunc argument suffix
The use case is when user wants to use sk_buff pointers as kptrs against
program types that take in __sk_buff struct as context.
A pair of kfuncs for acquiring and releasing skb would look as follows:
__bpf_kfunc struct sk_buff *bpf_skb_acquire(struct sk_buff *skb__ctx_ref)
__bpf_kfunc void bpf_skb_release(struct sk_buff *skb__ctx_ref)
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
| 8 ++++++++
1 file changed, 8 insertions(+)
--git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 60938b136365..b16a39d28f8a 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -11303,6 +11303,11 @@ static bool is_kfunc_arg_const_str(const struct btf *btf, const struct btf_param
return btf_param_match_suffix(btf, arg, "__str");
}
+static bool is_kfunc_arg_ctx_ref(const struct btf *btf, const struct btf_param *arg)
+{
+ return btf_param_match_suffix(btf, arg, "__ctx_ref");
+}
+
static bool is_kfunc_arg_scalar_with_name(const struct btf *btf,
const struct btf_param *arg,
const char *name)
@@ -11599,6 +11604,9 @@ get_kfunc_ptr_arg_type(struct bpf_verifier_env *env,
if (meta->func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx])
return KF_ARG_PTR_TO_CTX;
+ if (is_kfunc_arg_ctx_ref(meta->btf, &args[argno]))
+ return KF_ARG_PTR_TO_BTF_ID;
+
/* In this function, we verify the kfunc's BTF as per the argument type,
* leaving the rest of the verification with respect to the register
* type to our caller. When a set of conditions hold in the BTF type of
--
2.34.1
Then below example progs store the skb on TC tx and retrieve it later on
when TC rx hook is called. This is a ping-pong of packets between veth
pair interfaces. We start with Tx on interface with progs attached and
then receive the packets back. I have stripped the other logic so that we
focus only on kptr storage:
-------------------------------8<-------------------------------
struct sk_buff *bpf_skb_acquire(struct sk_buff *skb) __ksym;
void bpf_skb_release(struct sk_buff *skb) __ksym;
void *bpf_cast_to_kern_ctx(void *) __ksym;
u32 key = 0;
u32 get_key = 0;
struct map_value {
struct sk_buff __kptr * skb;
};
struct {
__uint(type, BPF_MAP_TYPE_ARRAY);
__type(key, int);
__type(value, struct map_value);
__uint(max_entries, 16);
} skb_map SEC(".maps");
SEC("tc") int bpf_tc_ingress(struct __sk_buff *skb)
{
struct map_value *map_entry;
struct sk_buff *tmp = NULL;
map_entry = bpf_map_lookup_elem(&skb_map, &get_key);
if (!map_entry) {
bpf_printk("no entry at get key %d\n", get_key);
return TC_ACT_SHOT;
}
tmp = bpf_kptr_xchg(&map_entry->skb, tmp);
if (!tmp)
return TC_ACT_UNSPEC;
bpf_printk("retrieved skb %p from index %d\n", tmp, get_key);
get_key++;
if (get_key >= 16)
get_key = 0;
bpf_skb_release(tmp);
return TC_ACT_OK;
}
SEC("tc") int bpf_tc_egress(struct __sk_buff *ctx)
{
struct sk_buff *skbk = bpf_cast_to_kern_ctx(ctx);
struct map_value *map_entry;
struct map_value tmp_entry;
struct sk_buff *tmp;
tmp_entry.skb = NULL;
bpf_map_update_elem(&skb_map, &key, &tmp_entry, BPF_ANY);
map_entry = bpf_map_lookup_elem(&skb_map, &key);
if (!map_entry)
return TC_ACT_SHOT;
skbk = bpf_skb_acquire(skbk);
if (!skbk)
return TC_ACT_SHOT;
tmp = bpf_kptr_xchg(&map_entry->skb, skbk);
if (tmp)
bpf_skb_release(tmp);
bpf_printk("stored skb %p at index %d\n", skbk, key);
key++;
if (key >= 16)
key = 0;
return TC_ACT_OK;
}
char LICENSE[] SEC("license") = "GPL";
------------------------------->8-------------------------------
>
> > I tried to simplify the use case that customer has, but I am a bit worried
> > that it might only confuse people more :/ however, here it is:
>
> No. not at all. I suspect the use case has some similarity to the
> net-timestamp patches (https://lore.kernel.org/bpf/20241028110535.82999-1-kerneljasonxing@gmail.com/)
> which uses a skb tskey to associate/co-relate different timestamp.
I was aware of this work and I will give this a deep look, thanks.
>
> > On TC egress hook skb is stored in a map - reason for picking it over the
> > linked list or rbtree is that we want to be able to access skbs via some index,
> > say a hash. This is where we bump the skb's refcount via acquire kfunc.
> >
> > During TC ingress hook on the same interface, the skb that was previously
> > stored in map is retrieved, current skb that resides in the context of
> > hook carries the timestamp via metadata. We then use the retrieved skb and
> > tstamp from metadata on skb_tstamp_tx() (another kfunc) and finally
> > decrement skb's refcount via release kfunc.
> >
> >
> > Anyways, since we are able to do similar operations on task_struct
> > (holding it in map via kptr), I don't see a reason why wouldn't we allow
> > ourselves to do it on sk_buffs, no?
>
> skb holds other things like dev and dst, like someone may be trying to
> remove the netdevice and route...etc. Overall, yes, the skb refcnt will
> eventually be decremented when the map is freed like other kptr (e.g. task)
> do.
>
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [External] Storing sk_buffs as kptrs in map
2024-12-03 20:46 ` Maciej Fijalkowski
@ 2024-12-04 23:24 ` Martin KaFai Lau
2024-12-06 16:24 ` Maciej Fijalkowski
0 siblings, 1 reply; 10+ messages in thread
From: Martin KaFai Lau @ 2024-12-04 23:24 UTC (permalink / raw)
To: Maciej Fijalkowski; +Cc: Amery Hung, bpf, magnus.karlsson, sreedevi.joshi, ast
On 12/3/24 12:46 PM, Maciej Fijalkowski wrote:
> ; bpf_skb_release((struct __sk_buff *)tmp); @ bpf_bpf.c:161
> 36: (bf) r1 = r6 ; R1_w=ptr_sk_buff(ref_obj_id=3) R6=ptr_sk_buff(ref_obj_id=3) refs=3
> 37: (85) call bpf_skb_release#102037
> arg#0 expected pointer to ctx, but got ptr_
ic. The bpf_skb_release() is hitting the KF_ARG_PTR_TO_CTX again.
> processed 34 insns (limit 1000000) max_states_per_insn 0 total_states 3 peak_states 3 mark_read 1
> -- END PROG LOAD LOG --
>
>
> Still the same problem. Also even it would work it was not very convenient
> to cast these types back and forth...I then tried to store __sk_buff as
> kptr but I ended up with:
>
> "map 'skb_map' has to have BTF in order to use bpf_kptr_xchg"
>
> which got me lost:) I have a solution though which I'd like to discuss.
>
>>
>> Please share the patch and the test case. It will be easier for others to help.
>
> I have come up with rather simple way of achieving what I desired when
> starting this thread, how about something like this:
>
> From 0df7760330cccfe71235b56018d0a33d4a3b9863 Mon Sep 17 00:00:00 2001
> From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> Date: Tue, 3 Dec 2024 21:00:44 +0100
> Subject: [PATCH RFC bpf-next] bpf: add __ctx_ref kfunc argument suffix
>
> The use case is when user wants to use sk_buff pointers as kptrs against
> program types that take in __sk_buff struct as context.
>
> A pair of kfuncs for acquiring and releasing skb would look as follows:
>
> __bpf_kfunc struct sk_buff *bpf_skb_acquire(struct sk_buff *skb__ctx_ref)
> __bpf_kfunc void bpf_skb_release(struct sk_buff *skb__ctx_ref)
>
> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> ---
> kernel/bpf/verifier.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 60938b136365..b16a39d28f8a 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -11303,6 +11303,11 @@ static bool is_kfunc_arg_const_str(const struct btf *btf, const struct btf_param
> return btf_param_match_suffix(btf, arg, "__str");
> }
>
> +static bool is_kfunc_arg_ctx_ref(const struct btf *btf, const struct btf_param *arg)
> +{
> + return btf_param_match_suffix(btf, arg, "__ctx_ref");
imo, new tagging is not needed. It does not give new information to
the ptr type. I still think the verifier can be taught to handle
it better.
I took a closer look. I think the issue is btf_is_prog_ctx_type() selectively
treating some kfunc arg type as the uapi prog ctx instead of honoring what
has been written in the kernel source code.
The projection_of test was first added in
commit 2f4643934670 ("bpf: Support "sk_buff" and "xdp_buff" as valid kfunc arg types").
It was originally added to allow the kfunc to accept reg->type == PTR_TO_CTX
as its "struct sk_buff/xdp_buff *" argument.
After commit cce4c40b9606 ("bpf: treewide: Align kfunc signatures to prog point-of-view"),
the need of projection_of in btf_is_prog_ctx_type() should be gone.
However, this projection_of check has an unintentional side effect
for the other btf_is_prog_ctx_type() callers. It treats subprog
(in the bpf prog code) taking the "struct sk_buff *skb" arg as the uapi
prog ctx also. I don't think the bpf prog should do this though. I have a
"call_dynptr_skb" subprog to show this.
For the skb acquire/release kfunc, I think it is better to begin with
the "struct sk_buff *" as its arg type and return type
instead of "struct __sk_buff *" because it is the __kptr type stored
in the map. The kptr_xchg will also return the PTR_TO_BTF_ID.
It will need another cast call for acquire, like
"bpf_skb_acauire(bpf_cast_to_kern_ctx(ctx))" but this should be fine.
The "struct sk_buff __kptr *skb" stored in the map cannot
be passed to the bpf_dynptr_from_skb() also. It shouldn't be
allowed because bpf_dynptr_from_skb will allow skb write
in the tc bpf prog. The same goes for other tc bpf helpers which
takes ARG_PTR_TO_CTX.
I think we can remove the projection_of call from the
bpf_is_prog_ctx_type() such that it honors the exact argument
type written in the kernel source code. Add this particular projection_of
check (renamed to bpf_is_kern_ctx in the diff) to the other callers for
backward compat such that the caller can selectively translate
the argument of a subprog to the corresponding prog ctx type.
Lightly tested only:
diff --git i/kernel/bpf/btf.c w/kernel/bpf/btf.c
index e7a59e6462a9..2d39f91617fb 100644
--- i/kernel/bpf/btf.c
+++ w/kernel/bpf/btf.c
@@ -5914,6 +5914,26 @@ bool btf_is_projection_of(const char *pname, const char *tname)
return false;
}
+static bool btf_is_kern_ctx(const struct btf *btf,
+ const struct btf_type *t,
+ enum bpf_prog_type prog_type)
+{
+ const struct btf_type *ctx_type;
+ const char *tname, *ctx_tname;
+
+ t = btf_type_skip_modifiers(btf, t->type, NULL);
+ if (!btf_type_is_struct(t))
+ return false;
+
+ tname = btf_name_by_offset(btf, t->name_off);
+ if (!tname)
+ return false;
+
+ ctx_type = find_canonical_prog_ctx_type(prog_type);
+ ctx_tname = btf_name_by_offset(btf_vmlinux, ctx_type->name_off);
+ return btf_is_projection_of(ctx_tname, tname);
+}
+
bool btf_is_prog_ctx_type(struct bpf_verifier_log *log, const struct btf *btf,
const struct btf_type *t, enum bpf_prog_type prog_type,
int arg)
@@ -5976,8 +5996,6 @@ bool btf_is_prog_ctx_type(struct bpf_verifier_log *log, const struct btf *btf,
* int socket_filter_bpf_prog(struct __sk_buff *skb)
* { // no fields of skb are ever used }
*/
- if (btf_is_projection_of(ctx_tname, tname))
- return true;
if (strcmp(ctx_tname, tname)) {
/* bpf_user_pt_regs_t is a typedef, so resolve it to
* underlying struct and check name again
@@ -6140,7 +6158,8 @@ static int btf_translate_to_vmlinux(struct bpf_verifier_log *log,
enum bpf_prog_type prog_type,
int arg)
{
- if (!btf_is_prog_ctx_type(log, btf, t, prog_type, arg))
+ if (!btf_is_prog_ctx_type(log, btf, t, prog_type, arg) &&
+ !btf_is_kern_ctx(btf, t, prog_type))
return -ENOENT;
return find_kern_ctx_type_id(prog_type);
}
@@ -7505,7 +7524,8 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog)
if (!btf_type_is_ptr(t))
goto skip_pointer;
- if ((tags & ARG_TAG_CTX) || btf_is_prog_ctx_type(log, btf, t, prog_type, i)) {
+ if ((tags & ARG_TAG_CTX) || btf_is_prog_ctx_type(log, btf, t, prog_type, i) ||
+ btf_is_kern_ctx(btf, t, prog_type)) {
if (tags & ~ARG_TAG_CTX) {
bpf_log(log, "arg#%d has invalid combination of tags\n", i);
return -EINVAL;
diff --git a/tools/testing/selftests/bpf/progs/skb_acquire.c b/tools/testing/selftests/bpf/progs/skb_acquire.c
new file mode 100644
index 000000000000..65d62fd97905
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/skb_acquire.c
@@ -0,0 +1,59 @@
+#include <vmlinux.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_misc.h"
+#include "bpf_kfuncs.h"
+#include "bpf_tracing_net.h"
+
+struct sk_buff *dummy_skb;
+
+struct sk_buff *bpf_skb_acquire(struct sk_buff *skb) __ksym;
+void bpf_skb_release(struct sk_buff *skb) __ksym;
+void *bpf_cast_to_kern_ctx(void *) __ksym;
+
+struct map_value {
+ int a;
+ struct sk_buff __kptr *skb;
+};
+
+struct {
+ __uint(type, BPF_MAP_TYPE_ARRAY);
+ __type(key, int);
+ __type(value, struct map_value);
+ __uint(max_entries, 16);
+} skb_map SEC(".maps");
+
+__noinline int call_dynptr_skb(struct sk_buff *skb)
+{
+ struct bpf_dynptr ptr;
+
+ bpf_dynptr_from_skb((struct __sk_buff *)skb, 0, &ptr);
+
+ return 0;
+}
+
+__success
+SEC("tc")
+int bpf_tc_egress(struct __sk_buff *ctx)
+{
+ struct sk_buff *skb;
+ struct map_value *map_entry;
+ u32 zero = 0;
+
+ call_dynptr_skb((struct sk_buff *)ctx);
+
+ map_entry = bpf_map_lookup_elem(&skb_map, &zero);
+ if (!map_entry)
+ return TC_ACT_SHOT;
+
+ skb = bpf_skb_acquire(bpf_cast_to_kern_ctx(ctx));
+ if (!skb)
+ return TC_ACT_SHOT;
+
+ skb = bpf_kptr_xchg(&map_entry->skb, skb);
+ if (skb)
+ bpf_skb_release(skb);
+
+ return TC_ACT_OK;
+}
+
+char LICENSE[] SEC("license") = "GPL";
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [External] Storing sk_buffs as kptrs in map
2024-12-04 23:24 ` Martin KaFai Lau
@ 2024-12-06 16:24 ` Maciej Fijalkowski
2024-12-07 0:36 ` Martin KaFai Lau
0 siblings, 1 reply; 10+ messages in thread
From: Maciej Fijalkowski @ 2024-12-06 16:24 UTC (permalink / raw)
To: Martin KaFai Lau; +Cc: Amery Hung, bpf, magnus.karlsson, sreedevi.joshi, ast
On Wed, Dec 04, 2024 at 03:24:33PM -0800, Martin KaFai Lau wrote:
> On 12/3/24 12:46 PM, Maciej Fijalkowski wrote:
> > ; bpf_skb_release((struct __sk_buff *)tmp); @ bpf_bpf.c:161
> > 36: (bf) r1 = r6 ; R1_w=ptr_sk_buff(ref_obj_id=3) R6=ptr_sk_buff(ref_obj_id=3) refs=3
> > 37: (85) call bpf_skb_release#102037
> > arg#0 expected pointer to ctx, but got ptr_
>
> ic. The bpf_skb_release() is hitting the KF_ARG_PTR_TO_CTX again.
>
> > processed 34 insns (limit 1000000) max_states_per_insn 0 total_states 3 peak_states 3 mark_read 1
> > -- END PROG LOAD LOG --
> >
> >
> > Still the same problem. Also even it would work it was not very convenient
> > to cast these types back and forth...I then tried to store __sk_buff as
> > kptr but I ended up with:
> >
> > "map 'skb_map' has to have BTF in order to use bpf_kptr_xchg"
> >
> > which got me lost:) I have a solution though which I'd like to discuss.
> >
> > >
> > > Please share the patch and the test case. It will be easier for others to help.
> >
> > I have come up with rather simple way of achieving what I desired when
> > starting this thread, how about something like this:
> >
> > From 0df7760330cccfe71235b56018d0a33d4a3b9863 Mon Sep 17 00:00:00 2001
> > From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > Date: Tue, 3 Dec 2024 21:00:44 +0100
> > Subject: [PATCH RFC bpf-next] bpf: add __ctx_ref kfunc argument suffix
> >
> > The use case is when user wants to use sk_buff pointers as kptrs against
> > program types that take in __sk_buff struct as context.
> >
> > A pair of kfuncs for acquiring and releasing skb would look as follows:
> >
> > __bpf_kfunc struct sk_buff *bpf_skb_acquire(struct sk_buff *skb__ctx_ref)
> > __bpf_kfunc void bpf_skb_release(struct sk_buff *skb__ctx_ref)
> >
> > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > ---
> > kernel/bpf/verifier.c | 8 ++++++++
> > 1 file changed, 8 insertions(+)
> >
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 60938b136365..b16a39d28f8a 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -11303,6 +11303,11 @@ static bool is_kfunc_arg_const_str(const struct btf *btf, const struct btf_param
> > return btf_param_match_suffix(btf, arg, "__str");
> > }
> > +static bool is_kfunc_arg_ctx_ref(const struct btf *btf, const struct btf_param *arg)
> > +{
> > + return btf_param_match_suffix(btf, arg, "__ctx_ref");
>
> imo, new tagging is not needed. It does not give new information to
> the ptr type. I still think the verifier can be taught to handle
> it better.
>
> I took a closer look. I think the issue is btf_is_prog_ctx_type() selectively
> treating some kfunc arg type as the uapi prog ctx instead of honoring what
> has been written in the kernel source code.
>
> The projection_of test was first added in
> commit 2f4643934670 ("bpf: Support "sk_buff" and "xdp_buff" as valid kfunc arg types").
> It was originally added to allow the kfunc to accept reg->type == PTR_TO_CTX
> as its "struct sk_buff/xdp_buff *" argument.
>
> After commit cce4c40b9606 ("bpf: treewide: Align kfunc signatures to prog point-of-view"),
> the need of projection_of in btf_is_prog_ctx_type() should be gone.
> However, this projection_of check has an unintentional side effect
> for the other btf_is_prog_ctx_type() callers. It treats subprog
> (in the bpf prog code) taking the "struct sk_buff *skb" arg as the uapi
> prog ctx also. I don't think the bpf prog should do this though. I have a
> "call_dynptr_skb" subprog to show this.
Nice that's what I wanted to say from the very beginning but didn't have a
good justification in terms of getting rid of projection_of() call :)
>
> For the skb acquire/release kfunc, I think it is better to begin with
> the "struct sk_buff *" as its arg type and return type
> instead of "struct __sk_buff *" because it is the __kptr type stored
> in the map. The kptr_xchg will also return the PTR_TO_BTF_ID.
Yep that's also what I stumbled upon, that __sk_buff as kptr will not
work.
> It will need another cast call for acquire, like
> "bpf_skb_acauire(bpf_cast_to_kern_ctx(ctx))" but this should be fine.
> The "struct sk_buff __kptr *skb" stored in the map cannot
> be passed to the bpf_dynptr_from_skb() also. It shouldn't be
> allowed because bpf_dynptr_from_skb will allow skb write
> in the tc bpf prog. The same goes for other tc bpf helpers which
> takes ARG_PTR_TO_CTX.
>
> I think we can remove the projection_of call from the
> bpf_is_prog_ctx_type() such that it honors the exact argument
> type written in the kernel source code. Add this particular projection_of
> check (renamed to bpf_is_kern_ctx in the diff) to the other callers for
> backward compat such that the caller can selectively translate
> the argument of a subprog to the corresponding prog ctx type.
>
> Lightly tested only:
I tried the kernel diff on my side and it addressed my needs. Will you
send a patch?
>
> diff --git i/kernel/bpf/btf.c w/kernel/bpf/btf.c
> index e7a59e6462a9..2d39f91617fb 100644
> --- i/kernel/bpf/btf.c
> +++ w/kernel/bpf/btf.c
> @@ -5914,6 +5914,26 @@ bool btf_is_projection_of(const char *pname, const char *tname)
> return false;
> }
> +static bool btf_is_kern_ctx(const struct btf *btf,
> + const struct btf_type *t,
> + enum bpf_prog_type prog_type)
> +{
> + const struct btf_type *ctx_type;
> + const char *tname, *ctx_tname;
> +
> + t = btf_type_skip_modifiers(btf, t->type, NULL);
> + if (!btf_type_is_struct(t))
> + return false;
> +
> + tname = btf_name_by_offset(btf, t->name_off);
> + if (!tname)
> + return false;
> +
> + ctx_type = find_canonical_prog_ctx_type(prog_type);
> + ctx_tname = btf_name_by_offset(btf_vmlinux, ctx_type->name_off);
> + return btf_is_projection_of(ctx_tname, tname);
We're sort of doubling the work that btf_is_prog_ctx_type() is doing also,
maybe add a flag to btf_is_prog_ctx_type() that will allow us to skip
btf_is_projection_of() call when needed? e.g. in get_kfunc_ptr_arg_type().
Thanks for helping here and I look forward for patch submission:)
> +}
> +
> bool btf_is_prog_ctx_type(struct bpf_verifier_log *log, const struct btf *btf,
> const struct btf_type *t, enum bpf_prog_type prog_type,
> int arg)
> @@ -5976,8 +5996,6 @@ bool btf_is_prog_ctx_type(struct bpf_verifier_log *log, const struct btf *btf,
> * int socket_filter_bpf_prog(struct __sk_buff *skb)
> * { // no fields of skb are ever used }
> */
> - if (btf_is_projection_of(ctx_tname, tname))
> - return true;
> if (strcmp(ctx_tname, tname)) {
> /* bpf_user_pt_regs_t is a typedef, so resolve it to
> * underlying struct and check name again
> @@ -6140,7 +6158,8 @@ static int btf_translate_to_vmlinux(struct bpf_verifier_log *log,
> enum bpf_prog_type prog_type,
> int arg)
> {
> - if (!btf_is_prog_ctx_type(log, btf, t, prog_type, arg))
> + if (!btf_is_prog_ctx_type(log, btf, t, prog_type, arg) &&
> + !btf_is_kern_ctx(btf, t, prog_type))
> return -ENOENT;
> return find_kern_ctx_type_id(prog_type);
> }
> @@ -7505,7 +7524,8 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog)
> if (!btf_type_is_ptr(t))
> goto skip_pointer;
> - if ((tags & ARG_TAG_CTX) || btf_is_prog_ctx_type(log, btf, t, prog_type, i)) {
> + if ((tags & ARG_TAG_CTX) || btf_is_prog_ctx_type(log, btf, t, prog_type, i) ||
> + btf_is_kern_ctx(btf, t, prog_type)) {
> if (tags & ~ARG_TAG_CTX) {
> bpf_log(log, "arg#%d has invalid combination of tags\n", i);
> return -EINVAL;
>
> diff --git a/tools/testing/selftests/bpf/progs/skb_acquire.c b/tools/testing/selftests/bpf/progs/skb_acquire.c
> new file mode 100644
> index 000000000000..65d62fd97905
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/skb_acquire.c
> @@ -0,0 +1,59 @@
> +#include <vmlinux.h>
> +#include <bpf/bpf_helpers.h>
> +#include "bpf_misc.h"
> +#include "bpf_kfuncs.h"
> +#include "bpf_tracing_net.h"
> +
> +struct sk_buff *dummy_skb;
> +
> +struct sk_buff *bpf_skb_acquire(struct sk_buff *skb) __ksym;
> +void bpf_skb_release(struct sk_buff *skb) __ksym;
> +void *bpf_cast_to_kern_ctx(void *) __ksym;
> +
> +struct map_value {
> + int a;
> + struct sk_buff __kptr *skb;
> +};
> +
> +struct {
> + __uint(type, BPF_MAP_TYPE_ARRAY);
> + __type(key, int);
> + __type(value, struct map_value);
> + __uint(max_entries, 16);
> +} skb_map SEC(".maps");
> +
> +__noinline int call_dynptr_skb(struct sk_buff *skb)
> +{
> + struct bpf_dynptr ptr;
> +
> + bpf_dynptr_from_skb((struct __sk_buff *)skb, 0, &ptr);
> +
> + return 0;
> +}
> +
> +__success
> +SEC("tc")
> +int bpf_tc_egress(struct __sk_buff *ctx)
> +{
> + struct sk_buff *skb;
> + struct map_value *map_entry;
> + u32 zero = 0;
> +
> + call_dynptr_skb((struct sk_buff *)ctx);
> +
> + map_entry = bpf_map_lookup_elem(&skb_map, &zero);
> + if (!map_entry)
> + return TC_ACT_SHOT;
> +
> + skb = bpf_skb_acquire(bpf_cast_to_kern_ctx(ctx));
> + if (!skb)
> + return TC_ACT_SHOT;
> +
> + skb = bpf_kptr_xchg(&map_entry->skb, skb);
> + if (skb)
> + bpf_skb_release(skb);
> +
> + return TC_ACT_OK;
> +}
> +
> +char LICENSE[] SEC("license") = "GPL";
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [External] Storing sk_buffs as kptrs in map
2024-12-06 16:24 ` Maciej Fijalkowski
@ 2024-12-07 0:36 ` Martin KaFai Lau
2024-12-09 13:17 ` Maciej Fijalkowski
0 siblings, 1 reply; 10+ messages in thread
From: Martin KaFai Lau @ 2024-12-07 0:36 UTC (permalink / raw)
To: Maciej Fijalkowski; +Cc: Amery Hung, bpf, magnus.karlsson, sreedevi.joshi, ast
On 12/6/24 8:24 AM, Maciej Fijalkowski wrote:
>> I think we can remove the projection_of call from the
>> bpf_is_prog_ctx_type() such that it honors the exact argument
>> type written in the kernel source code. Add this particular projection_of
>> check (renamed to bpf_is_kern_ctx in the diff) to the other callers for
>> backward compat such that the caller can selectively translate
>> the argument of a subprog to the corresponding prog ctx type.
>>
>> Lightly tested only:
> I tried the kernel diff on my side and it addressed my needs. Will you
> send a patch?
There is no real kfunc taking the "struct sk_buff *" now. It is better to make
this change together with your skb acquire/release kfunc introduction. You can
include this patch in your future set.
>
>> diff --git i/kernel/bpf/btf.c w/kernel/bpf/btf.c
>> index e7a59e6462a9..2d39f91617fb 100644
>> --- i/kernel/bpf/btf.c
>> +++ w/kernel/bpf/btf.c
>> @@ -5914,6 +5914,26 @@ bool btf_is_projection_of(const char *pname, const char *tname)
>> return false;
>> }
>> +static bool btf_is_kern_ctx(const struct btf *btf,
>> + const struct btf_type *t,
>> + enum bpf_prog_type prog_type)
>> +{
>> + const struct btf_type *ctx_type;
>> + const char *tname, *ctx_tname;
>> +
>> + t = btf_type_skip_modifiers(btf, t->type, NULL);
>> + if (!btf_type_is_struct(t))
>> + return false;
>> +
>> + tname = btf_name_by_offset(btf, t->name_off);
>> + if (!tname)
>> + return false;
>> +
>> + ctx_type = find_canonical_prog_ctx_type(prog_type);
>> + ctx_tname = btf_name_by_offset(btf_vmlinux, ctx_type->name_off);
>> + return btf_is_projection_of(ctx_tname, tname);
> We're sort of doubling the work that btf_is_prog_ctx_type() is doing also,
> maybe add a flag to btf_is_prog_ctx_type() that will allow us to skip
> btf_is_projection_of() call when needed? e.g. in get_kfunc_ptr_arg_type().
It is pretty cheap to do the btf_is_kern_ctx().
I don't have a strong opinion on either way. may be a "bool check_kern_ctx".
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [External] Storing sk_buffs as kptrs in map
2024-12-07 0:36 ` Martin KaFai Lau
@ 2024-12-09 13:17 ` Maciej Fijalkowski
0 siblings, 0 replies; 10+ messages in thread
From: Maciej Fijalkowski @ 2024-12-09 13:17 UTC (permalink / raw)
To: Martin KaFai Lau; +Cc: Amery Hung, bpf, magnus.karlsson, sreedevi.joshi, ast
On Fri, Dec 06, 2024 at 04:36:03PM -0800, Martin KaFai Lau wrote:
> On 12/6/24 8:24 AM, Maciej Fijalkowski wrote:
> > > I think we can remove the projection_of call from the
> > > bpf_is_prog_ctx_type() such that it honors the exact argument
> > > type written in the kernel source code. Add this particular projection_of
> > > check (renamed to bpf_is_kern_ctx in the diff) to the other callers for
> > > backward compat such that the caller can selectively translate
> > > the argument of a subprog to the corresponding prog ctx type.
> > >
> > > Lightly tested only:
> > I tried the kernel diff on my side and it addressed my needs. Will you
> > send a patch?
>
> There is no real kfunc taking the "struct sk_buff *" now. It is better to
> make this change together with your skb acquire/release kfunc introduction.
> You can include this patch in your future set.
Ack, good point. Will do as suggested.
>
> >
> > > diff --git i/kernel/bpf/btf.c w/kernel/bpf/btf.c
> > > index e7a59e6462a9..2d39f91617fb 100644
> > > --- i/kernel/bpf/btf.c
> > > +++ w/kernel/bpf/btf.c
> > > @@ -5914,6 +5914,26 @@ bool btf_is_projection_of(const char *pname, const char *tname)
> > > return false;
> > > }
> > > +static bool btf_is_kern_ctx(const struct btf *btf,
> > > + const struct btf_type *t,
> > > + enum bpf_prog_type prog_type)
> > > +{
> > > + const struct btf_type *ctx_type;
> > > + const char *tname, *ctx_tname;
> > > +
> > > + t = btf_type_skip_modifiers(btf, t->type, NULL);
> > > + if (!btf_type_is_struct(t))
> > > + return false;
> > > +
> > > + tname = btf_name_by_offset(btf, t->name_off);
> > > + if (!tname)
> > > + return false;
> > > +
> > > + ctx_type = find_canonical_prog_ctx_type(prog_type);
> > > + ctx_tname = btf_name_by_offset(btf_vmlinux, ctx_type->name_off);
> > > + return btf_is_projection_of(ctx_tname, tname);
> > We're sort of doubling the work that btf_is_prog_ctx_type() is doing also,
> > maybe add a flag to btf_is_prog_ctx_type() that will allow us to skip
> > btf_is_projection_of() call when needed? e.g. in get_kfunc_ptr_arg_type().
>
> It is pretty cheap to do the btf_is_kern_ctx().
>
> I don't have a strong opinion on either way. may be a "bool check_kern_ctx".
It's just that this flow to retrieve both struct names is duplicated,
below would serve the same purpose...but I don't have a strong opinion
either:) anyways, thanks for discussion!
diff --git a/include/linux/btf.h b/include/linux/btf.h
index 4214e76c9168..3c6acb993d5b 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -577,7 +577,7 @@ struct btf_struct_meta *btf_find_struct_meta(const struct btf *btf, u32 btf_id);
bool btf_is_projection_of(const char *pname, const char *tname);
bool btf_is_prog_ctx_type(struct bpf_verifier_log *log, const struct btf *btf,
const struct btf_type *t, enum bpf_prog_type prog_type,
- int arg);
+ int arg, bool check_proj);
int get_kern_ctx_btf_id(struct bpf_verifier_log *log, enum bpf_prog_type prog_type);
bool btf_types_are_same(const struct btf *btf1, u32 id1,
const struct btf *btf2, u32 id2);
@@ -653,7 +653,7 @@ static inline struct btf_struct_meta *btf_find_struct_meta(const struct btf *btf
static inline bool
btf_is_prog_ctx_type(struct bpf_verifier_log *log, const struct btf *btf,
const struct btf_type *t, enum bpf_prog_type prog_type,
- int arg)
+ int arg, bool check_proj)
{
return false;
}
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index e7a59e6462a9..dd27eb35b827 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -5916,7 +5916,7 @@ bool btf_is_projection_of(const char *pname, const char *tname)
bool btf_is_prog_ctx_type(struct bpf_verifier_log *log, const struct btf *btf,
const struct btf_type *t, enum bpf_prog_type prog_type,
- int arg)
+ int arg, bool check_proj)
{
const struct btf_type *ctx_type;
const char *tname, *ctx_tname;
@@ -5976,8 +5976,9 @@ bool btf_is_prog_ctx_type(struct bpf_verifier_log *log, const struct btf *btf,
* int socket_filter_bpf_prog(struct __sk_buff *skb)
* { // no fields of skb are ever used }
*/
- if (btf_is_projection_of(ctx_tname, tname))
- return true;
+ if (check_proj)
+ if (btf_is_projection_of(ctx_tname, tname))
+ return true;
if (strcmp(ctx_tname, tname)) {
/* bpf_user_pt_regs_t is a typedef, so resolve it to
* underlying struct and check name again
@@ -6140,7 +6141,7 @@ static int btf_translate_to_vmlinux(struct bpf_verifier_log *log,
enum bpf_prog_type prog_type,
int arg)
{
- if (!btf_is_prog_ctx_type(log, btf, t, prog_type, arg))
+ if (!btf_is_prog_ctx_type(log, btf, t, prog_type, arg, true))
return -ENOENT;
return find_kern_ctx_type_id(prog_type);
}
@@ -7505,7 +7506,7 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog)
if (!btf_type_is_ptr(t))
goto skip_pointer;
- if ((tags & ARG_TAG_CTX) || btf_is_prog_ctx_type(log, btf, t, prog_type, i)) {
+ if ((tags & ARG_TAG_CTX) || btf_is_prog_ctx_type(log, btf, t, prog_type, i, true)) {
if (tags & ~ARG_TAG_CTX) {
bpf_log(log, "arg#%d has invalid combination of tags\n", i);
return -EINVAL;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 60938b136365..bb636f53dfc3 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -11604,7 +11604,8 @@ get_kfunc_ptr_arg_type(struct bpf_verifier_env *env,
* type to our caller. When a set of conditions hold in the BTF type of
* arguments, we resolve it to a known kfunc_ptr_arg_type.
*/
- if (btf_is_prog_ctx_type(&env->log, meta->btf, t, resolve_prog_type(env->prog), argno))
+ if (btf_is_prog_ctx_type(&env->log, meta->btf, t, resolve_prog_type(env->prog),
+ argno, false))
return KF_ARG_PTR_TO_CTX;
if (is_kfunc_arg_nullable(meta->btf, &args[argno]) && register_is_null(reg))
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-12-09 13:18 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-26 17:05 Storing sk_buffs as kptrs in map Maciej Fijalkowski
2024-11-26 19:56 ` [External] " Amery Hung
2024-11-26 20:47 ` Martin KaFai Lau
2024-11-27 19:07 ` Maciej Fijalkowski
2024-11-27 20:54 ` Martin KaFai Lau
2024-12-03 20:46 ` Maciej Fijalkowski
2024-12-04 23:24 ` Martin KaFai Lau
2024-12-06 16:24 ` Maciej Fijalkowski
2024-12-07 0:36 ` Martin KaFai Lau
2024-12-09 13:17 ` Maciej Fijalkowski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox