public inbox for xdp-newbies@vger.kernel.org
 help / color / mirror / Atom feed
* Different packet handling after bpf_redirect_map with BPF_F_BROADCAST
@ 2024-07-04 10:19 Florian Kauer
  2024-07-04 11:20 ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 8+ messages in thread
From: Florian Kauer @ 2024-07-04 10:19 UTC (permalink / raw)
  To: xdp-newbies; +Cc: Ferenc Fejes

Hi,
we are currently using bpf_redirect_map with BPF_F_BROADCAST to replicate frames for sending traffic over redundant paths.

See for example https://www.rfc-editor.org/rfc/rfc8655.html#section-3.2.2.2 for background
and https://github.com/EricssonResearch/xdpfrer/blob/5f0845cb2e4c4da325f0e77df3020526ad992aff/src/xdpfrer.bpf.c#L393 for the current implementation.

However, we want to modify the frame after the replication. In the easiest case this means to change the VLAN tag to route the traffic over different VLANs. This is currently done by taking a different egress_ifindex into account after the replication and that works well so far ( https://github.com/EricssonResearch/xdpfrer/blob/5f0845cb2e4c4da325f0e77df3020526ad992aff/src/xdpfrer.bpf.c#L399 ).

BUT there are cases where the egress_interface for both replicated packets shall be the same and the different path of the replicated frames is only taken on a subsequent switch based on a different VLAN tag. So how could the XDP program differentiate between the different replicated frames if the egress_interface is the same?

So my basic idea would be to add two (or more) DEVMAP entries with the same ifindex into the same map. And then either

1. Add several xdp/devmap progs to the same loaded bpf and reference them in the DEVMAP entry, like

SEC("xdp/devmap")
int replicate_postprocessing_first(struct xdp_md *pkt)
{
    int ret = change_vlan(pkt, 0, true);
    ...
}

SEC("xdp/devmap")
int replicate_postprocessing_second(struct xdp_md *pkt)
{
    int ret = change_vlan(pkt, 1, true);
    ...
}

This, however, would be quite unflexible.

2. Load the same bpf several times without attaching it to an interface and set e.g. a const to a different value after loading. But can I even reference a xdp/devmap prog from a different loaded bpf, especially when it is not attached?

3. Extend the kernel with a way to let the xdp/devmap prog know from which DEVMAP entry its execution originates (like an additional entry in the bpf_devmap_val that is then set in the xdp_md).

Any other ideas?

By the way: We likely need the same for CPUMAP, too (see https://lore.kernel.org/xdp-newbies/c8072891-6d5c-42c3-8b13-e8ca9ab6c43c@linutronix.de/T/#u for why).

Thanks,
Florian

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Different packet handling after bpf_redirect_map with BPF_F_BROADCAST
  2024-07-04 10:19 Different packet handling after bpf_redirect_map with BPF_F_BROADCAST Florian Kauer
@ 2024-07-04 11:20 ` Toke Høiland-Jørgensen
  2024-07-04 12:00   ` Florian Kauer
  0 siblings, 1 reply; 8+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-07-04 11:20 UTC (permalink / raw)
  To: Florian Kauer, xdp-newbies; +Cc: Ferenc Fejes

Florian Kauer <florian.kauer@linutronix.de> writes:

> Hi,
> we are currently using bpf_redirect_map with BPF_F_BROADCAST to replicate frames for sending traffic over redundant paths.
>
> See for example https://www.rfc-editor.org/rfc/rfc8655.html#section-3.2.2.2 for background
> and https://github.com/EricssonResearch/xdpfrer/blob/5f0845cb2e4c4da325f0e77df3020526ad992aff/src/xdpfrer.bpf.c#L393 for the current implementation.
>
> However, we want to modify the frame after the replication. In the easiest case this means to change the VLAN tag to route the traffic over different VLANs. This is currently done by taking a different egress_ifindex into account after the replication and that works well so far ( https://github.com/EricssonResearch/xdpfrer/blob/5f0845cb2e4c4da325f0e77df3020526ad992aff/src/xdpfrer.bpf.c#L399 ).
>
> BUT there are cases where the egress_interface for both replicated packets shall be the same and the different path of the replicated frames is only taken on a subsequent switch based on a different VLAN tag. So how could the XDP program differentiate between the different replicated frames if the egress_interface is the same?
>
> So my basic idea would be to add two (or more) DEVMAP entries with the same ifindex into the same map. And then either
>
> 1. Add several xdp/devmap progs to the same loaded bpf and reference them in the DEVMAP entry, like
>
> SEC("xdp/devmap")
> int replicate_postprocessing_first(struct xdp_md *pkt)
> {
>     int ret = change_vlan(pkt, 0, true);
>     ...
> }
>
> SEC("xdp/devmap")
> int replicate_postprocessing_second(struct xdp_md *pkt)
> {
>     int ret = change_vlan(pkt, 1, true);
>     ...
> }
>
> This, however, would be quite unflexible.

Having multiple entries in the devmap entry corresponds roughly to how
the stack handles VLANs. I.e., when configuring a VLAN, you create a new
netdevice (which you would then put into the devmap). Unfortunately, XDP
doesn't really know how to deal with stacked devices like VLANs, so you
can't actually add a VLAN device into a devmap. But creating an
interface for this would be one way of dealing with a situation like
this, without having to hardcode things into a BPF program.

> 2. Load the same bpf several times without attaching it to an
> interface and set e.g. a const to a different value after loading.

This would work, I think. Something like:

static volatile const vlan_id = 1;

SEC("xdp/devmap")
int replicate_postprocessing_second(struct xdp_md *pkt)
{
    int ret = change_vlan(pkt, vlan_id, true);
    ...
}

and then the loader would replace the value of vlan_id before loading;
using skeletons this would look something like:

skel = xdp_program_skeleton__open();
skel->rodata->vlan_id = 2;
xdp_program_skeleton__load();

/* attach to devmap */

> But can I even reference a xdp/devmap prog from a different loaded
> bpf, especially when it is not attached?

Why do you need to reference it from a different BPF program? The
userspace program just attaches it to the right devmap entry?

> 3. Extend the kernel with a way to let the xdp/devmap prog know from
> which DEVMAP entry its execution originates (like an additional entry
> in the bpf_devmap_val that is then set in the xdp_md).

This could be useful in any case, so I would personally be fine with
adding something like this (for both devmap and cpumap) :)

-Toke


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Different packet handling after bpf_redirect_map with BPF_F_BROADCAST
  2024-07-04 11:20 ` Toke Høiland-Jørgensen
@ 2024-07-04 12:00   ` Florian Kauer
  2024-07-04 12:30     ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 8+ messages in thread
From: Florian Kauer @ 2024-07-04 12:00 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, xdp-newbies; +Cc: Ferenc Fejes

Hi Toke,
thanks a lot for the prompt response!

On 7/4/24 13:20, Toke Høiland-Jørgensen wrote:
> Florian Kauer <florian.kauer@linutronix.de> writes:
> 
>> Hi,
>> we are currently using bpf_redirect_map with BPF_F_BROADCAST to replicate frames for sending traffic over redundant paths.
>>
>> See for example https://www.rfc-editor.org/rfc/rfc8655.html#section-3.2.2.2 for background
>> and https://github.com/EricssonResearch/xdpfrer/blob/5f0845cb2e4c4da325f0e77df3020526ad992aff/src/xdpfrer.bpf.c#L393 for the current implementation.
>>
>> However, we want to modify the frame after the replication. In the easiest case this means to change the VLAN tag to route the traffic over different VLANs. This is currently done by taking a different egress_ifindex into account after the replication and that works well so far ( https://github.com/EricssonResearch/xdpfrer/blob/5f0845cb2e4c4da325f0e77df3020526ad992aff/src/xdpfrer.bpf.c#L399 ).
>>
>> BUT there are cases where the egress_interface for both replicated packets shall be the same and the different path of the replicated frames is only taken on a subsequent switch based on a different VLAN tag. So how could the XDP program differentiate between the different replicated frames if the egress_interface is the same?
>>
>> So my basic idea would be to add two (or more) DEVMAP entries with the same ifindex into the same map. And then either
>>
>> 1. Add several xdp/devmap progs to the same loaded bpf and reference them in the DEVMAP entry, like
>>
>> SEC("xdp/devmap")
>> int replicate_postprocessing_first(struct xdp_md *pkt)
>> {
>>     int ret = change_vlan(pkt, 0, true);
>>     ...
>> }
>>
>> SEC("xdp/devmap")
>> int replicate_postprocessing_second(struct xdp_md *pkt)
>> {
>>     int ret = change_vlan(pkt, 1, true);
>>     ...
>> }
>>
>> This, however, would be quite unflexible.
> 
> Having multiple entries in the devmap entry corresponds roughly to how
> the stack handles VLANs. I.e., when configuring a VLAN, you create a new
> netdevice (which you would then put into the devmap). Unfortunately, XDP
> doesn't really know how to deal with stacked devices like VLANs, so you
> can't actually add a VLAN device into a devmap. But creating an
> interface for this would be one way of dealing with a situation like
> this, without having to hardcode things into a BPF program.

I see. That would be very nice in general, but for our specific application
likely still to unflexible to refer to a different VLAN interface
(e.g. in addition to changing the VLAN tag we also might want to
add/remove/modify MPLS labels and other headers).

> 
>> 2. Load the same bpf several times without attaching it to an
>> interface and set e.g. a const to a different value after loading.
> 
> This would work, I think. Something like:
> 
> static volatile const vlan_id = 1;
> 
> SEC("xdp/devmap")
> int replicate_postprocessing_second(struct xdp_md *pkt)
> {
>     int ret = change_vlan(pkt, vlan_id, true);
>     ...
> }
> 
> and then the loader would replace the value of vlan_id before loading;
> using skeletons this would look something like:
> 
> skel = xdp_program_skeleton__open();
> skel->rodata->vlan_id = 2;
> xdp_program_skeleton__load();
> 
> /* attach to devmap */

Yes, that is exactly what I was imagining, thanks!

> 
>> But can I even reference a xdp/devmap prog from a different loaded
>> bpf, especially when it is not attached?
> 
> Why do you need to reference it from a different BPF program? The
> userspace program just attaches it to the right devmap entry?

What I wanted to imply with this is that the lifetime of the different BPF
programs is unclear to me. So AFAIK (but I might be totally wrong) an
XDP program has the lifetime of the process that loaded the program
(i.e. called xdp_program_skeleton__load()) so it is destroyed/unloaded
as soon as the process ends, UNLESS it is explicitly attached to an interface
by which it inherits the lifetime of the interface it was attached to
(i.e. might outlive the loading process).

If I do what you sketched above, where we do not attach the program explicitly
to an interface but only directly to the devmap, does it then inherit the
lifetime of the interface of the BPF program the devmap belongs to or is
it destroyed as soon as the loading process end?

The latter would invalidate the bpf_prog inside the devmap (very bad)
while the first seems very complex to handle (especially since I would expect
I can attach the same loaded program to different devmaps of different
interfaces). But you probably have successfully tackled this complexity
already :-)

> 
>> 3. Extend the kernel with a way to let the xdp/devmap prog know from
>> which DEVMAP entry its execution originates (like an additional entry
>> in the bpf_devmap_val that is then set in the xdp_md).
> 
> This could be useful in any case, so I would personally be fine with
> adding something like this (for both devmap and cpumap) :)

Would you prefer a simple u32 (or similar) that could then be used as
index for an array or a more complex data structure/void* to fill
with arbitrary data?

The first would require an additional lookup (which is OK, but a little
overhead), but for the latter it is not clear to me where the data would
actually be located in memory...

> 
> -Toke
> 

Thanks,
Florian

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Different packet handling after bpf_redirect_map with BPF_F_BROADCAST
  2024-07-04 12:00   ` Florian Kauer
@ 2024-07-04 12:30     ` Toke Høiland-Jørgensen
  2024-07-04 13:08       ` Florian Kauer
  0 siblings, 1 reply; 8+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-07-04 12:30 UTC (permalink / raw)
  To: Florian Kauer, xdp-newbies; +Cc: Ferenc Fejes

Florian Kauer <florian.kauer@linutronix.de> writes:

> Hi Toke,
> thanks a lot for the prompt response!
>
> On 7/4/24 13:20, Toke Høiland-Jørgensen wrote:
>> Florian Kauer <florian.kauer@linutronix.de> writes:
>> 
>>> Hi,
>>> we are currently using bpf_redirect_map with BPF_F_BROADCAST to replicate frames for sending traffic over redundant paths.
>>>
>>> See for example https://www.rfc-editor.org/rfc/rfc8655.html#section-3.2.2.2 for background
>>> and https://github.com/EricssonResearch/xdpfrer/blob/5f0845cb2e4c4da325f0e77df3020526ad992aff/src/xdpfrer.bpf.c#L393 for the current implementation.
>>>
>>> However, we want to modify the frame after the replication. In the easiest case this means to change the VLAN tag to route the traffic over different VLANs. This is currently done by taking a different egress_ifindex into account after the replication and that works well so far ( https://github.com/EricssonResearch/xdpfrer/blob/5f0845cb2e4c4da325f0e77df3020526ad992aff/src/xdpfrer.bpf.c#L399 ).
>>>
>>> BUT there are cases where the egress_interface for both replicated packets shall be the same and the different path of the replicated frames is only taken on a subsequent switch based on a different VLAN tag. So how could the XDP program differentiate between the different replicated frames if the egress_interface is the same?
>>>
>>> So my basic idea would be to add two (or more) DEVMAP entries with the same ifindex into the same map. And then either
>>>
>>> 1. Add several xdp/devmap progs to the same loaded bpf and reference them in the DEVMAP entry, like
>>>
>>> SEC("xdp/devmap")
>>> int replicate_postprocessing_first(struct xdp_md *pkt)
>>> {
>>>     int ret = change_vlan(pkt, 0, true);
>>>     ...
>>> }
>>>
>>> SEC("xdp/devmap")
>>> int replicate_postprocessing_second(struct xdp_md *pkt)
>>> {
>>>     int ret = change_vlan(pkt, 1, true);
>>>     ...
>>> }
>>>
>>> This, however, would be quite unflexible.
>> 
>> Having multiple entries in the devmap entry corresponds roughly to how
>> the stack handles VLANs. I.e., when configuring a VLAN, you create a new
>> netdevice (which you would then put into the devmap). Unfortunately, XDP
>> doesn't really know how to deal with stacked devices like VLANs, so you
>> can't actually add a VLAN device into a devmap. But creating an
>> interface for this would be one way of dealing with a situation like
>> this, without having to hardcode things into a BPF program.
>
> I see. That would be very nice in general, but for our specific application
> likely still to unflexible to refer to a different VLAN interface
> (e.g. in addition to changing the VLAN tag we also might want to
> add/remove/modify MPLS labels and other headers).

Yeah, part of the reason this stacked device support never materialised
is that the way the kernel handles it requires a lot of indirection
(MPLS+VLAN would be two stacked devices on top of each other, say), and
for XDP you probably want to short-circuit this anyway. So that means
rolling your own abstraction here; but you can do that with the custom
devmap approach :)

>>> 2. Load the same bpf several times without attaching it to an
>>> interface and set e.g. a const to a different value after loading.
>> 
>> This would work, I think. Something like:
>> 
>> static volatile const vlan_id = 1;
>> 
>> SEC("xdp/devmap")
>> int replicate_postprocessing_second(struct xdp_md *pkt)
>> {
>>     int ret = change_vlan(pkt, vlan_id, true);
>>     ...
>> }
>> 
>> and then the loader would replace the value of vlan_id before loading;
>> using skeletons this would look something like:
>> 
>> skel = xdp_program_skeleton__open();
>> skel->rodata->vlan_id = 2;
>> xdp_program_skeleton__load();
>> 
>> /* attach to devmap */
>
> Yes, that is exactly what I was imagining, thanks!
>
>> 
>>> But can I even reference a xdp/devmap prog from a different loaded
>>> bpf, especially when it is not attached?
>> 
>> Why do you need to reference it from a different BPF program? The
>> userspace program just attaches it to the right devmap entry?
>
> What I wanted to imply with this is that the lifetime of the different BPF
> programs is unclear to me. So AFAIK (but I might be totally wrong) an
> XDP program has the lifetime of the process that loaded the program
> (i.e. called xdp_program_skeleton__load()) so it is destroyed/unloaded
> as soon as the process ends, UNLESS it is explicitly attached to an interface
> by which it inherits the lifetime of the interface it was attached to
> (i.e. might outlive the loading process).
>
> If I do what you sketched above, where we do not attach the program explicitly
> to an interface but only directly to the devmap, does it then inherit the
> lifetime of the interface of the BPF program the devmap belongs to or is
> it destroyed as soon as the loading process end?

Ah, right. Devmap attachment works similar to netlink attachment to an
interface: you pass in an fd as part of the bpf_map_update() from
userspace, and the devmap takes a reference on the BPF program when
inserting it into the map. So the BPF program inserted this way stays
alive as long as the map does, unless it is explicitly removed by
another bpf_map_update().

Normally, the map itself will be referenced by the XDP program running
on ingress, so as long as that stays attached, the map will stay alive.
And when the (ingress) XDP program is removed, it will release its
reference on the map, which in turn will cause the map to release all
references to other BPF programs inserted into it.

So in other words, there is no direct reference between the two XDP
programs, but things should mostly keep working the way you expect them
to...

>>> 3. Extend the kernel with a way to let the xdp/devmap prog know from
>>> which DEVMAP entry its execution originates (like an additional entry
>>> in the bpf_devmap_val that is then set in the xdp_md).
>> 
>> This could be useful in any case, so I would personally be fine with
>> adding something like this (for both devmap and cpumap) :)
>
> Would you prefer a simple u32 (or similar) that could then be used as
> index for an array or a more complex data structure/void* to fill
> with arbitrary data?

Well, the API for map indexing specifies a u64 map index, so just
reusing that would be the simplest :)

-Toke


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Different packet handling after bpf_redirect_map with BPF_F_BROADCAST
  2024-07-04 12:30     ` Toke Høiland-Jørgensen
@ 2024-07-04 13:08       ` Florian Kauer
  2024-07-04 14:51         ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 8+ messages in thread
From: Florian Kauer @ 2024-07-04 13:08 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, xdp-newbies; +Cc: Ferenc Fejes

On 7/4/24 14:30, Toke Høiland-Jørgensen wrote:
> Florian Kauer <florian.kauer@linutronix.de> writes:
> 
>> Hi Toke,
>> thanks a lot for the prompt response!
>>
>> On 7/4/24 13:20, Toke Høiland-Jørgensen wrote:
>>> Florian Kauer <florian.kauer@linutronix.de> writes:
>>>
>>>> Hi,
>>>> we are currently using bpf_redirect_map with BPF_F_BROADCAST to replicate frames for sending traffic over redundant paths.
>>>>
>>>> See for example https://www.rfc-editor.org/rfc/rfc8655.html#section-3.2.2.2 for background
>>>> and https://github.com/EricssonResearch/xdpfrer/blob/5f0845cb2e4c4da325f0e77df3020526ad992aff/src/xdpfrer.bpf.c#L393 for the current implementation.
>>>>
>>>> However, we want to modify the frame after the replication. In the easiest case this means to change the VLAN tag to route the traffic over different VLANs. This is currently done by taking a different egress_ifindex into account after the replication and that works well so far ( https://github.com/EricssonResearch/xdpfrer/blob/5f0845cb2e4c4da325f0e77df3020526ad992aff/src/xdpfrer.bpf.c#L399 ).
>>>>
>>>> BUT there are cases where the egress_interface for both replicated packets shall be the same and the different path of the replicated frames is only taken on a subsequent switch based on a different VLAN tag. So how could the XDP program differentiate between the different replicated frames if the egress_interface is the same?
>>>>
>>>> So my basic idea would be to add two (or more) DEVMAP entries with the same ifindex into the same map. And then either
>>>>
>>>> 1. Add several xdp/devmap progs to the same loaded bpf and reference them in the DEVMAP entry, like
>>>>
>>>> SEC("xdp/devmap")
>>>> int replicate_postprocessing_first(struct xdp_md *pkt)
>>>> {
>>>>     int ret = change_vlan(pkt, 0, true);
>>>>     ...
>>>> }
>>>>
>>>> SEC("xdp/devmap")
>>>> int replicate_postprocessing_second(struct xdp_md *pkt)
>>>> {
>>>>     int ret = change_vlan(pkt, 1, true);
>>>>     ...
>>>> }
>>>>
>>>> This, however, would be quite unflexible.
>>>
>>> Having multiple entries in the devmap entry corresponds roughly to how
>>> the stack handles VLANs. I.e., when configuring a VLAN, you create a new
>>> netdevice (which you would then put into the devmap). Unfortunately, XDP
>>> doesn't really know how to deal with stacked devices like VLANs, so you
>>> can't actually add a VLAN device into a devmap. But creating an
>>> interface for this would be one way of dealing with a situation like
>>> this, without having to hardcode things into a BPF program.
>>
>> I see. That would be very nice in general, but for our specific application
>> likely still to unflexible to refer to a different VLAN interface
>> (e.g. in addition to changing the VLAN tag we also might want to
>> add/remove/modify MPLS labels and other headers).
> 
> Yeah, part of the reason this stacked device support never materialised
> is that the way the kernel handles it requires a lot of indirection
> (MPLS+VLAN would be two stacked devices on top of each other, say), and
> for XDP you probably want to short-circuit this anyway. So that means
> rolling your own abstraction here; but you can do that with the custom
> devmap approach :)

Sounds good :-)

> 
>>>> 2. Load the same bpf several times without attaching it to an
>>>> interface and set e.g. a const to a different value after loading.
>>>
>>> This would work, I think. Something like:
>>>
>>> static volatile const vlan_id = 1;
>>>
>>> SEC("xdp/devmap")
>>> int replicate_postprocessing_second(struct xdp_md *pkt)
>>> {
>>>     int ret = change_vlan(pkt, vlan_id, true);
>>>     ...
>>> }
>>>
>>> and then the loader would replace the value of vlan_id before loading;
>>> using skeletons this would look something like:
>>>
>>> skel = xdp_program_skeleton__open();
>>> skel->rodata->vlan_id = 2;
>>> xdp_program_skeleton__load();
>>>
>>> /* attach to devmap */
>>
>> Yes, that is exactly what I was imagining, thanks!
>>
>>>
>>>> But can I even reference a xdp/devmap prog from a different loaded
>>>> bpf, especially when it is not attached?
>>>
>>> Why do you need to reference it from a different BPF program? The
>>> userspace program just attaches it to the right devmap entry?
>>
>> What I wanted to imply with this is that the lifetime of the different BPF
>> programs is unclear to me. So AFAIK (but I might be totally wrong) an
>> XDP program has the lifetime of the process that loaded the program
>> (i.e. called xdp_program_skeleton__load()) so it is destroyed/unloaded
>> as soon as the process ends, UNLESS it is explicitly attached to an interface
>> by which it inherits the lifetime of the interface it was attached to
>> (i.e. might outlive the loading process).
>>
>> If I do what you sketched above, where we do not attach the program explicitly
>> to an interface but only directly to the devmap, does it then inherit the
>> lifetime of the interface of the BPF program the devmap belongs to or is
>> it destroyed as soon as the loading process end?
> 
> Ah, right. Devmap attachment works similar to netlink attachment to an
> interface: you pass in an fd as part of the bpf_map_update() from
> userspace, and the devmap takes a reference on the BPF program when
> inserting it into the map. So the BPF program inserted this way stays
> alive as long as the map does, unless it is explicitly removed by
> another bpf_map_update().
> 
> Normally, the map itself will be referenced by the XDP program running
> on ingress, so as long as that stays attached, the map will stay alive.
> And when the (ingress) XDP program is removed, it will release its
> reference on the map, which in turn will cause the map to release all
> references to other BPF programs inserted into it.
> 
> So in other words, there is no direct reference between the two XDP
> programs, but things should mostly keep working the way you expect them
> to...

Ok, that makes a lot of sense, thanks!

> 
>>>> 3. Extend the kernel with a way to let the xdp/devmap prog know from
>>>> which DEVMAP entry its execution originates (like an additional entry
>>>> in the bpf_devmap_val that is then set in the xdp_md).
>>>
>>> This could be useful in any case, so I would personally be fine with
>>> adding something like this (for both devmap and cpumap) :)
>>
>> Would you prefer a simple u32 (or similar) that could then be used as
>> index for an array or a more complex data structure/void* to fill
>> with arbitrary data?
> 
> Well, the API for map indexing specifies a u64 map index, so just
> reusing that would be the simplest :)

u64? Now I am confused:
"The key type is an unsigned 32-bit integer (4 bytes)"
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/Documentation/bpf/map_array.rst?h=next-20240703

> 
> -Toke
> 

Thanks,
Florian

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Different packet handling after bpf_redirect_map with BPF_F_BROADCAST
  2024-07-04 13:08       ` Florian Kauer
@ 2024-07-04 14:51         ` Toke Høiland-Jørgensen
  2024-07-04 15:20           ` Florian Kauer
  0 siblings, 1 reply; 8+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-07-04 14:51 UTC (permalink / raw)
  To: Florian Kauer, xdp-newbies; +Cc: Ferenc Fejes

Florian Kauer <florian.kauer@linutronix.de> writes:

>>>>> 3. Extend the kernel with a way to let the xdp/devmap prog know from
>>>>> which DEVMAP entry its execution originates (like an additional entry
>>>>> in the bpf_devmap_val that is then set in the xdp_md).
>>>>
>>>> This could be useful in any case, so I would personally be fine with
>>>> adding something like this (for both devmap and cpumap) :)
>>>
>>> Would you prefer a simple u32 (or similar) that could then be used as
>>> index for an array or a more complex data structure/void* to fill
>>> with arbitrary data?
>> 
>> Well, the API for map indexing specifies a u64 map index, so just
>> reusing that would be the simplest :)
>
> u64? Now I am confused:
> "The key type is an unsigned 32-bit integer (4 bytes)"
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/Documentation/bpf/map_array.rst?h=next-20240703

That's the documentation for array maps. Devmap is documented here:

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/Documentation/bpf/map_devmap.rst?h=next-20240703

The text doesn't say anything explicitly about the key type, but the
function signature for the redirect function has not been updated, it
seems. The key type was changed to u64 in commit:

32637e33003f ("bpf: Expand map key argument of bpf_redirect_map to u64")

-Toke


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Different packet handling after bpf_redirect_map with BPF_F_BROADCAST
  2024-07-04 14:51         ` Toke Høiland-Jørgensen
@ 2024-07-04 15:20           ` Florian Kauer
  2024-07-04 15:36             ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 8+ messages in thread
From: Florian Kauer @ 2024-07-04 15:20 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, xdp-newbies; +Cc: Ferenc Fejes



On 7/4/24 16:51, Toke Høiland-Jørgensen wrote:
> Florian Kauer <florian.kauer@linutronix.de> writes:
> 
>>>>>> 3. Extend the kernel with a way to let the xdp/devmap prog know from
>>>>>> which DEVMAP entry its execution originates (like an additional entry
>>>>>> in the bpf_devmap_val that is then set in the xdp_md).
>>>>>
>>>>> This could be useful in any case, so I would personally be fine with
>>>>> adding something like this (for both devmap and cpumap) :)
>>>>
>>>> Would you prefer a simple u32 (or similar) that could then be used as
>>>> index for an array or a more complex data structure/void* to fill
>>>> with arbitrary data?
>>>
>>> Well, the API for map indexing specifies a u64 map index, so just
>>> reusing that would be the simplest :)
>>
>> u64? Now I am confused:
>> "The key type is an unsigned 32-bit integer (4 bytes)"
>> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/Documentation/bpf/map_array.rst?h=next-20240703
> 
> That's the documentation for array maps. Devmap is documented here:

Well, an array map would be where I would search for the needed information
(like the VLAN tag) after the redirect.

But I guess you meant not just "reusing" the TYPE of the devmap map index
for another field in bpf_devmap_val, but actually reusing the devmap index
itself and write it into the xdp_md, right? Then, of course, it would be
a u64 (and needs to be truncated when indexing the array with the VLAN tags).

Makes sense to me. I will try to assemble a patch and send it out for discussion.

Thanks,
Florian

> 
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/Documentation/bpf/map_devmap.rst?h=next-20240703
> 
> The text doesn't say anything explicitly about the key type, but the
> function signature for the redirect function has not been updated, it
> seems. The key type was changed to u64 in commit:
> 
> 32637e33003f ("bpf: Expand map key argument of bpf_redirect_map to u64")
> 
> -Toke
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Different packet handling after bpf_redirect_map with BPF_F_BROADCAST
  2024-07-04 15:20           ` Florian Kauer
@ 2024-07-04 15:36             ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 8+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-07-04 15:36 UTC (permalink / raw)
  To: Florian Kauer, xdp-newbies; +Cc: Ferenc Fejes

Florian Kauer <florian.kauer@linutronix.de> writes:

> On 7/4/24 16:51, Toke Høiland-Jørgensen wrote:
>> Florian Kauer <florian.kauer@linutronix.de> writes:
>> 
>>>>>>> 3. Extend the kernel with a way to let the xdp/devmap prog know from
>>>>>>> which DEVMAP entry its execution originates (like an additional entry
>>>>>>> in the bpf_devmap_val that is then set in the xdp_md).
>>>>>>
>>>>>> This could be useful in any case, so I would personally be fine with
>>>>>> adding something like this (for both devmap and cpumap) :)
>>>>>
>>>>> Would you prefer a simple u32 (or similar) that could then be used as
>>>>> index for an array or a more complex data structure/void* to fill
>>>>> with arbitrary data?
>>>>
>>>> Well, the API for map indexing specifies a u64 map index, so just
>>>> reusing that would be the simplest :)
>>>
>>> u64? Now I am confused:
>>> "The key type is an unsigned 32-bit integer (4 bytes)"
>>> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/Documentation/bpf/map_array.rst?h=next-20240703
>> 
>> That's the documentation for array maps. Devmap is documented here:
>
> Well, an array map would be where I would search for the needed information
> (like the VLAN tag) after the redirect.
>
> But I guess you meant not just "reusing" the TYPE of the devmap map index
> for another field in bpf_devmap_val, but actually reusing the devmap index
> itself and write it into the xdp_md, right? Then, of course, it would be
> a u64 (and needs to be truncated when indexing the array with the VLAN
> tags).

Yes, exactly. The "you are currently running in this devmap slot"
metadata is something the kernel should provide to the running program
as part of the execution context.

> Makes sense to me. I will try to assemble a patch and send it out for
> discussion.

Cool :)

-Toke


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-07-04 15:36 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-04 10:19 Different packet handling after bpf_redirect_map with BPF_F_BROADCAST Florian Kauer
2024-07-04 11:20 ` Toke Høiland-Jørgensen
2024-07-04 12:00   ` Florian Kauer
2024-07-04 12:30     ` Toke Høiland-Jørgensen
2024-07-04 13:08       ` Florian Kauer
2024-07-04 14:51         ` Toke Høiland-Jørgensen
2024-07-04 15:20           ` Florian Kauer
2024-07-04 15:36             ` Toke Høiland-Jørgensen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox