[LSF/MM/BPF TOPIC] BPF local storage for every packet

public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed

* [LSF/MM/BPF TOPIC] BPF local storage for every packet
@ 2026-02-20 14:56 Jakub Sitnicki
  2026-02-20 18:34 ` Martin KaFai Lau
  2026-03-03 15:06 ` Zhu Yanjun
  0 siblings, 2 replies; 8+ messages in thread
From: Jakub Sitnicki @ 2026-02-20 14:56 UTC (permalink / raw)
  To: lsf-pc; +Cc: bpf, kernel-team

In the upcoming days we are going to post an RFC which proposes to
extend the concept of BPF local storage to socket buffers (sk_buff, skb)
as means to attach arbitrary metadata to packets from BPF programs [1]
(slides 41-55).

Design wise, BPF local storage is a great fit for a packet metadata
container, as it that avoids some of the shortcoming of the the XDP
metadata interface:

1. Users interact with storage through BPF maps and can take advantage
   of existing built-in BPF map types, while still being able to
   implement a custom data format,

2. Maps within local storage can have different properties controlled by
   map flags. For example, maps with BPF_F_CLONE set can survive packet
   cloning. Other flags could allow map contents to survive sk_buff
   scrubbing during encapsulation/decapsulation or pass across network
   namespace boundaries.

3. Local storage supports multiple users out of the box - each user
   creates their own map, eliminating the need to coordinate data
   layout,

4. Local storage has its own backing memory, so persisting it across
   network stack layers requires no changes to the network stack.

However, this flexibility comes at a cost. While XDP metadata requires
no allocations [2], an initial write to BPF local storage requires two:
one for bpf_local_storage_elem, and one for bpf_local_storage itself.

We would like to align this work with the needs of other BPF local
storage users (socks, cgroups, tasks, inodes), where allocation overhead
has been a concern as well [2].

Optimization ideas we would like to put up for discussion:
- slimming down bpf_local_storage so it can be embedded as an skb
  extension chunk,
- making the bpf_local_storage cache size configurable,
- allowing bpf_local_storage to be pre-allocated,
- co-allocating bpf_local_storage and bpf_local_storage_elem for the
  single-map case.

Thanks,
-jkbs

[1] https://fosdem.org/2026/schedule/event/DSC9L3-rich-packet-metadata/
[2] Assuming sufficient free headroom in the skb linear buffer.
[3] http://msgid.link/ad835a9b-e544-48d3-b6e2-ffe172fcfa6d@linux.dev

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [LSF/MM/BPF TOPIC] BPF local storage for every packet
  2026-02-20 14:56 [LSF/MM/BPF TOPIC] BPF local storage for every packet Jakub Sitnicki
@ 2026-02-20 18:34 ` Martin KaFai Lau
  2026-02-21 13:42   ` Jakub Sitnicki
  2026-03-03 15:06 ` Zhu Yanjun
  1 sibling, 1 reply; 8+ messages in thread
From: Martin KaFai Lau @ 2026-02-20 18:34 UTC (permalink / raw)
  To: Jakub Sitnicki; +Cc: bpf, kernel-team, lsf-pc

On 2/20/26 6:56 AM, Jakub Sitnicki wrote:
> In the upcoming days we are going to post an RFC which proposes to
> extend the concept of BPF local storage to socket buffers (sk_buff, skb)
> as means to attach arbitrary metadata to packets from BPF programs [1]
> (slides 41-55).
> 
> Design wise, BPF local storage is a great fit for a packet metadata
> container, as it that avoids some of the shortcoming of the the XDP
> metadata interface:
> 
> 1. Users interact with storage through BPF maps and can take advantage
>     of existing built-in BPF map types, while still being able to
>     implement a custom data format,
> 
> 2. Maps within local storage can have different properties controlled by
>     map flags. For example, maps with BPF_F_CLONE set can survive packet
>     cloning. Other flags could allow map contents to survive sk_buff
>     scrubbing during encapsulation/decapsulation or pass across network
>     namespace boundaries.
> 
> 3. Local storage supports multiple users out of the box - each user
>     creates their own map, eliminating the need to coordinate data
>     layout,
> 
> 4. Local storage has its own backing memory, so persisting it across
>     network stack layers requires no changes to the network stack.
> 
> However, this flexibility comes at a cost. While XDP metadata requires
> no allocations [2], an initial write to BPF local storage requires two:
> one for bpf_local_storage_elem, and one for bpf_local_storage itself.
> 
> We would like to align this work with the needs of other BPF local
> storage users (socks, cgroups, tasks, inodes), where allocation overhead
> has been a concern as well [2].
> 
> Optimization ideas we would like to put up for discussion:
> - slimming down bpf_local_storage so it can be embedded as an skb
>    extension chunk,
> - making the bpf_local_storage cache size configurable,
> - allowing bpf_local_storage to be pre-allocated,
> - co-allocating bpf_local_storage and bpf_local_storage_elem for the
>    single-map case.

The sk/cgroup/task storage has a much longer lifetime. Meaning once 
allocation is done, the storage stays in the sk until the sk is closed. 
The length of lifetime is quite different from the skb. I am afraid we 
are re-purposing bpf_local_storage for a very different use case where 
skb lifecycle is much shorter.

We are planning to increase the 'sizeof(struct sock)' for perf reason. 
Saving an allocation is an upside but not the major one we are looking 
(or care) for sk. We are more looking for cacheline efficiency and 
probably remove the need for bpf_local_storage[_elem] if the user 
chooses to use the in-place spaces of a sk.

If 'sizeof(struct sk_buff)' can be increased, this should align on where 
sk local storage is going. If skb will solely depend on the existing 
bpf_local_storage and has no plan to raise sizeof(struct sk_buff) for 
perf purpose, the existing bpf_local_storage may be the wrong place to 
repurpose/optimize because the lifecycle of skb is very different.

> [1] https://fosdem.org/2026/schedule/event/DSC9L3-rich-packet-metadata/
> [2] Assuming sufficient free headroom in the skb linear buffer.
> [3] http://msgid.link/ad835a9b-e544-48d3-b6e2-ffe172fcfa6d@linux.dev
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [LSF/MM/BPF TOPIC] BPF local storage for every packet
  2026-02-20 18:34 ` Martin KaFai Lau
@ 2026-02-21 13:42   ` Jakub Sitnicki
  2026-02-23 19:26     ` Martin KaFai Lau
  0 siblings, 1 reply; 8+ messages in thread
From: Jakub Sitnicki @ 2026-02-21 13:42 UTC (permalink / raw)
  To: Martin KaFai Lau; +Cc: bpf, kernel-team, lsf-pc

On Fri, Feb 20, 2026 at 10:34 AM -08, Martin KaFai Lau wrote:
> On 2/20/26 6:56 AM, Jakub Sitnicki wrote:
>> In the upcoming days we are going to post an RFC which proposes to
>> extend the concept of BPF local storage to socket buffers (sk_buff, skb)
>> as means to attach arbitrary metadata to packets from BPF programs [1]
>> (slides 41-55).
>> Design wise, BPF local storage is a great fit for a packet metadata
>> container, as it that avoids some of the shortcoming of the the XDP
>> metadata interface:
>> 1. Users interact with storage through BPF maps and can take advantage
>>     of existing built-in BPF map types, while still being able to
>>     implement a custom data format,
>> 2. Maps within local storage can have different properties controlled by
>>     map flags. For example, maps with BPF_F_CLONE set can survive packet
>>     cloning. Other flags could allow map contents to survive sk_buff
>>     scrubbing during encapsulation/decapsulation or pass across network
>>     namespace boundaries.
>> 3. Local storage supports multiple users out of the box - each user
>>     creates their own map, eliminating the need to coordinate data
>>     layout,
>> 4. Local storage has its own backing memory, so persisting it across
>>     network stack layers requires no changes to the network stack.
>> However, this flexibility comes at a cost. While XDP metadata requires
>> no allocations [2], an initial write to BPF local storage requires two:
>> one for bpf_local_storage_elem, and one for bpf_local_storage itself.
>> We would like to align this work with the needs of other BPF local
>> storage users (socks, cgroups, tasks, inodes), where allocation overhead
>> has been a concern as well [2].
>> Optimization ideas we would like to put up for discussion:
>> - slimming down bpf_local_storage so it can be embedded as an skb
>>    extension chunk,
>> - making the bpf_local_storage cache size configurable,
>> - allowing bpf_local_storage to be pre-allocated,
>> - co-allocating bpf_local_storage and bpf_local_storage_elem for the
>>    single-map case.
>
> The sk/cgroup/task storage has a much longer lifetime. Meaning once allocation
> is done, the storage stays in the sk until the sk is closed. The length of
> lifetime is quite different from the skb. I am afraid we are re-purposing
> bpf_local_storage for a very different use case where skb lifecycle is much
> shorter.
>
> We are planning to increase the 'sizeof(struct sock)' for perf reason. Saving an
> allocation is an upside but not the major one we are looking (or care) for
> sk. We are more looking for cacheline efficiency and probably remove the need
> for bpf_local_storage[_elem] if the user chooses to use the in-place spaces of a
> sk.
>
> If 'sizeof(struct sk_buff)' can be increased, this should align on where sk
> local storage is going. If skb will solely depend on the existing
> bpf_local_storage and has no plan to raise sizeof(struct sk_buff) for perf
> purpose, the existing bpf_local_storage may be the wrong place to
> repurpose/optimize because the lifecycle of skb is very different.

The lifetime difference is undeniable, but I still see common ground.
To make it more concrete:

1. IIRC you've mentioned wanting more bpf_local_storage->cache entries
   for socks in some scenarios, while for skbs I'd expect we need
   fewer. We could make the cache size configurable via a flexible
   array.

2. Embedding bpf_local_storage is another overlap I had in mind. For
   socks that in within the same memory blob as struct sock, while for
   skbs we'd want to embed it in skb_ext (once it's small enough). This
   depends on whether you end up dropping bpf_local_storage for
   sk_local_storage entirely, which I didn't know about until now.

3. I've heard the idea of allocating skb_ext memory together with
   sk_buff was floated in the past. While trimming skb_ext at build-time
   is hard today — say I need XFRM but don't care about crypto offloads
   keeping state in skb_ext — the idea is similar to what you're
   proposing for struct sock.

Thanks,
-jkbs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [LSF/MM/BPF TOPIC] BPF local storage for every packet
  2026-02-21 13:42   ` Jakub Sitnicki
@ 2026-02-23 19:26     ` Martin KaFai Lau
  2026-02-24 11:58       ` Jakub Sitnicki
  0 siblings, 1 reply; 8+ messages in thread
From: Martin KaFai Lau @ 2026-02-23 19:26 UTC (permalink / raw)
  To: Jakub Sitnicki; +Cc: bpf, kernel-team, lsf-pc

On 2/21/26 5:42 AM, Jakub Sitnicki wrote:
> On Fri, Feb 20, 2026 at 10:34 AM -08, Martin KaFai Lau wrote:
>> On 2/20/26 6:56 AM, Jakub Sitnicki wrote:
>>> In the upcoming days we are going to post an RFC which proposes to
>>> extend the concept of BPF local storage to socket buffers (sk_buff, skb)
>>> as means to attach arbitrary metadata to packets from BPF programs [1]
>>> (slides 41-55).
>>> Design wise, BPF local storage is a great fit for a packet metadata
>>> container, as it that avoids some of the shortcoming of the the XDP
>>> metadata interface:
>>> 1. Users interact with storage through BPF maps and can take advantage
>>>      of existing built-in BPF map types, while still being able to
>>>      implement a custom data format,
>>> 2. Maps within local storage can have different properties controlled by
>>>      map flags. For example, maps with BPF_F_CLONE set can survive packet
>>>      cloning. Other flags could allow map contents to survive sk_buff
>>>      scrubbing during encapsulation/decapsulation or pass across network
>>>      namespace boundaries.
>>> 3. Local storage supports multiple users out of the box - each user
>>>      creates their own map, eliminating the need to coordinate data
>>>      layout,
>>> 4. Local storage has its own backing memory, so persisting it across
>>>      network stack layers requires no changes to the network stack.
>>> However, this flexibility comes at a cost. While XDP metadata requires
>>> no allocations [2], an initial write to BPF local storage requires two:
>>> one for bpf_local_storage_elem, and one for bpf_local_storage itself.
>>> We would like to align this work with the needs of other BPF local
>>> storage users (socks, cgroups, tasks, inodes), where allocation overhead
>>> has been a concern as well [2].
>>> Optimization ideas we would like to put up for discussion:
>>> - slimming down bpf_local_storage so it can be embedded as an skb
>>>     extension chunk,
>>> - making the bpf_local_storage cache size configurable,
>>> - allowing bpf_local_storage to be pre-allocated,
>>> - co-allocating bpf_local_storage and bpf_local_storage_elem for the
>>>     single-map case.
>>
>> The sk/cgroup/task storage has a much longer lifetime. Meaning once allocation
>> is done, the storage stays in the sk until the sk is closed. The length of
>> lifetime is quite different from the skb. I am afraid we are re-purposing
>> bpf_local_storage for a very different use case where skb lifecycle is much
>> shorter.
>>
>> We are planning to increase the 'sizeof(struct sock)' for perf reason. Saving an
>> allocation is an upside but not the major one we are looking (or care) for
>> sk. We are more looking for cacheline efficiency and probably remove the need
>> for bpf_local_storage[_elem] if the user chooses to use the in-place spaces of a
>> sk.
>>
>> If 'sizeof(struct sk_buff)' can be increased, this should align on where sk
>> local storage is going. If skb will solely depend on the existing
>> bpf_local_storage and has no plan to raise sizeof(struct sk_buff) for perf
>> purpose, the existing bpf_local_storage may be the wrong place to
>> repurpose/optimize because the lifecycle of skb is very different.
> 
> The lifetime difference is undeniable, but I still see common ground.
> To make it more concrete:
> 
> 1. IIRC you've mentioned wanting more bpf_local_storage->cache entries
>     for socks in some scenarios, while for skbs I'd expect we need
>     fewer. We could make the cache size configurable via a flexible
>     array.
> 
> 2. Embedding bpf_local_storage is another overlap I had in mind. For
>     socks that in within the same memory blob as struct sock, while for
>     skbs we'd want to embed it in skb_ext (once it's small enough). This
>     depends on whether you end up dropping bpf_local_storage for
>     sk_local_storage entirely, which I didn't know about until now.

For the in-place sk storage, it should not need the bpf_local_storage 
and the bpf_local_storage_elem. A stable map_xyz->sk_offset should be 
enough. If a storage is needed for all sk, the bpf prog should use the 
in-place sk storage instead of going through the bpf_local_storage[_elem].

imo, if we manage to pull out a new solution (whatever that is) for skb 
but does not perform close to the skb->data_meta, it is probably hard to 
use in production. I could be wrong but I don't see how embedding 
local_storage and/or shrink the cache can get there. I think we need 
another solution/design.

> 
> 3. I've heard the idea of allocating skb_ext memory together with
>     sk_buff was floated in the past. While trimming skb_ext at build-time
>     is hard today — say I need XFRM but don't care about crypto offloads
>     keeping state in skb_ext — the idea is similar to what you're
>     proposing for struct sock.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [LSF/MM/BPF TOPIC] BPF local storage for every packet
  2026-02-23 19:26     ` Martin KaFai Lau
@ 2026-02-24 11:58       ` Jakub Sitnicki
  0 siblings, 0 replies; 8+ messages in thread
From: Jakub Sitnicki @ 2026-02-24 11:58 UTC (permalink / raw)
  To: Martin KaFai Lau; +Cc: bpf, kernel-team, lsf-pc

On Mon, Feb 23, 2026 at 11:26 AM -08, Martin KaFai Lau wrote:
> On 2/21/26 5:42 AM, Jakub Sitnicki wrote:
>> On Fri, Feb 20, 2026 at 10:34 AM -08, Martin KaFai Lau wrote:
>>> On 2/20/26 6:56 AM, Jakub Sitnicki wrote:
>>>> In the upcoming days we are going to post an RFC which proposes to
>>>> extend the concept of BPF local storage to socket buffers (sk_buff, skb)
>>>> as means to attach arbitrary metadata to packets from BPF programs [1]
>>>> (slides 41-55).
>>>> Design wise, BPF local storage is a great fit for a packet metadata
>>>> container, as it that avoids some of the shortcoming of the the XDP
>>>> metadata interface:
>>>> 1. Users interact with storage through BPF maps and can take advantage
>>>>      of existing built-in BPF map types, while still being able to
>>>>      implement a custom data format,
>>>> 2. Maps within local storage can have different properties controlled by
>>>>      map flags. For example, maps with BPF_F_CLONE set can survive packet
>>>>      cloning. Other flags could allow map contents to survive sk_buff
>>>>      scrubbing during encapsulation/decapsulation or pass across network
>>>>      namespace boundaries.
>>>> 3. Local storage supports multiple users out of the box - each user
>>>>      creates their own map, eliminating the need to coordinate data
>>>>      layout,
>>>> 4. Local storage has its own backing memory, so persisting it across
>>>>      network stack layers requires no changes to the network stack.
>>>> However, this flexibility comes at a cost. While XDP metadata requires
>>>> no allocations [2], an initial write to BPF local storage requires two:
>>>> one for bpf_local_storage_elem, and one for bpf_local_storage itself.
>>>> We would like to align this work with the needs of other BPF local
>>>> storage users (socks, cgroups, tasks, inodes), where allocation overhead
>>>> has been a concern as well [2].
>>>> Optimization ideas we would like to put up for discussion:
>>>> - slimming down bpf_local_storage so it can be embedded as an skb
>>>>     extension chunk,
>>>> - making the bpf_local_storage cache size configurable,
>>>> - allowing bpf_local_storage to be pre-allocated,
>>>> - co-allocating bpf_local_storage and bpf_local_storage_elem for the
>>>>     single-map case.
>>>
>>> The sk/cgroup/task storage has a much longer lifetime. Meaning once allocation
>>> is done, the storage stays in the sk until the sk is closed. The length of
>>> lifetime is quite different from the skb. I am afraid we are re-purposing
>>> bpf_local_storage for a very different use case where skb lifecycle is much
>>> shorter.
>>>
>>> We are planning to increase the 'sizeof(struct sock)' for perf reason. Saving an
>>> allocation is an upside but not the major one we are looking (or care) for
>>> sk. We are more looking for cacheline efficiency and probably remove the need
>>> for bpf_local_storage[_elem] if the user chooses to use the in-place spaces of a
>>> sk.
>>>
>>> If 'sizeof(struct sk_buff)' can be increased, this should align on where sk
>>> local storage is going. If skb will solely depend on the existing
>>> bpf_local_storage and has no plan to raise sizeof(struct sk_buff) for perf
>>> purpose, the existing bpf_local_storage may be the wrong place to
>>> repurpose/optimize because the lifecycle of skb is very different.
>> The lifetime difference is undeniable, but I still see common ground.
>> To make it more concrete:
>> 1. IIRC you've mentioned wanting more bpf_local_storage->cache entries
>>     for socks in some scenarios, while for skbs I'd expect we need
>>     fewer. We could make the cache size configurable via a flexible
>>     array.
>> 2. Embedding bpf_local_storage is another overlap I had in mind. For
>>     socks that in within the same memory blob as struct sock, while for
>>     skbs we'd want to embed it in skb_ext (once it's small enough). This
>>     depends on whether you end up dropping bpf_local_storage for
>>     sk_local_storage entirely, which I didn't know about until now.
>
> For the in-place sk storage, it should not need the bpf_local_storage and the
> bpf_local_storage_elem. A stable map_xyz->sk_offset should be enough. If a
> storage is needed for all sk, the bpf prog should use the in-place sk storage
> instead of going through the bpf_local_storage[_elem].

Call me overly optimistic, but if we can pull it off for sk storage,
then what stops us from transplanting this pattern to skb_ext and skb
storage?

> imo, if we manage to pull out a new solution (whatever that is) for skb but does
> not perform close to the skb->data_meta, it is probably hard to use in
> production. I could be wrong but I don't see how embedding local_storage and/or
> shrink the cache can get there. I think we need another solution/design.

skb->data_meta is allocation free. That would be the ultimate goal.

I could see that happening is if we allocate space for skb_ext together
with sk_buff and embed map storage within the skb_ext chunk.

Apart from the long term goal, I still see value in a naive BPF local
storage implementation, like we have in sock/task/..., today because:

1. skb local storage would be available only after GRO, so we're dealing
   lower pps rate than XDP.

2. If you have use cases, like we do, when you want to attach metadata
   only to the first packet of the L4 connection, then the skb local
   storage allocation rate is the same as your established socket
   allocation rate. And we know BPF local storage is good enough for
   that.

3. There's feature gap. skb->data_meta doesn't survive past TC. Paying
   an allocation cost - user decides if its worth the price - is better
   than nothing.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [LSF/MM/BPF TOPIC] BPF local storage for every packet
  2026-02-20 14:56 [LSF/MM/BPF TOPIC] BPF local storage for every packet Jakub Sitnicki
  2026-02-20 18:34 ` Martin KaFai Lau
@ 2026-03-03 15:06 ` Zhu Yanjun
  2026-03-03 21:07   ` Jakub Sitnicki
  1 sibling, 1 reply; 8+ messages in thread
From: Zhu Yanjun @ 2026-03-03 15:06 UTC (permalink / raw)
  To: Jakub Sitnicki, lsf-pc; +Cc: bpf, kernel-team

在 2026/2/20 6:56, Jakub Sitnicki 写道:
> In the upcoming days we are going to post an RFC which proposes to
> extend the concept of BPF local storage to socket buffers (sk_buff, skb)
> as means to attach arbitrary metadata to packets from BPF programs [1]
> (slides 41-55).
> 
> Design wise, BPF local storage is a great fit for a packet metadata
> container, as it that avoids some of the shortcoming of the the XDP
> metadata interface:
> 
> 1. Users interact with storage through BPF maps and can take advantage
>     of existing built-in BPF map types, while still being able to
>     implement a custom data format,
> 
> 2. Maps within local storage can have different properties controlled by
>     map flags. For example, maps with BPF_F_CLONE set can survive packet
>     cloning. Other flags could allow map contents to survive sk_buff
>     scrubbing during encapsulation/decapsulation or pass across network
>     namespace boundaries.
> 
> 3. Local storage supports multiple users out of the box - each user
>     creates their own map, eliminating the need to coordinate data
>     layout,
> 
> 4. Local storage has its own backing memory, so persisting it across
>     network stack layers requires no changes to the network stack.
> 
> However, this flexibility comes at a cost. While XDP metadata requires
> no allocations [2], an initial write to BPF local storage requires two:
> one for bpf_local_storage_elem, and one for bpf_local_storage itself.
> 
> We would like to align this work with the needs of other BPF local
> storage users (socks, cgroups, tasks, inodes), where allocation overhead
> has been a concern as well [2].
> 
> Optimization ideas we would like to put up for discussion:
> - slimming down bpf_local_storage so it can be embedded as an skb
>    extension chunk,

Interested in this topic. I hope to join this meeting.

Zhu Yanjun

> - making the bpf_local_storage cache size configurable,
> - allowing bpf_local_storage to be pre-allocated,
> - co-allocating bpf_local_storage and bpf_local_storage_elem for the
>    single-map case.
> 
> Thanks,
> -jkbs
> 
> [1] https://fosdem.org/2026/schedule/event/DSC9L3-rich-packet-metadata/
> [2] Assuming sufficient free headroom in the skb linear buffer.
> [3] http://msgid.link/ad835a9b-e544-48d3-b6e2-ffe172fcfa6d@linux.dev


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [LSF/MM/BPF TOPIC] BPF local storage for every packet
  2026-03-03 15:06 ` Zhu Yanjun
@ 2026-03-03 21:07   ` Jakub Sitnicki
  2026-03-16  3:02     ` Zhu Yanjun
  0 siblings, 1 reply; 8+ messages in thread
From: Jakub Sitnicki @ 2026-03-03 21:07 UTC (permalink / raw)
  To: Zhu Yanjun; +Cc: lsf-pc, bpf, kernel-team

On Tue, Mar 03, 2026 at 07:06 AM -08, Zhu Yanjun wrote:
> 在 2026/2/20 6:56, Jakub Sitnicki 写道:
>> In the upcoming days we are going to post an RFC which proposes to
>> extend the concept of BPF local storage to socket buffers (sk_buff, skb)
>> as means to attach arbitrary metadata to packets from BPF programs [1]
>> (slides 41-55).
>> Design wise, BPF local storage is a great fit for a packet metadata
>> container, as it that avoids some of the shortcoming of the the XDP
>> metadata interface:
>> 1. Users interact with storage through BPF maps and can take advantage
>>     of existing built-in BPF map types, while still being able to
>>     implement a custom data format,
>> 2. Maps within local storage can have different properties controlled by
>>     map flags. For example, maps with BPF_F_CLONE set can survive packet
>>     cloning. Other flags could allow map contents to survive sk_buff
>>     scrubbing during encapsulation/decapsulation or pass across network
>>     namespace boundaries.
>> 3. Local storage supports multiple users out of the box - each user
>>     creates their own map, eliminating the need to coordinate data
>>     layout,
>> 4. Local storage has its own backing memory, so persisting it across
>>     network stack layers requires no changes to the network stack.
>> However, this flexibility comes at a cost. While XDP metadata requires
>> no allocations [2], an initial write to BPF local storage requires two:
>> one for bpf_local_storage_elem, and one for bpf_local_storage itself.
>> We would like to align this work with the needs of other BPF local
>> storage users (socks, cgroups, tasks, inodes), where allocation overhead
>> has been a concern as well [2].
>> Optimization ideas we would like to put up for discussion:
>> - slimming down bpf_local_storage so it can be embedded as an skb
>>    extension chunk,
>
> Interested in this topic. I hope to join this meeting.

Thanks for the interest. I've since posted the RFC for that [1] and the
topic is, at least partially, no longer relevant. We won't be adding new
users of BPF local storage [2].

I've proposed to the PC that we can change it to:

1) How to make regular BPF maps work as a stash-away storage for skb
metadata. I've highlighted my initial concerns [3] and will give it a
try to get hands-on experience with this approach.

2) Or if we decide to go with a secondary skb metadata embedded in the
skb_ext - which is another direction I wanted to explore, then we could
discuss how to optimize skb_ext (this overlaps with the original
proposal).

Thanks,
-jkbs

[1] https://lore.kernel.org/all/20260226-skb-local-storage-v1-0-4ca44f0dd9d1@cloudflare.com/
[2] https://lore.kernel.org/all/CAADnVQKVfyh3_OZshvYf7GJUF-ph2eMfmaQsxNgwBJd1AJgXTQ@mail.gmail.com/
[3] https://lore.kernel.org/all/87wlzydk12.fsf@cloudflare.com/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [LSF/MM/BPF TOPIC] BPF local storage for every packet
  2026-03-03 21:07   ` Jakub Sitnicki
@ 2026-03-16  3:02     ` Zhu Yanjun
  0 siblings, 0 replies; 8+ messages in thread
From: Zhu Yanjun @ 2026-03-16  3:02 UTC (permalink / raw)
  To: Jakub Sitnicki; +Cc: lsf-pc, bpf, kernel-team


在 2026/3/3 13:07, Jakub Sitnicki 写道:
> On Tue, Mar 03, 2026 at 07:06 AM -08, Zhu Yanjun wrote:
>> 在 2026/2/20 6:56, Jakub Sitnicki 写道:
>>> In the upcoming days we are going to post an RFC which proposes to
>>> extend the concept of BPF local storage to socket buffers (sk_buff, skb)
>>> as means to attach arbitrary metadata to packets from BPF programs [1]
>>> (slides 41-55).
>>> Design wise, BPF local storage is a great fit for a packet metadata
>>> container, as it that avoids some of the shortcoming of the the XDP
>>> metadata interface:
>>> 1. Users interact with storage through BPF maps and can take advantage
>>>      of existing built-in BPF map types, while still being able to
>>>      implement a custom data format,
>>> 2. Maps within local storage can have different properties controlled by
>>>      map flags. For example, maps with BPF_F_CLONE set can survive packet
>>>      cloning. Other flags could allow map contents to survive sk_buff
>>>      scrubbing during encapsulation/decapsulation or pass across network
>>>      namespace boundaries.
>>> 3. Local storage supports multiple users out of the box - each user
>>>      creates their own map, eliminating the need to coordinate data
>>>      layout,
>>> 4. Local storage has its own backing memory, so persisting it across
>>>      network stack layers requires no changes to the network stack.
>>> However, this flexibility comes at a cost. While XDP metadata requires
>>> no allocations [2], an initial write to BPF local storage requires two:
>>> one for bpf_local_storage_elem, and one for bpf_local_storage itself.
>>> We would like to align this work with the needs of other BPF local
>>> storage users (socks, cgroups, tasks, inodes), where allocation overhead
>>> has been a concern as well [2].
>>> Optimization ideas we would like to put up for discussion:
>>> - slimming down bpf_local_storage so it can be embedded as an skb
>>>     extension chunk,
>> Interested in this topic. I hope to join this meeting.
> Thanks for the interest. I've since posted the RFC for that [1] and the
> topic is, at least partially, no longer relevant. We won't be adding new
> users of BPF local storage [2].
>
> I've proposed to the PC that we can change it to:
>
> 1) How to make regular BPF maps work as a stash-away storage for skb
> metadata. I've highlighted my initial concerns [3] and will give it a
> try to get hands-on experience with this approach.
>
> 2) Or if we decide to go with a secondary skb metadata embedded in the
> skb_ext - which is another direction I wanted to explore, then we could
> discuss how to optimize skb_ext (this overlaps with the original
> proposal).

Thanks a lot. I’m very interested in this topic and was wondering 
whether it is on the agenda for the LSF meeting.

Zhu Yanjun

>
> Thanks,
> -jkbs
>
> [1] https://lore.kernel.org/all/20260226-skb-local-storage-v1-0-4ca44f0dd9d1@cloudflare.com/
> [2] https://lore.kernel.org/all/CAADnVQKVfyh3_OZshvYf7GJUF-ph2eMfmaQsxNgwBJd1AJgXTQ@mail.gmail.com/
> [3] https://lore.kernel.org/all/87wlzydk12.fsf@cloudflare.com/

-- 
Best Regards,
Yanjun.Zhu


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-03-16  3:02 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-20 14:56 [LSF/MM/BPF TOPIC] BPF local storage for every packet Jakub Sitnicki
2026-02-20 18:34 ` Martin KaFai Lau
2026-02-21 13:42   ` Jakub Sitnicki
2026-02-23 19:26     ` Martin KaFai Lau
2026-02-24 11:58       ` Jakub Sitnicki
2026-03-03 15:06 ` Zhu Yanjun
2026-03-03 21:07   ` Jakub Sitnicki
2026-03-16  3:02     ` Zhu Yanjun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox