From: Yonghong Song <yonghong.song@linux.dev>
To: Fabian Pfitzner <f.pfitzner@tu-braunschweig.de>, bpf@vger.kernel.org
Subject: Re: No direct copy from ctx to map possible, why?
Date: Wed, 17 Apr 2024 17:09:05 -0700 [thread overview]
Message-ID: <f102239c-69cc-4ca7-8e21-7efb66bfaceb@linux.dev> (raw)
In-Reply-To: <9c019772-8c21-4eb5-908d-103f0966dc13@tu-braunschweig.de>
On 4/17/24 12:38 PM, Fabian Pfitzner wrote:
>> In your particular example, since you intend to copy xdp_md->data,
>> you can directly
>> access that from xdp_md->data pointer, there is no need to copy ctx
>> which is not
>> what you want.
> Thanks for your answer, but I think you misunderstood me. I need to
> store the packet's payload in a map (not the xdp_md structure itself),
> because my use case forces me to do so.
>
> I write a program that reassembles split packets into a single one.
> Therefore I have to buffer packet fragments until all have been
> arrived. The only way in eBPF to realize such a buffer is a map, so I
> have to put the packet's payload in there. My problem is, that I have
> no clue how to do it properly as there is no direct way to put the
> payload into a map.
>
> How would you put a packet with a size of 700 bytes into a map? What
> would be your strategy when you can only access your packet via the
> xdp_md structure? My strategy (and that's the best I have found so
> far) is to split this packet into two packets of size 350 bytes, so
> that I can process them on the stack consecutively.
The map value can be packet pointer as your early mentioned:
expecting another type as "ctx" (R3 type=ctx expected=fp, pkt, pkt_meta, .....).
But you need to do packet range checking to ensure the packet range (from start of packet->data) must be the same or greater
than map value size.
>
> On 4/16/24 5:22 AM, Yonghong Song wrote:
>>
>> On 4/15/24 1:25 PM, Fabian Pfitzner wrote:
>>>> Looks like you intend to copy packet data. So from the above,
>>>> 'expected=fp,pkt,pkt_meta...', you can just put the first argument
>>>> with xdp->data, right?
>>> Yes, I intend to copy packet data. What do you mean by "first
>>> argument"? I'd like to put the whole data that is depicted by
>>> xdp->data into a map that stores them as raw bytes (by using a char
>>> array as map element to store the data).
>>
>> Sorry, typo. 'first argument' should be 'third argument'.
>>
>>>
>>>> Verifer rejects to 'ctx' since 'ctx' contents are subject to
>>>> verifier rewrite. So actual 'ctx' contents/layouts may not match
>>>> uapi definition.
>>> Sorry but I do not understand what you mean by "subject to verifier
>>> rewrite". What kind of rewrite happens when using the ctx as
>>> argument? Furthermore, am I correct that you assume that the uapi
>>> may dictate the structure of the data that can be stored in a map?
>>> How is it different to the case when first storing it on the stack
>>> and then putting it into a map?
>>
>> The UAPI xdp_md struct:
>>
>> struct xdp_md {
>> __u32 data;
>> __u32 data_end;
>> __u32 data_meta;
>> /* Below access go through struct xdp_rxq_info */
>> __u32 ingress_ifindex; /* rxq->dev->ifindex */
>> __u32 rx_queue_index; /* rxq->queue_index */
>>
>> __u32 egress_ifindex; /* txq->dev->ifindex */
>> };
>>
>> The actual kernel representation of xdp_md:
>>
>> struct xdp_buff {
>> void *data;
>> void *data_end;
>> void *data_meta;
>> void *data_hard_start;
>> struct xdp_rxq_info *rxq;
>> struct xdp_txq_info *txq;
>> u32 frame_sz; /* frame size to deduce data_hard_end/reserved
>> tailroom*/
>> u32 flags; /* supported values defined in xdp_buff_flags */
>> };
>>
>> You can see they are quite different. So to use pointee of 'ctx' as
>> the key, we
>> need to allocate a space of sizeof(struct_md) to the stack and copy
>> necessary
>> stuff to that structure. For example, xdp_md->ingress_ifindex =
>> xdp_buff->rxq->dev->ifindex, etc.
>> Some fields actually does not make sense for copying, e.g.,
>> data/data_end/data_meta in 64bit
>> architecture. Since stack allocation is needed any way, so disabling
>> ctx and requires
>> user explicit using stack make sense (if they want to use *ctx as map
>> update value).
>>
>> In your particular example, since you intend to copy xdp_md->data,
>> you can directly
>> access that from xdp_md->data pointer, there is no need to copy ctx
>> which is not
>> what you want.
>>
>>>
>>> On 4/15/24 6:01 PM, Yonghong Song wrote:
>>>>
>>>> On 4/14/24 2:34 PM, Fabian Pfitzner wrote:
>>>>> Hello,
>>>>>
>>>>> is there a specific reason why it is not allowed to copy data from
>>>>> ctx directly into a map via the bpf_map_update_elem helper?
>>>>> I develop a XDP program where I need to store incoming packets
>>>>> (including the whole payload) into a map in order to buffer them.
>>>>> I thought I could simply put them into a map via the mentioned
>>>>> helper function, but the verifier complains about expecting
>>>>> another type as "ctx" (R3 type=ctx expected=fp, pkt, pkt_meta,
>>>>> .....).
>>>>
>>>> Looks like you intend to copy packet data. So from the above,
>>>> 'expected=fp,pkt,pkt_meta...', you can just put the first argument
>>>> with xdp->data, right?
>>>> Verifer rejects to 'ctx' since 'ctx' contents are subject to
>>>> verifier rewrite. So actual 'ctx' contents/layouts may not match
>>>> uapi definition.
>>>>
>>>>>
>>>>> I was able to circumvent this error by first putting the packet
>>>>> onto the stack (via xdp->data) and then write it into the map.
>>>>> The only limitation with this is that I cannot store packets
>>>>> larger than 512 bytes due to the maximum stack size.
>>>>>
>>>>> I was also able to circumvent this by slicing chunks, that are
>>>>> smaller than 512 bytes, out of the packet so that I can use the
>>>>> stack as a clipboard before putting them into the map. This is a
>>>>> really ugly solution, but I have not found a better one yet.
>>>>>
>>>>> So my question is: Why does this limitation exist? I am not sure
>>>>> if its only related to XDP programs as this restriction is defined
>>>>> inside of the bpf_map_update_elem_proto struct (arg3_type
>>>>> restricts this), so I think it is a general limitation that
>>>>> affects all program types.
>>>>>
>>>>> Best regards,
>>>>> Fabian Pfitzner
>>>>>
>>>>>
>>>>>
>>>>>
>>>
>
next prev parent reply other threads:[~2024-04-18 0:09 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-14 21:34 No direct copy from ctx to map possible, why? Fabian Pfitzner
2024-04-15 16:01 ` Yonghong Song
2024-04-15 20:25 ` Fabian Pfitzner
2024-04-16 3:22 ` Yonghong Song
2024-04-17 19:38 ` Fabian Pfitzner
2024-04-18 0:09 ` Yonghong Song [this message]
2024-04-19 0:20 ` Andrii Nakryiko
2024-04-16 5:12 ` Hengqi Chen
2024-04-17 19:42 ` Fabian Pfitzner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f102239c-69cc-4ca7-8e21-7efb66bfaceb@linux.dev \
--to=yonghong.song@linux.dev \
--cc=bpf@vger.kernel.org \
--cc=f.pfitzner@tu-braunschweig.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox