BPF List
 help / color / mirror / Atom feed
From: Yonghong Song <yonghong.song@linux.dev>
To: Fabian Pfitzner <f.pfitzner@tu-braunschweig.de>, bpf@vger.kernel.org
Subject: Re: No direct copy from ctx to map possible, why?
Date: Wed, 17 Apr 2024 17:09:05 -0700	[thread overview]
Message-ID: <f102239c-69cc-4ca7-8e21-7efb66bfaceb@linux.dev> (raw)
In-Reply-To: <9c019772-8c21-4eb5-908d-103f0966dc13@tu-braunschweig.de>


On 4/17/24 12:38 PM, Fabian Pfitzner wrote:
>> In your particular example, since you intend to copy xdp_md->data, 
>> you can directly
>> access that from xdp_md->data pointer, there is no need to copy ctx 
>> which is not
>> what you want. 
> Thanks for your answer, but I think you misunderstood me. I need to 
> store the packet's payload in a map (not the xdp_md structure itself), 
> because my use case forces me to do so.
>
> I write a program that reassembles split packets into a single one. 
> Therefore I have to buffer packet fragments until all have been 
> arrived. The only way in eBPF to realize such a buffer is a map, so I 
> have to put the packet's payload in there. My problem is, that I have 
> no clue how to do it properly as there is no direct way to put the 
> payload into a map.
>
> How would you put a packet with a size of 700 bytes into a map? What 
> would be your strategy when you can only access your packet via the 
> xdp_md structure? My strategy (and that's the best I have found so 
> far) is to split this packet into two packets of size 350 bytes, so 
> that I can process them on the stack consecutively.

The map value can be packet pointer as your early mentioned:
   expecting another type as "ctx" (R3 type=ctx expected=fp, pkt, pkt_meta, .....).
But you need to do packet range checking to ensure the packet range (from start of packet->data) must be the same or greater
than map value size.


>
> On 4/16/24 5:22 AM, Yonghong Song wrote:
>>
>> On 4/15/24 1:25 PM, Fabian Pfitzner wrote:
>>>> Looks like you intend to copy packet data. So from the above, 
>>>> 'expected=fp,pkt,pkt_meta...', you can just put the first argument
>>>> with xdp->data, right? 
>>> Yes, I intend to copy packet data. What do you mean by "first 
>>> argument"? I'd like to put the whole data that is depicted by 
>>> xdp->data into a map that stores them as raw bytes (by using a char 
>>> array as map element to store the data).
>>
>> Sorry, typo. 'first argument' should be 'third argument'.
>>
>>>
>>>> Verifer rejects to 'ctx' since 'ctx' contents are subject to 
>>>> verifier rewrite. So actual 'ctx' contents/layouts may not match 
>>>> uapi definition. 
>>> Sorry but I do not understand what you mean by "subject to verifier 
>>> rewrite". What kind of rewrite happens when using the ctx as 
>>> argument? Furthermore, am I correct that you assume that the uapi 
>>> may dictate the structure of the data that can be stored in a map? 
>>> How is it different to the case when first storing it on the stack 
>>> and then putting it into a map?
>>
>> The UAPI xdp_md struct:
>>
>> struct xdp_md {
>>         __u32 data;
>>         __u32 data_end;
>>         __u32 data_meta;
>>         /* Below access go through struct xdp_rxq_info */
>>         __u32 ingress_ifindex; /* rxq->dev->ifindex */
>>         __u32 rx_queue_index;  /* rxq->queue_index  */
>>
>>         __u32 egress_ifindex;  /* txq->dev->ifindex */
>> };
>>
>> The actual kernel representation of xdp_md:
>>
>> struct xdp_buff {
>>         void *data;
>>         void *data_end;
>>         void *data_meta;
>>         void *data_hard_start;
>>         struct xdp_rxq_info *rxq;
>>         struct xdp_txq_info *txq;
>>         u32 frame_sz; /* frame size to deduce data_hard_end/reserved 
>> tailroom*/
>>         u32 flags; /* supported values defined in xdp_buff_flags */
>> };
>>
>> You can see they are quite different. So to use pointee of 'ctx' as 
>> the key, we
>> need to allocate a space of sizeof(struct_md) to the stack and copy 
>> necessary
>> stuff to that structure. For example, xdp_md->ingress_ifindex = 
>> xdp_buff->rxq->dev->ifindex, etc.
>> Some fields actually does not make sense for copying, e.g., 
>> data/data_end/data_meta in 64bit
>> architecture. Since stack allocation is needed any way, so disabling 
>> ctx and requires
>> user explicit using stack make sense (if they want to use *ctx as map 
>> update value).
>>
>> In your particular example, since you intend to copy xdp_md->data, 
>> you can directly
>> access that from xdp_md->data pointer, there is no need to copy ctx 
>> which is not
>> what you want.
>>
>>>
>>> On 4/15/24 6:01 PM, Yonghong Song wrote:
>>>>
>>>> On 4/14/24 2:34 PM, Fabian Pfitzner wrote:
>>>>> Hello,
>>>>>
>>>>> is there a specific reason why it is not allowed to copy data from 
>>>>> ctx directly into a map via the bpf_map_update_elem helper?
>>>>> I develop a XDP program where I need to store incoming packets 
>>>>> (including the whole payload) into a map in order to buffer them.
>>>>> I thought I could simply put them into a map via the mentioned 
>>>>> helper function, but the verifier complains about expecting 
>>>>> another type as "ctx" (R3 type=ctx expected=fp, pkt, pkt_meta, 
>>>>> .....).
>>>>
>>>> Looks like you intend to copy packet data. So from the above, 
>>>> 'expected=fp,pkt,pkt_meta...', you can just put the first argument
>>>> with xdp->data, right?
>>>> Verifer rejects to 'ctx' since 'ctx' contents are subject to 
>>>> verifier rewrite. So actual 'ctx' contents/layouts may not match 
>>>> uapi definition.
>>>>
>>>>>
>>>>> I was able to circumvent this error by first putting the packet 
>>>>> onto the stack (via xdp->data) and then write it into the map.
>>>>> The only limitation with this is that I cannot store packets 
>>>>> larger than 512 bytes due to the maximum stack size.
>>>>>
>>>>> I was also able to circumvent this by slicing chunks, that are 
>>>>> smaller than 512 bytes, out of the packet so that I can use the 
>>>>> stack as a clipboard before putting them into the map. This is a 
>>>>> really ugly solution, but I have not found a better one yet.
>>>>>
>>>>> So my question is: Why does this limitation exist? I am not sure 
>>>>> if its only related to XDP programs as this restriction is defined 
>>>>> inside of the bpf_map_update_elem_proto struct (arg3_type 
>>>>> restricts this), so I think it is a general limitation that 
>>>>> affects all program types.
>>>>>
>>>>> Best regards,
>>>>> Fabian Pfitzner
>>>>>
>>>>>
>>>>>
>>>>>
>>>
>

  reply	other threads:[~2024-04-18  0:09 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-14 21:34 No direct copy from ctx to map possible, why? Fabian Pfitzner
2024-04-15 16:01 ` Yonghong Song
2024-04-15 20:25   ` Fabian Pfitzner
2024-04-16  3:22     ` Yonghong Song
2024-04-17 19:38       ` Fabian Pfitzner
2024-04-18  0:09         ` Yonghong Song [this message]
2024-04-19  0:20         ` Andrii Nakryiko
2024-04-16  5:12 ` Hengqi Chen
2024-04-17 19:42   ` Fabian Pfitzner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f102239c-69cc-4ca7-8e21-7efb66bfaceb@linux.dev \
    --to=yonghong.song@linux.dev \
    --cc=bpf@vger.kernel.org \
    --cc=f.pfitzner@tu-braunschweig.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox