From: Alexander Lobakin <aleksander.lobakin@intel.com>
To: Amit Cohen <amcohen@nvidia.com>
Cc: "Ido Schimmel" <idosch@idosch.org>,
"David S. Miller" <davem@davemloft.net>,
"Eric Dumazet" <edumazet@google.com>,
"Jakub Kicinski" <kuba@kernel.org>,
"Paolo Abeni" <pabeni@redhat.com>,
"Toke Høiland-Jørgensen" <toke@redhat.com>,
"Alexei Starovoitov" <ast@kernel.org>,
"Daniel Borkmann" <daniel@iogearbox.net>,
"John Fastabend" <john.fastabend@gmail.com>,
"Andrii Nakryiko" <andrii@kernel.org>,
"Maciej Fijalkowski" <maciej.fijalkowski@intel.com>,
"Stanislav Fomichev" <sdf@fomichev.me>,
"Magnus Karlsson" <magnus.karlsson@intel.com>,
"nex.sw.ncis.osdt.itp.upstreaming@intel.com"
<nex.sw.ncis.osdt.itp.upstreaming@intel.com>,
"bpf@vger.kernel.org" <bpf@vger.kernel.org>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH net-next v5 12/19] xdp: add generic xdp_build_skb_from_buff()
Date: Tue, 19 Nov 2024 13:05:11 +0100 [thread overview]
Message-ID: <b37c2b10-ac0c-48e3-9352-77963a6d7c7b@intel.com> (raw)
In-Reply-To: <LV2PR12MB59435D8F548C8DA2E317DC6FCB262@LV2PR12MB5943.namprd12.prod.outlook.com>
From: Amit Cohen <amcohen@nvidia.com>
Date: Sun, 17 Nov 2024 12:42:11 +0000
>
>
>> -----Original Message-----
>> From: Alexander Lobakin <aleksander.lobakin@intel.com>
>> Sent: Friday, 15 November 2024 16:35
>> To: Ido Schimmel <idosch@idosch.org>
>> Cc: David S. Miller <davem@davemloft.net>; Eric Dumazet <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Paolo Abeni
>> <pabeni@redhat.com>; Toke Høiland-Jørgensen <toke@redhat.com>; Alexei Starovoitov <ast@kernel.org>; Daniel Borkmann
>> <daniel@iogearbox.net>; John Fastabend <john.fastabend@gmail.com>; Andrii Nakryiko <andrii@kernel.org>; Maciej Fijalkowski
>> <maciej.fijalkowski@intel.com>; Stanislav Fomichev <sdf@fomichev.me>; Magnus Karlsson <magnus.karlsson@intel.com>;
>> nex.sw.ncis.osdt.itp.upstreaming@intel.com; bpf@vger.kernel.org; netdev@vger.kernel.org; linux-kernel@vger.kernel.org
>> Subject: Re: [PATCH net-next v5 12/19] xdp: add generic xdp_build_skb_from_buff()
>>
>> From: Ido Schimmel <idosch@idosch.org>
>> Date: Thu, 14 Nov 2024 17:16:44 +0200
>>
>>> On Thu, Nov 14, 2024 at 05:06:06PM +0200, Ido Schimmel wrote:
>>>> Looks good (no objections to the patch), but I have a question. See
>>>> below.
>>>>
>>>> On Wed, Nov 13, 2024 at 04:24:35PM +0100, Alexander Lobakin wrote:
>>>>> The code which builds an skb from an &xdp_buff keeps multiplying itself
>>>>> around the drivers with almost no changes. Let's try to stop that by
>>>>> adding a generic function.
>>>>> Unlike __xdp_build_skb_from_frame(), always allocate an skbuff head
>>>>> using napi_build_skb() and make use of the available xdp_rxq pointer to
>>>>> assign the Rx queue index. In case of PP-backed buffer, mark the skb to
>>>>> be recycled, as every PP user's been switched to recycle skbs.
>>>>>
>>>>> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
>>>>> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
>>>>
>>>> Reviewed-by: Ido Schimmel <idosch@nvidia.com>
>>>>
>>>>> ---
>>>>> include/net/xdp.h | 1 +
>>>>> net/core/xdp.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++
>>>>> 2 files changed, 56 insertions(+)
>>>>>
>>>>> diff --git a/include/net/xdp.h b/include/net/xdp.h
>>>>> index 4c19042adf80..b0a25b7060ff 100644
>>>>> --- a/include/net/xdp.h
>>>>> +++ b/include/net/xdp.h
>>>>> @@ -330,6 +330,7 @@ xdp_update_skb_shared_info(struct sk_buff *skb, u8 nr_frags,
>>>>> void xdp_warn(const char *msg, const char *func, const int line);
>>>>> #define XDP_WARN(msg) xdp_warn(msg, __func__, __LINE__)
>>>>>
>>>>> +struct sk_buff *xdp_build_skb_from_buff(const struct xdp_buff *xdp);
>>>>> struct xdp_frame *xdp_convert_zc_to_xdp_frame(struct xdp_buff *xdp);
>>>>> struct sk_buff *__xdp_build_skb_from_frame(struct xdp_frame *xdpf,
>>>>> struct sk_buff *skb,
>>>>> diff --git a/net/core/xdp.c b/net/core/xdp.c
>>>>> index b1b426a9b146..3a9a3c14b080 100644
>>>>> --- a/net/core/xdp.c
>>>>> +++ b/net/core/xdp.c
>>>>> @@ -624,6 +624,61 @@ int xdp_alloc_skb_bulk(void **skbs, int n_skb, gfp_t gfp)
>>>>> }
>>>>> EXPORT_SYMBOL_GPL(xdp_alloc_skb_bulk);
>>>>>
>>>>> +/**
>>>>> + * xdp_build_skb_from_buff - create an skb from an &xdp_buff
>>>>> + * @xdp: &xdp_buff to convert to an skb
>>>>> + *
>>>>> + * Perform common operations to create a new skb to pass up the stack from
>>>>> + * an &xdp_buff: allocate an skb head from the NAPI percpu cache, initialize
>>>>> + * skb data pointers and offsets, set the recycle bit if the buff is PP-backed,
>>>>> + * Rx queue index, protocol and update frags info.
>>>>> + *
>>>>> + * Return: new &sk_buff on success, %NULL on error.
>>>>> + */
>>>>> +struct sk_buff *xdp_build_skb_from_buff(const struct xdp_buff *xdp)
>>>>> +{
>>>>> + const struct xdp_rxq_info *rxq = xdp->rxq;
>>>>> + const struct skb_shared_info *sinfo;
>>>>> + struct sk_buff *skb;
>>>>> + u32 nr_frags = 0;
>>>>> + int metalen;
>>>>> +
>>>>> + if (unlikely(xdp_buff_has_frags(xdp))) {
>>>>> + sinfo = xdp_get_shared_info_from_buff(xdp);
>>>>> + nr_frags = sinfo->nr_frags;
>>>>> + }
>>>>> +
>>>>> + skb = napi_build_skb(xdp->data_hard_start, xdp->frame_sz);
>>>>> + if (unlikely(!skb))
>>>>> + return NULL;
>>>>> +
>>>>> + skb_reserve(skb, xdp->data - xdp->data_hard_start);
>>>>> + __skb_put(skb, xdp->data_end - xdp->data);
>>>>> +
>>>>> + metalen = xdp->data - xdp->data_meta;
>>>>> + if (metalen > 0)
>>>>> + skb_metadata_set(skb, metalen);
>>>>> +
>>>>> + if (is_page_pool_compiled_in() && rxq->mem.type == MEM_TYPE_PAGE_POOL)
>>>>> + skb_mark_for_recycle(skb);
>>>>> +
>>>>> + skb_record_rx_queue(skb, rxq->queue_index);
>>>>> +
>>>>> + if (unlikely(nr_frags)) {
>>>>> + u32 tsize;
>>>>> +
>>>>> + tsize = sinfo->xdp_frags_truesize ? : nr_frags * xdp->frame_sz;
>>>>> + xdp_update_skb_shared_info(skb, nr_frags,
>>>>> + sinfo->xdp_frags_size, tsize,
>>>>> + xdp_buff_is_frag_pfmemalloc(xdp));
>>>>> + }
>>>>> +
>>>>> + skb->protocol = eth_type_trans(skb, rxq->dev);
>>>>
>>>> The device we are working with has more ports (net devices) than Rx
>>>> queues, so each queue can receive packets from different net devices.
>>>> Currently, each Rx queue has its own NAPI instance and its own page
>>>> pool. All the Rx NAPI instances are initialized using the same dummy net
>>>> device which is allocated using alloc_netdev_dummy().
>>>>
>>>> What are our options with regards to the XDP Rx queue info structure? As
>>>> evident by this patch, it does not seem valid to register one such
>>>> structure per Rx queue and pass the dummy net device. Would it be valid
>>>> to register one such structure per port (net device) and pass zero for
>>>> the queue index and NAPI ID?
>>>
>>> Actually, this does not seem to be valid either as we need to associate
>>> an XDP Rx queue info with the correct page pool :/
>>
>> Right.
>> But I'd say, this assoc slowly becomes redundant. For example, idpf has
>> up to 4 page_pools per queue and I only pass 1 of them to rxq_info as
>> there are no other options. Regardless, its frames get processed
>> correctly thanks to that we have struct page::pp pointer + patch 9 from
>> this series which teaches put_page_bulk() to handle mixed bulks.
>>
>> Regarding your usecase -- after calling this function, you are free to
>> overwrite any skb fields as this helper doesn't pass it up the stack.
>> For example, in ice driver we have port reps and sometimes we need to
>> pass a different net_device, not the one saved in rxq_info. So when
>> switching to this function, we'll do eth_type_trans() once again (it's
>> either way under unlikely() in our code as it's swichdev slowpath).
>> Same for the queue number in rxq_info.
>
> With this series, maintaining 'struct xdp_mem_allocator' in hash-table looks unnecessary.
> If so, xdp_reg_mem_model() does not need 'allocator' when mem_type is Page-Pool.
>
> Is there a reason for not removing 'mem_id_ht'? With this patch, the nodes are no longer used.
Let me review this once again since I need to rebase it anyway.
Maybe we really could drop more code.
Thanks,
Olek
next prev parent reply other threads:[~2024-11-19 12:05 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-13 15:24 [PATCH net-next v5 00/19] xdp: a fistful of generic changes (+libeth_xdp) Alexander Lobakin
2024-11-13 15:24 ` [PATCH net-next v5 01/19] jump_label: export static_key_slow_{inc,dec}_cpuslocked() Alexander Lobakin
2024-11-13 15:24 ` [PATCH net-next v5 02/19] skbuff: allow 2-4-argument skb_frag_dma_map() Alexander Lobakin
2024-11-13 15:24 ` [PATCH net-next v5 03/19] unroll: add generic loop unroll helpers Alexander Lobakin
2024-11-13 15:24 ` [PATCH net-next v5 04/19] bpf, xdp: constify some bpf_prog * function arguments Alexander Lobakin
2024-11-13 15:24 ` [PATCH net-next v5 05/19] xdp, xsk: constify read-only arguments of some static inline helpers Alexander Lobakin
2024-11-13 15:24 ` [PATCH net-next v5 06/19] xdp: allow attaching already registered memory model to xdp_rxq_info Alexander Lobakin
2024-11-13 15:24 ` [PATCH net-next v5 07/19] xdp: register system page pool as an XDP memory model Alexander Lobakin
2024-11-13 15:24 ` [PATCH net-next v5 08/19] page_pool: make page_pool_put_page_bulk() actually handle array of pages Alexander Lobakin
2024-11-13 15:24 ` [PATCH net-next v5 09/19] page_pool: allow mixing PPs within one bulk Alexander Lobakin
2024-11-13 15:24 ` [PATCH net-next v5 10/19] xdp: get rid of xdp_frame::mem.id Alexander Lobakin
2024-11-13 15:24 ` [PATCH net-next v5 11/19] xdp: add generic xdp_buff_add_frag() Alexander Lobakin
2024-11-14 14:07 ` Ido Schimmel
2024-11-16 2:40 ` Jakub Kicinski
2024-11-19 12:03 ` Alexander Lobakin
2024-11-13 15:24 ` [PATCH net-next v5 12/19] xdp: add generic xdp_build_skb_from_buff() Alexander Lobakin
2024-11-14 15:06 ` Ido Schimmel
2024-11-14 15:16 ` Ido Schimmel
2024-11-15 14:34 ` Alexander Lobakin
2024-11-17 12:42 ` Amit Cohen
2024-11-19 12:05 ` Alexander Lobakin [this message]
2024-11-26 16:38 ` Alexander Lobakin
2024-11-13 15:24 ` [PATCH net-next v5 13/19] xsk: align &xdp_buff_xsk harder Alexander Lobakin
2024-11-13 15:24 ` [PATCH net-next v5 14/19] xsk: allow attaching XSk pool via xdp_rxq_info_reg_mem_model() Alexander Lobakin
2024-11-13 15:24 ` [PATCH net-next v5 15/19] xsk: make xsk_buff_add_frag really add a frag via __xdp_buff_add_frag() Alexander Lobakin
2024-11-13 15:24 ` [PATCH net-next v5 16/19] xsk: add generic XSk &xdp_buff -> skb conversion Alexander Lobakin
2024-11-13 15:24 ` [PATCH net-next v5 17/19] xsk: add helper to get &xdp_desc's DMA and meta pointer in one go Alexander Lobakin
2024-11-13 15:24 ` [PATCH net-next v5 18/19] libeth: support native XDP and register memory model Alexander Lobakin
2024-11-13 15:24 ` [PATCH net-next v5 19/19] libeth: add a couple of XDP helpers (libeth_xdp) Alexander Lobakin
2024-11-16 2:43 ` [PATCH net-next v5 00/19] xdp: a fistful of generic changes (+libeth_xdp) Jakub Kicinski
2024-11-16 15:31 ` Willem de Bruijn
2024-11-19 12:28 ` Alexander Lobakin
2024-11-19 15:14 ` Willem de Bruijn
2024-11-20 15:23 ` Alexander Lobakin
2024-11-21 15:43 ` Willem de Bruijn
2024-11-21 18:02 ` Alexander Lobakin
2024-11-21 18:42 ` Willem de Bruijn
2024-11-19 12:06 ` Alexander Lobakin
2024-11-19 14:25 ` Jakub Kicinski
2024-11-20 14:40 ` Alexander Lobakin
2024-11-21 19:26 ` Jacob Keller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b37c2b10-ac0c-48e3-9352-77963a6d7c7b@intel.com \
--to=aleksander.lobakin@intel.com \
--cc=amcohen@nvidia.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=idosch@idosch.org \
--cc=john.fastabend@gmail.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maciej.fijalkowski@intel.com \
--cc=magnus.karlsson@intel.com \
--cc=netdev@vger.kernel.org \
--cc=nex.sw.ncis.osdt.itp.upstreaming@intel.com \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
--cc=toke@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox