From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Tariq Toukan <ttoukan.linux@gmail.com>,
Stanislav Fomichev <sdf@google.com>,
Tariq Toukan <ttoukan.linux@gmail.com>
Cc: bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net,
andrii@kernel.org, martin.lau@linux.dev, song@kernel.org,
yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org,
haoluo@google.com, jolsa@kernel.org,
Saeed Mahameed <saeedm@nvidia.com>,
David Ahern <dsahern@gmail.com>, Jakub Kicinski <kuba@kernel.org>,
Willem de Bruijn <willemb@google.com>,
Jesper Dangaard Brouer <brouer@redhat.com>,
Anatoly Burakov <anatoly.burakov@intel.com>,
Alexander Lobakin <alexandr.lobakin@intel.com>,
Magnus Karlsson <magnus.karlsson@gmail.com>,
Maryam Tahhan <mtahhan@redhat.com>,
xdp-hints@xdp-project.net, netdev@vger.kernel.org
Subject: Re: [xdp-hints] Re: [PATCH bpf-next v7 15/17] net/mlx5e: Introduce wrapper for xdp_buff
Date: Fri, 13 Jan 2023 22:31:59 +0100 [thread overview]
Message-ID: <87358ef7e8.fsf@toke.dk> (raw)
In-Reply-To: <d83f2193-3fb9-e30f-cfb0-f1098f039b67@gmail.com>
Tariq Toukan <ttoukan.linux@gmail.com> writes:
> On 12/01/2023 23:55, Toke Høiland-Jørgensen wrote:
>> Toke Høiland-Jørgensen <toke@redhat.com> writes:
>>
>>> Stanislav Fomichev <sdf@google.com> writes:
>>>
>>>> On Thu, Jan 12, 2023 at 12:07 AM Tariq Toukan <ttoukan.linux@gmail.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 12/01/2023 2:32, Stanislav Fomichev wrote:
>>>>>> From: Toke Høiland-Jørgensen <toke@redhat.com>
>>>>>>
>>>>>> Preparation for implementing HW metadata kfuncs. No functional change.
>>>>>>
>>>>>> Cc: Tariq Toukan <tariqt@nvidia.com>
>>>>>> Cc: Saeed Mahameed <saeedm@nvidia.com>
>>>>>> Cc: John Fastabend <john.fastabend@gmail.com>
>>>>>> Cc: David Ahern <dsahern@gmail.com>
>>>>>> Cc: Martin KaFai Lau <martin.lau@linux.dev>
>>>>>> Cc: Jakub Kicinski <kuba@kernel.org>
>>>>>> Cc: Willem de Bruijn <willemb@google.com>
>>>>>> Cc: Jesper Dangaard Brouer <brouer@redhat.com>
>>>>>> Cc: Anatoly Burakov <anatoly.burakov@intel.com>
>>>>>> Cc: Alexander Lobakin <alexandr.lobakin@intel.com>
>>>>>> Cc: Magnus Karlsson <magnus.karlsson@gmail.com>
>>>>>> Cc: Maryam Tahhan <mtahhan@redhat.com>
>>>>>> Cc: xdp-hints@xdp-project.net
>>>>>> Cc: netdev@vger.kernel.org
>>>>>> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
>>>>>> Signed-off-by: Stanislav Fomichev <sdf@google.com>
>>>>>> ---
>>>>>> drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 +
>>>>>> .../net/ethernet/mellanox/mlx5/core/en/xdp.c | 3 +-
>>>>>> .../net/ethernet/mellanox/mlx5/core/en/xdp.h | 6 +-
>>>>>> .../ethernet/mellanox/mlx5/core/en/xsk/rx.c | 25 ++++----
>>>>>> .../net/ethernet/mellanox/mlx5/core/en_rx.c | 58 +++++++++----------
>>>>>> 5 files changed, 50 insertions(+), 43 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
>>>>>> index 2d77fb8a8a01..af663978d1b4 100644
>>>>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
>>>>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
>>>>>> @@ -469,6 +469,7 @@ struct mlx5e_txqsq {
>>>>>> union mlx5e_alloc_unit {
>>>>>> struct page *page;
>>>>>> struct xdp_buff *xsk;
>>>>>> + struct mlx5e_xdp_buff *mxbuf;
>>>>>
>>>>> In XSK files below you mix usage of both alloc_units[page_idx].mxbuf and
>>>>> alloc_units[page_idx].xsk, while both fields share the memory of a union.
>>>>>
>>>>> As struct mlx5e_xdp_buff wraps struct xdp_buff, I think that you just
>>>>> need to change the existing xsk field type from struct xdp_buff *xsk
>>>>> into struct mlx5e_xdp_buff *xsk and align the usage.
>>>>
>>>> Hmmm, good point. I'm actually not sure how it works currently.
>>>> mlx5e_alloc_unit.mxbuf doesn't seem to be initialized anywhere? Toke,
>>>> am I missing something?
>>>
>>> It's initialised piecemeal in different places; but yeah, we're mixing
>>> things a bit...
>>>
>>>> I'm thinking about something like this:
>>>>
>>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h
>>>> b/drivers/net/ethernet/mellanox/mlx5/core/en.h
>>>> index af663978d1b4..2d77fb8a8a01 100644
>>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
>>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
>>>> @@ -469,7 +469,6 @@ struct mlx5e_txqsq {
>>>> union mlx5e_alloc_unit {
>>>> struct page *page;
>>>> struct xdp_buff *xsk;
>>>> - struct mlx5e_xdp_buff *mxbuf;
>>>> };
>>>
>>> Hmm, for consistency with the non-XSK path we should rather go the other
>>> direction and lose the xsk member, moving everything to mxbuf? Let me
>>> give that a shot...
>>
>> Something like the below?
>>
>> -Toke
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
>> index 6de02d8aeab8..cb9cdb6421c5 100644
>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
>> @@ -468,7 +468,6 @@ struct mlx5e_txqsq {
>>
>> union mlx5e_alloc_unit {
>> struct page *page;
>> - struct xdp_buff *xsk;
>> struct mlx5e_xdp_buff *mxbuf;
>> };
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
>> index cb568c62aba0..95694a25ec31 100644
>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
>> @@ -33,6 +33,7 @@
>> #define __MLX5_EN_XDP_H__
>>
>> #include <linux/indirect_call_wrapper.h>
>> +#include <net/xdp_sock_drv.h>
>>
>> #include "en.h"
>> #include "en/txrx.h"
>> @@ -112,6 +113,21 @@ static inline void mlx5e_xmit_xdp_doorbell(struct mlx5e_xdpsq *sq)
>> }
>> }
>>
>> +static inline struct mlx5e_xdp_buff *mlx5e_xsk_buff_alloc(struct xsk_buff_pool *pool)
>> +{
>> + return (struct mlx5e_xdp_buff *)xsk_buff_alloc(pool);
>
> What about the space needed for the rq / cqe fields? xsk_buff_alloc
> won't allocate it.
It will! See patch 14 in the series that adds a 'cb' field to
xdp_buff_xsk, meaning there's actually space after the xdp_buff struct
being allocated by the xsk_buff_alloc API. The XSK_CHECK_PRIV_TYPE macro
call is there to ensure the cb field is big enough for the struct we're
casting to in the driver.
>> +}
>> +
>> +static inline void mlx5e_xsk_buff_free(struct mlx5e_xdp_buff *mxbuf)
>> +{
>> + xsk_buff_free(&mxbuf->xdp);
>> +}
>> +
>> +static inline dma_addr_t mlx5e_xsk_buff_xdp_get_frame_dma(struct mlx5e_xdp_buff *mxbuf)
>> +{
>> + return xsk_buff_xdp_get_frame_dma(&mxbuf->xdp);
>> +}
>> +
>> /* Enable inline WQEs to shift some load from a congested HCA (HW) to
>> * a less congested cpu (SW).
>> */
>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
>> index 8bf3029abd3c..1f166dbb7f22 100644
>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
>> @@ -3,7 +3,6 @@
>>
>> #include "rx.h"
>> #include "en/xdp.h"
>> -#include <net/xdp_sock_drv.h>
>> #include <linux/filter.h>
>>
>> /* RX data path */
>> @@ -21,7 +20,7 @@ int mlx5e_xsk_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix)
>> if (unlikely(!xsk_buff_can_alloc(rq->xsk_pool, rq->mpwqe.pages_per_wqe)))
>> goto err;
>>
>> - BUILD_BUG_ON(sizeof(wi->alloc_units[0]) != sizeof(wi->alloc_units[0].xsk));
>> + BUILD_BUG_ON(sizeof(wi->alloc_units[0]) != sizeof(wi->alloc_units[0].mxbuf));
>> XSK_CHECK_PRIV_TYPE(struct mlx5e_xdp_buff);
>> batch = xsk_buff_alloc_batch(rq->xsk_pool, (struct xdp_buff **)wi->alloc_units,
>> rq->mpwqe.pages_per_wqe);
>
> This batching API gets broken as well...
> xsk_buff_alloc_batch fills an array of struct xdp_buff pointers, it
> cannot correctly act on the array of struct mlx5e_xdp_buff, as it
> contains additional fields.
See above for why this does, in fact, work. I agree it's not totally
obvious, and in any case there's going to be a point where the cast
happens where type safety will break, which is what I was alluding to in
my reply to Stanislav.
I guess we could try to rework the API in xdp_sock_drv.h to make this
more obvious instead of using the casting driver-specific wrappers I
suggested here. Or we could go with Stanislav's suggestion of keeping
allocation etc using xdp_buff and only casting to mlx5e_xdp_buff in the
function where it's used; then we can keep the casting localised to that
function, and put a comment there explaining why it works?
> Maybe letting mlx5e_xdp_buff point to its struct xdp_buff (instead of
> wrapping it) will solve the problems here, then we'll loop over the
> xdp_buff * array and copy the pointers into the struct mlx5e_xdp_buff *
> array.
> Need to give it deeper thoughts...
>
> struct mlx5e_xdp_buff {
> struct xdp_buff *xdp;
> struct mlx5_cqe64 *cqe;
> struct mlx5e_rq *rq;
> };
This was actually my original proposal; we discussed this back on v2 of
this patch series. People generally felt that the 'cb' field approach
(originally suggested by Jakub) was better. See the discussion starting
from here:
https://lore.kernel.org/r/20221123111431.7b54668e@kernel.org
-Toke
next prev parent reply other threads:[~2023-01-13 21:33 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-12 0:32 [PATCH bpf-next v7 00/17] xdp: hints via kfuncs Stanislav Fomichev
2023-01-12 0:32 ` [PATCH bpf-next v7 01/17] bpf: Document XDP RX metadata Stanislav Fomichev
2023-01-16 13:09 ` Jesper Dangaard Brouer
2023-01-17 20:33 ` Stanislav Fomichev
2023-01-18 14:28 ` Jesper Dangaard Brouer
2023-01-18 17:55 ` Stanislav Fomichev
2023-01-12 0:32 ` [PATCH bpf-next v7 02/17] bpf: Rename bpf_{prog,map}_is_dev_bound to is_offloaded Stanislav Fomichev
2023-01-12 0:32 ` [PATCH bpf-next v7 03/17] bpf: Move offload initialization into late_initcall Stanislav Fomichev
2023-01-12 0:32 ` [PATCH bpf-next v7 04/17] bpf: Reshuffle some parts of bpf/offload.c Stanislav Fomichev
2023-01-12 0:32 ` [PATCH bpf-next v7 05/17] bpf: Introduce device-bound XDP programs Stanislav Fomichev
2023-01-12 0:32 ` [PATCH bpf-next v7 06/17] selftests/bpf: Update expected test_offload.py messages Stanislav Fomichev
2023-01-12 0:32 ` [PATCH bpf-next v7 07/17] bpf: XDP metadata RX kfuncs Stanislav Fomichev
2023-01-12 0:32 ` [PATCH bpf-next v7 08/17] bpf: Support consuming XDP HW metadata from fext programs Stanislav Fomichev
2023-01-12 0:32 ` [PATCH bpf-next v7 09/17] veth: Introduce veth_xdp_buff wrapper for xdp_buff Stanislav Fomichev
2023-01-12 0:32 ` [PATCH bpf-next v7 10/17] veth: Support RX XDP metadata Stanislav Fomichev
2023-01-16 16:21 ` Jesper Dangaard Brouer
2023-01-17 20:33 ` Stanislav Fomichev
2023-01-18 15:57 ` Jesper Dangaard Brouer
2023-01-12 0:32 ` [PATCH bpf-next v7 11/17] selftests/bpf: Verify xdp_metadata xdp->af_xdp path Stanislav Fomichev
2023-01-12 0:32 ` [PATCH bpf-next v7 12/17] net/mlx4_en: Introduce wrapper for xdp_buff Stanislav Fomichev
2023-01-12 0:32 ` [PATCH bpf-next v7 13/17] net/mlx4_en: Support RX XDP metadata Stanislav Fomichev
2023-01-12 0:32 ` [PATCH bpf-next v7 14/17] xsk: Add cb area to struct xdp_buff_xsk Stanislav Fomichev
2023-01-12 0:32 ` [PATCH bpf-next v7 15/17] net/mlx5e: Introduce wrapper for xdp_buff Stanislav Fomichev
2023-01-12 8:07 ` Tariq Toukan
2023-01-12 19:10 ` Stanislav Fomichev
2023-01-12 21:09 ` [xdp-hints] " Toke Høiland-Jørgensen
2023-01-12 21:55 ` Toke Høiland-Jørgensen
2023-01-12 22:18 ` Stanislav Fomichev
2023-01-12 22:29 ` Toke Høiland-Jørgensen
2023-01-13 20:55 ` Tariq Toukan
2023-01-13 20:53 ` Tariq Toukan
2023-01-13 21:31 ` Toke Høiland-Jørgensen [this message]
2023-01-15 6:59 ` Tariq Toukan
2023-01-15 11:13 ` Toke Høiland-Jørgensen
2023-01-12 0:32 ` [PATCH bpf-next v7 16/17] net/mlx5e: Support RX XDP metadata Stanislav Fomichev
2023-01-12 8:13 ` Tariq Toukan
2023-01-12 19:09 ` Stanislav Fomichev
2023-01-13 20:25 ` Tariq Toukan
2023-01-12 0:32 ` [PATCH bpf-next v7 17/17] selftests/bpf: Simple program to dump XDP RX metadata Stanislav Fomichev
2023-01-12 7:29 ` [PATCH bpf-next v7 00/17] xdp: hints via kfuncs Martin KaFai Lau
2023-01-12 8:19 ` Tariq Toukan
2023-01-12 18:09 ` Stanislav Fomichev
2023-01-12 18:20 ` Martin KaFai Lau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87358ef7e8.fsf@toke.dk \
--to=toke@redhat.com \
--cc=alexandr.lobakin@intel.com \
--cc=anatoly.burakov@intel.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=brouer@redhat.com \
--cc=daniel@iogearbox.net \
--cc=dsahern@gmail.com \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=kuba@kernel.org \
--cc=magnus.karlsson@gmail.com \
--cc=martin.lau@linux.dev \
--cc=mtahhan@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=saeedm@nvidia.com \
--cc=sdf@google.com \
--cc=song@kernel.org \
--cc=ttoukan.linux@gmail.com \
--cc=willemb@google.com \
--cc=xdp-hints@xdp-project.net \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.