From: Yunsheng Lin <yunshenglin0825@gmail.com>
To: Alexander Duyck <alexander.duyck@gmail.com>
Cc: Yunsheng Lin <linyunsheng@huawei.com>,
davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
Lorenzo Bianconi <lorenzo@kernel.org>,
Jesper Dangaard Brouer <hawk@kernel.org>,
Ilias Apalodimas <ilias.apalodimas@linaro.org>,
Eric Dumazet <edumazet@google.com>
Subject: Re: [PATCH net-next v3 3/4] page_pool: introduce page_pool_alloc() API
Date: Sat, 24 Jun 2023 23:39:14 +0800 [thread overview]
Message-ID: <7e7ee6bf-13b6-3194-10df-d8a310778620@gmail.com> (raw)
In-Reply-To: <CAKgT0UeeWhD0_YWHoQe4=vEvKPXdVcFzp5qca2kM3uG7j+U2dg@mail.gmail.com>
On 2023/6/20 23:39, Alexander Duyck wrote:
...
>
>> If I understand it correctly, most hw have a per-queue fixed buffer
>> size, even the mlx5 one with per-desc buffer size support through
>> mlx5_wqe_data_seg, the driver seems to use the 'per-queue fixed
>> buffer size' model, I assume that using per-desc buffer size is just
>> not worth the effort?
>
> The problem is the device really has two buffer sizes it is dealing
> with. The wqe size, and the cqe size. What goes in as a 4K page can
> come up as multiple frames depending on the packet sizes being
> received.
Yes, I understand that the buffer associated with wqe must be large
enough to hold the biggest packet, and sometimes hw may report that
only a small portion of that buffer is used as indicated in cqe when
a small packet is received. The problem is: how much buffer is
associated with a wqe to allow subdividing within wqe? With biggest
packet being 2K size, we need a buffer with 4K size to be associated
with a wqe, right? Isn't it wasteful to do that? Not to mention true
size exacerbating problem for small packet.
And it seems mlx5 is not using the page_pool_fragment_page() API as
you expected.
As my understanding, for a mpwqe, it have multi strides, a packet
seems to be able to fit in a stride or multi strides within a mpwqe,
and a stride seems to be corresponding to a frag, and there seems to
be no subdividing within a stride, see mlx5e_handle_rx_cqe_mpwrq().
https://elixir.bootlin.com/linux/v6.4-rc6/source/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c#L2366
...
>
>>>
>>> What I was thinking of was the frag count. That is something the
>>> driver should have the ability to manipulate, be it adding or removing
>>> frags as it takes the section of memory it was given and it decides to
>>> break it up further before handing it out in skb frames.
>>
>> As my understanding, there is no essential difference between frag
>> count and frag offet if we want to do 'subdividing', just like we
>> have frag_count for page pool and _refcount for page allocator, we
>> may need a third one for this 'subdividing'.
>
> There is a huge difference, and may be part of the reason why you and
> I have such a different understanding of this.
>
> The offset is just local to your fragmentation, whereas the count is
> the global value for the page at which it can finally be freed back to
> the pool. You could have multiple threads all working with different
> offsets as long as they are all bounded within separate regions of the
> page, however they must all agree on the frag count they are working
> with since that is a property specific to the page. This is why
> frag_count must be atomic whereas we keep frag_offset as a local
> variable.
>
> No additional counts needed. We never added another _refcount when we
> were doing splitting in the drivers, and we wouldn't need to in order
> to do splitting with page_pool pages. We would just have to start with
> a frag count of 1.
In that case, we can not do something like below as _refcount if we have
the same frag count for page pool and driver, right?
https://elixir.bootlin.com/linux/v6.4-rc6/source/drivers/net/ethernet/intel/iavf/iavf_txrx.c#L1220
next prev parent reply other threads:[~2023-06-24 15:47 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-09 13:17 [PATCH net-next v3 0/4] introduce page_pool_alloc() API Yunsheng Lin
2023-06-09 13:17 ` [PATCH net-next v3 1/4] page_pool: frag API support for 32-bit arch with 64-bit DMA Yunsheng Lin
2023-06-09 15:02 ` Jesper Dangaard Brouer
2023-06-10 13:13 ` Yunsheng Lin
2023-06-11 10:47 ` Jesper Dangaard Brouer
2023-06-09 13:17 ` [PATCH net-next v3 2/4] page_pool: unify frag_count handling in page_pool_is_last_frag() Yunsheng Lin
2023-06-09 13:17 ` [PATCH net-next v3 3/4] page_pool: introduce page_pool_alloc() API Yunsheng Lin
2023-06-13 14:36 ` Alexander Duyck
2023-06-14 3:51 ` Yunsheng Lin
2023-06-14 14:18 ` Alexander Duyck
2023-06-15 6:39 ` Yunsheng Lin
2023-06-15 14:45 ` Alexander Duyck
2023-06-15 16:19 ` Jesper Dangaard Brouer
2023-06-16 11:57 ` Yunsheng Lin
2023-06-16 16:31 ` Jesper Dangaard Brouer
2023-06-16 17:34 ` Alexander Duyck
2023-06-16 18:41 ` Jesper Dangaard Brouer
2023-06-16 18:47 ` Jakub Kicinski
2023-06-16 19:50 ` Alexander Duyck
2023-06-18 15:05 ` Lorenzo Bianconi
2023-06-20 16:19 ` Alexander Duyck
2023-06-20 21:16 ` Lorenzo Bianconi
2023-06-21 11:55 ` Jesper Dangaard Brouer
2023-06-24 14:44 ` Yunsheng Lin
2023-06-17 12:47 ` Yunsheng Lin
2023-06-16 11:47 ` Yunsheng Lin
2023-06-16 14:46 ` Alexander Duyck
2023-06-17 11:41 ` Yunsheng Lin
2023-06-20 15:39 ` Alexander Duyck
2023-06-24 15:39 ` Yunsheng Lin [this message]
2023-06-09 13:17 ` [PATCH net-next v3 4/4] page_pool: remove PP_FLAG_PAGE_FRAG flag Yunsheng Lin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7e7ee6bf-13b6-3194-10df-d8a310778620@gmail.com \
--to=yunshenglin0825@gmail.com \
--cc=alexander.duyck@gmail.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=hawk@kernel.org \
--cc=ilias.apalodimas@linaro.org \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linyunsheng@huawei.com \
--cc=lorenzo@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox