From: Yunsheng Lin <linyunsheng@huawei.com>
To: Alexander Lobakin <aleksander.lobakin@intel.com>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>,
Jesper Dangaard Brouer <hawk@kernel.org>,
Larysa Zaremba <larysa.zaremba@intel.com>,
netdev@vger.kernel.org, Alexander Duyck <alexanderduyck@fb.com>,
Ilias Apalodimas <ilias.apalodimas@linaro.org>,
linux-kernel@vger.kernel.org,
Michal Kubiak <michal.kubiak@intel.com>,
intel-wired-lan@lists.osuosl.org,
David Christensen <drc@linux.vnet.ibm.com>
Subject: Re: [Intel-wired-lan] [PATCH RFC net-next v4 6/9] iavf: switch to Page Pool
Date: Thu, 6 Jul 2023 20:47:22 +0800 [thread overview]
Message-ID: <6b8bc66f-8a02-b6b4-92cc-f8aaf067abd8@huawei.com> (raw)
In-Reply-To: <20230705155551.1317583-7-aleksander.lobakin@intel.com>
On 2023/7/5 23:55, Alexander Lobakin wrote:
> Now that the IAVF driver simply uses dev_alloc_page() + free_page() with
> no custom recycling logics, it can easily be switched to using Page
> Pool / libie API instead.
> This allows to removing the whole dancing around headroom, HW buffer
> size, and page order. All DMA-for-device is now done in the PP core,
> for-CPU -- in the libie helper.
> Use skb_mark_for_recycle() to bring back the recycling and restore the
> performance. Speaking of performance: on par with the baseline and
> faster with the PP optimization series applied. But the memory usage for
> 1500b MTU is now almost 2x lower (x86_64) thanks to allocating a page
> every second descriptor.
>
> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
> ---
...
> @@ -2562,11 +2541,7 @@ static void iavf_init_config_adapter(struct iavf_adapter *adapter)
>
> netdev->netdev_ops = &iavf_netdev_ops;
> iavf_set_ethtool_ops(netdev);
> - netdev->watchdog_timeo = 5 * HZ;
This seems like a unrelated change here?
> -
> - /* MTU range: 68 - 9710 */
> - netdev->min_mtu = ETH_MIN_MTU;
> - netdev->max_mtu = IAVF_MAX_RXBUFFER - IAVF_PACKET_HDR_PAD;
> + netdev->max_mtu = LIBIE_MAX_MTU;
>
...
> /**
> @@ -766,13 +742,19 @@ void iavf_free_rx_resources(struct iavf_ring *rx_ring)
> **/
> int iavf_setup_rx_descriptors(struct iavf_ring *rx_ring)
> {
> - struct device *dev = rx_ring->dev;
> - int bi_size;
> + struct page_pool *pool;
> +
> + pool = libie_rx_page_pool_create(&rx_ring->q_vector->napi,
> + rx_ring->count);
If a page is able to be spilt between more than one desc, perhaps the
prt_ring size does not need to be as big as rx_ring->count.
> + if (IS_ERR(pool))
> + return PTR_ERR(pool);
> +
> + rx_ring->pp = pool;
>
> /* warn if we are about to overwrite the pointer */
> WARN_ON(rx_ring->rx_bi);
> - bi_size = sizeof(struct iavf_rx_buffer) * rx_ring->count;
> - rx_ring->rx_bi = kzalloc(bi_size, GFP_KERNEL);
> + rx_ring->rx_bi = kcalloc(rx_ring->count, sizeof(*rx_ring->rx_bi),
> + GFP_KERNEL);
> if (!rx_ring->rx_bi)
> goto err;
>
...
>
> /**
> * iavf_build_skb - Build skb around an existing buffer
> - * @rx_ring: Rx descriptor ring to transact packets on
> * @rx_buffer: Rx buffer to pull data from
> * @size: size of buffer to add to skb
> *
> * This function builds an skb around an existing Rx buffer, taking care
> * to set up the skb correctly and avoid any memcpy overhead.
> */
> -static struct sk_buff *iavf_build_skb(struct iavf_ring *rx_ring,
> - struct iavf_rx_buffer *rx_buffer,
> +static struct sk_buff *iavf_build_skb(const struct libie_rx_buffer *rx_buffer,
> unsigned int size)
> {
> - void *va;
> -#if (PAGE_SIZE < 8192)
> - unsigned int truesize = iavf_rx_pg_size(rx_ring) / 2;
> -#else
> - unsigned int truesize = SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) +
> - SKB_DATA_ALIGN(IAVF_SKB_PAD + size);
> -#endif
> + struct page *page = rx_buffer->page;
> + u32 hr = page->pp->p.offset;
> struct sk_buff *skb;
> + void *va;
>
> - if (!rx_buffer || !size)
> - return NULL;
> /* prefetch first cache line of first page */
> - va = page_address(rx_buffer->page) + rx_buffer->page_offset;
> - net_prefetch(va);
> + va = page_address(page) + rx_buffer->offset;
> + net_prefetch(va + hr);
>
> /* build an skb around the page buffer */
> - skb = napi_build_skb(va - IAVF_SKB_PAD, truesize);
> - if (unlikely(!skb))
> + skb = napi_build_skb(va, rx_buffer->truesize);
> + if (unlikely(!skb)) {
> + page_pool_put_page(page->pp, page, size, true);
Isn't it more correct to call page_pool_put_full_page() here?
as we do not know which frag is used for the rx_buffer, and depend
on the last released frag to do the syncing, maybe I should mention
that in Documentation/networking/page_pool.rst.
> return NULL;
> + }
...
> struct iavf_queue_stats {
> u64 packets;
> u64 bytes;
> @@ -311,16 +243,19 @@ enum iavf_ring_state_t {
> struct iavf_ring {
> struct iavf_ring *next; /* pointer to next ring in q_vector */
> void *desc; /* Descriptor ring memory */
> - struct device *dev; /* Used for DMA mapping */
> + union {
> + struct page_pool *pp; /* Used on Rx for buffer management */
> + struct device *dev; /* Used on Tx for DMA mapping */
> + };
> struct net_device *netdev; /* netdev ring maps to */
> union {
> + struct libie_rx_buffer *rx_bi;
> struct iavf_tx_buffer *tx_bi;
> - struct iavf_rx_buffer *rx_bi;
> };
> DECLARE_BITMAP(state, __IAVF_RING_STATE_NBITS);
> + u8 __iomem *tail;
Is there a reason to move it here?
> u16 queue_index; /* Queue number of ring */
> u8 dcb_tc; /* Traffic class of ring */
> - u8 __iomem *tail;
>
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
next prev parent reply other threads:[~2023-07-06 15:19 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-05 15:55 [Intel-wired-lan] [PATCH RFC net-next v4 0/9] net: intel: start The Great Code Dedup + Page Pool for iavf Alexander Lobakin
2023-07-05 15:55 ` [Intel-wired-lan] [PATCH RFC net-next v4 1/9] net: intel: introduce Intel Ethernet common library Alexander Lobakin
2023-07-14 14:17 ` Przemek Kitszel
2023-07-05 15:55 ` [Intel-wired-lan] [PATCH RFC net-next v4 2/9] iavf: kill "legacy-rx" for good Alexander Lobakin
2023-07-14 14:17 ` Przemek Kitszel
2023-07-05 15:55 ` [Intel-wired-lan] [PATCH RFC net-next v4 3/9] iavf: drop page splitting and recycling Alexander Lobakin
2023-07-06 14:47 ` Alexander Duyck
2023-07-06 16:45 ` Alexander Lobakin
2023-07-06 17:06 ` Alexander Duyck
2023-07-10 13:13 ` Alexander Lobakin
2023-07-05 15:55 ` [Intel-wired-lan] [PATCH RFC net-next v4 4/9] net: page_pool: add DMA-sync-for-CPU inline helpers Alexander Lobakin
2023-07-05 15:55 ` [Intel-wired-lan] [PATCH RFC net-next v4 5/9] libie: add Rx buffer management (via Page Pool) Alexander Lobakin
2023-07-06 12:47 ` Yunsheng Lin
2023-07-06 16:28 ` Alexander Lobakin
2023-07-09 5:16 ` Yunsheng Lin
2023-07-10 13:25 ` Alexander Lobakin
2023-07-11 11:39 ` Yunsheng Lin
2023-07-11 16:37 ` Alexander Lobakin
2023-07-12 11:13 ` Yunsheng Lin
2023-07-05 15:55 ` [Intel-wired-lan] [PATCH RFC net-next v4 6/9] iavf: switch to Page Pool Alexander Lobakin
2023-07-06 12:47 ` Yunsheng Lin [this message]
2023-07-06 16:38 ` Alexander Lobakin
2023-07-09 5:16 ` Yunsheng Lin
2023-07-10 13:34 ` Alexander Lobakin
2023-07-11 11:47 ` Yunsheng Lin
2023-07-18 13:56 ` Alexander Lobakin
2023-07-06 15:26 ` Alexander Duyck
2023-07-06 16:56 ` Alexander Lobakin
2023-07-06 17:28 ` Alexander Duyck
2023-07-10 13:18 ` Alexander Lobakin
2023-07-05 15:55 ` [Intel-wired-lan] [PATCH RFC net-next v4 7/9] libie: add common queue stats Alexander Lobakin
2023-07-05 15:55 ` [Intel-wired-lan] [PATCH RFC net-next v4 8/9] libie: add per-queue Page Pool stats Alexander Lobakin
2023-07-05 15:55 ` [Intel-wired-lan] [PATCH RFC net-next v4 9/9] iavf: switch queue stats to libie Alexander Lobakin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6b8bc66f-8a02-b6b4-92cc-f8aaf067abd8@huawei.com \
--to=linyunsheng@huawei.com \
--cc=aleksander.lobakin@intel.com \
--cc=alexanderduyck@fb.com \
--cc=davem@davemloft.net \
--cc=drc@linux.vnet.ibm.com \
--cc=edumazet@google.com \
--cc=hawk@kernel.org \
--cc=ilias.apalodimas@linaro.org \
--cc=intel-wired-lan@lists.osuosl.org \
--cc=kuba@kernel.org \
--cc=larysa.zaremba@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=michal.kubiak@intel.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=pmenzel@molgen.mpg.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox