[PATCH net-next v11 0/6] introduce page_pool

BPF List
 help / color / mirror / Atom feed

* [PATCH net-next v11 0/6] introduce page_pool_alloc() related API
@ 2023-10-13  6:48 Yunsheng Lin
  2023-10-13  6:48 ` [PATCH net-next v11 5/6] page_pool: update document about fragment API Yunsheng Lin
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Yunsheng Lin @ 2023-10-13  6:48 UTC (permalink / raw)
  To: davem, kuba, pabeni
  Cc: netdev, linux-kernel, Yunsheng Lin, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Matthias Brugger, AngeloGioacchino Del Regno, bpf,
	linux-arm-kernel, linux-mediatek

In [1] & [2] & [3], there are usecases for veth and virtio_net
to use frag support in page pool to reduce memory usage, and it
may request different frag size depending on the head/tail
room space for xdp_frame/shinfo and mtu/packet size. When the
requested frag size is large enough that a single page can not
be split into more than one frag, using frag support only have
performance penalty because of the extra frag count handling
for frag support.

So this patchset provides a page pool API for the driver to
allocate memory with least memory utilization and performance
penalty when it doesn't know the size of memory it need
beforehand.

1. https://patchwork.kernel.org/project/netdevbpf/patch/d3ae6bd3537fbce379382ac6a42f67e22f27ece2.1683896626.git.lorenzo@kernel.org/
2. https://patchwork.kernel.org/project/netdevbpf/patch/20230526054621.18371-3-liangchen.linux@gmail.com/
3. https://github.com/alobakin/linux/tree/iavf-pp-frag

V11: Repost based on the latest net-next branch and collect
     Tested-by Tag from Alexander.

V10: Use fragment instead of frag in English docs.
     Remove PP_FLAG_PAGE_FRAG usage in idpf driver.

V9: Update some performance info in patch 2.

V8: Store the dma addr on a shifted u32 instead of using
    dma_addr_t explicitly for 32-bit arch with 64-bit DMA.
    Update document according to discussion in v7.

V7: Fix a compile error, a few typo and use kernel-doc syntax.

V6: Add a PP_FLAG_PAGE_SPLIT_IN_DRIVER flag to fail the page_pool
    creation for 32-bit arch with 64-bit DMA when driver tries to
    do the page splitting itself, adjust the requested size to
    include head/tail room in veth, and rebased on the latest
    next-net.

v5 RFC: Add a new page_pool_cache_alloc() API, and other minor
        change as discussed in v4. As there seems to be three
        comsumers that might be made use of the new API, so
        repost it as RFC and CC the relevant authors to see
        if the new API fits their need.

V4. Fix a typo and add a patch to update document about frag
    API, PAGE_POOL_DMA_USE_PP_FRAG_COUNT is not renamed yet
    as we may need a different thread to discuss that.

V3: Incorporate changes from the disscusion with Alexander,
    mostly the inline wraper, PAGE_POOL_DMA_USE_PP_FRAG_COUNT
    change split to separate patch and comment change.
V2: Add patch to remove PP_FLAG_PAGE_FRAG flags and mention
    virtio_net usecase in the cover letter.
V1: Drop RFC tag and page_pool_frag patch.

Yunsheng Lin (6):
  page_pool: fragment API support for 32-bit arch with 64-bit DMA
  page_pool: unify frag_count handling in page_pool_is_last_frag()
  page_pool: remove PP_FLAG_PAGE_FRAG
  page_pool: introduce page_pool[_cache]_alloc() API
  page_pool: update document about fragment API
  net: veth: use newly added page pool API for veth with xdp

 Documentation/networking/page_pool.rst        |   4 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt.c     |   2 -
 .../net/ethernet/hisilicon/hns3/hns3_enet.c   |   3 +-
 drivers/net/ethernet/intel/idpf/idpf_txrx.c   |   3 -
 .../marvell/octeontx2/nic/otx2_common.c       |   2 +-
 .../net/ethernet/mellanox/mlx5/core/en_main.c |   2 +-
 drivers/net/veth.c                            |  25 +-
 drivers/net/wireless/mediatek/mt76/mac80211.c |   2 +-
 include/linux/mm_types.h                      |  13 +-
 include/net/page_pool/helpers.h               | 231 +++++++++++++++---
 include/net/page_pool/types.h                 |   6 +-
 net/core/page_pool.c                          |  31 ++-
 net/core/skbuff.c                             |   2 +-
 13 files changed, 243 insertions(+), 83 deletions(-)

-- 
2.33.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH net-next v11 5/6] page_pool: update document about fragment API
  2023-10-13  6:48 [PATCH net-next v11 0/6] introduce page_pool_alloc() related API Yunsheng Lin
@ 2023-10-13  6:48 ` Yunsheng Lin
  2023-10-13  6:48 ` [PATCH net-next v11 6/6] net: veth: use newly added page pool API for veth with xdp Yunsheng Lin
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 12+ messages in thread
From: Yunsheng Lin @ 2023-10-13  6:48 UTC (permalink / raw)
  To: davem, kuba, pabeni
  Cc: netdev, linux-kernel, Yunsheng Lin, Lorenzo Bianconi,
	Alexander Duyck, Liang Chen, Alexander Lobakin, Dima Tisnek,
	Jesper Dangaard Brouer, Ilias Apalodimas, Eric Dumazet,
	Jonathan Corbet, Alexei Starovoitov, Daniel Borkmann,
	John Fastabend, linux-doc, bpf

As more drivers begin to use the fragment API, update the
document about how to decide which API to use for the
driver author.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
CC: Lorenzo Bianconi <lorenzo@kernel.org>
CC: Alexander Duyck <alexander.duyck@gmail.com>
CC: Liang Chen <liangchen.linux@gmail.com>
CC: Alexander Lobakin <aleksander.lobakin@intel.com>
CC: Dima Tisnek <dimaqq@gmail.com>
---
 Documentation/networking/page_pool.rst |  4 +-
 include/net/page_pool/helpers.h        | 93 ++++++++++++++++++++++----
 2 files changed, 82 insertions(+), 15 deletions(-)

diff --git a/Documentation/networking/page_pool.rst b/Documentation/networking/page_pool.rst
index 215ebc92752c..0c0705994f51 100644
--- a/Documentation/networking/page_pool.rst
+++ b/Documentation/networking/page_pool.rst
@@ -58,7 +58,9 @@ a page will cause no race conditions is enough.
 
 .. kernel-doc:: include/net/page_pool/helpers.h
    :identifiers: page_pool_put_page page_pool_put_full_page
-		 page_pool_recycle_direct page_pool_dev_alloc_pages
+		 page_pool_recycle_direct page_pool_cache_free
+		 page_pool_dev_alloc_pages page_pool_dev_alloc_frag
+		 page_pool_dev_alloc page_pool_dev_cache_alloc
 		 page_pool_get_dma_addr page_pool_get_dma_dir
 
 .. kernel-doc:: net/core/page_pool.c
diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h
index 674f480d9f2e..7550beeacf3d 100644
--- a/include/net/page_pool/helpers.h
+++ b/include/net/page_pool/helpers.h
@@ -8,23 +8,46 @@
 /**
  * DOC: page_pool allocator
  *
- * The page_pool allocator is optimized for the XDP mode that
- * uses one frame per-page, but it can fallback on the
- * regular page allocator APIs.
+ * The page_pool allocator is optimized for recycling page or page fragment used
+ * by skb packet and xdp frame.
  *
- * Basic use involves replacing alloc_pages() calls with the
- * page_pool_alloc_pages() call.  Drivers should use
- * page_pool_dev_alloc_pages() replacing dev_alloc_pages().
+ * Basic use involves replacing napi_alloc_frag() and alloc_pages() calls with
+ * page_pool_cache_alloc() and page_pool_alloc(), which allocate memory with or
+ * without page splitting depending on the requested memory size.
  *
- * The API keeps track of in-flight pages, in order to let API users know
- * when it is safe to free a page_pool object.  Thus, API users
- * must call page_pool_put_page() to free the page, or attach
- * the page to a page_pool-aware object like skbs marked with
- * skb_mark_for_recycle().
+ * If the driver knows that it always requires full pages or its allocations are
+ * always smaller than half a page, it can use one of the more specific API
+ * calls:
  *
- * API users must call page_pool_put_page() once on a page, as it
- * will either recycle the page, or in case of refcnt > 1, it will
- * release the DMA mapping and in-flight state accounting.
+ * 1. page_pool_alloc_pages(): allocate memory without page splitting when
+ * driver knows that the memory it need is always bigger than half of the page
+ * allocated from page pool. There is no cache line dirtying for 'struct page'
+ * when a page is recycled back to the page pool.
+ *
+ * 2. page_pool_alloc_frag(): allocate memory with page splitting when driver
+ * knows that the memory it need is always smaller than or equal to half of the
+ * page allocated from page pool. Page splitting enables memory saving and thus
+ * avoids TLB/cache miss for data access, but there also is some cost to
+ * implement page splitting, mainly some cache line dirtying/bouncing for
+ * 'struct page' and atomic operation for page->pp_frag_count.
+ *
+ * The API keeps track of in-flight pages, in order to let API users know when
+ * it is safe to free a page_pool object, the API users must call
+ * page_pool_put_page() or page_pool_cache_free() to free the pp page or the pp
+ * buffer, or attach the pp page or the pp buffer to a page_pool-aware object
+ * like skbs marked with skb_mark_for_recycle().
+ *
+ * page_pool_put_page() may be called multi times on the same page if a page is
+ * split into multi fragments. For the last fragment, it will either recycle the
+ * page, or in case of page->_refcount > 1, it will release the DMA mapping and
+ * in-flight state accounting.
+ *
+ * dma_sync_single_range_for_device() is only called for the last fragment when
+ * page_pool is created with PP_FLAG_DMA_SYNC_DEV flag, so it depends on the
+ * last freed fragment to do the sync_for_device operation for all fragments in
+ * the same page when a page is split, the API user must setup pool->p.max_len
+ * and pool->p.offset correctly and ensure that page_pool_put_page() is called
+ * with dma_sync_size being -1 for fragment API.
  */
 #ifndef _NET_PAGE_POOL_HELPERS_H
 #define _NET_PAGE_POOL_HELPERS_H
@@ -73,6 +96,17 @@ static inline struct page *page_pool_dev_alloc_pages(struct page_pool *pool)
 	return page_pool_alloc_pages(pool, gfp);
 }
 
+/**
+ * page_pool_dev_alloc_frag() - allocate a page fragment.
+ * @pool: pool from which to allocate
+ * @offset: offset to the allocated page
+ * @size: requested size
+ *
+ * Get a page fragment from the page allocator or page_pool caches.
+ *
+ * Return:
+ * Return allocated page fragment, otherwise return NULL.
+ */
 static inline struct page *page_pool_dev_alloc_frag(struct page_pool *pool,
 						    unsigned int *offset,
 						    unsigned int size)
@@ -111,6 +145,19 @@ static inline struct page *page_pool_alloc(struct page_pool *pool,
 	return page;
 }
 
+/**
+ * page_pool_dev_alloc() - allocate a page or a page fragment.
+ * @pool: pool from which to allocate
+ * @offset: offset to the allocated page
+ * @size: in as the requested size, out as the allocated size
+ *
+ * Get a page or a page fragment from the page allocator or page_pool caches
+ * depending on the requested size in order to allocate memory with least memory
+ * utilization and performance penalty.
+ *
+ * Return:
+ * Return allocated page or page fragment, otherwise return NULL.
+ */
 static inline struct page *page_pool_dev_alloc(struct page_pool *pool,
 					       unsigned int *offset,
 					       unsigned int *size)
@@ -133,6 +180,16 @@ static inline void *page_pool_cache_alloc(struct page_pool *pool,
 	return page_address(page) + offset;
 }
 
+/**
+ * page_pool_dev_cache_alloc() - allocate a cache.
+ * @pool: pool from which to allocate
+ * @size: in as the requested size, out as the allocated size
+ *
+ * Get a cache from the page allocator or page_pool caches.
+ *
+ * Return:
+ * Return the addr for the allocated cache, otherwise return NULL.
+ */
 static inline void *page_pool_dev_cache_alloc(struct page_pool *pool,
 					      unsigned int *size)
 {
@@ -280,6 +337,14 @@ static inline void page_pool_recycle_direct(struct page_pool *pool,
 #define PAGE_POOL_32BIT_ARCH_WITH_64BIT_DMA	\
 		(sizeof(dma_addr_t) > sizeof(unsigned long))
 
+/**
+ * page_pool_cache_free() - free a cache into the page_pool
+ * @pool: pool from which cache was allocated
+ * @data: addr of cache to be free
+ * @allow_direct: freed by the consumer, allow lockless caching
+ *
+ * Free a cache allocated from page_pool_dev_cache_alloc().
+ */
 static inline void page_pool_cache_free(struct page_pool *pool, void *data,
 					bool allow_direct)
 {
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v11 6/6] net: veth: use newly added page pool API for veth with xdp
  2023-10-13  6:48 [PATCH net-next v11 0/6] introduce page_pool_alloc() related API Yunsheng Lin
  2023-10-13  6:48 ` [PATCH net-next v11 5/6] page_pool: update document about fragment API Yunsheng Lin
@ 2023-10-13  6:48 ` Yunsheng Lin
  2023-10-17  1:27 ` [PATCH net-next v11 0/6] introduce page_pool_alloc() related API Jakub Kicinski
  2023-10-17  4:10 ` patchwork-bot+netdevbpf
  3 siblings, 0 replies; 12+ messages in thread
From: Yunsheng Lin @ 2023-10-13  6:48 UTC (permalink / raw)
  To: davem, kuba, pabeni
  Cc: netdev, linux-kernel, Yunsheng Lin, Lorenzo Bianconi,
	Alexander Duyck, Liang Chen, Alexander Lobakin, Eric Dumazet,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, bpf

Use page_pool[_cache]_alloc() API to allocate memory with
least memory utilization and performance penalty.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
CC: Lorenzo Bianconi <lorenzo@kernel.org>
CC: Alexander Duyck <alexander.duyck@gmail.com>
CC: Liang Chen <liangchen.linux@gmail.com>
CC: Alexander Lobakin <aleksander.lobakin@intel.com>
---
 drivers/net/veth.c | 25 ++++++++++++++++---------
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 0deefd1573cf..470791b0b533 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -737,10 +737,11 @@ static int veth_convert_skb_to_xdp_buff(struct veth_rq *rq,
 	if (skb_shared(skb) || skb_head_is_locked(skb) ||
 	    skb_shinfo(skb)->nr_frags ||
 	    skb_headroom(skb) < XDP_PACKET_HEADROOM) {
-		u32 size, len, max_head_size, off;
+		u32 size, len, max_head_size, off, truesize, page_offset;
 		struct sk_buff *nskb;
 		struct page *page;
 		int i, head_off;
+		void *data;
 
 		/* We need a private copy of the skb and data buffers since
 		 * the ebpf program can modify it. We segment the original skb
@@ -753,14 +754,17 @@ static int veth_convert_skb_to_xdp_buff(struct veth_rq *rq,
 		if (skb->len > PAGE_SIZE * MAX_SKB_FRAGS + max_head_size)
 			goto drop;
 
+		size = min_t(u32, skb->len, max_head_size);
+		truesize = SKB_HEAD_ALIGN(size) + VETH_XDP_HEADROOM;
+
 		/* Allocate skb head */
-		page = page_pool_dev_alloc_pages(rq->page_pool);
-		if (!page)
+		data = page_pool_dev_cache_alloc(rq->page_pool, &truesize);
+		if (!data)
 			goto drop;
 
-		nskb = napi_build_skb(page_address(page), PAGE_SIZE);
+		nskb = napi_build_skb(data, truesize);
 		if (!nskb) {
-			page_pool_put_full_page(rq->page_pool, page, true);
+			page_pool_cache_free(rq->page_pool, data, true);
 			goto drop;
 		}
 
@@ -768,7 +772,6 @@ static int veth_convert_skb_to_xdp_buff(struct veth_rq *rq,
 		skb_copy_header(nskb, skb);
 		skb_mark_for_recycle(nskb);
 
-		size = min_t(u32, skb->len, max_head_size);
 		if (skb_copy_bits(skb, 0, nskb->data, size)) {
 			consume_skb(nskb);
 			goto drop;
@@ -783,14 +786,18 @@ static int veth_convert_skb_to_xdp_buff(struct veth_rq *rq,
 		len = skb->len - off;
 
 		for (i = 0; i < MAX_SKB_FRAGS && off < skb->len; i++) {
-			page = page_pool_dev_alloc_pages(rq->page_pool);
+			size = min_t(u32, len, PAGE_SIZE);
+			truesize = size;
+
+			page = page_pool_dev_alloc(rq->page_pool, &page_offset,
+						   &truesize);
 			if (!page) {
 				consume_skb(nskb);
 				goto drop;
 			}
 
-			size = min_t(u32, len, PAGE_SIZE);
-			skb_add_rx_frag(nskb, i, page, 0, size, PAGE_SIZE);
+			skb_add_rx_frag(nskb, i, page, page_offset, size,
+					truesize);
 			if (skb_copy_bits(skb, off, page_address(page),
 					  size)) {
 				consume_skb(nskb);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v11 0/6] introduce page_pool_alloc() related API
  2023-10-13  6:48 [PATCH net-next v11 0/6] introduce page_pool_alloc() related API Yunsheng Lin
  2023-10-13  6:48 ` [PATCH net-next v11 5/6] page_pool: update document about fragment API Yunsheng Lin
  2023-10-13  6:48 ` [PATCH net-next v11 6/6] net: veth: use newly added page pool API for veth with xdp Yunsheng Lin
@ 2023-10-17  1:27 ` Jakub Kicinski
  2023-10-17  7:56   ` Yunsheng Lin
  2023-10-17  4:10 ` patchwork-bot+netdevbpf
  3 siblings, 1 reply; 12+ messages in thread
From: Jakub Kicinski @ 2023-10-17  1:27 UTC (permalink / raw)
  To: Yunsheng Lin
  Cc: davem, pabeni, netdev, linux-kernel, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Matthias Brugger, AngeloGioacchino Del Regno, bpf,
	linux-arm-kernel, linux-mediatek

On Fri, 13 Oct 2023 14:48:20 +0800 Yunsheng Lin wrote:
> v5 RFC: Add a new page_pool_cache_alloc() API, and other minor
>         change as discussed in v4. As there seems to be three
>         comsumers that might be made use of the new API, so
>         repost it as RFC and CC the relevant authors to see
>         if the new API fits their need.

I have looked thru the v4 discussion (admittedly it was pretty huge).
I can't find where the "cache" API was suggested.
And I can't figure out now what the "cache" in the name is referring to.
Looks like these are just convenience wrappers which return VA instead
of struct page..

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v11 0/6] introduce page_pool_alloc() related API
  2023-10-13  6:48 [PATCH net-next v11 0/6] introduce page_pool_alloc() related API Yunsheng Lin
                   ` (2 preceding siblings ...)
  2023-10-17  1:27 ` [PATCH net-next v11 0/6] introduce page_pool_alloc() related API Jakub Kicinski
@ 2023-10-17  4:10 ` patchwork-bot+netdevbpf
  3 siblings, 0 replies; 12+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-10-17  4:10 UTC (permalink / raw)
  To: Yunsheng Lin
  Cc: davem, kuba, pabeni, netdev, linux-kernel, ast, daniel, hawk,
	john.fastabend, matthias.bgg, angelogioacchino.delregno, bpf,
	linux-arm-kernel, linux-mediatek

Hello:

This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Fri, 13 Oct 2023 14:48:20 +0800 you wrote:
> In [1] & [2] & [3], there are usecases for veth and virtio_net
> to use frag support in page pool to reduce memory usage, and it
> may request different frag size depending on the head/tail
> room space for xdp_frame/shinfo and mtu/packet size. When the
> requested frag size is large enough that a single page can not
> be split into more than one frag, using frag support only have
> performance penalty because of the extra frag count handling
> for frag support.
> 
> [...]

Here is the summary with links:
  - [net-next,v11,1/6] page_pool: fragment API support for 32-bit arch with 64-bit DMA
    https://git.kernel.org/netdev/net-next/c/90de47f020db
  - [net-next,v11,2/6] page_pool: unify frag_count handling in page_pool_is_last_frag()
    (no matching commit)
  - [net-next,v11,3/6] page_pool: remove PP_FLAG_PAGE_FRAG
    (no matching commit)
  - [net-next,v11,4/6] page_pool: introduce page_pool[_cache]_alloc() API
    (no matching commit)
  - [net-next,v11,5/6] page_pool: update document about fragment API
    (no matching commit)
  - [net-next,v11,6/6] net: veth: use newly added page pool API for veth with xdp
    (no matching commit)

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v11 0/6] introduce page_pool_alloc() related API
  2023-10-17  1:27 ` [PATCH net-next v11 0/6] introduce page_pool_alloc() related API Jakub Kicinski
@ 2023-10-17  7:56   ` Yunsheng Lin
  2023-10-17 15:13     ` Jakub Kicinski
  0 siblings, 1 reply; 12+ messages in thread
From: Yunsheng Lin @ 2023-10-17  7:56 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, pabeni, netdev, linux-kernel, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Matthias Brugger, AngeloGioacchino Del Regno, bpf,
	linux-arm-kernel, linux-mediatek

On 2023/10/17 9:27, Jakub Kicinski wrote:
> On Fri, 13 Oct 2023 14:48:20 +0800 Yunsheng Lin wrote:
>> v5 RFC: Add a new page_pool_cache_alloc() API, and other minor
>>         change as discussed in v4. As there seems to be three
>>         comsumers that might be made use of the new API, so
>>         repost it as RFC and CC the relevant authors to see
>>         if the new API fits their need.
> 
> I have looked thru the v4 discussion (admittedly it was pretty huge).
> I can't find where the "cache" API was suggested.

Actually, the discussion happened in V3 as both of discussions in V3
and V4 seems to be happening concurrently:

https://lore.kernel.org/all/f8ce176f-f975-af11-641c-b56c53a8066a@redhat.com/

> And I can't figure out now what the "cache" in the name is referring to.
> Looks like these are just convenience wrappers which return VA instead
> of struct page..

Yes, it is corresponding to some API like napi_alloc_frag() returning va
instead of 'struct page' mentioned in patch 5.

Anyway, naming is hard, any suggestion for a better naming is always
welcomed:)

> .
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v11 0/6] introduce page_pool_alloc() related API
  2023-10-17  7:56   ` Yunsheng Lin
@ 2023-10-17 15:13     ` Jakub Kicinski
  2023-10-17 15:14       ` Jakub Kicinski
  2023-10-18 11:47       ` Yunsheng Lin
  0 siblings, 2 replies; 12+ messages in thread
From: Jakub Kicinski @ 2023-10-17 15:13 UTC (permalink / raw)
  To: Yunsheng Lin
  Cc: davem, pabeni, netdev, linux-kernel, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Matthias Brugger, AngeloGioacchino Del Regno, bpf,
	linux-arm-kernel, linux-mediatek, Alexander Duyck

On Tue, 17 Oct 2023 15:56:48 +0800 Yunsheng Lin wrote:
> > And I can't figure out now what the "cache" in the name is referring to.
> > Looks like these are just convenience wrappers which return VA instead
> > of struct page..  
> 
> Yes, it is corresponding to some API like napi_alloc_frag() returning va
> instead of 'struct page' mentioned in patch 5.
> 
> Anyway, naming is hard, any suggestion for a better naming is always
> welcomed:)

I'd just throw a _va (for virtual address) at the end. And not really
mention it in the documentation. Plus the kdoc of the function should
say that this is just a thin wrapper around other page pool APIs, and
it's safe to mix it with other page pool APIs?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v11 0/6] introduce page_pool_alloc() related API
  2023-10-17 15:13     ` Jakub Kicinski
@ 2023-10-17 15:14       ` Jakub Kicinski
  2023-10-18 11:47       ` Yunsheng Lin
  1 sibling, 0 replies; 12+ messages in thread
From: Jakub Kicinski @ 2023-10-17 15:14 UTC (permalink / raw)
  To: Yunsheng Lin
  Cc: davem, pabeni, netdev, linux-kernel, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Matthias Brugger, AngeloGioacchino Del Regno, bpf,
	linux-arm-kernel, linux-mediatek, Alexander Duyck

On Tue, 17 Oct 2023 08:13:03 -0700 Jakub Kicinski wrote:
> I'd just throw a _va (for virtual address) at the end

To be clear I mean:
  page_pool_alloc_va()
  page_pool_dev_alloc_va()
  page_pool_free_va()

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v11 0/6] introduce page_pool_alloc() related API
  2023-10-17 15:13     ` Jakub Kicinski
  2023-10-17 15:14       ` Jakub Kicinski
@ 2023-10-18 11:47       ` Yunsheng Lin
  2023-10-18 15:35         ` Jakub Kicinski
  1 sibling, 1 reply; 12+ messages in thread
From: Yunsheng Lin @ 2023-10-18 11:47 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, pabeni, netdev, linux-kernel, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Matthias Brugger, AngeloGioacchino Del Regno, bpf,
	linux-arm-kernel, linux-mediatek, Alexander Duyck

On 2023/10/17 23:13, Jakub Kicinski wrote:
> On Tue, 17 Oct 2023 15:56:48 +0800 Yunsheng Lin wrote:
>>> And I can't figure out now what the "cache" in the name is referring to.
>>> Looks like these are just convenience wrappers which return VA instead
>>> of struct page..  
>>
>> Yes, it is corresponding to some API like napi_alloc_frag() returning va
>> instead of 'struct page' mentioned in patch 5.
>>
>> Anyway, naming is hard, any suggestion for a better naming is always
>> welcomed:)
> 
> I'd just throw a _va (for virtual address) at the end. And not really

_va seems fine:)

> mention it in the documentation. Plus the kdoc of the function should
> say that this is just a thin wrapper around other page pool APIs, and
> it's safe to mix it with other page pool APIs?

I am not sure I understand what do 'safe' and 'mix' mean here.

For 'safe' part, I suppose you mean if there is a va accociated with
a 'struct page' without calling some API like kmap()? For that, I suppose
it is safe when the driver is calling page_pool API without the
__GFP_HIGHMEM flag. Maybe we should mention that in the kdoc and give a
warning if page_pool_*alloc_va() is called with the __GFP_HIGHMEM flag?

For the 'mix', I suppose you mean the below:
1. Allocate a page with the page_pool_*alloc_va() API and free a page with
   page_pool_free() API.
2. Allocate a page with the page_pool_*alloc() API and free a page with
   page_pool_free_va() API.

For 1, it seems it is ok as some virt_to_head_page() and page_address() call
between va and 'struct page' does not seem to change anything if we have
enforce page_pool_*alloc_va() to be called without the __GFP_HIGHMEM flag.

For 2, If the va is returned from page_address() which the allocation API is
called without __GFP_HIGHMEM flag. If not, the va is from kmap*()? which means
we may be calling page_pool_free_va() before kunmap*()? Is that possible?

> .
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v11 0/6] introduce page_pool_alloc() related API
  2023-10-18 11:47       ` Yunsheng Lin
@ 2023-10-18 15:35         ` Jakub Kicinski
  2023-10-19 13:22           ` Yunsheng Lin
  0 siblings, 1 reply; 12+ messages in thread
From: Jakub Kicinski @ 2023-10-18 15:35 UTC (permalink / raw)
  To: Yunsheng Lin
  Cc: davem, pabeni, netdev, linux-kernel, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Matthias Brugger, AngeloGioacchino Del Regno, bpf,
	linux-arm-kernel, linux-mediatek, Alexander Duyck

On Wed, 18 Oct 2023 19:47:16 +0800 Yunsheng Lin wrote:
> > mention it in the documentation. Plus the kdoc of the function should
> > say that this is just a thin wrapper around other page pool APIs, and
> > it's safe to mix it with other page pool APIs?  
> 
> I am not sure I understand what do 'safe' and 'mix' mean here.
> 
> For 'safe' part, I suppose you mean if there is a va accociated with
> a 'struct page' without calling some API like kmap()? For that, I suppose
> it is safe when the driver is calling page_pool API without the
> __GFP_HIGHMEM flag. Maybe we should mention that in the kdoc and give a
> warning if page_pool_*alloc_va() is called with the __GFP_HIGHMEM flag?

Sounds good. Warning wrapped in #if CONFIG_DEBUG_NET perhaps?

> For the 'mix', I suppose you mean the below:
> 1. Allocate a page with the page_pool_*alloc_va() API and free a page with
>    page_pool_free() API.
> 2. Allocate a page with the page_pool_*alloc() API and free a page with
>    page_pool_free_va() API.
> 
> For 1, it seems it is ok as some virt_to_head_page() and page_address() call
> between va and 'struct page' does not seem to change anything if we have
> enforce page_pool_*alloc_va() to be called without the __GFP_HIGHMEM flag.
> 
> For 2, If the va is returned from page_address() which the allocation API is
> called without __GFP_HIGHMEM flag. If not, the va is from kmap*()? which means
> we may be calling page_pool_free_va() before kunmap*()? Is that possible?

Right, if someone passes kmap()'ed address they are trying quite hard
to break their own driver. Technically possible but I wouldn't worry.

I just mean that in the common case of non-HIGHMEM page, calling
page_pool_free_va() with the address returned by page_address() 
is perfectly legal.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v11 0/6] introduce page_pool_alloc() related API
  2023-10-18 15:35         ` Jakub Kicinski
@ 2023-10-19 13:22           ` Yunsheng Lin
  2023-10-19 13:56             ` Jakub Kicinski
  0 siblings, 1 reply; 12+ messages in thread
From: Yunsheng Lin @ 2023-10-19 13:22 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, pabeni, netdev, linux-kernel, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Matthias Brugger, AngeloGioacchino Del Regno, bpf,
	linux-arm-kernel, linux-mediatek, Alexander Duyck

On 2023/10/18 23:35, Jakub Kicinski wrote:
> On Wed, 18 Oct 2023 19:47:16 +0800 Yunsheng Lin wrote:
>>> mention it in the documentation. Plus the kdoc of the function should
>>> say that this is just a thin wrapper around other page pool APIs, and
>>> it's safe to mix it with other page pool APIs?  
>>
>> I am not sure I understand what do 'safe' and 'mix' mean here.
>>
>> For 'safe' part, I suppose you mean if there is a va accociated with
>> a 'struct page' without calling some API like kmap()? For that, I suppose
>> it is safe when the driver is calling page_pool API without the
>> __GFP_HIGHMEM flag. Maybe we should mention that in the kdoc and give a
>> warning if page_pool_*alloc_va() is called with the __GFP_HIGHMEM flag?
> 
> Sounds good. Warning wrapped in #if CONFIG_DEBUG_NET perhaps?

How about something like __get_free_pages() does with gfp flags?
https://elixir.free-electrons.com/linux/v6.4-rc6/source/mm/page_alloc.c#L4818

how about something like below on top of this patchset:
diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h
index 7550beeacf3d..61cee55606c0 100644
--- a/include/net/page_pool/helpers.h
+++ b/include/net/page_pool/helpers.h
@@ -167,13 +167,13 @@ static inline struct page *page_pool_dev_alloc(struct page_pool *pool,
        return page_pool_alloc(pool, offset, size, gfp);
 }

-static inline void *page_pool_cache_alloc(struct page_pool *pool,
-                                         unsigned int *size, gfp_t gfp)
+static inline void *page_pool_alloc_va(struct page_pool *pool,
+                                      unsigned int *size, gfp_t gfp)
 {
        unsigned int offset;
        struct page *page;

-       page = page_pool_alloc(pool, &offset, size, gfp);
+       page = page_pool_alloc(pool, &offset, size, gfp & ~__GFP_HIGHMEM);
        if (unlikely(!page))
                return NULL;

@@ -181,21 +181,22 @@ static inline void *page_pool_cache_alloc(struct page_pool *pool,
 }

 /**
- * page_pool_dev_cache_alloc() - allocate a cache.
+ * page_pool_dev_alloc_va() - allocate a page or a page fragment.
  * @pool: pool from which to allocate
  * @size: in as the requested size, out as the allocated size
  *
- * Get a cache from the page allocator or page_pool caches.
+ * This is just a thin wrapper around the page_pool_alloc() API, and
+ * it returns va of the allocated page or page fragment.
  *
  * Return:
- * Return the addr for the allocated cache, otherwise return NULL.
+ * Return the va for the allocated page or page fragment, otherwise return NULL.
  */
-static inline void *page_pool_dev_cache_alloc(struct page_pool *pool,
-                                             unsigned int *size)
+static inline void *page_pool_dev_alloc_va(struct page_pool *pool,
+                                          unsigned int *size)
 {
        gfp_t gfp = (GFP_ATOMIC | __GFP_NOWARN);

-       return page_pool_cache_alloc(pool, size, gfp);
+       return page_pool_alloc_va(pool, size, gfp);
 }

 /**
@@ -338,17 +339,17 @@ static inline void page_pool_recycle_direct(struct page_pool *pool,
                (sizeof(dma_addr_t) > sizeof(unsigned long))

 /**
- * page_pool_cache_free() - free a cache into the page_pool
- * @pool: pool from which cache was allocated
- * @data: addr of cache to be free
+ * page_pool_free_va() - free a va into the page_pool
+ * @pool: pool from which va was allocated
+ * @va: va to be free
  * @allow_direct: freed by the consumer, allow lockless caching
  *
  * Free a cache allocated from page_pool_dev_cache_alloc().
  */
-static inline void page_pool_cache_free(struct page_pool *pool, void *data,
-                                       bool allow_direct)
+static inline void page_pool_free_va(struct page_pool *pool, void *va,
+                                    bool allow_direct)
 {
-       page_pool_put_page(pool, virt_to_head_page(data), -1, allow_direct);
+       page_pool_put_page(pool, virt_to_head_page(va), -1, allow_direct);
 }




^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v11 0/6] introduce page_pool_alloc() related API
  2023-10-19 13:22           ` Yunsheng Lin
@ 2023-10-19 13:56             ` Jakub Kicinski
  0 siblings, 0 replies; 12+ messages in thread
From: Jakub Kicinski @ 2023-10-19 13:56 UTC (permalink / raw)
  To: Yunsheng Lin
  Cc: davem, pabeni, netdev, linux-kernel, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Matthias Brugger, AngeloGioacchino Del Regno, bpf,
	linux-arm-kernel, linux-mediatek, Alexander Duyck

On Thu, 19 Oct 2023 21:22:07 +0800 Yunsheng Lin wrote:
> > Sounds good. Warning wrapped in #if CONFIG_DEBUG_NET perhaps?  
> 
> How about something like __get_free_pages() does with gfp flags?
> https://elixir.free-electrons.com/linux/v6.4-rc6/source/mm/page_alloc.c#L4818

Fine by me!

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-10-19 13:56 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-13  6:48 [PATCH net-next v11 0/6] introduce page_pool_alloc() related API Yunsheng Lin
2023-10-13  6:48 ` [PATCH net-next v11 5/6] page_pool: update document about fragment API Yunsheng Lin
2023-10-13  6:48 ` [PATCH net-next v11 6/6] net: veth: use newly added page pool API for veth with xdp Yunsheng Lin
2023-10-17  1:27 ` [PATCH net-next v11 0/6] introduce page_pool_alloc() related API Jakub Kicinski
2023-10-17  7:56   ` Yunsheng Lin
2023-10-17 15:13     ` Jakub Kicinski
2023-10-17 15:14       ` Jakub Kicinski
2023-10-18 11:47       ` Yunsheng Lin
2023-10-18 15:35         ` Jakub Kicinski
2023-10-19 13:22           ` Yunsheng Lin
2023-10-19 13:56             ` Jakub Kicinski
2023-10-17  4:10 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox