linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mina Almasry <almasrymina@google.com>
To: netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	 linux-doc@vger.kernel.org, linux-alpha@vger.kernel.org,
	 linux-mips@vger.kernel.org, linux-parisc@vger.kernel.org,
	 sparclinux@vger.kernel.org, linux-trace-kernel@vger.kernel.org,
	 linux-arch@vger.kernel.org, bpf@vger.kernel.org,
	 linux-kselftest@vger.kernel.org, linux-media@vger.kernel.org,
	 dri-devel@lists.freedesktop.org
Cc: "Mina Almasry" <almasrymina@google.com>,
	"David S. Miller" <davem@davemloft.net>,
	"Eric Dumazet" <edumazet@google.com>,
	"Jakub Kicinski" <kuba@kernel.org>,
	"Paolo Abeni" <pabeni@redhat.com>,
	"Donald Hunter" <donald.hunter@gmail.com>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Richard Henderson" <richard.henderson@linaro.org>,
	"Ivan Kokshaysky" <ink@jurassic.park.msu.ru>,
	"Matt Turner" <mattst88@gmail.com>,
	"Thomas Bogendoerfer" <tsbogend@alpha.franken.de>,
	"James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>,
	"Helge Deller" <deller@gmx.de>,
	"Andreas Larsson" <andreas@gaisler.com>,
	"Jesper Dangaard Brouer" <hawk@kernel.org>,
	"Ilias Apalodimas" <ilias.apalodimas@linaro.org>,
	"Steven Rostedt" <rostedt@goodmis.org>,
	"Masami Hiramatsu" <mhiramat@kernel.org>,
	"Mathieu Desnoyers" <mathieu.desnoyers@efficios.com>,
	"Arnd Bergmann" <arnd@arndb.de>,
	"Alexei Starovoitov" <ast@kernel.org>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"Andrii Nakryiko" <andrii@kernel.org>,
	"Martin KaFai Lau" <martin.lau@linux.dev>,
	"Eduard Zingerman" <eddyz87@gmail.com>,
	"Song Liu" <song@kernel.org>,
	"Yonghong Song" <yonghong.song@linux.dev>,
	"John Fastabend" <john.fastabend@gmail.com>,
	"KP Singh" <kpsingh@kernel.org>,
	"Stanislav Fomichev" <sdf@google.com>,
	"Hao Luo" <haoluo@google.com>, "Jiri Olsa" <jolsa@kernel.org>,
	"Steffen Klassert" <steffen.klassert@secunet.com>,
	"Herbert Xu" <herbert@gondor.apana.org.au>,
	"David Ahern" <dsahern@kernel.org>,
	"Willem de Bruijn" <willemdebruijn.kernel@gmail.com>,
	"Shuah Khan" <shuah@kernel.org>,
	"Sumit Semwal" <sumit.semwal@linaro.org>,
	"Christian König" <christian.koenig@amd.com>,
	"Pavel Begunkov" <asml.silence@gmail.com>,
	"David Wei" <dw@davidwei.uk>, "Jason Gunthorpe" <jgg@ziepe.ca>,
	"Yunsheng Lin" <linyunsheng@huawei.com>,
	"Shailend Chand" <shailend@google.com>,
	"Harshitha Ramamurthy" <hramamurthy@google.com>,
	"Shakeel Butt" <shakeel.butt@linux.dev>,
	"Jeroen de Borst" <jeroendb@google.com>,
	"Praveen Kaligineedi" <pkaligineedi@google.com>,
	"Christoph Hellwig" <hch@infradead.org>
Subject: [PATCH net-next v10 02/14] net: page_pool: create hooks for custom page providers
Date: Thu, 30 May 2024 20:16:01 +0000	[thread overview]
Message-ID: <20240530201616.1316526-3-almasrymina@google.com> (raw)
In-Reply-To: <20240530201616.1316526-1-almasrymina@google.com>

From: Jakub Kicinski <kuba@kernel.org>

The page providers which try to reuse the same pages will
need to hold onto the ref, even if page gets released from
the pool - as in releasing the page from the pp just transfers
the "ownership" reference from pp to the provider, and provider
will wait for other references to be gone before feeding this
page back into the pool.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Mina Almasry <almasrymina@google.com>

---

- This is implemented by Jakub in his RFC:
https://lore.kernel.org/netdev/f8270765-a27b-6ccf-33ea-cda097168d79@redhat.com/T/

I take no credit for the idea or implementation; I only added minor
edits to make this workable with device memory TCP, and removed some
hacky test code. This is a critical dependency of device memory TCP
and thus I'm pulling it into this series to make it revewable and
mergeable.

- There is a pending discussion about the acceptance of the page_pool
  memory provider hooks:

https://lore.kernel.org/netdev/20240403002053.2376017-3-almasrymina@google.com/

I'm unsure if the discussion has been resolved yet. Sending the series
anyway to get reviews/feedback on the (unrelated) rest of the series.

Cc: Christoph Hellwig <hch@infradead.org>

v10:
- Renamed alloc_pages -> alloc_netmems. alloc_pages is now
  a preprocessor macro, and reusing the string results in a build error.

RFC v3 -> v1
- Removed unusued mem_provider. (Yunsheng).
- Replaced memory_provider & mp_priv with netdev_rx_queue (Jakub).

---
 include/net/page_pool/types.h | 12 ++++++++++
 net/core/page_pool.c          | 43 +++++++++++++++++++++++++++++++----
 2 files changed, 50 insertions(+), 5 deletions(-)

diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h
index b088d131aeb0d..b038b838f042f 100644
--- a/include/net/page_pool/types.h
+++ b/include/net/page_pool/types.h
@@ -51,6 +51,7 @@ struct pp_alloc_cache {
  * @dev:	device, for DMA pre-mapping purposes
  * @netdev:	netdev this pool will serve (leave as NULL if none or multiple)
  * @napi:	NAPI which is the sole consumer of pages, otherwise NULL
+ * @queue:	struct netdev_rx_queue this page_pool is being created for.
  * @dma_dir:	DMA mapping direction
  * @max_len:	max DMA sync memory size for PP_FLAG_DMA_SYNC_DEV
  * @offset:	DMA sync address offset for PP_FLAG_DMA_SYNC_DEV
@@ -64,6 +65,7 @@ struct page_pool_params {
 		int		nid;
 		struct device	*dev;
 		struct napi_struct *napi;
+		struct netdev_rx_queue *queue;
 		enum dma_data_direction dma_dir;
 		unsigned int	max_len;
 		unsigned int	offset;
@@ -127,6 +129,13 @@ struct page_pool_stats {
 };
 #endif
 
+struct memory_provider_ops {
+	int (*init)(struct page_pool *pool);
+	void (*destroy)(struct page_pool *pool);
+	struct page *(*alloc_netmems)(struct page_pool *pool, gfp_t gfp);
+	bool (*release_page)(struct page_pool *pool, struct page *page);
+};
+
 struct page_pool {
 	struct page_pool_params_fast p;
 
@@ -193,6 +202,9 @@ struct page_pool {
 	 */
 	struct ptr_ring ring;
 
+	void *mp_priv;
+	const struct memory_provider_ops *mp_ops;
+
 #ifdef CONFIG_PAGE_POOL_STATS
 	/* recycle stats are per-cpu to avoid locking */
 	struct page_pool_recycle_stats __percpu *recycle_stats;
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index f4444b4e39e63..251c9356c9202 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -26,6 +26,8 @@
 
 #include "page_pool_priv.h"
 
+static DEFINE_STATIC_KEY_FALSE(page_pool_mem_providers);
+
 #define DEFER_TIME (msecs_to_jiffies(1000))
 #define DEFER_WARN_INTERVAL (60 * HZ)
 
@@ -186,6 +188,7 @@ static int page_pool_init(struct page_pool *pool,
 			  int cpuid)
 {
 	unsigned int ring_qsize = 1024; /* Default */
+	int err;
 
 	page_pool_struct_check();
 
@@ -267,7 +270,22 @@ static int page_pool_init(struct page_pool *pool,
 	if (pool->dma_map)
 		get_device(pool->p.dev);
 
+	if (pool->mp_ops) {
+		err = pool->mp_ops->init(pool);
+		if (err) {
+			pr_warn("%s() mem-provider init failed %d\n", __func__,
+				err);
+			goto free_ptr_ring;
+		}
+
+		static_branch_inc(&page_pool_mem_providers);
+	}
+
 	return 0;
+
+free_ptr_ring:
+	ptr_ring_cleanup(&pool->ring, NULL);
+	return err;
 }
 
 static void page_pool_uninit(struct page_pool *pool)
@@ -569,7 +587,10 @@ struct page *page_pool_alloc_pages(struct page_pool *pool, gfp_t gfp)
 		return page;
 
 	/* Slow-path: cache empty, do real allocation */
-	page = __page_pool_alloc_pages_slow(pool, gfp);
+	if (static_branch_unlikely(&page_pool_mem_providers) && pool->mp_ops)
+		page = pool->mp_ops->alloc_netmems(pool, gfp);
+	else
+		page = __page_pool_alloc_pages_slow(pool, gfp);
 	return page;
 }
 EXPORT_SYMBOL(page_pool_alloc_pages);
@@ -627,10 +648,13 @@ void __page_pool_release_page_dma(struct page_pool *pool, struct page *page)
 void page_pool_return_page(struct page_pool *pool, struct page *page)
 {
 	int count;
+	bool put;
 
-	__page_pool_release_page_dma(pool, page);
-
-	page_pool_clear_pp_info(page);
+	put = true;
+	if (static_branch_unlikely(&page_pool_mem_providers) && pool->mp_ops)
+		put = pool->mp_ops->release_page(pool, page);
+	else
+		__page_pool_release_page_dma(pool, page);
 
 	/* This may be the last page returned, releasing the pool, so
 	 * it is not safe to reference pool afterwards.
@@ -638,7 +662,10 @@ void page_pool_return_page(struct page_pool *pool, struct page *page)
 	count = atomic_inc_return_relaxed(&pool->pages_state_release_cnt);
 	trace_page_pool_state_release(pool, page, count);
 
-	put_page(page);
+	if (put) {
+		page_pool_clear_pp_info(page);
+		put_page(page);
+	}
 	/* An optimization would be to call __free_pages(page, pool->p.order)
 	 * knowing page is not part of page-cache (thus avoiding a
 	 * __page_cache_release() call).
@@ -937,6 +964,12 @@ static void __page_pool_destroy(struct page_pool *pool)
 
 	page_pool_unlist(pool);
 	page_pool_uninit(pool);
+
+	if (pool->mp_ops) {
+		pool->mp_ops->destroy(pool);
+		static_branch_dec(&page_pool_mem_providers);
+	}
+
 	kfree(pool);
 }
 
-- 
2.45.1.288.g0e0cd299f1-goog


  parent reply	other threads:[~2024-05-30 20:16 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-30 20:15 [PATCH net-next v10 00/14] Device Memory TCP Mina Almasry
2024-05-30 20:16 ` [PATCH net-next v10 01/14] netdev: add netdev_rx_queue_restart() Mina Almasry
2024-05-30 23:51   ` David Wei
2024-06-03 12:52   ` Pavel Begunkov
2024-05-30 20:16 ` Mina Almasry [this message]
2024-06-01  5:35   ` [PATCH net-next v10 02/14] net: page_pool: create hooks for custom page providers Christoph Hellwig
2024-06-03 14:17     ` Mina Almasry
2024-06-03 14:52       ` Pavel Begunkov
2024-06-03 15:43         ` Mina Almasry
2024-06-07 13:42           ` Pavel Begunkov
2024-06-07 14:27             ` David Ahern
2024-06-07 14:52               ` Jason Gunthorpe
2024-06-10  0:37                 ` David Wei
2024-06-10  1:07                   ` Pavel Begunkov
2024-06-10 12:16                     ` Jason Gunthorpe
2024-06-10 12:38                       ` Christian König
2024-06-10 15:41                         ` Mina Almasry
2024-06-10 19:32                           ` Pavel Begunkov
2024-06-10 16:22                         ` Daniel Vetter
2024-06-11  6:25                         ` Christoph Hellwig
2024-06-11  8:21                           ` Christian König
2024-06-10 15:16                       ` David Ahern
2024-06-10 19:20                         ` Pavel Begunkov
2024-06-10 22:15                           ` Jason Gunthorpe
2024-06-11 18:09                             ` Mina Almasry
2024-06-12 12:06                               ` Jason Gunthorpe
2024-06-12 15:47                                 ` David Ahern
2024-06-17 19:15                             ` Pavel Begunkov
2024-06-11  6:26                         ` Christoph Hellwig
2024-06-11 17:48                           ` Mina Almasry
2024-06-07 15:42               ` Pavel Begunkov
2024-06-07 15:46                 ` Pavel Begunkov
2024-06-07 16:59                   ` Mina Almasry
2024-06-10  1:12                     ` Pavel Begunkov
2024-06-10  0:27               ` David Wei
2024-06-05  8:24         ` Christoph Hellwig
2024-06-07 13:45           ` Pavel Begunkov
2024-06-11  6:34             ` Christoph Hellwig
2024-06-17 18:04               ` Pavel Begunkov
2024-06-18  6:43                 ` Christoph Hellwig
2024-06-18 11:40                   ` Pavel Begunkov
2024-06-05  8:23       ` Christoph Hellwig
2024-05-30 20:16 ` [PATCH net-next v10 03/14] net: netdev netlink api to bind dma-buf to a net device Mina Almasry
2024-05-30 20:16 ` [PATCH net-next v10 04/14] netdev: support binding dma-buf to netdevice Mina Almasry
2024-05-30 20:16 ` [PATCH net-next v10 05/14] netdev: netdevice devmem allocator Mina Almasry
2024-06-04 10:13   ` Paolo Abeni
2024-06-04 16:15     ` Steven Rostedt
2024-06-04 16:31       ` Jason Gunthorpe
2024-06-04 16:42         ` Steven Rostedt
2024-06-04 23:44           ` Andrew Lunn
2024-06-05  0:27             ` Steven Rostedt
2024-06-05  0:52               ` Andrew Lunn
2024-06-06  1:43                 ` Steven Rostedt
2024-06-07  7:55               ` Niklas Schnelle
2024-05-30 20:16 ` [PATCH net-next v10 06/14] page_pool: convert to use netmem Mina Almasry
2024-06-04 10:23   ` Paolo Abeni
2024-06-06  1:48   ` Steven Rostedt
2024-06-07 12:31     ` Pavel Begunkov
2024-05-30 20:16 ` [PATCH net-next v10 07/14] page_pool: devmem support Mina Almasry
2024-05-30 20:16 ` [PATCH net-next v10 08/14] memory-provider: dmabuf devmem memory provider Mina Almasry
2024-05-30 20:16 ` [PATCH net-next v10 09/14] net: support non paged skb frags Mina Almasry
2024-05-30 20:16 ` [PATCH net-next v10 10/14] net: add support for skbs with unreadable frags Mina Almasry
2024-06-04 10:46   ` Paolo Abeni
2024-06-06 16:49     ` Mina Almasry
2024-06-06 16:58       ` Mina Almasry
2024-05-30 20:16 ` [PATCH net-next v10 11/14] tcp: RX path for devmem TCP Mina Almasry
2024-06-04 10:53   ` Paolo Abeni
2024-05-30 20:16 ` [PATCH net-next v10 12/14] net: add SO_DEVMEM_DONTNEED setsockopt to release RX frags Mina Almasry
2024-05-30 20:16 ` [PATCH net-next v10 13/14] net: add devmem TCP documentation Mina Almasry
2024-06-01 13:09   ` Bagas Sanjaya
2024-05-30 20:16 ` [PATCH net-next v10 14/14] selftests: add ncdevmem, netcat for devmem TCP Mina Almasry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240530201616.1316526-3-almasrymina@google.com \
    --to=almasrymina@google.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=andreas@gaisler.com \
    --cc=andrii@kernel.org \
    --cc=arnd@arndb.de \
    --cc=asml.silence@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=christian.koenig@amd.com \
    --cc=corbet@lwn.net \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=deller@gmx.de \
    --cc=donald.hunter@gmail.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=dsahern@kernel.org \
    --cc=dw@davidwei.uk \
    --cc=eddyz87@gmail.com \
    --cc=edumazet@google.com \
    --cc=haoluo@google.com \
    --cc=hawk@kernel.org \
    --cc=hch@infradead.org \
    --cc=herbert@gondor.apana.org.au \
    --cc=hramamurthy@google.com \
    --cc=ilias.apalodimas@linaro.org \
    --cc=ink@jurassic.park.msu.ru \
    --cc=jeroendb@google.com \
    --cc=jgg@ziepe.ca \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-alpha@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-parisc@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=linyunsheng@huawei.com \
    --cc=martin.lau@linux.dev \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mattst88@gmail.com \
    --cc=mhiramat@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pkaligineedi@google.com \
    --cc=richard.henderson@linaro.org \
    --cc=rostedt@goodmis.org \
    --cc=sdf@google.com \
    --cc=shailend@google.com \
    --cc=shakeel.butt@linux.dev \
    --cc=shuah@kernel.org \
    --cc=song@kernel.org \
    --cc=sparclinux@vger.kernel.org \
    --cc=steffen.klassert@secunet.com \
    --cc=sumit.semwal@linaro.org \
    --cc=tsbogend@alpha.franken.de \
    --cc=willemdebruijn.kernel@gmail.com \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).