From: Alexander Lobakin <aleksander.lobakin@intel.com>
To: "David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>,
Maciej Fijalkowski <maciej.fijalkowski@intel.com>,
Jesper Dangaard Brouer <hawk@kernel.org>,
Larysa Zaremba <larysa.zaremba@intel.com>,
netdev@vger.kernel.org, Alexander Duyck <alexanderduyck@fb.com>,
Ilias Apalodimas <ilias.apalodimas@linaro.org>,
linux-kernel@vger.kernel.org,
Alexander Lobakin <aleksander.lobakin@intel.com>,
Yunsheng Lin <linyunsheng@huawei.com>,
Michal Kubiak <michal.kubiak@intel.com>,
intel-wired-lan@lists.osuosl.org,
David Christensen <drc@linux.vnet.ibm.com>
Subject: [Intel-wired-lan] [PATCH net-next v5 03/14] page_pool: avoid calling no-op externals when possible
Date: Fri, 24 Nov 2023 16:47:21 +0100 [thread overview]
Message-ID: <20231124154732.1623518-4-aleksander.lobakin@intel.com> (raw)
In-Reply-To: <20231124154732.1623518-1-aleksander.lobakin@intel.com>
Turned out page_pool_put{,_full}_page() can burn quite a bunch of cycles
even on DMA-coherent platforms (like x86) with no active IOMMU or
swiotlb, just for the call ladder.
Indeed, it's
page_pool_put_page()
page_pool_put_defragged_page() <- external
__page_pool_put_page()
page_pool_dma_sync_for_device() <- non-inline
dma_sync_single_range_for_device()
dma_sync_single_for_device() <- external
dma_direct_sync_single_for_device()
dev_is_dma_coherent() <- exit
For the inline functions, no guarantees the compiler won't uninline them
(they're clearly not one-liners and sometimes the compilers manage to
uninline 2 + 2). The first external call is necessary, but the rest 2+
are done for nothing each time, plus a bunch of checks here and there.
Since Page Pool mappings are long-term and for one "device + addr" pair
dma_need_sync() will always return the same value (basically, whether it
belongs to an swiotlb pool), addresses can be tested once right after
they're obtained and the result can be reused until the page is
unmapped.
Do this test after a successful DMA mapping and reuse the lowest bit of
page::dma_addr to store the result, since Page Pool addresses are always
page-aligned and the lowest bits are unused.
page_pool_{g,s}et_dma_addr() will then take this bit into account and
exclude it from the actual address. Call page_pool_dma_sync_for_device()
only if the bit is set, shaving off several function calls back and
forth. If pool::dma_sync is not set, i.e. the driver didn't ask to
perform syncs, don't do this test and never touch the lowest bit.
On my x86_64, this gives from 2% to 5% performance benefit with no
negative impact for cases when IOMMU is on and the shortcut can't be
used.
Suggested-by: Yunsheng Lin <linyunsheng@huawei.com> # per-page flag
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
---
include/net/page_pool/helpers.h | 60 +++++++++++++++++++++++++++++----
net/core/page_pool.c | 10 +++---
2 files changed, 58 insertions(+), 12 deletions(-)
diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h
index 4ebd544ae977..28bec368b8e9 100644
--- a/include/net/page_pool/helpers.h
+++ b/include/net/page_pool/helpers.h
@@ -52,6 +52,8 @@
#ifndef _NET_PAGE_POOL_HELPERS_H
#define _NET_PAGE_POOL_HELPERS_H
+#include <linux/dma-mapping.h>
+
#include <net/page_pool/types.h>
#ifdef CONFIG_PAGE_POOL_STATS
@@ -354,6 +356,11 @@ static inline void page_pool_free_va(struct page_pool *pool, void *va,
page_pool_put_page(pool, virt_to_head_page(va), -1, allow_direct);
}
+/* Occupies page::dma_addr[0:0] and indicates whether the mapping needs to be
+ * synchronized (either for device, for CPU, or both).
+ */
+#define PAGE_POOL_DMA_ADDR_NEED_SYNC BIT(0)
+
/**
* page_pool_get_dma_addr() - Retrieve the stored DMA address.
* @page: page allocated from a page pool
@@ -363,27 +370,66 @@ static inline void page_pool_free_va(struct page_pool *pool, void *va,
*/
static inline dma_addr_t page_pool_get_dma_addr(struct page *page)
{
- dma_addr_t ret = page->dma_addr;
+ dma_addr_t ret = page->dma_addr & ~PAGE_POOL_DMA_ADDR_NEED_SYNC;
if (PAGE_POOL_32BIT_ARCH_WITH_64BIT_DMA)
- ret <<= PAGE_SHIFT;
+ ret <<= PAGE_SHIFT - 1;
return ret;
}
-static inline bool page_pool_set_dma_addr(struct page *page, dma_addr_t addr)
+/**
+ * page_pool_set_dma_addr - save the obtained DMA address for @page
+ * @pool: &page_pool this page belongs to and was mapped by
+ * @page: page to save the address for
+ * @addr: DMA address to save
+ *
+ * If the pool was created with the DMA synchronization enabled, it will test
+ * whether the address needs to be synced and store the result as well to
+ * conserve CPU cycles.
+ *
+ * Return: false if the address doesn't fit into the corresponding field and
+ * thus can't be stored, true on success.
+ */
+static inline bool page_pool_set_dma_addr(const struct page_pool *pool,
+ struct page *page,
+ dma_addr_t addr)
{
+ unsigned long val = addr;
+
+ if (unlikely(!addr)) {
+ page->dma_addr = 0;
+ return true;
+ }
+
if (PAGE_POOL_32BIT_ARCH_WITH_64BIT_DMA) {
- page->dma_addr = addr >> PAGE_SHIFT;
+ val = addr >> (PAGE_SHIFT - 1);
/* We assume page alignment to shave off bottom bits,
* if this "compression" doesn't work we need to drop.
*/
- return addr != (dma_addr_t)page->dma_addr << PAGE_SHIFT;
+ if (addr != (dma_addr_t)val << (PAGE_SHIFT - 1))
+ return false;
}
- page->dma_addr = addr;
- return false;
+ if (pool->dma_sync && dma_need_sync(pool->p.dev, addr))
+ val |= PAGE_POOL_DMA_ADDR_NEED_SYNC;
+
+ page->dma_addr = val;
+
+ return true;
+}
+
+/**
+ * page_pool_dma_addr_need_sync - check whether a mapped page needs DMA syncs
+ * @page: page to check
+ *
+ * Return: the result previously saved by page_pool_set_dma_addr() to skip
+ * synchronization when not needed.
+ */
+static inline bool page_pool_dma_addr_need_sync(const struct page *page)
+{
+ return page->dma_addr & PAGE_POOL_DMA_ADDR_NEED_SYNC;
}
static inline bool page_pool_put(struct page_pool *pool)
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index a8da3ba3b1e8..09bdfdfc4788 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -366,10 +366,10 @@ static bool page_pool_dma_map(struct page_pool *pool, struct page *page)
if (dma_mapping_error(pool->p.dev, dma))
return false;
- if (page_pool_set_dma_addr(page, dma))
+ if (!page_pool_set_dma_addr(pool, page, dma))
goto unmap_failed;
- if (pool->dma_sync)
+ if (page_pool_dma_addr_need_sync(page))
page_pool_dma_sync_for_device(pool, page, pool->p.max_len);
return true;
@@ -545,7 +545,7 @@ static void page_pool_return_page(struct page_pool *pool, struct page *page)
dma_unmap_page_attrs(pool->p.dev, dma,
PAGE_SIZE << pool->p.order, pool->p.dma_dir,
DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_WEAK_ORDERING);
- page_pool_set_dma_addr(page, 0);
+ page_pool_set_dma_addr(pool, page, 0);
skip_dma_unmap:
page_pool_clear_pp_info(page);
@@ -622,7 +622,7 @@ __page_pool_put_page(struct page_pool *pool, struct page *page,
if (likely(page_ref_count(page) == 1 && !page_is_pfmemalloc(page))) {
/* Read barrier done in page_ref_count / READ_ONCE */
- if (pool->dma_sync)
+ if (page_pool_dma_addr_need_sync(page))
page_pool_dma_sync_for_device(pool, page,
dma_sync_size);
@@ -735,7 +735,7 @@ static struct page *page_pool_drain_frag(struct page_pool *pool,
return NULL;
if (page_ref_count(page) == 1 && !page_is_pfmemalloc(page)) {
- if (pool->dma_sync)
+ if (page_pool_dma_addr_need_sync(page))
page_pool_dma_sync_for_device(pool, page, -1);
return page;
--
2.42.0
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
next prev parent reply other threads:[~2023-11-24 15:50 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-24 15:47 [Intel-wired-lan] [PATCH net-next v5 00/14] net: intel: start The Great Code Dedup + Page Pool for iavf Alexander Lobakin
2023-11-24 15:47 ` [Intel-wired-lan] [PATCH net-next v5 01/14] page_pool: make sure frag API fields don't span between cachelines Alexander Lobakin
2023-11-25 12:29 ` Yunsheng Lin
2023-11-27 14:08 ` Alexander Lobakin
2023-11-29 2:55 ` Yunsheng Lin
2023-11-29 13:12 ` Alexander Lobakin
2023-11-26 22:54 ` Jakub Kicinski
2023-11-27 14:12 ` Alexander Lobakin
2023-11-24 15:47 ` [Intel-wired-lan] [PATCH net-next v5 02/14] page_pool: don't use driver-set flags field directly Alexander Lobakin
2023-11-24 15:47 ` Alexander Lobakin [this message]
2023-11-25 13:04 ` [Intel-wired-lan] [PATCH net-next v5 03/14] page_pool: avoid calling no-op externals when possible Yunsheng Lin
2023-11-27 14:32 ` Alexander Lobakin
2023-11-27 18:17 ` Jakub Kicinski
2023-11-28 16:50 ` Alexander Lobakin
2023-11-29 3:17 ` Yunsheng Lin
2023-11-29 13:17 ` Alexander Lobakin
2023-11-30 8:46 ` Yunsheng Lin
2023-11-30 11:58 ` Alexander Lobakin
2023-11-30 12:20 ` Yunsheng Lin
2023-12-01 14:37 ` Alexander Lobakin
2023-12-12 15:25 ` Christoph Hellwig
2023-11-24 15:47 ` [Intel-wired-lan] [PATCH net-next v5 04/14] net: intel: introduce Intel Ethernet common library Alexander Lobakin
2023-11-24 15:47 ` [Intel-wired-lan] [PATCH net-next v5 05/14] iavf: kill "legacy-rx" for good Alexander Lobakin
2023-11-24 15:47 ` [Intel-wired-lan] [PATCH net-next v5 06/14] iavf: drop page splitting and recycling Alexander Lobakin
2023-11-24 15:47 ` [Intel-wired-lan] [PATCH net-next v5 07/14] page_pool: constify some read-only function arguments Alexander Lobakin
2023-11-24 15:47 ` [Intel-wired-lan] [PATCH net-next v5 08/14] page_pool: add DMA-sync-for-CPU inline helpers Alexander Lobakin
2023-11-24 15:47 ` [Intel-wired-lan] [PATCH net-next v5 09/14] libie: add Rx buffer management (via Page Pool) Alexander Lobakin
2023-11-24 15:47 ` [Intel-wired-lan] [PATCH net-next v5 10/14] iavf: pack iavf_ring more efficiently Alexander Lobakin
2023-11-24 15:47 ` [Intel-wired-lan] [PATCH net-next v5 11/14] iavf: switch to Page Pool Alexander Lobakin
2023-11-24 15:47 ` [Intel-wired-lan] [PATCH net-next v5 12/14] libie: add common queue stats Alexander Lobakin
2023-11-24 15:47 ` [Intel-wired-lan] [PATCH net-next v5 13/14] libie: add per-queue Page Pool stats Alexander Lobakin
2023-11-29 13:40 ` Alexander Lobakin
2023-11-29 14:29 ` Jakub Kicinski
2023-11-30 16:01 ` Alexander Lobakin
2023-11-30 16:45 ` Alexander Lobakin
2023-12-01 6:55 ` Jakub Kicinski
2023-11-24 15:47 ` [Intel-wired-lan] [PATCH net-next v5 14/14] iavf: switch queue stats to libie Alexander Lobakin
2023-11-27 9:04 ` [Intel-wired-lan] [PATCH net-next v5 00/14] net: intel: start The Great Code Dedup + Page Pool for iavf Jiri Pirko
2023-11-27 10:23 ` Przemek Kitszel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231124154732.1623518-4-aleksander.lobakin@intel.com \
--to=aleksander.lobakin@intel.com \
--cc=alexanderduyck@fb.com \
--cc=davem@davemloft.net \
--cc=drc@linux.vnet.ibm.com \
--cc=edumazet@google.com \
--cc=hawk@kernel.org \
--cc=ilias.apalodimas@linaro.org \
--cc=intel-wired-lan@lists.osuosl.org \
--cc=kuba@kernel.org \
--cc=larysa.zaremba@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linyunsheng@huawei.com \
--cc=maciej.fijalkowski@intel.com \
--cc=michal.kubiak@intel.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=pmenzel@molgen.mpg.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox