From: Lorenzo Bianconi <lorenzo@kernel.org>
To: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: netdev@vger.kernel.org, lorenzo.bianconi@redhat.com,
davem@davemloft.net, thomas.petazzoni@bootlin.com,
ilias.apalodimas@linaro.org, matteo.croce@redhat.com
Subject: Re: [PATCH net-next 2/3] net: page_pool: add the possibility to sync DMA memory for non-coherent devices
Date: Mon, 11 Nov 2019 21:11:50 +0200 [thread overview]
Message-ID: <20191111191150.GF4197@localhost.localdomain> (raw)
In-Reply-To: <20191111174835.7344731b@carbon>
[-- Attachment #1: Type: text/plain, Size: 8609 bytes --]
> On Sun, 10 Nov 2019 14:09:09 +0200
> Lorenzo Bianconi <lorenzo@kernel.org> wrote:
>
> > Introduce the following parameters in order to add the possibility to sync
> > DMA memory area before putting allocated buffers in the page_pool caches:
>
> > - sync: set to 1 if device is non cache-coherent and needs to flush DMA area
>
Hi Jesper,
thx for the review
> I don't agree that this is only for non cache-coherent devices.
>
> This change is generally for all device drivers. Via setting 'sync'
> (which I prefer to rename 'dma_sync') driver request that page_pool
> takes over doing DMA-sync-for-device. (Very important, DMA-sync-for-CPU
> is still drivers responsibility). Drivers can benefit from removing
> their calls to dma_sync_single_for_device().
>
> We need to define meaning/semantics of this setting (my definition):
> - This means that all pages that driver gets from page_pool, will be
> DMA-synced-for-device.
ack, will fix it in v2
>
> > - offset: DMA address offset where the DMA engine starts copying rx data
>
> > - max_len: maximum DMA memory size page_pool is allowed to flush. This
> > is currently used in __page_pool_alloc_pages_slow routine when pages
> > are allocated from page allocator
>
> Implementation wise (you did as I suggested offlist), and does the
> DMA-sync-for-device at return-time page_pool_put_page() time, because
> we (often) know the length that was/can touched by CPU. This is key to
> the optimization, that we know this length.
right, refilling the cache we now the exact length that was/can touched by CPU.
>
> I also think you/we need to explain why this optimization is correct,
> my attempt:
>
> This optimization reduce the length of the DMA-sync-for-device. The
> optimization is valid, because page is initially DMA-synced-for-device,
> as defined via max_len. At driver RX time, the driver will do a
> DMA-sync-for-CPU on the memory for the packet length. What is
> important is the memory occupied by packet payload, because this is the
> memory CPU is allowed to read and modify. If CPU have not written into
> a cache-line, then we know that CPU will not be flushing this, thus it
> doesn't need a DMA-sync-for-device. As we don't track cache-lines
> written into, simply use the full packet length as dma_sync_size, at
> page_pool recycle time. This also take into account any tail-extend.
ack, will update it in v2
Regards,
Lorenzo
>
>
> > These parameters are supposed to be set by device drivers
>
>
>
> > Tested-by: Matteo Croce <mcroce@redhat.com>
> > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> > ---
> > include/net/page_pool.h | 11 +++++++----
> > net/core/page_pool.c | 39 +++++++++++++++++++++++++++++++++------
> > 2 files changed, 40 insertions(+), 10 deletions(-)
> >
> > diff --git a/include/net/page_pool.h b/include/net/page_pool.h
> > index 2cbcdbdec254..defbfd90ab46 100644
> > --- a/include/net/page_pool.h
> > +++ b/include/net/page_pool.h
> > @@ -65,6 +65,9 @@ struct page_pool_params {
> > int nid; /* Numa node id to allocate from pages from */
> > struct device *dev; /* device, for DMA pre-mapping purposes */
> > enum dma_data_direction dma_dir; /* DMA mapping direction */
> > + unsigned int max_len; /* max DMA sync memory size */
> > + unsigned int offset; /* DMA addr offset */
> > + u8 sync;
> > };
> >
> > struct page_pool {
> > @@ -150,8 +153,8 @@ static inline void page_pool_destroy(struct page_pool *pool)
> > }
> >
> > /* Never call this directly, use helpers below */
> > -void __page_pool_put_page(struct page_pool *pool,
> > - struct page *page, bool allow_direct);
> > +void __page_pool_put_page(struct page_pool *pool, struct page *page,
> > + unsigned int dma_sync_size, bool allow_direct);
> >
> > static inline void page_pool_put_page(struct page_pool *pool,
> > struct page *page, bool allow_direct)
> > @@ -160,14 +163,14 @@ static inline void page_pool_put_page(struct page_pool *pool,
> > * allow registering MEM_TYPE_PAGE_POOL, but shield linker.
> > */
> > #ifdef CONFIG_PAGE_POOL
> > - __page_pool_put_page(pool, page, allow_direct);
> > + __page_pool_put_page(pool, page, 0, allow_direct);
> > #endif
> > }
> > /* Very limited use-cases allow recycle direct */
> > static inline void page_pool_recycle_direct(struct page_pool *pool,
> > struct page *page)
> > {
> > - __page_pool_put_page(pool, page, true);
> > + __page_pool_put_page(pool, page, 0, true);
> > }
> >
> > /* API user MUST have disconnected alloc-side (not allowed to call
> > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > index 5bc65587f1c4..af9514c2d15b 100644
> > --- a/net/core/page_pool.c
> > +++ b/net/core/page_pool.c
> > @@ -112,6 +112,17 @@ static struct page *__page_pool_get_cached(struct page_pool *pool)
> > return page;
> > }
> >
> > +/* Used for non-coherent devices */
> > +static void page_pool_dma_sync_for_device(struct page_pool *pool,
> > + struct page *page,
> > + unsigned int dma_sync_size)
> > +{
> > + dma_sync_size = min(dma_sync_size, pool->p.max_len);
> > + dma_sync_single_range_for_device(pool->p.dev, page->dma_addr,
> > + pool->p.offset, dma_sync_size,
> > + pool->p.dma_dir);
> > +}
> > +
> > /* slow path */
> > noinline
> > static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool,
> > @@ -156,6 +167,10 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool,
> > }
> > page->dma_addr = dma;
> >
> > + /* non-coherent devices - flush memory */
> > + if (pool->p.sync)
> > + page_pool_dma_sync_for_device(pool, page, pool->p.max_len);
> > +
> > skip_dma_map:
> > /* Track how many pages are held 'in-flight' */
> > pool->pages_state_hold_cnt++;
> > @@ -255,7 +270,8 @@ static void __page_pool_return_page(struct page_pool *pool, struct page *page)
> > }
> >
> > static bool __page_pool_recycle_into_ring(struct page_pool *pool,
> > - struct page *page)
> > + struct page *page,
> > + unsigned int dma_sync_size)
> > {
> > int ret;
> > /* BH protection not needed if current is serving softirq */
> > @@ -264,6 +280,10 @@ static bool __page_pool_recycle_into_ring(struct page_pool *pool,
> > else
> > ret = ptr_ring_produce_bh(&pool->ring, page);
> >
> > + /* non-coherent devices - flush memory */
> > + if (ret == 0 && pool->p.sync)
> > + page_pool_dma_sync_for_device(pool, page, dma_sync_size);
> > +
> > return (ret == 0) ? true : false;
> > }
> >
> > @@ -273,18 +293,23 @@ static bool __page_pool_recycle_into_ring(struct page_pool *pool,
> > * Caller must provide appropriate safe context.
> > */
> > static bool __page_pool_recycle_direct(struct page *page,
> > - struct page_pool *pool)
> > + struct page_pool *pool,
> > + unsigned int dma_sync_size)
> > {
> > if (unlikely(pool->alloc.count == PP_ALLOC_CACHE_SIZE))
> > return false;
> >
> > /* Caller MUST have verified/know (page_ref_count(page) == 1) */
> > pool->alloc.cache[pool->alloc.count++] = page;
> > +
> > + /* non-coherent devices - flush memory */
> > + if (pool->p.sync)
> > + page_pool_dma_sync_for_device(pool, page, dma_sync_size);
> > return true;
> > }
> >
> > -void __page_pool_put_page(struct page_pool *pool,
> > - struct page *page, bool allow_direct)
> > +void __page_pool_put_page(struct page_pool *pool, struct page *page,
> > + unsigned int dma_sync_size, bool allow_direct)
> > {
> > /* This allocator is optimized for the XDP mode that uses
> > * one-frame-per-page, but have fallbacks that act like the
> > @@ -296,10 +321,12 @@ void __page_pool_put_page(struct page_pool *pool,
> > /* Read barrier done in page_ref_count / READ_ONCE */
> >
> > if (allow_direct && in_serving_softirq())
> > - if (__page_pool_recycle_direct(page, pool))
> > + if (__page_pool_recycle_direct(page, pool,
> > + dma_sync_size))
> > return;
> >
> > - if (!__page_pool_recycle_into_ring(pool, page)) {
> > + if (!__page_pool_recycle_into_ring(pool, page,
> > + dma_sync_size)) {
> > /* Cache full, fallback to free pages */
> > __page_pool_return_page(pool, page);
> > }
>
>
>
> --
> Best regards,
> Jesper Dangaard Brouer
> MSc.CS, Principal Kernel Engineer at Red Hat
> LinkedIn: http://www.linkedin.com/in/brouer
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
next prev parent reply other threads:[~2019-11-11 19:12 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-10 12:09 [PATCH net-next 0/3] add DMA sync capability to page_pool API Lorenzo Bianconi
2019-11-10 12:09 ` [PATCH net-next 1/3] net: mvneta: rely on page_pool_recycle_direct in mvneta_run_xdp Lorenzo Bianconi
2019-11-10 12:09 ` [PATCH net-next 2/3] net: page_pool: add the possibility to sync DMA memory for non-coherent devices Lorenzo Bianconi
2019-11-11 16:48 ` Jesper Dangaard Brouer
2019-11-11 19:11 ` Lorenzo Bianconi [this message]
2019-11-13 8:29 ` Jesper Dangaard Brouer
2019-11-14 18:48 ` Jonathan Lemon
2019-11-14 18:53 ` Ilias Apalodimas
2019-11-14 20:27 ` Jonathan Lemon
2019-11-14 20:42 ` Ilias Apalodimas
2019-11-14 21:04 ` Jonathan Lemon
2019-11-14 21:43 ` Jesper Dangaard Brouer
2019-11-15 7:05 ` Ilias Apalodimas
2019-11-15 7:49 ` Lorenzo Bianconi
2019-11-15 8:03 ` Ilias Apalodimas
2019-11-15 16:47 ` Jonathan Lemon
2019-11-15 16:53 ` Lorenzo Bianconi
2019-11-15 7:17 ` Ilias Apalodimas
2019-11-10 12:09 ` [PATCH net-next 3/3] net: mvneta: get rid of huge DMA sync in mvneta_rx_refill Lorenzo Bianconi
2019-11-14 18:14 ` Jonathan Lemon
2019-11-14 18:18 ` Ilias Apalodimas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191111191150.GF4197@localhost.localdomain \
--to=lorenzo@kernel.org \
--cc=brouer@redhat.com \
--cc=davem@davemloft.net \
--cc=ilias.apalodimas@linaro.org \
--cc=lorenzo.bianconi@redhat.com \
--cc=matteo.croce@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=thomas.petazzoni@bootlin.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).