From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Cc: mcroce@redhat.com, Lorenzo Bianconi <lorenzo@kernel.org>,
netdev@vger.kernel.org, davem@davemloft.net,
ilias.apalodimas@linaro.org, jonathan.lemon@gmail.com,
brouer@redhat.com
Subject: Re: [PATCH v4 net-next 2/3] net: page_pool: add the possibility to sync DMA memory for device
Date: Tue, 19 Nov 2019 22:17:18 +0100 [thread overview]
Message-ID: <20191119221718.3d050008@carbon> (raw)
In-Reply-To: <20191119152543.GD3449@localhost.localdomain>
On Tue, 19 Nov 2019 17:25:43 +0200
Lorenzo Bianconi <lorenzo.bianconi@redhat.com> wrote:
> > On Tue, 19 Nov 2019 14:14:30 +0200
> > Lorenzo Bianconi <lorenzo.bianconi@redhat.com> wrote:
> >
> > > > On Mon, 18 Nov 2019 15:33:45 +0200
> > > > Lorenzo Bianconi <lorenzo@kernel.org> wrote:
> > > >
> > > > > diff --git a/include/net/page_pool.h b/include/net/page_pool.h
> > > > > index 1121faa99c12..6f684c3a3434 100644
> > > > > --- a/include/net/page_pool.h
> > > > > +++ b/include/net/page_pool.h
> > > > > @@ -34,8 +34,15 @@
> > > > > #include <linux/ptr_ring.h>
> > > > > #include <linux/dma-direction.h>
> > > > >
> > > > > -#define PP_FLAG_DMA_MAP 1 /* Should page_pool do the DMA map/unmap */
> > > > > -#define PP_FLAG_ALL PP_FLAG_DMA_MAP
> > > > > +#define PP_FLAG_DMA_MAP 1 /* Should page_pool do the DMA map/unmap */
> > > > > +#define PP_FLAG_DMA_SYNC_DEV 2 /* if set all pages that the driver gets
> > > > > + * from page_pool will be
> > > > > + * DMA-synced-for-device according to the
> > > > > + * length provided by the device driver.
> > > > > + * Please note DMA-sync-for-CPU is still
> > > > > + * device driver responsibility
> > > > > + */
> > > > > +#define PP_FLAG_ALL (PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV)
> > > > >
> > > > [...]
> > > >
> > > > Can you please change this to use the BIT(X) api.
> > > >
> > > > #include <linux/bits.h>
> > > >
> > > > #define PP_FLAG_DMA_MAP BIT(0)
> > > > #define PP_FLAG_DMA_SYNC_DEV BIT(1)
> > >
> > > Hi Jesper,
> > >
> > > sure, will do in v5
> > >
> > > >
> > > >
> > > >
> > > > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > > > > index dfc2501c35d9..4f9aed7bce5a 100644
> > > > > --- a/net/core/page_pool.c
> > > > > +++ b/net/core/page_pool.c
> > > > > @@ -47,6 +47,13 @@ static int page_pool_init(struct page_pool *pool,
> > > > > (pool->p.dma_dir != DMA_BIDIRECTIONAL))
> > > > > return -EINVAL;
> > > > >
> > > > > + /* In order to request DMA-sync-for-device the page needs to
> > > > > + * be mapped
> > > > > + */
> > > > > + if ((pool->p.flags & PP_FLAG_DMA_SYNC_DEV) &&
> > > > > + !(pool->p.flags & PP_FLAG_DMA_MAP))
> > > > > + return -EINVAL;
> > > > > +
> > > >
> > > > I like that you have moved this check to setup time.
> > > >
> > > > There are two other parameters the DMA_SYNC_DEV depend on:
> > > >
> > > > struct page_pool_params pp_params = {
> > > > .order = 0,
> > > > - .flags = PP_FLAG_DMA_MAP,
> > > > + .flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
> > > > .pool_size = size,
> > > > .nid = cpu_to_node(0),
> > > > .dev = pp->dev->dev.parent,
> > > > .dma_dir = xdp_prog ? DMA_BIDIRECTIONAL : DMA_FROM_DEVICE,
> > > > + .offset = pp->rx_offset_correction,
> > > > + .max_len = MVNETA_MAX_RX_BUF_SIZE,
> > > > };
> > > >
> > > > Can you add a check, that .max_len must not be zero. The reason is
> > > > that I can easily see people misconfiguring this. And the effect is
> > > > that the DMA-sync-for-device is essentially disabled, without user
> > > > realizing this. The not-realizing part is really bad, especially
> > > > because bugs that can occur from this are very rare and hard to catch.
> > >
> > > I guess we need to check it just if we provide PP_FLAG_DMA_SYNC_DEV.
> > > Something like:
> > >
> > > if (pool->p.flags & PP_FLAG_DMA_SYNC_DEV) {
> > > if (!(pool->p.flags & PP_FLAG_DMA_MAP))
> > > return -EINVAL;
> > >
> > > if (!pool->p.max_len)
> > > return -EINVAL;
> > > }
> >
> > Yes, exactly.
> >
>
> ack, I will add it to v5
>
> > > >
> > > > I'm up for discussing if there should be a similar check for .offset.
> > > > IMHO we should also check .offset is configured, and then be open to
> > > > remove this check once a driver user want to use offset=0. Does the
> > > > mvneta driver already have a use-case for this (in non-XDP mode)?
> > >
> > > With 'non-XDP mode' do you mean not loading a BPF program? If so yes, it used
> > > in __page_pool_alloc_pages_slow getting pages from page allocator.
> > > What would be a right min value for it? Just 0 or
> > > XDP_PACKET_HEADROOM/NET_SKB_PAD? I guess here it matters if a BPF program is
> > > loaded or not.
> >
> > I think you are saying, that we need to allow .offset==0, because it is
> > used by mvneta. Did I understand that correctly?
>
> I was just wondering what is the right value for the min offset, but
> rethinking about it yes, there is a condition where mvneta is using
> offset set 0 (it is the regression reported by Andrew, when mvneta is
> running on a hw bm device but bm code is not compiled). Do you think
> we can skip this check for the moment until we fix XDP on that
> particular board?
Yes. I guess we just accept .offset can be zero. It is an artificial
limitation.
The check is not important if API is used correctly. It comes from my
API design philosophy for page_pool, which is "Easy to use, and hard to
misuse". This is a case of catching "misuse" and signaling that this
was a wrong config. The check for pool->p.max_len should be enough,
for driver developer to notice, that they also need to set offset.
Maybe a comment close to pool->p.max_len check about "offset" will be
enough. Given you return the "catch all" -EINVAL, we/you force driver
devel to read code for page_pool_init(), which IMHO is sufficiently
clear.
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
next prev parent reply other threads:[~2019-11-19 21:17 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-18 13:33 [PATCH v4 net-next 0/3] add DMA-sync-for-device capability to page_pool API Lorenzo Bianconi
2019-11-18 13:33 ` [PATCH v4 net-next 1/3] net: mvneta: rely on page_pool_recycle_direct in mvneta_run_xdp Lorenzo Bianconi
2019-11-18 13:33 ` [PATCH v4 net-next 2/3] net: page_pool: add the possibility to sync DMA memory for device Lorenzo Bianconi
2019-11-19 11:23 ` Jesper Dangaard Brouer
2019-11-19 11:33 ` Ilias Apalodimas
2019-11-19 15:11 ` Jesper Dangaard Brouer
2019-11-19 15:23 ` Ilias Apalodimas
2019-11-19 12:14 ` Lorenzo Bianconi
2019-11-19 15:13 ` Jesper Dangaard Brouer
2019-11-19 15:25 ` Lorenzo Bianconi
2019-11-19 21:17 ` Jesper Dangaard Brouer [this message]
2019-11-18 13:33 ` [PATCH v4 net-next 3/3] net: mvneta: get rid of huge dma sync in mvneta_rx_refill Lorenzo Bianconi
2019-11-19 11:38 ` Jesper Dangaard Brouer
2019-11-19 12:19 ` Lorenzo Bianconi
2019-11-19 14:51 ` Jesper Dangaard Brouer
2019-11-19 15:38 ` Lorenzo Bianconi
2019-11-19 22:23 ` Jonathan Lemon
2019-11-20 9:21 ` Lorenzo Bianconi
2019-11-20 16:29 ` Jonathan Lemon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191119221718.3d050008@carbon \
--to=brouer@redhat.com \
--cc=davem@davemloft.net \
--cc=ilias.apalodimas@linaro.org \
--cc=jonathan.lemon@gmail.com \
--cc=lorenzo.bianconi@redhat.com \
--cc=lorenzo@kernel.org \
--cc=mcroce@redhat.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).