From: Jesper Dangaard Brouer <brouer@redhat.com>
To: "Jonathan Lemon" <jonathan.lemon@gmail.com>
Cc: "Toke Høiland-Jørgensen" <toke@redhat.com>,
netdev@vger.kernel.org,
"Ilias Apalodimas" <ilias.apalodimas@linaro.org>,
"Saeed Mahameed" <saeedm@mellanox.com>,
"Matteo Croce" <mcroce@redhat.com>,
"Lorenzo Bianconi" <lorenzo@kernel.org>,
"Tariq Toukan" <tariqt@mellanox.com>,
brouer@redhat.com
Subject: Re: [net-next v1 PATCH 1/2] xdp: revert forced mem allocator removal for page_pool
Date: Sun, 10 Nov 2019 08:59:39 +0100 [thread overview]
Message-ID: <20191110085939.23013f83@carbon> (raw)
In-Reply-To: <5FDB1D3C-3A80-4F70-A7F0-03D4CD4061EB@gmail.com>
On Sat, 09 Nov 2019 09:34:50 -0800
"Jonathan Lemon" <jonathan.lemon@gmail.com> wrote:
> On 9 Nov 2019, at 8:11, Jesper Dangaard Brouer wrote:
>
> > On Fri, 08 Nov 2019 11:16:43 -0800
> > "Jonathan Lemon" <jonathan.lemon@gmail.com> wrote:
> >
> >>> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> >>> index 5bc65587f1c4..226f2eb30418 100644
> >>> --- a/net/core/page_pool.c
> >>> +++ b/net/core/page_pool.c
> >>> @@ -346,7 +346,7 @@ static void __warn_in_flight(struct page_pool
> >>> *pool)
> >>>
> >>> distance = _distance(hold_cnt, release_cnt);
> >>>
> >>> - /* Drivers should fix this, but only problematic when DMA is used */
> >>> + /* BUG but warn as kernel should crash later */
> >>> WARN(1, "Still in-flight pages:%d hold:%u released:%u",
> >>> distance, hold_cnt, release_cnt);
> >
> > Because this is kept as a WARN, I set pool->ring.queue = NULL later.
>
> ... which is also an API violation, reaching into the ring internals.
> I strongly dislike this.
I understand your dislike of reaching into ptr_ring "internals".
But my plan was to add this here, and then in a followup patch move this
pool->ring.queue=NULL into the ptr_ring.
> >>> }
> >>> @@ -360,12 +360,16 @@ void __page_pool_free(struct page_pool *pool)
> >>> WARN(pool->alloc.count, "API usage violation");
> >>> WARN(!ptr_ring_empty(&pool->ring), "ptr_ring is not empty");
> >>>
> >>> - /* Can happen due to forced shutdown */
> >>> if (!__page_pool_safe_to_destroy(pool))
> >>> __warn_in_flight(pool);
> >>
> >> If it's not safe to destroy, we shouldn't be getting here.
> >
> > Don't make such assumptions. The API is going to be used by driver
> > developer and they are always a little too creative...
>
> If the driver hits this case, the driver has a bug, and it isn't
> safe to continue in any fashion. The developer needs to fix their
> driver in that case. (see stmmac code)
The stmmac driver is NOT broken, they simply use page_pool as their
driver level page-cache. That is exactly what page_pool was designed
for, creating a generic page-cache for drivers to use. They use this
to simplify their driver. They don't use the advanced features, which
requires hooking into mem model reg.
>
> > The page_pool is a separate facility, it is not tied to the
> > xdp_rxq_info memory model. Some drivers use page_pool directly e.g.
> > drivers/net/ethernet/stmicro/stmmac. It can easily trigger this case,
> > when some extend that driver.
>
> Yes, and I pointed out that the mem_info should likely be completely
> detached from xdp.c since it really has nothing to do with XDP.
> The stmmac driver is actually broken at the moment, as it tries to
> free the pool immediately without a timeout.
>
> What should be happening is that drivers just call page_pool_destroy(),
> which kicks off the shutdown process if this was the last user ref,
> and delays destruction if packets are in flight.
Sorry, but I'm getting frustrated with you. I've already explained you
(offlist), that the memory model reg/unreg system have been created to
support multiple memory models (even per RX-queue). We already have
AF_XDP zero copy, but I actually want to keep the flexibility and add
more in the future.
> >>> ptr_ring_cleanup(&pool->ring, NULL);
> >>>
> >>> + /* Make sure kernel will crash on use-after-free */
> >>> + pool->ring.queue = NULL;
> >>> + pool->alloc.cache[PP_ALLOC_CACHE_SIZE - 1] = NULL;
> >>> + pool->alloc.count = PP_ALLOC_CACHE_SIZE;
> >>
> >> The pool is going to be freed. This is useless code; if we're
> >> really concerned about use-after-free, the correct place for catching
> >> this is with the memory-allocator tools, not scattering things like
> >> this ad-hoc over the codebase.
> >
> > No, I need this code here, because we kept the above WARN() and didn't
> > change that into a BUG(). It is obviously not a full solution for
> > use-after-free detection. The memory subsystem have kmemleak to catch
> > this kind of stuff, but nobody runs this in production. I need this
> > here to catch some obvious runtime cases.
>
> The WARN() indicates something went off the rails already. I really
> don't like half-assed solutions like the above; it may or may not work
> properly. If it doesn't work properly, then what's the point?
So, you are suggesting to use BUG_ON() instead and crash the kernel
immediately... you do know Linus hates when we do that, right?
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
next prev parent reply other threads:[~2019-11-10 7:59 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-08 18:20 [net-next v1 PATCH 0/2] Change XDP lifetime guarantees for page_pool objects Jesper Dangaard Brouer
2019-11-08 18:20 ` [net-next v1 PATCH 1/2] xdp: revert forced mem allocator removal for page_pool Jesper Dangaard Brouer
2019-11-08 19:16 ` Jonathan Lemon
2019-11-09 16:11 ` Jesper Dangaard Brouer
2019-11-09 17:34 ` Jonathan Lemon
2019-11-10 7:59 ` Jesper Dangaard Brouer [this message]
2019-11-10 19:56 ` Jonathan Lemon
2019-11-08 18:20 ` [net-next v1 PATCH 2/2] page_pool: make inflight returns more robust via blocking alloc cache Jesper Dangaard Brouer
2019-11-11 0:13 ` kbuild test robot
2019-11-11 0:13 ` [RFC PATCH] page_pool: page_pool_empty_alloc_cache_once() can be static kbuild test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191110085939.23013f83@carbon \
--to=brouer@redhat.com \
--cc=ilias.apalodimas@linaro.org \
--cc=jonathan.lemon@gmail.com \
--cc=lorenzo@kernel.org \
--cc=mcroce@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=saeedm@mellanox.com \
--cc=tariqt@mellanox.com \
--cc=toke@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).