netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: "Jonathan Lemon" <jonathan.lemon@gmail.com>
Cc: "Toke Høiland-Jørgensen" <toke@redhat.com>,
	netdev@vger.kernel.org,
	"Ilias Apalodimas" <ilias.apalodimas@linaro.org>,
	"Saeed Mahameed" <saeedm@mellanox.com>,
	"Matteo Croce" <mcroce@redhat.com>,
	"Lorenzo Bianconi" <lorenzo@kernel.org>,
	"Tariq Toukan" <tariqt@mellanox.com>,
	brouer@redhat.com
Subject: Re: [net-next v1 PATCH 1/2] xdp: revert forced mem allocator removal for page_pool
Date: Sun, 10 Nov 2019 08:59:39 +0100	[thread overview]
Message-ID: <20191110085939.23013f83@carbon> (raw)
In-Reply-To: <5FDB1D3C-3A80-4F70-A7F0-03D4CD4061EB@gmail.com>

On Sat, 09 Nov 2019 09:34:50 -0800
"Jonathan Lemon" <jonathan.lemon@gmail.com> wrote:

> On 9 Nov 2019, at 8:11, Jesper Dangaard Brouer wrote:
> 
> > On Fri, 08 Nov 2019 11:16:43 -0800
> > "Jonathan Lemon" <jonathan.lemon@gmail.com> wrote:
> >  
> >>> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> >>> index 5bc65587f1c4..226f2eb30418 100644
> >>> --- a/net/core/page_pool.c
> >>> +++ b/net/core/page_pool.c
> >>> @@ -346,7 +346,7 @@ static void __warn_in_flight(struct page_pool
> >>> *pool)
> >>>
> >>>  	distance = _distance(hold_cnt, release_cnt);
> >>>
> >>> -	/* Drivers should fix this, but only problematic when DMA is used */
> >>> +	/* BUG but warn as kernel should crash later */
> >>>  	WARN(1, "Still in-flight pages:%d hold:%u released:%u",
> >>>  	     distance, hold_cnt, release_cnt);  
> >
> > Because this is kept as a WARN, I set pool->ring.queue = NULL later.  
> 
> ... which is also an API violation, reaching into the ring internals.
> I strongly dislike this.

I understand your dislike of reaching into ptr_ring "internals".
But my plan was to add this here, and then in a followup patch move this
pool->ring.queue=NULL into the ptr_ring.

 
> >>>  }
> >>> @@ -360,12 +360,16 @@ void __page_pool_free(struct page_pool *pool)
> >>>  	WARN(pool->alloc.count, "API usage violation");
> >>>  	WARN(!ptr_ring_empty(&pool->ring), "ptr_ring is not empty");
> >>>
> >>> -	/* Can happen due to forced shutdown */
> >>>  	if (!__page_pool_safe_to_destroy(pool))
> >>>  		__warn_in_flight(pool);  
> >>
> >> If it's not safe to destroy, we shouldn't be getting here.  
> >
> > Don't make such assumptions. The API is going to be used by driver
> > developer and they are always a little too creative...  
> 
> If the driver hits this case, the driver has a bug, and it isn't
> safe to continue in any fashion.  The developer needs to fix their
> driver in that case.  (see stmmac code)

The stmmac driver is NOT broken, they simply use page_pool as their
driver level page-cache.  That is exactly what page_pool was designed
for, creating a generic page-cache for drivers to use.  They use this
to simplify their driver.  They don't use the advanced features, which
requires hooking into mem model reg.

> 
> > The page_pool is a separate facility, it is not tied to the
> > xdp_rxq_info memory model.  Some drivers use page_pool directly e.g.
> > drivers/net/ethernet/stmicro/stmmac.  It can easily trigger this case,
> > when some extend that driver.  
> 
> Yes, and I pointed out that the mem_info should likely be completely
> detached from xdp.c since it really has nothing to do with XDP.
> The stmmac driver is actually broken at the moment, as it tries to
> free the pool immediately without a timeout.
> 
> What should be happening is that drivers just call page_pool_destroy(),
> which kicks off the shutdown process if this was the last user ref,
> and delays destruction if packets are in flight.

Sorry, but I'm getting frustrated with you. I've already explained you
(offlist), that the memory model reg/unreg system have been created to
support multiple memory models (even per RX-queue).  We already have
AF_XDP zero copy, but I actually want to keep the flexibility and add
more in the future.

 
> >>>  	ptr_ring_cleanup(&pool->ring, NULL);
> >>>
> >>> +	/* Make sure kernel will crash on use-after-free */
> >>> +	pool->ring.queue = NULL;
> >>> +	pool->alloc.cache[PP_ALLOC_CACHE_SIZE - 1] = NULL;
> >>> +	pool->alloc.count = PP_ALLOC_CACHE_SIZE;  
> >>
> >> The pool is going to be freed.  This is useless code; if we're
> >> really concerned about use-after-free, the correct place for catching
> >> this is with the memory-allocator tools, not scattering things like
> >> this ad-hoc over the codebase.  
> >
> > No, I need this code here, because we kept the above WARN() and didn't
> > change that into a BUG().  It is obviously not a full solution for
> > use-after-free detection.  The memory subsystem have kmemleak to catch
> > this kind of stuff, but nobody runs this in production.  I need this
> > here to catch some obvious runtime cases.  
> 
> The WARN() indicates something went off the rails already.  I really
> don't like half-assed solutions like the above; it may or may not work
> properly.  If it doesn't work properly, then what's the point?

So, you are suggesting to use BUG_ON() instead and crash the kernel
immediately... you do know Linus hates when we do that, right?

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


  reply	other threads:[~2019-11-10  7:59 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-08 18:20 [net-next v1 PATCH 0/2] Change XDP lifetime guarantees for page_pool objects Jesper Dangaard Brouer
2019-11-08 18:20 ` [net-next v1 PATCH 1/2] xdp: revert forced mem allocator removal for page_pool Jesper Dangaard Brouer
2019-11-08 19:16   ` Jonathan Lemon
2019-11-09 16:11     ` Jesper Dangaard Brouer
2019-11-09 17:34       ` Jonathan Lemon
2019-11-10  7:59         ` Jesper Dangaard Brouer [this message]
2019-11-10 19:56           ` Jonathan Lemon
2019-11-08 18:20 ` [net-next v1 PATCH 2/2] page_pool: make inflight returns more robust via blocking alloc cache Jesper Dangaard Brouer
2019-11-11  0:13   ` kbuild test robot
2019-11-11  0:13   ` [RFC PATCH] page_pool: page_pool_empty_alloc_cache_once() can be static kbuild test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191110085939.23013f83@carbon \
    --to=brouer@redhat.com \
    --cc=ilias.apalodimas@linaro.org \
    --cc=jonathan.lemon@gmail.com \
    --cc=lorenzo@kernel.org \
    --cc=mcroce@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@mellanox.com \
    --cc=tariqt@mellanox.com \
    --cc=toke@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).