netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Saeed Mahameed <saeedm@mellanox.com>
Cc: "jonathan.lemon@gmail.com" <jonathan.lemon@gmail.com>,
	"linyunsheng@huawei.com" <linyunsheng@huawei.com>,
	Li Rongqing <lirongqing@baidu.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"ilias.apalodimas@linaro.org" <ilias.apalodimas@linaro.org>,
	brouer@redhat.com
Subject: Re: [PATCH][v2] page_pool: handle page recycle for NUMA_NO_NODE condition
Date: Mon, 9 Dec 2019 13:14:16 +0100	[thread overview]
Message-ID: <20191209131416.238d4ae4@carbon> (raw)
In-Reply-To: <9fecbff3518d311ec7c3aee9ae0315a73682a4af.camel@mellanox.com>

On Sat, 7 Dec 2019 03:52:41 +0000
Saeed Mahameed <saeedm@mellanox.com> wrote:

> On Fri, 2019-12-06 at 17:32 +0800, Li RongQing wrote:
> > some drivers uses page pool, but not require to allocate
> > pages from bound node, or simply assign pool.p.nid to
> > NUMA_NO_NODE, and the commit d5394610b1ba ("page_pool:
> > Don't recycle non-reusable pages") will block this kind
> > of driver to recycle
> > 
> > so take page as reusable when page belongs to current
> > memory node if nid is NUMA_NO_NODE
> > 
> > v1-->v2: add check with numa_mem_id from Yunsheng
> > 
> > Fixes: d5394610b1ba ("page_pool: Don't recycle non-reusable pages")
> > Signed-off-by: Li RongQing <lirongqing@baidu.com>
> > Suggested-by: Yunsheng Lin <linyunsheng@huawei.com>
> > Cc: Saeed Mahameed <saeedm@mellanox.com>
> > ---
> >  net/core/page_pool.c | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > index a6aefe989043..3c8b51ccd1c1 100644
> > --- a/net/core/page_pool.c
> > +++ b/net/core/page_pool.c
> > @@ -312,12 +312,17 @@ static bool __page_pool_recycle_direct(struct
> > page *page,
> >  /* page is NOT reusable when:
> >   * 1) allocated when system is under some pressure.
> > (page_is_pfmemalloc)
> >   * 2) belongs to a different NUMA node than pool->p.nid.
> > + * 3) belongs to a different memory node than current context
> > + * if pool->p.nid is NUMA_NO_NODE
> >   *
> >   * To update pool->p.nid users must call page_pool_update_nid.
> >   */
> >  static bool pool_page_reusable(struct page_pool *pool, struct page
> > *page)
> >  {
> > -	return !page_is_pfmemalloc(page) && page_to_nid(page) == pool-  
> > >p.nid;  
> > +	return !page_is_pfmemalloc(page) &&
> > +		(page_to_nid(page) == pool->p.nid ||
> > +		(pool->p.nid == NUMA_NO_NODE &&
> > +		page_to_nid(page) == numa_mem_id()));
> >  }
> >    
> 
> Cc'ed Jesper, Ilias & Jonathan.
> 
> I don't think it is correct to check that the page nid is same as
> numa_mem_id() if pool is NUMA_NO_NODE. In such case we should allow all
> pages to recycle, because you can't assume where pages are allocated
> from and where they are being handled.
> 
> I suggest the following:
> 
> return !page_pfmemalloc() && 
> ( page_to_nid(page) == pool->p.nid || pool->p.nid == NUMA_NO_NODE );
> 
> 1) never recycle emergency pages, regardless of pool nid.
> 2) always recycle if pool is NUMA_NO_NODE.
> 
> the above change should not add any overhead, a modest branch predictor
> will handle this with no effort.
> 
> Jesper et al. what do you think?

The patch description doesn't explain the problem very well.

Lets first establish what the problem is.  After I took at closer look,
I do think we have a real problem here...

If function alloc_pages_node() is called with NUMA_NO_NODE (see below
signature), then the nid is re-assigned to numa_mem_id().

Our current code checks: page_to_nid(page) == pool->p.nid which seems
bogus, as pool->p.nid=NUMA_NO_NODE and the page NID will not return
NUMA_NO_NODE... as it was set to the local detect numa node, right?

So, we do need a fix... but the question is that semantics do we want?

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


/*
 * Allocate pages, preferring the node given as nid. When nid == NUMA_NO_NODE,
 * prefer the current CPU's closest node. Otherwise node must be valid and
 * online.
 */
static inline struct page *alloc_pages_node(int nid, gfp_t gfp_mask,
						unsigned int order)
{
	if (nid == NUMA_NO_NODE)
		nid = numa_mem_id();

	return __alloc_pages_node(nid, gfp_mask, order);
}

static bool pool_page_reusable(struct page_pool *pool, struct page *page)
{
	return !page_is_pfmemalloc(page) && page_to_nid(page) == pool->p.nid;
}


  parent reply	other threads:[~2019-12-09 12:14 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-06  9:32 [PATCH][v2] page_pool: handle page recycle for NUMA_NO_NODE condition Li RongQing
2019-12-07  3:52 ` Saeed Mahameed
2019-12-09  1:31   ` Yunsheng Lin
2019-12-09  3:47     ` 答复: " Li,Rongqing
2019-12-09  9:30       ` Ilias Apalodimas
2019-12-09 10:37         ` 答复: " Li,Rongqing
2019-12-09 12:14   ` Jesper Dangaard Brouer [this message]
2019-12-09 23:34     ` Saeed Mahameed
2019-12-10  1:31       ` Yunsheng Lin
2019-12-10  9:39         ` 答复: " Li,Rongqing
2019-12-10 14:52           ` Ilias Apalodimas
2019-12-10 19:56           ` Saeed Mahameed
2019-12-10 19:45         ` Saeed Mahameed
2019-12-11  3:01           ` Yunsheng Lin
2019-12-11  3:06             ` Yunsheng Lin
2019-12-11 20:57             ` Saeed Mahameed
2019-12-12  1:04               ` Yunsheng Lin
2019-12-10 15:02       ` Ilias Apalodimas
2019-12-10 20:02         ` Saeed Mahameed
2019-12-10 20:10           ` Ilias Apalodimas
2019-12-11 18:49   ` Jesper Dangaard Brouer
2019-12-11 21:24     ` Saeed Mahameed
2019-12-12  1:34       ` Yunsheng Lin
2019-12-12 10:18         ` Jesper Dangaard Brouer
2019-12-13  3:40           ` Yunsheng Lin
2019-12-13  6:27             ` 答复: " Li,Rongqing
2019-12-13  6:53               ` Yunsheng Lin
2019-12-13  8:48                 ` Jesper Dangaard Brouer
2019-12-16  1:51                   ` Yunsheng Lin
2019-12-16  4:02                     ` 答复: " Li,Rongqing
2019-12-16 10:13                       ` Ilias Apalodimas
2019-12-16 10:16                         ` Ilias Apalodimas
2019-12-16 10:57                           ` 答复: " Li,Rongqing
2019-12-17 19:38                         ` Saeed Mahameed
2019-12-17 19:35             ` Saeed Mahameed
2019-12-17 19:27           ` Saeed Mahameed
2019-12-16 12:15         ` Michal Hocko
2019-12-16 12:34           ` Ilias Apalodimas
2019-12-16 13:08             ` Michal Hocko
2019-12-16 13:21               ` Ilias Apalodimas
2019-12-17  2:11                 ` Yunsheng Lin
2019-12-17  9:11                   ` Michal Hocko
2019-12-19  2:09                     ` Yunsheng Lin
2019-12-19 11:53                       ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191209131416.238d4ae4@carbon \
    --to=brouer@redhat.com \
    --cc=ilias.apalodimas@linaro.org \
    --cc=jonathan.lemon@gmail.com \
    --cc=linyunsheng@huawei.com \
    --cc=lirongqing@baidu.com \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).