public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Yunsheng Lin <linyunsheng@huawei.com>
Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org>,
	Saeed Mahameed <saeedm@mellanox.com>,
	"brouer@redhat.com" <brouer@redhat.com>,
	"jonathan.lemon@gmail.com" <jonathan.lemon@gmail.com>,
	Li Rongqing <lirongqing@baidu.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	peterz@infradead.org,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	bhelgaas@google.com,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH][v2] page_pool: handle page recycle for NUMA_NO_NODE condition
Date: Thu, 19 Dec 2019 12:53:38 +0100	[thread overview]
Message-ID: <20191219115338.GC26945@dhcp22.suse.cz> (raw)
In-Reply-To: <ff280412-bb31-5ffb-99f0-6d49bb470855@huawei.com>

On Thu 19-12-19 10:09:33, Yunsheng Lin wrote:
[...]
> > There is not real guarantee that NUMA_NO_NODE is going to imply local
> > node and we do not want to grow any subtle dependency on that behavior.
> 
> Strictly speaking, using numa_mem_id() also does not have real guarantee
> that it will allocate local memory when local memory is used up, right?
> Because alloc_pages_node() is basically turning the node to numa_mem_id()
> when it is NUMA_NO_NODE.

yes, both allocations are allowed to fallback to other nodes unless
there is an explicit nodemask specified.

> Unless we do not allow passing NUMA_NO_NODE to alloc_pages_node(), otherwise
> I can not see difference between NUMA_NO_NODE and numa_mem_id().

The difference is in the presented intention. NUMA_NO_NODE means no node
preference. We turn it into an implicit local node preference because
this is the best assumption we can in general. If you provide numa_mem_id
then you explicitly ask for the local node preference because you know
that this is the best for your specific code. See the difference?

The NUMA_NO_NODE -> local node is a heuristic which might change
(albeit unlikely).
 
> >> And for those drivers, locality is decided by rx interrupt affinity, not
> >> dev_to_node(). So when rx interrupt affinity changes, the old page from old
> >> node will not be recycled(by checking page_to_nid(page) == numa_mem_id()),
> >> new pages will be allocated to replace the old pages and the new pages will
> >> be recycled because allocation and recycling happens in the same context,
> >> which means numa_mem_id() returns the same node of new page allocated, see
> >> [2].
> > 
> > Well, but my understanding is that the generic page pool implementation
> > has a clear means to change the affinity (namely page_pool_update_nid()).
> > So my primary question is, why does NUMA_NO_NODE should be use as a
> > bypass for that?
> 
> In that case, page_pool_update_nid() need to be called explicitly, which
> may not be the reality, because for drivers using page pool now, mlx5 seems
> to be the only one to call page_pool_update_nid(), which may lead to
> copy & paste problem when not careful enough.

The API is quite new AFAIU and I think it would be better to use it in
the intended way. Relying on implicit and undocumented behavior is just
going to bend that API in the future and it will impose an additional
burden to any future changes.
-- 
Michal Hocko
SUSE Labs

      reply	other threads:[~2019-12-19 11:53 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1575624767-3343-1-git-send-email-lirongqing@baidu.com>
     [not found] ` <9fecbff3518d311ec7c3aee9ae0315a73682a4af.camel@mellanox.com>
     [not found]   ` <20191211194933.15b53c11@carbon>
     [not found]     ` <831ed886842c894f7b2ffe83fe34705180a86b3b.camel@mellanox.com>
2019-12-12  1:34       ` [PATCH][v2] page_pool: handle page recycle for NUMA_NO_NODE condition Yunsheng Lin
2019-12-12 10:18         ` Jesper Dangaard Brouer
2019-12-13  3:40           ` Yunsheng Lin
2019-12-13  6:27             ` 答复: " Li,Rongqing
2019-12-13  6:53               ` Yunsheng Lin
2019-12-13  8:48                 ` Jesper Dangaard Brouer
2019-12-16  1:51                   ` Yunsheng Lin
2019-12-16  4:02                     ` 答复: " Li,Rongqing
2019-12-16 10:13                       ` Ilias Apalodimas
2019-12-16 10:16                         ` Ilias Apalodimas
2019-12-16 10:57                           ` 答复: " Li,Rongqing
2019-12-17 19:38                         ` Saeed Mahameed
2019-12-17 19:35             ` Saeed Mahameed
2019-12-17 19:27           ` Saeed Mahameed
2019-12-16 12:15         ` Michal Hocko
2019-12-16 12:34           ` Ilias Apalodimas
2019-12-16 13:08             ` Michal Hocko
2019-12-16 13:21               ` Ilias Apalodimas
2019-12-17  2:11                 ` Yunsheng Lin
2019-12-17  9:11                   ` Michal Hocko
2019-12-19  2:09                     ` Yunsheng Lin
2019-12-19 11:53                       ` Michal Hocko [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191219115338.GC26945@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=bhelgaas@google.com \
    --cc=brouer@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=ilias.apalodimas@linaro.org \
    --cc=jonathan.lemon@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linyunsheng@huawei.com \
    --cc=lirongqing@baidu.com \
    --cc=netdev@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=saeedm@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox