From: Michal Hocko <mhocko@kernel.org>
To: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Cc: Yunsheng Lin <linyunsheng@huawei.com>,
Saeed Mahameed <saeedm@mellanox.com>,
"brouer@redhat.com" <brouer@redhat.com>,
"jonathan.lemon@gmail.com" <jonathan.lemon@gmail.com>,
Li Rongqing <lirongqing@baidu.com>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
peterz@infradead.org,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
bhelgaas@google.com,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH][v2] page_pool: handle page recycle for NUMA_NO_NODE condition
Date: Mon, 16 Dec 2019 14:08:45 +0100 [thread overview]
Message-ID: <20191216130845.GF30281@dhcp22.suse.cz> (raw)
In-Reply-To: <20191216123426.GA18663@apalos.home>
On Mon 16-12-19 14:34:26, Ilias Apalodimas wrote:
> Hi Michal,
> On Mon, Dec 16, 2019 at 01:15:57PM +0100, Michal Hocko wrote:
> > On Thu 12-12-19 09:34:14, Yunsheng Lin wrote:
> > > +CC Michal, Peter, Greg and Bjorn
> > > Because there has been disscusion about where and how the NUMA_NO_NODE
> > > should be handled before.
> >
> > I do not have a full context. What is the question here?
>
> When we allocate pages for the page_pool API, during the init, the driver writer
> decides which NUMA node to use. The API can, in some cases recycle the memory,
> instead of freeing it and re-allocating it. If the NUMA node has changed (irq
> affinity for example), we forbid recycling and free the memory, since recycling
> and using memory on far NUMA nodes is more expensive (more expensive than
> recycling, at least on the architectures we tried anyway).
> Since this would be expensive to do it per packet, the burden falls on the
> driver writer for that. Drivers *have* to call page_pool_update_nid() or
> page_pool_nid_changed() if they want to check for that which runs once
> per NAPI cycle.
Thanks for the clarification.
> The current code in the API though does not account for NUMA_NO_NODE. That's
> what this is trying to fix.
> If the page_pool params are initialized with that, we *never* recycle
> the memory. This is happening because the API is allocating memory with
> 'nid = numa_mem_id()' if NUMA_NO_NODE is configured so the current if statement
> 'page_to_nid(page) == pool->p.nid' will never trigger.
OK. There is no explicit mention of the expected behavior for
NUMA_NO_NODE. The semantic is usually that there is no NUMA placement
requirement and the MM code simply starts the allocate from a local node
in that case. But the memory might come from any node so there is no
"local node" guarantee.
So the main question is what is the expected semantic? Do people expect
that NUMA_NO_NODE implies locality? Why don't you simply always reuse
when there was no explicit numa requirement?
> The initial proposal was to check:
> pool->p.nid == NUMA_NO_NODE && page_to_nid(page) == numa_mem_id()));
> After that the thread span out of control :)
> My question is do we *really* have to check for
> page_to_nid(page) == numa_mem_id()? if the architecture is not NUMA aware
> wouldn't pool->p.nid == NUMA_NO_NODE be enough?
If the architecture is !NUMA then numa_mem_id and page_to_nid should
always equal and be both zero.
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2019-12-16 13:08 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1575624767-3343-1-git-send-email-lirongqing@baidu.com>
[not found] ` <9fecbff3518d311ec7c3aee9ae0315a73682a4af.camel@mellanox.com>
[not found] ` <20191211194933.15b53c11@carbon>
[not found] ` <831ed886842c894f7b2ffe83fe34705180a86b3b.camel@mellanox.com>
2019-12-12 1:34 ` [PATCH][v2] page_pool: handle page recycle for NUMA_NO_NODE condition Yunsheng Lin
2019-12-12 10:18 ` Jesper Dangaard Brouer
2019-12-13 3:40 ` Yunsheng Lin
2019-12-13 6:27 ` 答复: " Li,Rongqing
2019-12-13 6:53 ` Yunsheng Lin
2019-12-13 8:48 ` Jesper Dangaard Brouer
2019-12-16 1:51 ` Yunsheng Lin
2019-12-16 4:02 ` 答复: " Li,Rongqing
2019-12-16 10:13 ` Ilias Apalodimas
2019-12-16 10:16 ` Ilias Apalodimas
2019-12-16 10:57 ` 答复: " Li,Rongqing
2019-12-17 19:38 ` Saeed Mahameed
2019-12-17 19:35 ` Saeed Mahameed
2019-12-17 19:27 ` Saeed Mahameed
2019-12-16 12:15 ` Michal Hocko
2019-12-16 12:34 ` Ilias Apalodimas
2019-12-16 13:08 ` Michal Hocko [this message]
2019-12-16 13:21 ` Ilias Apalodimas
2019-12-17 2:11 ` Yunsheng Lin
2019-12-17 9:11 ` Michal Hocko
2019-12-19 2:09 ` Yunsheng Lin
2019-12-19 11:53 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191216130845.GF30281@dhcp22.suse.cz \
--to=mhocko@kernel.org \
--cc=bhelgaas@google.com \
--cc=brouer@redhat.com \
--cc=gregkh@linuxfoundation.org \
--cc=ilias.apalodimas@linaro.org \
--cc=jonathan.lemon@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linyunsheng@huawei.com \
--cc=lirongqing@baidu.com \
--cc=netdev@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=saeedm@mellanox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox