From: Oscar Salvador <osalvador@suse.de>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Michal Hocko <mhocko@kernel.org>,
Pavel Tatashin <pasha.tatashin@soleen.com>,
David Hildenbrand <david@redhat.com>,
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
Michal Hocko <mhocko@suse.com>
Subject: Re: [PATCH v2 7/7] mm, page_alloc: disable pcplists during memory offline
Date: Thu, 22 Oct 2020 14:52:00 +0200 [thread overview]
Message-ID: <20201022125159.GE26121@linux> (raw)
In-Reply-To: <20201008114201.18824-8-vbabka@suse.cz>
On Thu, Oct 08, 2020 at 01:42:01PM +0200, Vlastimil Babka wrote:
> Memory offline relies on page isolation can race with process freeing pages to
> pcplists in a way that a page from isolated pageblock can end up on pcplist.
> This can be worked around by repeated draining of pcplists, as done by commit
> 968318261221 ("mm/memory_hotplug: drain per-cpu pages again during memory
> offline").
>
> David and Michal would prefer that this race was closed in a way that callers
> of page isolation who need stronger guarantees don't need to repeatedly drain.
> David suggested disabling pcplists usage completely during page isolation,
> instead of repeatedly draining them.
>
> To achieve this without adding special cases in alloc/free fastpath, we can use
> the same approach as boot pagesets - when pcp->high is 0, any pcplist addition
> will be immediately flushed.
>
> The race can thus be closed by setting pcp->high to 0 and draining pcplists
> once, before calling start_isolate_page_range(). The draining will serialize
> after processes that already disabled interrupts and read the old value of
> pcp->high in free_unref_page_commit(), and processes that have not yet disabled
> interrupts, will observe pcp->high == 0 when they are rescheduled, and skip
> pcplists. This guarantees no stray pages on pcplists in zones where isolation
> happens.
>
> This patch thus adds zone_pcp_disable() and zone_pcp_enable() functions that
> page isolation users can call before start_isolate_page_range() and after
> unisolating (or offlining) the isolated pages.
>
> Also, drain_all_pages() is optimized to only execute on cpus where pcplists are
> not empty. The check can however race with a free to pcplist that has not yet
> increased the pcp->count from 0 to 1. Thus make the drain optionally skip the
> racy check and drain on all cpus, and use this option in zone_pcp_disable().
>
> As we have to avoid external updates to high and batch while pcplists are
> disabled, we take pcp_batch_high_lock in zone_pcp_disable() and release it in
> zone_pcp_enable(). This also synchronizes multiple users of
> zone_pcp_disable()/enable().
>
> Currently the only user of this functionality is offline_pages().
>
> Suggested-by: David Hildenbrand <david@redhat.com>
> Suggested-by: Michal Hocko <mhocko@suse.com>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
I definitely like this one, and the implemantion looks smoth.
As Michal said in another thread, Hwposion code will also benefit from this,
since now we have a drain_all_pages dance that might be suboptimal and not
accurate.
I will get back to that once this gets merged.
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Thanks!
--
Oscar Salvador
SUSE L3
prev parent reply other threads:[~2020-10-22 12:52 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-08 11:41 [PATCH v2 0/7] disable pcplists during memory offline Vlastimil Babka
2020-10-08 11:41 ` [PATCH v2 1/7] mm, page_alloc: clean up pageset high and batch update Vlastimil Babka
2020-10-25 14:17 ` Mike Rapoport
2020-10-08 11:41 ` [PATCH v2 2/7] mm, page_alloc: calculate pageset high and batch once per zone Vlastimil Babka
2020-10-08 11:41 ` [PATCH v2 3/7] mm, page_alloc: remove setup_pageset() Vlastimil Babka
2020-10-08 12:23 ` Michal Hocko
2020-10-08 12:56 ` Vlastimil Babka
2020-10-08 13:03 ` Michal Hocko
2020-10-22 12:34 ` Oscar Salvador
2020-10-08 11:41 ` [PATCH v2 4/7] mm, page_alloc: simplify pageset_update() Vlastimil Babka
2020-10-22 12:39 ` Oscar Salvador
2020-10-08 11:41 ` [PATCH v2 5/7] mm, page_alloc: cache pageset high and batch in struct zone Vlastimil Babka
2020-10-08 12:31 ` Michal Hocko
2020-10-08 17:55 ` Vlastimil Babka
2020-10-22 12:42 ` Oscar Salvador
2020-10-08 11:42 ` [PATCH v2 6/7] mm, page_alloc: move draining pcplists to page isolation users Vlastimil Babka
2020-10-22 12:44 ` Oscar Salvador
2020-10-08 11:42 ` [PATCH v2 7/7] mm, page_alloc: disable pcplists during memory offline Vlastimil Babka
2020-10-08 12:45 ` Michal Hocko
2020-10-08 17:46 ` Vlastimil Babka
2020-10-22 12:52 ` Oscar Salvador [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201022125159.GE26121@linux \
--to=osalvador@suse.de \
--cc=david@redhat.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=mhocko@suse.com \
--cc=pasha.tatashin@soleen.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.