From: Mel Gorman <mgorman@techsingularity.net>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Oscar Salvador <osalvador@suse.de>,
"Michael S. Tsirkin" <mst@redhat.com>,
Alexander Duyck <alexanderduyck@fb.com>,
Minchan Kim <minchan@kernel.org>, Michal Hocko <mhocko@suse.com>,
David Hildenbrand <david@redhat.com>,
Linux-MM <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2 resend] mm/memory_hotplug: Make unpopulated zones PCP structures unreachable during hot remove
Date: Mon, 12 Apr 2021 15:08:52 +0100 [thread overview]
Message-ID: <20210412140852.GZ3697@techsingularity.net> (raw)
In-Reply-To: <d4e4c3e4-7d47-d634-4374-4cf1e55c7895@suse.cz>
On Mon, Apr 12, 2021 at 02:40:18PM +0200, Vlastimil Babka wrote:
> On 4/12/21 2:08 PM, Mel Gorman wrote:
> > zone_pcp_reset allegedly protects against a race with drain_pages
> > using local_irq_save but this is bogus. local_irq_save only operates
> > on the local CPU. If memory hotplug is running on CPU A and drain_pages
> > is running on CPU B, disabling IRQs on CPU A does not affect CPU B and
> > offers no protection.
> >
> > This patch deletes IRQ disable/enable on the grounds that IRQs protect
> > nothing and assumes the existing hotplug paths guarantees the PCP cannot be
> > used after zone_pcp_enable(). That should be the case already because all
> > the pages have been freed and there is no page to put on the PCP lists.
> >
> > Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
>
> Yeah the irq disabling here is clearly bogus, so:
>
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
>
Thanks!
> But I think Michal has a point that we might best leave the pagesets around, by
> a future change. I'm have some doubts that even with your reordering of the
> reset/destroy after zonelist rebuild in v1 they cant't be reachable. We have no
> protection between zonelist rebuild and zonelist traversal, and that's why we
> just leave pgdats around.
>
> So I can imagine a task racing with memory hotremove might see watermarks as ok
> in get_page_from_freelist() for the zone and proceeds to try_this_zone:, then
> gets stalled/scheduled out while hotremove rebuilds the zonelist and destroys
> the pcplists, then the first task is resumed and proceeds with rmqueue_pcplist().
>
> So that's very rare thus not urgent, and this patch doesn't make it less rare so
> not a reason to block it.
>
After v1 of the patch, the race was reduced to the point between the
zone watermark check and the rmqueue_pcplist but yes, it still existed.
Closing it completely was either complex or expensive. Setting
zone->pageset = &boot_pageset before the free would shrink the race
further but that still leaves a potential memory ordering issue.
While fixable, it's either complex, expensive or both so yes, just leaving
the pageset structures in place would be much more straight-forward
assuming the structures were not allocated in the zone that is being
hot-removed. As things stand, I had trouble even testing zone hot-remove
as there was always a few pages left behind and I did not chase down
why. The focus was getting rid of the bogus local_irq_save() because
it was clearly wrong and offering a false sense of safety and the last
problematic local_irq_save() user in page_alloc.c when local_lock is used
to protect the PCP structures.
--
Mel Gorman
SUSE Labs
next prev parent reply other threads:[~2021-04-12 14:08 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-12 12:08 [PATCH v2 resend] mm/memory_hotplug: Make unpopulated zones PCP structures unreachable during hot remove Mel Gorman
2021-04-12 12:12 ` David Hildenbrand
2021-04-12 12:40 ` Vlastimil Babka
2021-04-12 14:08 ` Mel Gorman [this message]
2021-04-12 14:12 ` David Hildenbrand
2021-04-12 15:27 ` Mel Gorman
2021-04-12 16:02 ` David Hildenbrand
2021-04-13 9:36 ` Vlastimil Babka
2021-04-13 9:49 ` Mel Gorman
2021-04-13 10:16 ` Michal Hocko
2021-04-13 6:44 ` Michal Hocko
2021-04-13 6:40 ` Michal Hocko
2021-04-14 7:18 ` Oscar Salvador
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210412140852.GZ3697@techsingularity.net \
--to=mgorman@techsingularity.net \
--cc=akpm@linux-foundation.org \
--cc=alexanderduyck@fb.com \
--cc=david@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=minchan@kernel.org \
--cc=mst@redhat.com \
--cc=osalvador@suse.de \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.