From: Oscar Salvador <osalvador@suse.de>
To: Michal Hocko <mhocko@suse.com>
Cc: david@redhat.com, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, vbabka@suse.cz, pasha.tatashin@soleen.com
Subject: Re: [RFC PATCH v2 2/4] mm,memory_hotplug: Allocate memmap from the added memory range
Date: Mon, 30 Nov 2020 10:12:37 +0100 [thread overview]
Message-ID: <20201130091236.GB3825@linux> (raw)
In-Reply-To: <20201127151536.GV31550@dhcp22.suse.cz>
On Fri, Nov 27, 2020 at 04:15:36PM +0100, Michal Hocko wrote:
> > Vmemap page tables can map arbitrary memory.
> > That means that we can simply use the beginning of each memory section and
> > map struct pages there.
>
> Did you mean each memory block rather than section?
Yes, sorry, I did not update that part.
> > struct pages which back the allocated space then just need to be treated
> > carefully.
> >
> > Implementation wise we will reuse vmem_altmap infrastructure to override
> > the default allocator used by __populate_section_memmap. Once the memmap is
> > allocated, we are going to need a way to mark altmap pfns used for the allocation.
> > If MHP_MEMMAP_ON_MEMORY flag was passed, we will set up the layout of the
> > altmap structure in add_memory_resouce(), and then we will call
> > mhp_mark_vmemmap_pages() to properly mark those pages.
> >
> > Online/Offline:
> >
> > In the memory_block structure, a new field is created in order to
> > store the number of vmemmap_pages.
>
> Is this really needed? We know how many pfns are required for a block of
> a specific size, right?
>
> I have only glanced through the patch so I might be missing something
> but I am really wondering why you haven't chosen to use altmap directly
> here.
Well, this is my bad, I did not update the changelog wrt. to the previous version
so it might be confusing.
I will make sure to update it for the next submission, but let me explain it
here to shed some light.
We no longer need to use mhp_mark_vmemap_pages to mark vmemmap pages pages.
Prior to online_pages(), the whole range is offline, so no one should be
messing with any pages within that range.
The initialization of the pages takes places in online_pages():
We have:
start_pfn = first_pfn_of_the_range
buddy_start_pfn = first_pfn_of_the_range + nr_vmemmap_pages
We do have:
+ if (nr_vmemmap_pages)
+ move_pfn_range_to_zone(zone, pfn, nr_vmemmap_pages, NULL,
+ MIGRATE_UNMOVABLE);
+ move_pfn_range_to_zone(zone, buddy_start_pfn, buddy_nr_pages, NULL,
+ MIGRATE_ISOLATE);
Now, all the range is initialized and marked PageReserved, but we only send
to buddy (buddy_start_pfn, end_pfn].
And so, (start_pfn, buddy_start_pfn - 1] reamins PageReserved.
And we know that pfn walkers to skip Reserved pages.
About the altmap part.
Altmap is used in the hot-add phase in add_memory_resource.
The thing is, we could avoid adding the memory_block's nr_vmemmap_pages field
, but we would have to mark the vmemmap pages as we used to do in previous
implementations (see [1]), but I find this way cleaner, and it adds much less
code (previous implementions can be see in [2]), and as a starter I find it
much better.
> It would be also good to describe how does a pfn walker recognize such a
> page? Most of them will simply ignore it but e.g. hotplug walker will
> need to skip over those because they are not preventing offlining as
> they will go away with the memory block together.
Wrt. hotplug walker, same as above, we only care to migrate the
(buddy_start_pfn, end_pfn], so the first pfn to migrate and isolate is set to
buddy_start_pfn.
Other pfns walkers should merely skip vmemmap pages because they are Reserved.
> Some basic description of testing done would be suitable as well.
Well, that is:
- Hot-add memory to a specific numa node
- Online memory
- numactl -H and /proc/zoneinfo reflects the truth and nr_vmemmap_pages are
extracted where they have to be
- Start a memory stress program and bind it to the numa node we added memory
so we make sure it gets exercised
- Wait for a while and when node's free pages have decreased considerably,
offline memory
- Check that memory went offline and check /proc/zoneinfo and numactl -H
again
- Hot-remove range
[1] https://patchwork.kernel.org/project/linux-mm/patch/20201022125835.26396-3-osalvador@suse.de/
[2] https://patchwork.kernel.org/project/linux-mm/cover/20201022125835.26396-1-osalvador@suse.de/
--
Oscar Salvador
SUSE L3
next prev parent reply other threads:[~2020-11-30 9:12 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-25 11:20 [RFC PATCH v2 0/4] Allocate memmap from hotadded memory (per device) Oscar Salvador
2020-11-25 11:20 ` [RFC PATCH v2 1/4] mm,memory_hotplug: Introduce MHP_MEMMAP_ON_MEMORY Oscar Salvador
2020-11-27 14:59 ` Michal Hocko
2020-11-25 11:20 ` [RFC PATCH v2 2/4] mm,memory_hotplug: Allocate memmap from the added memory range Oscar Salvador
2020-11-27 15:15 ` Michal Hocko
2020-11-30 9:12 ` Oscar Salvador [this message]
2020-11-25 11:20 ` [RFC PATCH v2 3/4] mm,memory_hotplug: Add mhp_supports_memmap_on_memory Oscar Salvador
2020-11-25 13:18 ` [RFC PATCH v2 3/4] mm, memory_hotplug: " kernel test robot
2020-11-25 13:20 ` kernel test robot
2020-11-27 15:02 ` [RFC PATCH v2 3/4] mm,memory_hotplug: " Michal Hocko
2020-11-30 8:50 ` Oscar Salvador
2020-11-25 11:20 ` [RFC PATCH v2 4/4] mm,memory_hotplug: Enable MHP_MEMMAP_ON_MEMORY when supported Oscar Salvador
2020-11-27 11:55 ` Oscar Salvador
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201130091236.GB3825@linux \
--to=osalvador@suse.de \
--cc=david@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=pasha.tatashin@soleen.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.