linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Oscar Salvador <osalvador@suse.de>
To: Michal Hocko <mhocko@suse.com>
Cc: david@redhat.com, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, vbabka@suse.cz, pasha.tatashin@soleen.com
Subject: Re: [RFC PATCH v2 2/4] mm,memory_hotplug: Allocate memmap from the added memory range
Date: Mon, 30 Nov 2020 10:12:37 +0100	[thread overview]
Message-ID: <20201130091236.GB3825@linux> (raw)
In-Reply-To: <20201127151536.GV31550@dhcp22.suse.cz>

On Fri, Nov 27, 2020 at 04:15:36PM +0100, Michal Hocko wrote:
> > Vmemap page tables can map arbitrary memory.
> > That means that we can simply use the beginning of each memory section and
> > map struct pages there.
> 
> Did you mean each memory block rather than section?

Yes, sorry, I did not update that part.

> > struct pages which back the allocated space then just need to be treated
> > carefully.
> > 
> > Implementation wise we will reuse vmem_altmap infrastructure to override
> > the default allocator used by __populate_section_memmap. Once the memmap is
> > allocated, we are going to need a way to mark altmap pfns used for the allocation.
> > If MHP_MEMMAP_ON_MEMORY flag was passed, we will set up the layout of the
> > altmap structure in add_memory_resouce(), and then we will call
> > mhp_mark_vmemmap_pages() to properly mark those pages.
> > 
> > Online/Offline:
> > 
> >  In the memory_block structure, a new field is created in order to
> >  store the number of vmemmap_pages.
> 
> Is this really needed? We know how many pfns are required for a block of
> a specific size, right?
> 
> I have only glanced through the patch so I might be missing something
> but I am really wondering why you haven't chosen to use altmap directly
> here.

Well, this is my bad, I did not update the changelog wrt. to the previous version
so it might be confusing.
I will make sure to update it for the next submission, but let me explain it
here to shed some light.

We no longer need to use mhp_mark_vmemap_pages to mark vmemmap pages pages.
Prior to online_pages(), the whole range is offline, so no one should be
messing with any pages within that range.
The initialization of the pages takes places in online_pages():

We have:

start_pfn = first_pfn_of_the_range
buddy_start_pfn = first_pfn_of_the_range + nr_vmemmap_pages


We do have:

+	if (nr_vmemmap_pages)
+		move_pfn_range_to_zone(zone, pfn, nr_vmemmap_pages, NULL,
+				       MIGRATE_UNMOVABLE);
+	move_pfn_range_to_zone(zone, buddy_start_pfn, buddy_nr_pages, NULL,
+			       MIGRATE_ISOLATE);

Now, all the range is initialized and marked PageReserved, but we only send
to buddy (buddy_start_pfn, end_pfn].

And so, (start_pfn, buddy_start_pfn - 1] reamins PageReserved.
And we know that pfn walkers to skip Reserved pages.

About the altmap part.
Altmap is used in the hot-add phase in add_memory_resource.

The thing is, we could avoid adding the memory_block's nr_vmemmap_pages field
, but we would have to mark the vmemmap pages as we used to do in previous
implementations (see [1]), but I find this way cleaner, and it adds much less
code (previous implementions can be see in [2]), and as a starter I find it
much better.

> It would be also good to describe how does a pfn walker recognize such a
> page? Most of them will simply ignore it but e.g. hotplug walker will
> need to skip over those because they are not preventing offlining as
> they will go away with the memory block together.

Wrt. hotplug walker, same as above, we only care to migrate the
(buddy_start_pfn, end_pfn], so the first pfn to migrate and isolate is set to
buddy_start_pfn.

Other pfns walkers should merely skip vmemmap pages because they are Reserved.

> Some basic description of testing done would be suitable as well.

Well, that is:

 - Hot-add memory to a specific numa node
 - Online memory
 - numactl -H and /proc/zoneinfo reflects the truth and nr_vmemmap_pages are
   extracted where they have to be
 - Start a memory stress program and bind it to the numa node we added memory
   so we make sure it gets exercised
 - Wait for a while and when node's free pages have decreased considerably,
   offline memory
 - Check that memory went offline and check /proc/zoneinfo and numactl -H
   again
 - Hot-remove range


[1] https://patchwork.kernel.org/project/linux-mm/patch/20201022125835.26396-3-osalvador@suse.de/
[2] https://patchwork.kernel.org/project/linux-mm/cover/20201022125835.26396-1-osalvador@suse.de/
 
-- 
Oscar Salvador
SUSE L3


  reply	other threads:[~2020-11-30  9:12 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-25 11:20 [RFC PATCH v2 0/4] Allocate memmap from hotadded memory (per device) Oscar Salvador
2020-11-25 11:20 ` [RFC PATCH v2 1/4] mm,memory_hotplug: Introduce MHP_MEMMAP_ON_MEMORY Oscar Salvador
2020-11-27 14:59   ` Michal Hocko
2020-11-25 11:20 ` [RFC PATCH v2 2/4] mm,memory_hotplug: Allocate memmap from the added memory range Oscar Salvador
2020-11-27 15:15   ` Michal Hocko
2020-11-30  9:12     ` Oscar Salvador [this message]
2020-11-25 11:20 ` [RFC PATCH v2 3/4] mm,memory_hotplug: Add mhp_supports_memmap_on_memory Oscar Salvador
2020-11-27 15:02   ` Michal Hocko
2020-11-30  8:50     ` Oscar Salvador
2020-11-25 11:20 ` [RFC PATCH v2 4/4] mm,memory_hotplug: Enable MHP_MEMMAP_ON_MEMORY when supported Oscar Salvador
2020-11-27 11:55   ` Oscar Salvador

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201130091236.GB3825@linux \
    --to=osalvador@suse.de \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).