Re: [RFC PATCH 0/5] mm, memory_hotplug: allocate memmap from hotadded memory

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jerome Glisse <jglisse@redhat.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@suse.de>, Vlastimil Babka <vbabka@suse.cz>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Reza Arbab <arbab@linux.vnet.ibm.com>,
	Yasuaki Ishimatsu <yasu.isimatu@gmail.com>,
	qiuxishi@huawei.com, Kani Toshimitsu <toshi.kani@hpe.com>,
	slaoub@gmail.com, Joonsoo Kim <js1304@gmail.com>,
	Andi Kleen <ak@linux.intel.com>,
	Daniel Kiper <daniel.kiper@oracle.com>,
	Igor Mammedov <imammedo@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Fenghua Yu <fenghua.yu@intel.com>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Michal Hocko <mhocko@suse.com>, Paul Mackerras <paulus@samba.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Tony Luck <tony.luck@intel.com>,
	Will Deacon <will.deacon@arm.com>
Subject: Re: [RFC PATCH 0/5] mm, memory_hotplug: allocate memmap from hotadded memory
Date: Wed, 26 Jul 2017 17:06:59 -0400	[thread overview]
Message-ID: <20170726210657.GE21717@redhat.com> (raw)
In-Reply-To: <20170726083333.17754-1-mhocko@kernel.org>

On Wed, Jul 26, 2017 at 10:33:28AM +0200, Michal Hocko wrote:
> Hi,
> this is another step to make the memory hotplug more usable. The primary
> goal of this patchset is to reduce memory overhead of the hot added
> memory (at least for SPARSE_VMEMMAP memory model). Currently we use
> kmalloc to poppulate memmap (struct page array) which has two main
> drawbacks a) it consumes an additional memory until the hotadded memory
> itslef is onlined and b) memmap might end up on a different numa node
> which is especially true for movable_node configuration.
> 
> a) is problem especially for memory hotplug based memory "ballooning"
> solutions when the delay between physical memory hotplug and the
> onlining can lead to OOM and that led to introduction of hacks like auto
> onlining (see 31bc3858ea3e ("memory-hotplug: add automatic onlining
> policy for the newly added memory")).
> b) can have performance drawbacks.
> 
> One way to mitigate both issues is to simply allocate memmap array
> (which is the largest memory footprint of the physical memory hotplug)
> from the hotadded memory itself. VMEMMAP memory model allows us to map
> any pfn range so the memory doesn't need to be online to be usable
> for the array. See patch 3 for more details. In short I am reusing an
> existing vmem_altmap which wants to achieve the same thing for nvdim
> device memory.
> 
> I am sending this as an RFC because this has seen only a very limited
> testing and I am mostly interested about opinions on the chosen
> approach. I had to touch some arch code and I have no idea whether my
> changes make sense there (especially ppc). Therefore I would highly
> appreciate arch maintainers to check patch 2.
> 
> Patches 4 and 5 should be straightforward cleanups.
> 
> There is also one potential drawback, though. If somebody uses memory
> hotplug for 1G (gigantic) hugetlb pages then this scheme will not work
> for them obviously because each memory section will contain 2MB reserved
> area.  I am not really sure somebody does that and how reliable that
> can work actually. Nevertheless, I _believe_ that onlining more memory
> into virtual machines is much more common usecase. Anyway if there ever
> is a strong demand for such a usecase we have basically 3 options a)
> enlarge memory sections b) enhance altmap allocation strategy and reuse
> low memory sections to host memmaps of other sections on the same NUMA
> node c) have the memmap allocation strategy configurable to fallback to
> the current allocation.
> 
> Are there any other concerns, ideas, comments?
> 

This does not seems to be an opt-in change ie if i am reading patch 3
correctly if an altmap is not provided to __add_pages() you fallback
to allocating from begining of zone. This will not work with HMM ie
device private memory. So at very least i would like to see some way
to opt-out of this. Maybe a new argument like bool forbid_altmap ?

Cheers,
Jerome

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)

From: Jerome Glisse <jglisse@redhat.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@suse.de>, Vlastimil Babka <vbabka@suse.cz>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Reza Arbab <arbab@linux.vnet.ibm.com>,
	Yasuaki Ishimatsu <yasu.isimatu@gmail.com>,
	qiuxishi@huawei.com, Kani Toshimitsu <toshi.kani@hpe.com>,
	slaoub@gmail.com, Joonsoo Kim <js1304@gmail.com>,
	Andi Kleen <ak@linux.intel.com>,
	Daniel Kiper <daniel.kiper@oracle.com>,
	Igor Mammedov <imammedo@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Fenghua Yu <fenghua.yu@intel.com>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Michal Hocko <mhocko@suse.com>, Paul Mackerras <paulus@samba.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Tony Luck <tony.luck@intel.com>,
	Will Deacon <will.deacon@arm.com>
Subject: Re: [RFC PATCH 0/5] mm, memory_hotplug: allocate memmap from hotadded memory
Date: Wed, 26 Jul 2017 17:06:59 -0400	[thread overview]
Message-ID: <20170726210657.GE21717@redhat.com> (raw)
In-Reply-To: <20170726083333.17754-1-mhocko@kernel.org>

On Wed, Jul 26, 2017 at 10:33:28AM +0200, Michal Hocko wrote:
> Hi,
> this is another step to make the memory hotplug more usable. The primary
> goal of this patchset is to reduce memory overhead of the hot added
> memory (at least for SPARSE_VMEMMAP memory model). Currently we use
> kmalloc to poppulate memmap (struct page array) which has two main
> drawbacks a) it consumes an additional memory until the hotadded memory
> itslef is onlined and b) memmap might end up on a different numa node
> which is especially true for movable_node configuration.
> 
> a) is problem especially for memory hotplug based memory "ballooning"
> solutions when the delay between physical memory hotplug and the
> onlining can lead to OOM and that led to introduction of hacks like auto
> onlining (see 31bc3858ea3e ("memory-hotplug: add automatic onlining
> policy for the newly added memory")).
> b) can have performance drawbacks.
> 
> One way to mitigate both issues is to simply allocate memmap array
> (which is the largest memory footprint of the physical memory hotplug)
> from the hotadded memory itself. VMEMMAP memory model allows us to map
> any pfn range so the memory doesn't need to be online to be usable
> for the array. See patch 3 for more details. In short I am reusing an
> existing vmem_altmap which wants to achieve the same thing for nvdim
> device memory.
> 
> I am sending this as an RFC because this has seen only a very limited
> testing and I am mostly interested about opinions on the chosen
> approach. I had to touch some arch code and I have no idea whether my
> changes make sense there (especially ppc). Therefore I would highly
> appreciate arch maintainers to check patch 2.
> 
> Patches 4 and 5 should be straightforward cleanups.
> 
> There is also one potential drawback, though. If somebody uses memory
> hotplug for 1G (gigantic) hugetlb pages then this scheme will not work
> for them obviously because each memory section will contain 2MB reserved
> area.  I am not really sure somebody does that and how reliable that
> can work actually. Nevertheless, I _believe_ that onlining more memory
> into virtual machines is much more common usecase. Anyway if there ever
> is a strong demand for such a usecase we have basically 3 options a)
> enlarge memory sections b) enhance altmap allocation strategy and reuse
> low memory sections to host memmaps of other sections on the same NUMA
> node c) have the memmap allocation strategy configurable to fallback to
> the current allocation.
> 
> Are there any other concerns, ideas, comments?
> 

This does not seems to be an opt-in change ie if i am reading patch 3
correctly if an altmap is not provided to __add_pages() you fallback
to allocating from begining of zone. This will not work with HMM ie
device private memory. So at very least i would like to see some way
to opt-out of this. Maybe a new argument like bool forbid_altmap ?

Cheers,
Jérôme

next prev parent reply	other threads:[~2017-07-26 21:07 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-26  8:33 [RFC PATCH 0/5] mm, memory_hotplug: allocate memmap from hotadded memory Michal Hocko
2017-07-26  8:33 ` Michal Hocko
2017-07-26  8:33 ` [RFC PATCH 1/5] mm, memory_hotplug: cleanup memory offline path Michal Hocko
2017-07-26  8:33   ` Michal Hocko
2017-07-26  8:33 ` [RFC PATCH 2/5] mm, arch: unify vmemmap_populate altmap handling Michal Hocko
2017-07-26  8:33   ` Michal Hocko
2017-07-31 12:40   ` Gerald Schaefer
2017-07-31 12:40     ` Gerald Schaefer
2017-07-31 12:55     ` Michal Hocko
2017-07-31 12:55       ` Michal Hocko
2017-07-31 14:27       ` Gerald Schaefer
2017-07-31 14:27         ` Gerald Schaefer
2017-07-31 14:36         ` Michal Hocko
2017-07-31 14:36           ` Michal Hocko
2017-07-26  8:33 ` [RFC PATCH 3/5] mm, memory_hotplug: allocate memmap from the added memory range for sparse-vmemmap Michal Hocko
2017-07-26  8:33   ` Michal Hocko
2017-07-26 11:45   ` Heiko Carstens
2017-07-26 11:45     ` Heiko Carstens
2017-07-26 11:49     ` Heiko Carstens
2017-07-26 11:49       ` Heiko Carstens
2017-07-26 12:30     ` Michal Hocko
2017-07-26 12:30       ` Michal Hocko
2017-07-26 17:20       ` Gerald Schaefer
2017-07-26 17:20         ` Gerald Schaefer
2017-07-28 11:26         ` Michal Hocko
2017-07-28 11:26           ` Michal Hocko
2017-07-28 17:47   ` Michal Hocko
2017-07-28 17:47     ` Michal Hocko
2017-07-26  8:33 ` [RFC PATCH 4/5] mm, sparse: complain about implicit altmap usage in vmemmap_populate Michal Hocko
2017-07-26  8:33   ` Michal Hocko
2017-07-26  8:33 ` [RFC PATCH 5/5] mm, sparse: rename kmalloc_section_memmap, __kfree_section_memmap Michal Hocko
2017-07-26  8:33   ` Michal Hocko
2017-07-26 11:39 ` [RFC PATCH 0/5] mm, memory_hotplug: allocate memmap from hotadded memory Michal Hocko
2017-07-26 11:39   ` Michal Hocko
2017-07-26 21:06 ` Jerome Glisse [this message]
2017-07-26 21:06   ` Jerome Glisse
2017-07-27  6:56   ` Michal Hocko
2017-07-27  6:56     ` Michal Hocko
2017-07-28 12:19     ` Michal Hocko
2017-07-28 12:19       ` Michal Hocko
2017-07-31 12:35       ` Gerald Schaefer
2017-07-31 12:35         ` Gerald Schaefer
2017-07-31 12:53         ` Michal Hocko
2017-07-31 12:53           ` Michal Hocko
2017-07-31 15:04           ` Gerald Schaefer
2017-07-31 15:04             ` Gerald Schaefer
2017-07-31 15:53             ` Michal Hocko
2017-07-31 15:53               ` Michal Hocko
2017-07-31 17:58               ` Gerald Schaefer
2017-07-31 17:58                 ` Gerald Schaefer
2017-08-01 11:30                 ` Igor Mammedov
2017-08-01 11:30                   ` Igor Mammedov
2017-08-01 12:27                 ` Michal Hocko
2017-08-01 12:27                   ` Michal Hocko
2017-07-28 12:01 ` Michal Hocko
2017-07-28 12:01   ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170726210657.GE21717@redhat.com \
    --to=jglisse@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=arbab@linux.vnet.ibm.com \
    --cc=benh@kernel.crashing.org \
    --cc=catalin.marinas@arm.com \
    --cc=dan.j.williams@intel.com \
    --cc=daniel.kiper@oracle.com \
    --cc=fenghua.yu@intel.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=hpa@zytor.com \
    --cc=imammedo@redhat.com \
    --cc=js1304@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@kernel.org \
    --cc=mhocko@suse.com \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=paulus@samba.org \
    --cc=qiuxishi@huawei.com \
    --cc=schwidefsky@de.ibm.com \
    --cc=slaoub@gmail.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=toshi.kani@hpe.com \
    --cc=vbabka@suse.cz \
    --cc=vkuznets@redhat.com \
    --cc=will.deacon@arm.com \
    --cc=yasu.isimatu@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.