All of lore.kernel.org
 help / color / mirror / Atom feed
From: Heiko Carstens <heiko.carstens@de.ibm.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@suse.de>, Vlastimil Babka <vbabka@suse.cz>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Jerome Glisse <jglisse@redhat.com>,
	Reza Arbab <arbab@linux.vnet.ibm.com>,
	Yasuaki Ishimatsu <yasu.isimatu@gmail.com>,
	qiuxishi@huawei.com, Kani Toshimitsu <toshi.kani@hpe.com>,
	slaoub@gmail.com, Joonsoo Kim <js1304@gmail.com>,
	Andi Kleen <ak@linux.intel.com>,
	Daniel Kiper <daniel.kiper@oracle.com>,
	Igor Mammedov <imammedo@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Michal Hocko <mhocko@suse.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Dan Williams <dan.j.williams@intel.com>,
	"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Paul Mackerras <paulus@samba.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Gerald Schaefer <gerald.schaefer@de.ibm.com>
Subject: Re: [RFC PATCH 3/5] mm, memory_hotplug: allocate memmap from the added memory range for sparse-vmemmap
Date: Wed, 26 Jul 2017 13:45:39 +0200	[thread overview]
Message-ID: <20170726114539.GG3218@osiris> (raw)
In-Reply-To: <20170726083333.17754-4-mhocko@kernel.org>

On Wed, Jul 26, 2017 at 10:33:31AM +0200, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> Physical memory hotadd has to allocate a memmap (struct page array) for
> the newly added memory section. kmalloc is currantly used for those
> allocations.
> 
> This has some disadvantages a) an existing memory is consumed for
> that purpose (~2MB per 128MB memory section) and b) if the whole node
> is movable then we have off-node struct pages which has performance
> drawbacks.
> 
> a) has turned out to be a problem for memory hotplug based ballooning
> because the userspace might not react in time to online memory while
> to memory consumed during physical hotadd consumes enough memory to push
> system to OOM. 31bc3858ea3e ("memory-hotplug: add automatic onlining
> policy for the newly added memory") has been added to workaround that
> problem.
> 
> We can do much better when CONFIG_SPARSEMEM_VMEMMAP=y because vmemap
> page tables can map arbitrary memory. That means that we can simply
> use the beginning of each memory section and map struct pages there.
> struct pages which back the allocated space then just need to be treated
> carefully so that we know they are not usable.
> 
> Add {_Set,_Clear}PageVmemmap helpers to distinguish those pages in pfn
> walkers. We do not have any spare page flag for this purpose so use the
> combination of PageReserved bit which already tells that the page should
> be ignored by the core mm code and store VMEMMAP_PAGE (which sets all
> bits but PAGE_MAPPING_FLAGS) into page->mapping.
> 
> On the memory hotplug front reuse vmem_altmap infrastructure to override
> the default allocator used by __vmemap_populate. Once the memmap is
> allocated we need a way to mark altmap pfns used for the allocation
> and this is done by a new vmem_altmap::flush_alloc_pfns callback.
> mark_vmemmap_pages implementation then simply __SetPageVmemmap all
> struct pages backing those pfns. The callback is called from
> sparse_add_one_section after the memmap has been initialized to 0.
> 
> We also have to be careful about those pages during online and offline
> operations. They are simply ignored.
> 
> Finally __ClearPageVmemmap is called when the vmemmap page tables are
> torn down.
> 
> Please note that only the memory hotplug is currently using this
> allocation scheme. The boot time memmap allocation could use the same
> trick as well but this is not done yet.

Which kernel are these patches based on? I tried linux-next and Linus'
vanilla tree, however the series does not apply.

In general I do like your idea, however if I understand your patches
correctly we might have an ordering problem on s390: it is not possible to
access hot-added memory on s390 before it is online (MEM_GOING_ONLINE
succeeded).

On MEM_GOING_ONLINE we ask the hypervisor to back the potential available
hot-added memory region with physical pages. Accessing those ranges before
that will result in an exception.

However with your approach the memory is still allocated when add_memory()
is being called, correct? That wouldn't be a change to the current
behaviour; except for the ordering problem outlined above.
Just trying to make sure I get this right :)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Heiko Carstens <heiko.carstens@de.ibm.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@suse.de>, Vlastimil Babka <vbabka@suse.cz>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Jerome Glisse <jglisse@redhat.com>,
	Reza Arbab <arbab@linux.vnet.ibm.com>,
	Yasuaki Ishimatsu <yasu.isimatu@gmail.com>,
	qiuxishi@huawei.com, Kani Toshimitsu <toshi.kani@hpe.com>,
	slaoub@gmail.com, Joonsoo Kim <js1304@gmail.com>,
	Andi Kleen <ak@linux.intel.com>,
	Daniel Kiper <daniel.kiper@oracle.com>,
	Igor Mammedov <imammedo@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Michal Hocko <mhocko@suse.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Dan Williams <dan.j.williams@intel.com>,
	"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Paul Mackerras <paulus@samba.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Gerald Schaefer <gerald.schaefer@de.ibm.com>
Subject: Re: [RFC PATCH 3/5] mm, memory_hotplug: allocate memmap from the added memory range for sparse-vmemmap
Date: Wed, 26 Jul 2017 13:45:39 +0200	[thread overview]
Message-ID: <20170726114539.GG3218@osiris> (raw)
In-Reply-To: <20170726083333.17754-4-mhocko@kernel.org>

On Wed, Jul 26, 2017 at 10:33:31AM +0200, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> Physical memory hotadd has to allocate a memmap (struct page array) for
> the newly added memory section. kmalloc is currantly used for those
> allocations.
> 
> This has some disadvantages a) an existing memory is consumed for
> that purpose (~2MB per 128MB memory section) and b) if the whole node
> is movable then we have off-node struct pages which has performance
> drawbacks.
> 
> a) has turned out to be a problem for memory hotplug based ballooning
> because the userspace might not react in time to online memory while
> to memory consumed during physical hotadd consumes enough memory to push
> system to OOM. 31bc3858ea3e ("memory-hotplug: add automatic onlining
> policy for the newly added memory") has been added to workaround that
> problem.
> 
> We can do much better when CONFIG_SPARSEMEM_VMEMMAP=y because vmemap
> page tables can map arbitrary memory. That means that we can simply
> use the beginning of each memory section and map struct pages there.
> struct pages which back the allocated space then just need to be treated
> carefully so that we know they are not usable.
> 
> Add {_Set,_Clear}PageVmemmap helpers to distinguish those pages in pfn
> walkers. We do not have any spare page flag for this purpose so use the
> combination of PageReserved bit which already tells that the page should
> be ignored by the core mm code and store VMEMMAP_PAGE (which sets all
> bits but PAGE_MAPPING_FLAGS) into page->mapping.
> 
> On the memory hotplug front reuse vmem_altmap infrastructure to override
> the default allocator used by __vmemap_populate. Once the memmap is
> allocated we need a way to mark altmap pfns used for the allocation
> and this is done by a new vmem_altmap::flush_alloc_pfns callback.
> mark_vmemmap_pages implementation then simply __SetPageVmemmap all
> struct pages backing those pfns. The callback is called from
> sparse_add_one_section after the memmap has been initialized to 0.
> 
> We also have to be careful about those pages during online and offline
> operations. They are simply ignored.
> 
> Finally __ClearPageVmemmap is called when the vmemmap page tables are
> torn down.
> 
> Please note that only the memory hotplug is currently using this
> allocation scheme. The boot time memmap allocation could use the same
> trick as well but this is not done yet.

Which kernel are these patches based on? I tried linux-next and Linus'
vanilla tree, however the series does not apply.

In general I do like your idea, however if I understand your patches
correctly we might have an ordering problem on s390: it is not possible to
access hot-added memory on s390 before it is online (MEM_GOING_ONLINE
succeeded).

On MEM_GOING_ONLINE we ask the hypervisor to back the potential available
hot-added memory region with physical pages. Accessing those ranges before
that will result in an exception.

However with your approach the memory is still allocated when add_memory()
is being called, correct? That wouldn't be a change to the current
behaviour; except for the ordering problem outlined above.
Just trying to make sure I get this right :)

  reply	other threads:[~2017-07-26 11:45 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-26  8:33 [RFC PATCH 0/5] mm, memory_hotplug: allocate memmap from hotadded memory Michal Hocko
2017-07-26  8:33 ` Michal Hocko
2017-07-26  8:33 ` [RFC PATCH 1/5] mm, memory_hotplug: cleanup memory offline path Michal Hocko
2017-07-26  8:33   ` Michal Hocko
2017-07-26  8:33 ` [RFC PATCH 2/5] mm, arch: unify vmemmap_populate altmap handling Michal Hocko
2017-07-26  8:33   ` Michal Hocko
2017-07-31 12:40   ` Gerald Schaefer
2017-07-31 12:40     ` Gerald Schaefer
2017-07-31 12:55     ` Michal Hocko
2017-07-31 12:55       ` Michal Hocko
2017-07-31 14:27       ` Gerald Schaefer
2017-07-31 14:27         ` Gerald Schaefer
2017-07-31 14:36         ` Michal Hocko
2017-07-31 14:36           ` Michal Hocko
2017-07-26  8:33 ` [RFC PATCH 3/5] mm, memory_hotplug: allocate memmap from the added memory range for sparse-vmemmap Michal Hocko
2017-07-26  8:33   ` Michal Hocko
2017-07-26 11:45   ` Heiko Carstens [this message]
2017-07-26 11:45     ` Heiko Carstens
2017-07-26 11:49     ` Heiko Carstens
2017-07-26 11:49       ` Heiko Carstens
2017-07-26 12:30     ` Michal Hocko
2017-07-26 12:30       ` Michal Hocko
2017-07-26 17:20       ` Gerald Schaefer
2017-07-26 17:20         ` Gerald Schaefer
2017-07-28 11:26         ` Michal Hocko
2017-07-28 11:26           ` Michal Hocko
2017-07-28 17:47   ` Michal Hocko
2017-07-28 17:47     ` Michal Hocko
2017-07-26  8:33 ` [RFC PATCH 4/5] mm, sparse: complain about implicit altmap usage in vmemmap_populate Michal Hocko
2017-07-26  8:33   ` Michal Hocko
2017-07-26  8:33 ` [RFC PATCH 5/5] mm, sparse: rename kmalloc_section_memmap, __kfree_section_memmap Michal Hocko
2017-07-26  8:33   ` Michal Hocko
2017-07-26 11:39 ` [RFC PATCH 0/5] mm, memory_hotplug: allocate memmap from hotadded memory Michal Hocko
2017-07-26 11:39   ` Michal Hocko
2017-07-26 21:06 ` Jerome Glisse
2017-07-26 21:06   ` Jerome Glisse
2017-07-27  6:56   ` Michal Hocko
2017-07-27  6:56     ` Michal Hocko
2017-07-28 12:19     ` Michal Hocko
2017-07-28 12:19       ` Michal Hocko
2017-07-31 12:35       ` Gerald Schaefer
2017-07-31 12:35         ` Gerald Schaefer
2017-07-31 12:53         ` Michal Hocko
2017-07-31 12:53           ` Michal Hocko
2017-07-31 15:04           ` Gerald Schaefer
2017-07-31 15:04             ` Gerald Schaefer
2017-07-31 15:53             ` Michal Hocko
2017-07-31 15:53               ` Michal Hocko
2017-07-31 17:58               ` Gerald Schaefer
2017-07-31 17:58                 ` Gerald Schaefer
2017-08-01 11:30                 ` Igor Mammedov
2017-08-01 11:30                   ` Igor Mammedov
2017-08-01 12:27                 ` Michal Hocko
2017-08-01 12:27                   ` Michal Hocko
2017-07-28 12:01 ` Michal Hocko
2017-07-28 12:01   ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170726114539.GG3218@osiris \
    --to=heiko.carstens@de.ibm.com \
    --cc=aarcange@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=arbab@linux.vnet.ibm.com \
    --cc=benh@kernel.crashing.org \
    --cc=dan.j.williams@intel.com \
    --cc=daniel.kiper@oracle.com \
    --cc=gerald.schaefer@de.ibm.com \
    --cc=hpa@zytor.com \
    --cc=imammedo@redhat.com \
    --cc=jglisse@redhat.com \
    --cc=js1304@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@kernel.org \
    --cc=mhocko@suse.com \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=paulus@samba.org \
    --cc=qiuxishi@huawei.com \
    --cc=slaoub@gmail.com \
    --cc=tglx@linutronix.de \
    --cc=toshi.kani@hpe.com \
    --cc=vbabka@suse.cz \
    --cc=vkuznets@redhat.com \
    --cc=yasu.isimatu@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.