From: David Hildenbrand <david@redhat.com>
To: linux-mm@kvack.org
Cc: "Oscar Salvador" <osalvador@suse.com>,
"Rafael J. Wysocki" <rafael@kernel.org>,
"Michal Hocko" <mhocko@suse.com>,
linux-ia64@vger.kernel.org, linux-sh@vger.kernel.org,
"Peter Zijlstra" <peterz@infradead.org>,
"Dave Hansen" <dave.hansen@linux.intel.com>,
"David Hildenbrand" <david@redhat.com>,
"Michal Hocko" <mhocko@kernel.org>,
"Vitaly Kuznetsov" <vkuznets@redhat.com>,
"Pavel Tatashin" <pavel.tatashin@microsoft.com>,
"Rich Felker" <dalias@libc.org>,
"Arun KS" <arunks@codeaurora.org>,
"H. Peter Anvin" <hpa@zytor.com>,
"Stephen Rothwell" <sfr@canb.auug.org.au>,
"Rashmica Gupta" <rashmica.g@gmail.com>,
"K. Y. Srinivasan" <kys@microsoft.com>,
"Dan Williams" <dan.j.williams@intel.com>,
"Paul Mackerras" <paulus@samba.org>,
"Pavel Tatashin" <pasha.tatashin@soleen.com>,
linux-s390@vger.kernel.org, "Michael Neuling" <mikey@neuling.org>,
"Stefano Stabellini" <sstabellini@kernel.org>,
"Dave Jiang" <dave.jiang@intel.com>,
"Yoshinori Sato" <ysato@users.sourceforge.jp>,
"Logan Gunthorpe" <logang@deltatee.com>,
x86@kernel.org, YueHaibing <yuehaibing@huawei.com>,
"Pavel Tatashin" <pasha.tatashin@oracle.com>,
"Matthew Wilcox" <willy@infradead.org>,
"Ingo Molnar" <mingo@kernel.org>,
linux-acpi@vger.kernel.org, "Ingo Molnar" <mingo@redhat.com>,
xen-devel@lists.xenproject.org,
"Michal Suchánek" <msuchanek@suse.de>,
"Len Brown" <lenb@kernel.org>,
"Fenghua Yu" <fenghua.yu@intel.com>,
"Jan H. Schönherr" <jschoenh@amazon.de>,
"Juergen Gross" <jgross@suse.com>,
"Vasily Gorbik" <gor@linux.ibm.com>,
"Rob Herring" <robh@kernel.org>,
"mike.travis@hpe.com" <mike.travis@hpe.com>,
"Heiko Carstens" <heiko.carstens@de.ibm.com>,
"Haiyang Zhang" <haiyangz@microsoft.com>,
"Jonathan Neuschäfer" <j.neuschaefer@gmx.net>,
"Nicholas Piggin" <npiggin@gmail.com>,
"Jérôme Glisse" <jglisse@redhat.com>,
"Mike Rapoport" <rppt@linux.vnet.ibm.com>,
"Borislav Petkov" <bp@alien8.de>,
"Andy Lutomirski" <luto@kernel.org>,
"Nathan Fontenot" <nfont@linux.vnet.ibm.com>,
"Stephen Hemminger" <sthemmin@microsoft.com>,
"Boris Ostrovsky" <boris.ostrovsky@oracle.com>,
"Wei Yang" <richard.weiyang@gmail.com>,
"Joonsoo Kim" <iamjoonsoo.kim@lge.com>,
"Oscar Salvador" <osalvador@suse.de>,
"Tony Luck" <tony.luck@intel.com>,
"Andrew Banman" <andrew.banman@hpe.com>,
"Mathieu Malaterre" <malat@debian.org>,
"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
"Rafael J. Wysocki" <rjw@rjwysocki.net>,
linux-kernel@vger.kernel.org,
"Mauricio Faria de Oliveira" <mauricfo@linux.vnet.ibm.com>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Martin Schwidefsky" <schwidefsky@de.ibm.com>,
devel@linuxdriverproject.org,
"Andrew Morton" <akpm@linux-foundation.org>,
linuxppc-dev@lists.ozlabs.org,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: [PATCH RFCv2 0/4] mm/memory_hotplug: Introduce memory block types
Date: Fri, 30 Nov 2018 18:59:18 +0100 [thread overview]
Message-ID: <20181130175922.10425-1-david@redhat.com> (raw)
This is the second approach, introducing more meaningful memory block
types and not changing online behavior in the kernel. It is based on
latest linux-next.
As we found out during dicussion, user space should always handle onlining
of memory, in any case. However in order to make smart decisions in user
space about if and how to online memory, we have to export more information
about memory blocks. This way, we can formulate rules in user space.
One such information is the type of memory block we are talking about.
This helps to answer some questions like:
- Does this memory block belong to a DIMM?
- Can this DIMM theoretically ever be unplugged again?
- Was this memory added by a balloon driver that will rely on balloon
inflation to remove chunks of that memory again? Which zone is advised?
- Is this special standby memory on s390x that is usually not automatically
onlined?
And in short it helps to answer to some extend (excluding zone imbalances)
- Should I online this memory block?
- To which zone should I online this memory block?
... of course special use cases will result in different anwers. But that's
why user space has control of onlining memory.
More details can be found in Patch 1 and Patch 3.
Tested on x86 with hotplugged DIMMs. Cross-compiled for PPC and s390x.
Example:
$ udevadm info -q all -a /sys/devices/system/memory/memory0
KERNEL=="memory0"
SUBSYSTEM=="memory"
DRIVER==""
ATTR{online}=="1"
ATTR{phys_device}=="0"
ATTR{phys_index}=="00000000"
ATTR{removable}=="0"
ATTR{state}=="online"
ATTR{type}=="boot"
ATTR{valid_zones}=="none"
$ udevadm info -q all -a /sys/devices/system/memory/memory90
KERNEL=="memory90"
SUBSYSTEM=="memory"
DRIVER==""
ATTR{online}=="1"
ATTR{phys_device}=="0"
ATTR{phys_index}=="0000005a"
ATTR{removable}=="1"
ATTR{state}=="online"
ATTR{type}=="dimm"
ATTR{valid_zones}=="Normal"
RFC -> RFCv2:
- Now also taking care of PPC (somehow missed it :/ )
- Split the series up to some degree (some ideas on how to split up patch 3
would be very welcome)
- Introduce more memory block types. Turns out abstracting too much was
rather confusing and not helpful. Properly document them.
Notes:
- I wanted to convert the enum of types into a named enum but this
provoked all kinds of different errors. For now, I am doing it just like
the other types (e.g. online_type) we are using in that context.
- The "removable" property should never have been named like that. It
should have been "offlinable". Can we still rename that? E.g. boot memory
is sometimes marked as removable ...
David Hildenbrand (4):
mm/memory_hotplug: Introduce memory block types
mm/memory_hotplug: Replace "bool want_memblock" by "int type"
mm/memory_hotplug: Introduce and use more memory types
mm/memory_hotplug: Drop MEMORY_TYPE_UNSPECIFIED
arch/ia64/mm/init.c | 4 +-
arch/powerpc/mm/mem.c | 4 +-
arch/powerpc/platforms/powernv/memtrace.c | 9 +--
.../platforms/pseries/hotplug-memory.c | 7 +-
arch/s390/mm/init.c | 4 +-
arch/sh/mm/init.c | 4 +-
arch/x86/mm/init_32.c | 4 +-
arch/x86/mm/init_64.c | 8 +--
drivers/acpi/acpi_memhotplug.c | 16 ++++-
drivers/base/memory.c | 60 ++++++++++++++--
drivers/hv/hv_balloon.c | 3 +-
drivers/s390/char/sclp_cmd.c | 3 +-
drivers/xen/balloon.c | 2 +-
include/linux/memory.h | 69 ++++++++++++++++++-
include/linux/memory_hotplug.h | 18 ++---
kernel/memremap.c | 6 +-
mm/memory_hotplug.c | 29 ++++----
17 files changed, 194 insertions(+), 56 deletions(-)
--
2.17.2
next reply other threads:[~2018-12-01 11:18 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-30 17:59 David Hildenbrand [this message]
2018-11-30 17:59 ` [PATCH RFCv2 1/4] mm/memory_hotplug: Introduce memory block types David Hildenbrand
2018-12-01 1:25 ` Wei Yang
2018-12-03 10:32 ` David Hildenbrand
2018-12-03 20:58 ` Wei Yang
2018-11-30 17:59 ` [PATCH RFCv2 2/4] mm/memory_hotplug: Replace "bool want_memblock" by "int type" David Hildenbrand
2018-12-01 1:50 ` Wei Yang
2018-12-03 10:33 ` David Hildenbrand
2018-11-30 17:59 ` [PATCH RFCv2 3/4] mm/memory_hotplug: Introduce and use more memory types David Hildenbrand
2018-12-04 9:44 ` Michal Suchánek
2018-12-04 9:47 ` David Hildenbrand
2018-11-30 17:59 ` [PATCH RFCv2 4/4] mm/memory_hotplug: Drop MEMORY_TYPE_UNSPECIFIED David Hildenbrand
2018-12-01 0:48 ` [PATCH RFCv2 0/4] mm/memory_hotplug: Introduce memory block types Wei Yang
2018-12-20 12:58 ` David Hildenbrand
2018-12-20 13:08 ` Michal Hocko
2018-12-20 13:16 ` David Hildenbrand
2019-03-27 16:03 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181130175922.10425-1-david@redhat.com \
--to=david@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=andrew.banman@hpe.com \
--cc=arunks@codeaurora.org \
--cc=boris.ostrovsky@oracle.com \
--cc=bp@alien8.de \
--cc=dalias@libc.org \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=dave.jiang@intel.com \
--cc=devel@linuxdriverproject.org \
--cc=fenghua.yu@intel.com \
--cc=gor@linux.ibm.com \
--cc=gregkh@linuxfoundation.org \
--cc=haiyangz@microsoft.com \
--cc=heiko.carstens@de.ibm.com \
--cc=hpa@zytor.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=j.neuschaefer@gmx.net \
--cc=jglisse@redhat.com \
--cc=jgross@suse.com \
--cc=jschoenh@amazon.de \
--cc=kirill.shutemov@linux.intel.com \
--cc=kys@microsoft.com \
--cc=lenb@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-ia64@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-s390@vger.kernel.org \
--cc=linux-sh@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=logang@deltatee.com \
--cc=luto@kernel.org \
--cc=malat@debian.org \
--cc=mauricfo@linux.vnet.ibm.com \
--cc=mhocko@kernel.org \
--cc=mhocko@suse.com \
--cc=mike.travis@hpe.com \
--cc=mikey@neuling.org \
--cc=mingo@kernel.org \
--cc=mingo@redhat.com \
--cc=msuchanek@suse.de \
--cc=nfont@linux.vnet.ibm.com \
--cc=npiggin@gmail.com \
--cc=osalvador@suse.com \
--cc=osalvador@suse.de \
--cc=pasha.tatashin@oracle.com \
--cc=pasha.tatashin@soleen.com \
--cc=paulus@samba.org \
--cc=pavel.tatashin@microsoft.com \
--cc=peterz@infradead.org \
--cc=rafael@kernel.org \
--cc=rashmica.g@gmail.com \
--cc=richard.weiyang@gmail.com \
--cc=rjw@rjwysocki.net \
--cc=robh@kernel.org \
--cc=rppt@linux.vnet.ibm.com \
--cc=schwidefsky@de.ibm.com \
--cc=sfr@canb.auug.org.au \
--cc=sstabellini@kernel.org \
--cc=sthemmin@microsoft.com \
--cc=tglx@linutronix.de \
--cc=tony.luck@intel.com \
--cc=vkuznets@redhat.com \
--cc=willy@infradead.org \
--cc=x86@kernel.org \
--cc=xen-devel@lists.xenproject.org \
--cc=ysato@users.sourceforge.jp \
--cc=yuehaibing@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).