From: David Hildenbrand <david@redhat.com>
To: stable@vger.kernel.org
Cc: linux-mm@kvack.org, Michal Hocko <mhocko@suse.com>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
"Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>,
Baoquan He <bhe@redhat.com>,
Dan Williams <dan.j.williams@intel.com>,
Oscar Salvador <osalvador@suse.de>,
Wei Yang <richard.weiyang@gmail.com>,
David Hildenbrand <david@redhat.com>
Subject: [PATCH for 4.19-stable v3 01/24] mm/memory_hotplug: make remove_memory() take the device_hotplug_lock
Date: Tue, 28 Jan 2020 10:49:58 +0100 [thread overview]
Message-ID: <20200128095021.8076-2-david@redhat.com> (raw)
In-Reply-To: <20200128095021.8076-1-david@redhat.com>
commit d15e59260f62bd5e0f625cf5f5240f6ffac78ab6 upstream.
Patch series "mm: online/offline_pages called w.o. mem_hotplug_lock", v3.
Reading through the code and studying how mem_hotplug_lock is to be used,
I noticed that there are two places where we can end up calling
device_online()/device_offline() - online_pages()/offline_pages() without
the mem_hotplug_lock. And there are other places where we call
device_online()/device_offline() without the device_hotplug_lock.
While e.g.
echo "online" > /sys/devices/system/memory/memory9/state
is fine, e.g.
echo 1 > /sys/devices/system/memory/memory9/online
Will not take the mem_hotplug_lock. However the device_lock() and
device_hotplug_lock.
E.g. via memory_probe_store(), we can end up calling
add_memory()->online_pages() without the device_hotplug_lock. So we can
have concurrent callers in online_pages(). We e.g. touch in
online_pages() basically unprotected zone->present_pages then.
Looks like there is a longer history to that (see Patch #2 for details),
and fixing it to work the way it was intended is not really possible. We
would e.g. have to take the mem_hotplug_lock in device/base/core.c, which
sounds wrong.
Summary: We had a lock inversion on mem_hotplug_lock and device_lock().
More details can be found in patch 3 and patch 6.
I propose the general rules (documentation added in patch 6):
1. add_memory/add_memory_resource() must only be called with
device_hotplug_lock.
2. remove_memory() must only be called with device_hotplug_lock. This is
already documented and holds for all callers.
3. device_online()/device_offline() must only be called with
device_hotplug_lock. This is already documented and true for now in core
code. Other callers (related to memory hotplug) have to be fixed up.
4. mem_hotplug_lock is taken inside of add_memory/remove_memory/
online_pages/offline_pages.
To me, this looks way cleaner than what we have right now (and easier to
verify). And looking at the documentation of remove_memory, using
lock_device_hotplug also for add_memory() feels natural.
This patch (of 6):
remove_memory() is exported right now but requires the
device_hotplug_lock, which is not exported. So let's provide a variant
that takes the lock and only export that one.
The lock is already held in
arch/powerpc/platforms/pseries/hotplug-memory.c
drivers/acpi/acpi_memhotplug.c
arch/powerpc/platforms/powernv/memtrace.c
Apart from that, there are not other users in the tree.
Link: http://lkml.kernel.org/r/20180925091457.28651-2-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: Rashmica Gupta <rashmica.g@gmail.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Cc: John Allen <jallen@linux.vnet.ibm.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>
Cc: Mathieu Malaterre <malat@debian.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
arch/powerpc/platforms/powernv/memtrace.c | 2 +-
arch/powerpc/platforms/pseries/hotplug-memory.c | 6 +++---
drivers/acpi/acpi_memhotplug.c | 2 +-
include/linux/memory_hotplug.h | 3 ++-
mm/memory_hotplug.c | 9 ++++++++-
5 files changed, 15 insertions(+), 7 deletions(-)
diff --git a/arch/powerpc/platforms/powernv/memtrace.c b/arch/powerpc/platforms/powernv/memtrace.c
index dd3cc4632b9a..84d038ed3882 100644
--- a/arch/powerpc/platforms/powernv/memtrace.c
+++ b/arch/powerpc/platforms/powernv/memtrace.c
@@ -122,7 +122,7 @@ static u64 memtrace_alloc_node(u32 nid, u64 size)
*/
end_pfn = base_pfn + nr_pages;
for (pfn = base_pfn; pfn < end_pfn; pfn += bytes>> PAGE_SHIFT) {
- remove_memory(nid, pfn << PAGE_SHIFT, bytes);
+ __remove_memory(nid, pfn << PAGE_SHIFT, bytes);
}
unlock_device_hotplug();
return base_pfn << PAGE_SHIFT;
diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
index 62d3c72cd931..c2c6f32848e1 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -301,7 +301,7 @@ static int pseries_remove_memblock(unsigned long base, unsigned int memblock_siz
nid = memory_add_physaddr_to_nid(base);
for (i = 0; i < sections_per_block; i++) {
- remove_memory(nid, base, MIN_MEMORY_BLOCK_SIZE);
+ __remove_memory(nid, base, MIN_MEMORY_BLOCK_SIZE);
base += MIN_MEMORY_BLOCK_SIZE;
}
@@ -393,7 +393,7 @@ static int dlpar_remove_lmb(struct drmem_lmb *lmb)
block_sz = pseries_memory_block_size();
nid = memory_add_physaddr_to_nid(lmb->base_addr);
- remove_memory(nid, lmb->base_addr, block_sz);
+ __remove_memory(nid, lmb->base_addr, block_sz);
/* Update memory regions for memory remove */
memblock_remove(lmb->base_addr, block_sz);
@@ -680,7 +680,7 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
rc = dlpar_online_lmb(lmb);
if (rc) {
- remove_memory(nid, lmb->base_addr, block_sz);
+ __remove_memory(nid, lmb->base_addr, block_sz);
invalidate_lmb_associativity_index(lmb);
} else {
lmb->flags |= DRCONF_MEM_ASSIGNED;
diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c
index 2ccfbb61ca89..8fe0960ea572 100644
--- a/drivers/acpi/acpi_memhotplug.c
+++ b/drivers/acpi/acpi_memhotplug.c
@@ -282,7 +282,7 @@ static void acpi_memory_remove_memory(struct acpi_memory_device *mem_device)
nid = memory_add_physaddr_to_nid(info->start_addr);
acpi_unbind_memory_blocks(info);
- remove_memory(nid, info->start_addr, info->length);
+ __remove_memory(nid, info->start_addr, info->length);
list_del(&info->list);
kfree(info);
}
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 4915e6cd7fd5..6f13a5a33b51 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -303,6 +303,7 @@ extern bool is_mem_section_removable(unsigned long pfn, unsigned long nr_pages);
extern void try_offline_node(int nid);
extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages);
extern void remove_memory(int nid, u64 start, u64 size);
+extern void __remove_memory(int nid, u64 start, u64 size);
#else
static inline bool is_mem_section_removable(unsigned long pfn,
@@ -319,6 +320,7 @@ static inline int offline_pages(unsigned long start_pfn, unsigned long nr_pages)
}
static inline void remove_memory(int nid, u64 start, u64 size) {}
+static inline void __remove_memory(int nid, u64 start, u64 size) {}
#endif /* CONFIG_MEMORY_HOTREMOVE */
extern void __ref free_area_init_core_hotplug(int nid);
@@ -333,7 +335,6 @@ extern void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
unsigned long nr_pages, struct vmem_altmap *altmap);
extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages);
extern bool is_memblock_offlined(struct memory_block *mem);
-extern void remove_memory(int nid, u64 start, u64 size);
extern int sparse_add_one_section(struct pglist_data *pgdat,
unsigned long start_pfn, struct vmem_altmap *altmap);
extern void sparse_remove_one_section(struct zone *zone, struct mem_section *ms,
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 413f6709039a..e2e2cf7014ee 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1893,7 +1893,7 @@ EXPORT_SYMBOL(try_offline_node);
* and online/offline operations before this call, as required by
* try_offline_node().
*/
-void __ref remove_memory(int nid, u64 start, u64 size)
+void __ref __remove_memory(int nid, u64 start, u64 size)
{
int ret;
@@ -1922,5 +1922,12 @@ void __ref remove_memory(int nid, u64 start, u64 size)
mem_hotplug_done();
}
+
+void remove_memory(int nid, u64 start, u64 size)
+{
+ lock_device_hotplug();
+ __remove_memory(nid, start, size);
+ unlock_device_hotplug();
+}
EXPORT_SYMBOL_GPL(remove_memory);
#endif /* CONFIG_MEMORY_HOTREMOVE */
--
2.24.1
next prev parent reply other threads:[~2020-01-28 9:50 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-28 9:49 [PATCH for 4.19-stable v3 00/24] mm/memory_hotplug: backport of pending stable fixes David Hildenbrand
2020-01-28 9:49 ` David Hildenbrand [this message]
2020-01-28 13:49 ` Patch "mm/memory_hotplug: make remove_memory() take the device_hotplug_lock" has been added to the 4.19-stable tree gregkh
2020-01-28 9:49 ` [PATCH for 4.19-stable v3 02/24] mm, sparse: drop pgdat_resize_lock in sparse_add/remove_one_section() David Hildenbrand
2020-01-28 13:49 ` Patch "mm, sparse: drop pgdat_resize_lock in sparse_add/remove_one_section()" has been added to the 4.19-stable tree gregkh
2020-01-28 9:50 ` [PATCH for 4.19-stable v3 03/24] mm, sparse: pass nid instead of pgdat to sparse_add_one_section() David Hildenbrand
2020-01-28 13:49 ` Patch "mm, sparse: pass nid instead of pgdat to sparse_add_one_section()" has been added to the 4.19-stable tree gregkh
2020-01-28 9:50 ` [PATCH for 4.19-stable v3 04/24] drivers/base/memory.c: remove an unnecessary check on NR_MEM_SECTIONS David Hildenbrand
2020-01-28 13:49 ` Patch "drivers/base/memory.c: remove an unnecessary check on NR_MEM_SECTIONS" has been added to the 4.19-stable tree gregkh
2020-01-28 9:50 ` [PATCH for 4.19-stable v3 05/24] mm, memory_hotplug: add nid parameter to arch_remove_memory David Hildenbrand
2020-01-28 13:49 ` Patch "mm, memory_hotplug: add nid parameter to arch_remove_memory" has been added to the 4.19-stable tree gregkh
2020-01-28 9:50 ` [PATCH for 4.19-stable v3 06/24] mm/memory_hotplug: release memory resource after arch_remove_memory() David Hildenbrand
2020-01-28 13:49 ` Patch "mm/memory_hotplug: release memory resource after arch_remove_memory()" has been added to the 4.19-stable tree gregkh
2020-01-28 9:50 ` [PATCH for 4.19-stable v3 07/24] drivers/base/memory.c: clean up relics in function parameters David Hildenbrand
2020-01-28 13:49 ` Patch "drivers/base/memory.c: clean up relics in function parameters" has been added to the 4.19-stable tree gregkh
2020-01-28 9:50 ` [PATCH for 4.19-stable v3 08/24] mm, memory_hotplug: update a comment in unregister_memory() David Hildenbrand
2020-01-28 13:49 ` Patch "mm, memory_hotplug: update a comment in unregister_memory()" has been added to the 4.19-stable tree gregkh
2020-01-28 9:50 ` [PATCH for 4.19-stable v3 09/24] mm/memory_hotplug: make unregister_memory_section() never fail David Hildenbrand
2020-01-28 13:49 ` Patch "mm/memory_hotplug: make unregister_memory_section() never fail" has been added to the 4.19-stable tree gregkh
2020-01-28 9:50 ` [PATCH for 4.19-stable v3 10/24] mm/memory_hotplug: make __remove_section() never fail David Hildenbrand
2020-01-28 13:49 ` Patch "mm/memory_hotplug: make __remove_section() never fail" has been added to the 4.19-stable tree gregkh
2020-01-28 9:50 ` [PATCH for 4.19-stable v3 11/24] powerpc/mm: Fix section mismatch warning David Hildenbrand
2020-01-28 13:49 ` Patch "powerpc/mm: Fix section mismatch warning" has been added to the 4.19-stable tree gregkh
2020-01-28 9:50 ` [PATCH for 4.19-stable v3 12/24] mm/memory_hotplug: make __remove_pages() and arch_remove_memory() never fail David Hildenbrand
2020-01-28 13:49 ` Patch "mm/memory_hotplug: make __remove_pages() and arch_remove_memory() never fail" has been added to the 4.19-stable tree gregkh
2020-01-28 9:50 ` [PATCH for 4.19-stable v3 13/24] s390x/mm: implement arch_remove_memory() David Hildenbrand
2020-01-28 13:49 ` Patch "s390x/mm: implement arch_remove_memory()" has been added to the 4.19-stable tree gregkh
2020-01-28 9:50 ` [PATCH for 4.19-stable v3 14/24] mm/memory_hotplug: allow arch_remove_memory() without CONFIG_MEMORY_HOTREMOVE David Hildenbrand
2020-01-28 13:49 ` Patch "mm/memory_hotplug: allow arch_remove_memory() without CONFIG_MEMORY_HOTREMOVE" has been added to the 4.19-stable tree gregkh
2020-01-28 9:50 ` [PATCH for 4.19-stable v3 15/24] drivers/base/memory: pass a block_id to init_memory_block() David Hildenbrand
2020-01-28 13:49 ` Patch "drivers/base/memory: pass a block_id to init_memory_block()" has been added to the 4.19-stable tree gregkh
2020-01-28 9:50 ` [PATCH for 4.19-stable v3 16/24] mm/memory_hotplug: create memory block devices after arch_add_memory() David Hildenbrand
2020-01-28 13:49 ` Patch "mm/memory_hotplug: create memory block devices after arch_add_memory()" has been added to the 4.19-stable tree gregkh
2020-01-28 9:50 ` [PATCH for 4.19-stable v3 17/24] mm/memory_hotplug: remove memory block devices before arch_remove_memory() David Hildenbrand
2020-01-28 13:49 ` Patch "mm/memory_hotplug: remove memory block devices before arch_remove_memory()" has been added to the 4.19-stable tree gregkh
2020-01-28 9:50 ` [PATCH for 4.19-stable v3 18/24] mm/memory_hotplug: make unregister_memory_block_under_nodes() never fail David Hildenbrand
2020-01-28 13:49 ` Patch "mm/memory_hotplug: make unregister_memory_block_under_nodes() never fail" has been added to the 4.19-stable tree gregkh
2020-01-28 9:50 ` [PATCH for 4.19-stable v3 19/24] mm/memory_hotplug: remove "zone" parameter from sparse_remove_one_section David Hildenbrand
2020-01-28 13:49 ` Patch "mm/memory_hotplug: remove "zone" parameter from sparse_remove_one_section" has been added to the 4.19-stable tree gregkh
2020-01-28 9:50 ` [PATCH for 4.19-stable v3 20/24] mm/hotplug: kill is_dev_zone() usage in __remove_pages() David Hildenbrand
2020-01-28 13:49 ` Patch "mm/hotplug: kill is_dev_zone() usage in __remove_pages()" has been added to the 4.19-stable tree gregkh
2020-01-28 9:50 ` [PATCH for 4.19-stable v3 21/24] drivers/base/node.c: simplify unregister_memory_block_under_nodes() David Hildenbrand
2020-01-28 13:49 ` Patch "drivers/base/node.c: simplify unregister_memory_block_under_nodes()" has been added to the 4.19-stable tree gregkh
2020-01-28 9:50 ` [PATCH for 4.19-stable v3 22/24] mm/memunmap: don't access uninitialized memmap in memunmap_pages() David Hildenbrand
2020-01-28 13:49 ` Patch "mm/memunmap: don't access uninitialized memmap in memunmap_pages()" has been added to the 4.19-stable tree gregkh
2020-01-28 9:50 ` [PATCH for 4.19-stable v3 23/24] mm/memory_hotplug: fix try_offline_node() David Hildenbrand
2020-01-28 13:49 ` Patch "mm/memory_hotplug: fix try_offline_node()" has been added to the 4.19-stable tree gregkh
2020-01-28 9:50 ` [PATCH for 4.19-stable v3 24/24] mm/memory_hotplug: shrink zones when offlining memory David Hildenbrand
2020-01-28 13:49 ` Patch "mm/memory_hotplug: shrink zones when offlining memory" has been added to the 4.19-stable tree gregkh
2020-01-28 13:50 ` [PATCH for 4.19-stable v3 00/24] mm/memory_hotplug: backport of pending stable fixes Greg Kroah-Hartman
2020-01-29 9:37 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200128095021.8076-2-david@redhat.com \
--to=david@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.ibm.com \
--cc=bhe@redhat.com \
--cc=dan.j.williams@intel.com \
--cc=gregkh@linuxfoundation.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=osalvador@suse.de \
--cc=richard.weiyang@gmail.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.