From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: David Hildenbrand <david@redhat.com>, linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org, David Hildenbrand <david@redhat.com>,
Alexander Duyck <alexander.h.duyck@linux.intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
Andy Lutomirski <luto@kernel.org>,
Anshuman Khandual <anshuman.khandual@arm.com>,
Arun KS <arunks@codeaurora.org>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Borislav Petkov <bp@alien8.de>,
Catalin Marinas <catalin.marinas@arm.com>,
Christian Borntraeger <borntraeger@de.ibm.com>,
Christophe Leroy <christophe.leroy@c-s.fr>,
Dan Williams <dan.j.williams@intel.com>,
Dave Hansen <dave.hansen@linux.intel.com>,
Fenghua Yu <fenghua.yu@intel.com>,
Gerald Schaefer <gerald.schaefer@de.ibm.com>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Halil Pasic <pasic@linux.ibm.com>,
Heiko Carstens <heiko.carstens@de.ibm.com>,
"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
Ira Weiny <ira.weiny@intel.com>, Jason Gunthorpe <jgg@ziepe.ca>,
Johannes Weiner <hannes@cmpxchg.org>,
Jun Yao <yaojun8558363@gmail.com>,
Logan Gunthorpe <logang@deltatee.com>,
Mark Rutland <mark.rutland@arm.com>,
Masahiro Yamada <yamada.masahiro@socionext.com>,
"Matthew Wilcox \(Oracle\)" <willy@infradead.org>,
Mel Gorman <mgorman@techsingularity.net>,
Michael Ellerman <mpe@ellerman.id.au>,
Michal Hocko <mhocko@suse.com>,
Mike Rapoport <rppt@linux.ibm.com>,
Oscar Salvador <osalvador@suse.de>,
Paul Mackerras <paulus@samba.org>,
Pavel Tatashin <pasha.tatashin@soleen.com>,
Pavel Tatashin <pavel.tatashin@microsoft.com>,
Peter Zijlstra <peterz@infradead.org>, Qian Cai <cai@lca.pw>,
Rich Felker <dalias@libc.org>,
Robin Murphy <robin.murphy@arm.com>,
Steve Capper <steve.capper@arm.com>,
Thomas Gleixner <tglx@linutronix.de>,
Tom Lendacky <thomas.lendacky@amd.com>,
Tony Luck <tony.luck@intel.com>,
Vasily Gorbik <gor@linux.ibm.com>,
Vlastimil Babka <vbabka@suse.cz>,
Wei Yang <richard.weiyang@gmail.com>,
Wei Yang <richardw.yang@linux.intel.com>,
Will Deacon <will@kernel.org>,
Yoshinori Sato <ysato@users.sourceforge.jp>,
Yu Zhao <yuzhao@google.com>
Subject: Re: [PATCH v2 0/6] mm/memory_hotplug: Consider all zones when removing memory
Date: Mon, 26 Aug 2019 20:23:38 +0530 [thread overview]
Message-ID: <87pnksm0zx.fsf@linux.ibm.com> (raw)
In-Reply-To: <20190826101012.10575-1-david@redhat.com>
David Hildenbrand <david@redhat.com> writes:
> Working on virtio-mem, I was able to trigger a kernel BUG (with debug
> options enabled) when removing memory that was never onlined. I was able
> to reproduce with DIMMs. As far as I can see the same can also happen
> without debug configs enabled, if we're unlucky and the uninitialized
> memmap contains selected garbage .
>
> The root problem is that we should not try to derive the zone of memory we
> are removing from the first PFN. The individual memory blocks of a DIMM
> could be spanned by different ZONEs, multiple ZONES (after being offline and
> re-onlined) or no ZONE at all (never onlined).
>
> Let's process all applicable zones when removing memory so we're on the
> safe side. In the long term, we want to resize the zones when offlining
> memory (and before removing ZONE_DEVICE memory), however, that will require
> more thought (and most probably a new SECTION_ACTIVE / pfn_active()
> thingy). More details about that in patch #3.
>
> Along with the fix, some related cleanups.
>
> v1 -> v2:
> - Include "mm: Introduce for_each_zone_nid()"
> - "mm/memory_hotplug: Pass nid instead of zone to __remove_pages()"
> -- Pass the nid instead of the zone and use it to reduce the number of
> zones to process
>
> --- snip ---
>
> I gave this a quick test with a DIMM on x86-64:
>
> Start with a NUMA-less node 1. Hotplug a DIMM (512MB) to Node 1.
> 1st memory block is not onlined. 2nd and 4th is onlined MOVABLE.
> 3rd is onlined NORMAL.
>
> :/# echo "online_movable" > /sys/devices/system/memory/memory41/state
> [...]
> :/# echo "online_movable" > /sys/devices/system/memory/memory43/state
> :/# echo "online_kernel" > /sys/devices/system/memory/memory42/state
> :/# cat /sys/devices/system/memory/memory40/state
> offline
>
> :/# cat /proc/zoneinfo
> Node 1, zone Normal
> [...]
> spanned 32768
> present 32768
> managed 32768
> [...]
> Node 1, zone Movable
> [...]
> spanned 98304
> present 65536
> managed 65536
> [...]
>
> Trigger hotunplug. If it succeeds (block 42 can be offlined):
>
> :/# cat /proc/zoneinfo
>
> Node 1, zone Normal
> pages free 0
> min 0
> low 0
> high 0
> spanned 0
> present 0
> managed 0
> protection: (0, 0, 0, 0, 0)
> Node 1, zone Movable
> pages free 0
> min 0
> low 0
> high 0
> spanned 0
> present 0
> managed 0
> protection: (0, 0, 0, 0, 0)
>
> So all zones were properly fixed up and we don't access the memmap of the
> first, never-onlined memory block (garbage). I am no longer able to trigger
> the BUG. I did a similar test with an already populated node.
>
I did report a variant of the issue at
https://lore.kernel.org/linux-mm/20190514025354.9108-1-aneesh.kumar@linux.ibm.com/
This patch series still doesn't handle the fact that struct page backing
the start_pfn might not be initialized. ie, it results in crash like
below
pc: c0000000004bc1ec: shrink_zone_span+0x1bc/0x290
lr: c0000000004bc1e8: shrink_zone_span+0x1b8/0x290
sp: c0000000dac7f910
msr: 800000000282b033
current = 0xc0000000da2fa000
paca = 0xc00000000fffb300 irqmask: 0x03 irq_happened: 0x01
pid = 1224, comm = ndctl
kernel BUG at /home/kvaneesh/src/linux/include/linux/mm.h:1088!
Linux version 5.3.0-rc6-17495-gc7727d815970-dirty (kvaneesh@ltc-boston123) (gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1)) #183 SMP Mon Aug 26 09:37:32 CDT 2019
enter ? for help
[c0000000dac7f980] c0000000004bc574 __remove_zone+0x84/0xd0
[c0000000dac7f9d0] c0000000004bc920 __remove_section+0x100/0x170
[c0000000dac7fa30] c0000000004bec98 __remove_pages+0x168/0x220
[c0000000dac7fa90] c00000000007dff8 arch_remove_memory+0x38/0x110
[c0000000dac7fb00] c00000000050cb0c devm_memremap_pages_release+0x24c/0x2f0
[c0000000dac7fb90] c000000000cfec00 devm_action_release+0x30/0x50
[c0000000dac7fbb0] c000000000cffe7c release_nodes+0x24c/0x2c0
[c0000000dac7fc20] c000000000cf8988 device_release_driver_internal+0x168/0x230
[c0000000dac7fc60] c000000000cf5624 unbind_store+0x74/0x190
[c0000000dac7fcb0] c000000000cf42a4 drv_attr_store+0x44/0x60
[c0000000dac7fcd0] c000000000617d44 sysfs_kf_write+0x74/0xa0
I do have a few patches to handle the crashes eralier in
devm_memremap_pages_release()
--- a/mm/memremap.c
+++ b/mm/memremap.c
@@ -121,7 +121,7 @@ static void devm_memremap_pages_release(void *data)
dev_pagemap_cleanup(pgmap);
/* pages are dead and unused, undo the arch mapping */
- nid = page_to_nid(pfn_to_page(PHYS_PFN(res->start)));
+ nid = page_to_nid(pfn_to_page(pfn_first(pgmap)));
and also for pfn_first
https://www.mail-archive.com/linux-nvdimm@lists.01.org/msg16205.html
-aneesh
next prev parent reply other threads:[~2019-08-26 14:54 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-26 10:10 [PATCH v2 0/6] mm/memory_hotplug: Consider all zones when removing memory David Hildenbrand
2019-08-26 10:10 ` [PATCH v2 1/6] mm/memory_hotplug: Exit early in __remove_pages() on BUGs David Hildenbrand
2019-08-26 10:10 ` [PATCH v2 2/6] mm: Exit early in set_zone_contiguous() if already contiguous David Hildenbrand
2019-08-26 10:10 ` [PATCH v2 3/6] mm/memory_hotplug: Process all zones when removing memory David Hildenbrand
2019-08-29 15:39 ` Michal Hocko
2019-08-29 15:54 ` David Hildenbrand
2019-08-29 16:27 ` Michal Hocko
2019-08-29 16:59 ` David Hildenbrand
2019-08-30 6:01 ` Michal Hocko
2019-08-30 6:20 ` David Hildenbrand
2019-08-30 6:47 ` Michal Hocko
2019-08-30 7:07 ` David Hildenbrand
2019-08-30 8:31 ` Michal Hocko
2019-08-26 10:10 ` [PATCH v2 4/6] mm/memory_hotplug: Cleanup __remove_pages() David Hildenbrand
2019-08-26 10:10 ` [PATCH v2 5/6] mm: Introduce for_each_zone_nid() David Hildenbrand
2019-08-26 10:10 ` [PATCH v2 6/6] mm/memory_hotplug: Pass nid instead of zone to __remove_pages() David Hildenbrand
2019-08-26 10:10 ` David Hildenbrand
2019-08-27 10:49 ` Robin Murphy
2019-08-27 10:49 ` Robin Murphy
2019-08-26 14:53 ` Aneesh Kumar K.V [this message]
2019-08-26 15:43 ` [PATCH v2 0/6] mm/memory_hotplug: Consider all zones when removing memory David Hildenbrand
2019-08-26 16:01 ` Aneesh Kumar K.V
2019-08-26 16:20 ` David Hildenbrand
2019-08-26 16:44 ` David Hildenbrand
2019-08-27 5:46 ` Aneesh Kumar K.V
2019-08-27 7:06 ` David Hildenbrand
2019-08-28 9:33 ` David Hildenbrand
2019-08-29 8:38 ` Michal Hocko
2019-08-29 11:55 ` David Hildenbrand
2019-08-29 12:20 ` Michal Hocko
2019-08-29 8:36 ` Michal Hocko
2019-08-29 11:39 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87pnksm0zx.fsf@linux.ibm.com \
--to=aneesh.kumar@linux.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=alexander.h.duyck@linux.intel.com \
--cc=anshuman.khandual@arm.com \
--cc=arunks@codeaurora.org \
--cc=benh@kernel.crashing.org \
--cc=borntraeger@de.ibm.com \
--cc=bp@alien8.de \
--cc=cai@lca.pw \
--cc=catalin.marinas@arm.com \
--cc=christophe.leroy@c-s.fr \
--cc=dalias@libc.org \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=david@redhat.com \
--cc=fenghua.yu@intel.com \
--cc=gerald.schaefer@de.ibm.com \
--cc=gor@linux.ibm.com \
--cc=gregkh@linuxfoundation.org \
--cc=hannes@cmpxchg.org \
--cc=heiko.carstens@de.ibm.com \
--cc=hpa@zytor.com \
--cc=ira.weiny@intel.com \
--cc=jgg@ziepe.ca \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=logang@deltatee.com \
--cc=luto@kernel.org \
--cc=mark.rutland@arm.com \
--cc=mgorman@techsingularity.net \
--cc=mhocko@suse.com \
--cc=mingo@redhat.com \
--cc=mpe@ellerman.id.au \
--cc=osalvador@suse.de \
--cc=pasha.tatashin@soleen.com \
--cc=pasic@linux.ibm.com \
--cc=paulus@samba.org \
--cc=pavel.tatashin@microsoft.com \
--cc=peterz@infradead.org \
--cc=richard.weiyang@gmail.com \
--cc=richardw.yang@linux.intel.com \
--cc=robin.murphy@arm.com \
--cc=rppt@linux.ibm.com \
--cc=steve.capper@arm.com \
--cc=tglx@linutronix.de \
--cc=thomas.lendacky@amd.com \
--cc=tony.luck@intel.com \
--cc=vbabka@suse.cz \
--cc=will@kernel.org \
--cc=willy@infradead.org \
--cc=yamada.masahiro@socionext.com \
--cc=yaojun8558363@gmail.com \
--cc=ysato@users.sourceforge.jp \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.