From: Ross Zwisler <zwisler@google.com>
To: Michal Hocko <mhocko@suse.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Mike Rapoport <rppt@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Matthew Wilcox <willy@infradead.org>,
Mel Gorman <mgorman@suse.de>, Vlastimil Babka <vbabka@suse.cz>,
David Hildenbrand <david@redhat.com>
Subject: Re: collision between ZONE_MOVABLE and memblock allocations
Date: Mon, 24 Jul 2023 10:56:35 -0600 [thread overview]
Message-ID: <20230724165635.GA20994@google.com> (raw)
In-Reply-To: <ZLkk5Z3jGT88is5g@dhcp22.suse.cz>
On Thu, Jul 20, 2023 at 02:13:25PM +0200, Michal Hocko wrote:
> On Wed 19-07-23 16:48:21, Ross Zwisler wrote:
> > On Wed, Jul 19, 2023 at 08:14:48AM +0200, Michal Hocko wrote:
> > > On Tue 18-07-23 16:01:06, Ross Zwisler wrote:
> > > [...]
> > > > I do think that we need to fix this collision between ZONE_MOVABLE and memmap
> > > > allocations, because this issue essentially makes the movablecore= kernel
> > > > command line parameter useless in many cases, as the ZONE_MOVABLE region it
> > > > creates will often actually be unmovable.
> > >
> > > movablecore is kinda hack and I would be more inclined to get rid of it
> > > rather than build more into it. Could you be more specific about your
> > > use case?
> >
> > The problem that I'm trying to solve is that I'd like to be able to get kernel
> > core dumps off machines (chromebooks) so that we can debug crashes. Because
> > the memory used by the crash kernel ("crashkernel=" kernel command line
> > option) is consumed the entire time the machine is booted, there is a strong
> > motivation to keep the crash kernel as small and as simple as possible. To
> > this end I'm trying to get away without SSD drivers, not having to worry about
> > encryption on the SSDs, etc.
> >
> > So, the rough plan right now is:
> >
> > 1) During boot set aside some memory that won't contain kernel allocations.
> > I'm trying to do this now with ZONE_MOVABLE, but I'm open to better ways.
> >
> > We set aside memory for a crash kernel & arm it so that the ZONE_MOVABLE
> > region (or whatever non-kernel region) will be set aside as PMEM in the crash
> > kernel. This is done with the memmap=nn[KMG]!ss[KMG] kernel command line
> > parameter passed to the crash kernel.
> >
> > So, in my sample 4G VM system, I see:
> >
> > # lsmem --split ZONES --output-all
> > RANGE SIZE STATE REMOVABLE BLOCK NODE ZONES
> > 0x0000000000000000-0x0000000007ffffff 128M online yes 0 0 None
> > 0x0000000008000000-0x00000000bfffffff 2.9G online yes 1-23 0 DMA32
> > 0x0000000100000000-0x000000012fffffff 768M online yes 32-37 0 Normal
> > 0x0000000130000000-0x000000013fffffff 256M online yes 38-39 0 Movable
> >
> > Memory block size: 128M
> > Total online memory: 4G
> > Total offline memory: 0B
> >
> > so I'll pass "memmap=256M!0x130000000" to the crash kernel.
> >
> > 2) When we hit a kernel crash, we know (hope?) that the PMEM region we've set
> > aside only contains user data, which we don't want to store anyway. We make a
> > filesystem in there, and create a kernel crash dump using 'makedumpfile':
> >
> > mkfs.ext4 /dev/pmem0
> > mount /dev/pmem0 /mnt
> > makedumpfile -c -d 31 /proc/vmcore /mnt/kdump
> >
> > We then set up the next full kernel boot to also have this same PMEM region,
> > using the same memmap kernel parameter. We reboot back into a full kernel.
>
> Btw. How do you ensure that the address range doesn't get reinitialized
> by POST? Do you rely on kexec boot here?
I've been working under the assumption that I do need to do a full reboot (not
just another kexec boot) so that the devices in the system (NICs, disks, etc)
are all reinitialized and don't carry over bad state from the crash.
I do know about the 'reset_devices' kernel command line parameter, but wasn't
sure that would be enough. From looking around it seems like this is very
driver + device dependent, so maybe I just need to test more.
In any case, you're right, if we do a full reboot and go through POST, it's
system dependent on whether BIOS/UEFI/Coreboot/etc will zero memory, and if it
does this feature won't work unless we kexec to the 3rd kernel.
I've also heard concerns around whether a full reboot will cause the memory
controller to reinitialize and potentially cause memory bit flips or similar,
though I haven't yet seen this myself. Has anyone seen such bit flips /
memory corruption due to system reboot, or is this a non-issue in your
experience?
Lots to figure out, thanks for the help. :)
next prev parent reply other threads:[~2023-07-24 16:56 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-18 22:01 collision between ZONE_MOVABLE and memblock allocations Ross Zwisler
2023-07-19 5:44 ` Mike Rapoport
2023-07-19 22:26 ` Ross Zwisler
2023-07-21 11:20 ` Mike Rapoport
2023-07-26 7:49 ` Michal Hocko
2023-07-26 10:48 ` Mike Rapoport
2023-07-26 12:57 ` Michal Hocko
2023-07-26 13:23 ` Mike Rapoport
2023-07-26 14:23 ` Michal Hocko
2023-07-19 6:14 ` Michal Hocko
2023-07-19 7:59 ` Mike Rapoport
2023-07-19 8:06 ` Michal Hocko
2023-07-19 8:14 ` David Hildenbrand
2023-07-19 23:05 ` Ross Zwisler
2023-07-26 8:31 ` David Hildenbrand
2023-07-19 22:48 ` Ross Zwisler
2023-07-20 7:49 ` Michal Hocko
2023-07-20 12:13 ` Michal Hocko
2023-07-24 16:56 ` Ross Zwisler [this message]
2023-07-26 8:44 ` David Hildenbrand
2023-07-26 13:08 ` David Hildenbrand
2023-07-27 8:18 ` Michal Hocko
2023-07-27 9:41 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230724165635.GA20994@google.com \
--to=zwisler@google.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@suse.com \
--cc=rppt@kernel.org \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.