From: Mark Rutland <mark.rutland@arm.com>
To: AKASHI Takahiro <takahiro.akashi@linaro.org>,
catalin.marinas@arm.com, will.deacon@arm.com,
james.morse@arm.com, geoff@infradead.org,
bauerman@linux.vnet.ibm.com, dyoung@redhat.com,
kexec@lists.infradead.org, linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v31 05/12] arm64: kdump: protect crash dump kernel memory
Date: Thu, 2 Feb 2017 11:16:37 +0000 [thread overview]
Message-ID: <20170202111637.GA31394@leverpostej> (raw)
In-Reply-To: <20170202103129.GE13549@linaro.org>
Hi,
On Thu, Feb 02, 2017 at 07:31:30PM +0900, AKASHI Takahiro wrote:
> On Wed, Feb 01, 2017 at 06:00:08PM +0000, Mark Rutland wrote:
> > On Wed, Feb 01, 2017 at 09:46:24PM +0900, AKASHI Takahiro wrote:
> > > arch_kexec_protect_crashkres() and arch_kexec_unprotect_crashkres()
> > > are meant to be called around kexec_load() in order to protect
> > > the memory allocated for crash dump kernel once after it's loaded.
> > >
> > > The protection is implemented here by unmapping the region rather than
> > > making it read-only.
> > > To make the things work correctly, we also have to
> > > - put the region in an isolated, page-level mapping initially, and
> > > - move copying kexec's control_code_page to machine_kexec_prepare()
> > >
> > > Note that page-level mapping is also required to allow for shrinking
> > > the size of memory, through /sys/kernel/kexec_crash_size, by any number
> > > of multiple pages.
> >
> > Looking at kexec_crash_size_store(), I don't see where memory returned
> > to the OS is mapped. AFAICT, if the region is protected when the user
> > shrinks the region, the memory will not be mapped, yet handed over to
> > the kernel for general allocation.
>
> The region is protected only when the crash dump kernel is loaded,
> and after that, we are no longer able to shrink the region.
Ah, sorry. My misunderstanding strikes again. That should be fine; sorry
for the noise, and thanks for explaining.
> > > @@ -538,6 +540,24 @@ static void __init map_mem(pgd_t *pgd)
> > > if (memblock_is_nomap(reg))
> > > continue;
> > >
> > > +#ifdef CONFIG_KEXEC_CORE
> > > + /*
> > > + * While crash dump kernel memory is contained in a single
> > > + * memblock for now, it should appear in an isolated mapping
> > > + * so that we can independently unmap the region later.
> > > + */
> > > + if (crashk_res.end &&
> > > + (start <= crashk_res.start) &&
> > > + ((crashk_res.end + 1) < end)) {
> > > + if (crashk_res.start != start)
> > > + __map_memblock(pgd, start, crashk_res.start);
> > > +
> > > + if ((crashk_res.end + 1) < end)
> > > + __map_memblock(pgd, crashk_res.end + 1, end);
> > > +
> > > + continue;
> > > + }
> > > +#endif
> >
> > This wasn't quite what I had in mind. I had expected that here we would
> > isolate the ranges we wanted to avoid mapping (with a comment as to why
> > we couldn't move the memblock_isolate_range() calls earlier). In
> > map_memblock(), we'd skip those ranges entirely.
> >
> > I believe the above isn't correct if we have a single memblock.memory
> > region covering both the crashkernel and kernel regions. In that case,
> > we'd erroneously map the portion which overlaps the kernel.
> >
> > It seems there are a number of subtle problems here. :/
>
> I didn't see any problems, but I will go back with memblock_isolate_range()
> here in map_mem().
Imagine we have phyiscal memory:
singe RAM bank: |---------------------------------------------------|
kernel image: |---|
crashkernel: |------|
... we reserve the image and crashkernel region, but these would still
remain part of the memory memblock, and we'd have a memblock layout
like:
memblock.memory: |---------------------------------------------------|
memblock.reserved: |---| |------|
... in map_mem() we iterate over memblock.memory, so we only have a
single entry to handle in this case. With the code above, we'd find that
it overlaps the crashk_res, and we'd map the parts which don't overlap,
e.g.
memblock.memory: |---------------------------------------------------|
crashkernel: |------|
mapped regions: |-----------------------------| |------------|
... hwoever, this means we've mapped the portion which overlaps with the
kernel's linear alias (i.e. the case that we try to handle in
__map_memblock()). What we actually wanted was:
memblock.memory: |---------------------------------------------------|
kernel image: |---|
crashkernel: |------|
mapped regions: |------| |----------------| |------------|
To handle all cases I think we have to isolate *both* the image and
crashkernel in map_mem(). That would leave use with:
memblock.memory: |------||---||----------------||------||------------|
memblock.reserved: |---| |------|
... so then we can check for overlap with either the kernel or
crashkernel in __map_memblock(), and return early, e.g.
__map_memblock(...)
if (overlaps_with_kernel(...))
return;
if (overlaps_with_crashekrenl(...))
return;
__create_pgd_mapping(...);
}
We can pull the kernel alias mapping out of __map_memblock() and put it
at the end of map_mem().
Does that make sense?
Thanks,
Mark.
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
WARNING: multiple messages have this Message-ID (diff)
From: mark.rutland@arm.com (Mark Rutland)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v31 05/12] arm64: kdump: protect crash dump kernel memory
Date: Thu, 2 Feb 2017 11:16:37 +0000 [thread overview]
Message-ID: <20170202111637.GA31394@leverpostej> (raw)
In-Reply-To: <20170202103129.GE13549@linaro.org>
Hi,
On Thu, Feb 02, 2017 at 07:31:30PM +0900, AKASHI Takahiro wrote:
> On Wed, Feb 01, 2017 at 06:00:08PM +0000, Mark Rutland wrote:
> > On Wed, Feb 01, 2017 at 09:46:24PM +0900, AKASHI Takahiro wrote:
> > > arch_kexec_protect_crashkres() and arch_kexec_unprotect_crashkres()
> > > are meant to be called around kexec_load() in order to protect
> > > the memory allocated for crash dump kernel once after it's loaded.
> > >
> > > The protection is implemented here by unmapping the region rather than
> > > making it read-only.
> > > To make the things work correctly, we also have to
> > > - put the region in an isolated, page-level mapping initially, and
> > > - move copying kexec's control_code_page to machine_kexec_prepare()
> > >
> > > Note that page-level mapping is also required to allow for shrinking
> > > the size of memory, through /sys/kernel/kexec_crash_size, by any number
> > > of multiple pages.
> >
> > Looking at kexec_crash_size_store(), I don't see where memory returned
> > to the OS is mapped. AFAICT, if the region is protected when the user
> > shrinks the region, the memory will not be mapped, yet handed over to
> > the kernel for general allocation.
>
> The region is protected only when the crash dump kernel is loaded,
> and after that, we are no longer able to shrink the region.
Ah, sorry. My misunderstanding strikes again. That should be fine; sorry
for the noise, and thanks for explaining.
> > > @@ -538,6 +540,24 @@ static void __init map_mem(pgd_t *pgd)
> > > if (memblock_is_nomap(reg))
> > > continue;
> > >
> > > +#ifdef CONFIG_KEXEC_CORE
> > > + /*
> > > + * While crash dump kernel memory is contained in a single
> > > + * memblock for now, it should appear in an isolated mapping
> > > + * so that we can independently unmap the region later.
> > > + */
> > > + if (crashk_res.end &&
> > > + (start <= crashk_res.start) &&
> > > + ((crashk_res.end + 1) < end)) {
> > > + if (crashk_res.start != start)
> > > + __map_memblock(pgd, start, crashk_res.start);
> > > +
> > > + if ((crashk_res.end + 1) < end)
> > > + __map_memblock(pgd, crashk_res.end + 1, end);
> > > +
> > > + continue;
> > > + }
> > > +#endif
> >
> > This wasn't quite what I had in mind. I had expected that here we would
> > isolate the ranges we wanted to avoid mapping (with a comment as to why
> > we couldn't move the memblock_isolate_range() calls earlier). In
> > map_memblock(), we'd skip those ranges entirely.
> >
> > I believe the above isn't correct if we have a single memblock.memory
> > region covering both the crashkernel and kernel regions. In that case,
> > we'd erroneously map the portion which overlaps the kernel.
> >
> > It seems there are a number of subtle problems here. :/
>
> I didn't see any problems, but I will go back with memblock_isolate_range()
> here in map_mem().
Imagine we have phyiscal memory:
singe RAM bank: |---------------------------------------------------|
kernel image: |---|
crashkernel: |------|
... we reserve the image and crashkernel region, but these would still
remain part of the memory memblock, and we'd have a memblock layout
like:
memblock.memory: |---------------------------------------------------|
memblock.reserved: |---| |------|
... in map_mem() we iterate over memblock.memory, so we only have a
single entry to handle in this case. With the code above, we'd find that
it overlaps the crashk_res, and we'd map the parts which don't overlap,
e.g.
memblock.memory: |---------------------------------------------------|
crashkernel: |------|
mapped regions: |-----------------------------| |------------|
... hwoever, this means we've mapped the portion which overlaps with the
kernel's linear alias (i.e. the case that we try to handle in
__map_memblock()). What we actually wanted was:
memblock.memory: |---------------------------------------------------|
kernel image: |---|
crashkernel: |------|
mapped regions: |------| |----------------| |------------|
To handle all cases I think we have to isolate *both* the image and
crashkernel in map_mem(). That would leave use with:
memblock.memory: |------||---||----------------||------||------------|
memblock.reserved: |---| |------|
... so then we can check for overlap with either the kernel or
crashkernel in __map_memblock(), and return early, e.g.
__map_memblock(...)
if (overlaps_with_kernel(...))
return;
if (overlaps_with_crashekrenl(...))
return;
__create_pgd_mapping(...);
}
We can pull the kernel alias mapping out of __map_memblock() and put it
at the end of map_mem().
Does that make sense?
Thanks,
Mark.
next prev parent reply other threads:[~2017-02-02 11:16 UTC|newest]
Thread overview: 92+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-01 12:42 [PATCH v31 00/12] add kdump support AKASHI Takahiro
2017-02-01 12:42 ` AKASHI Takahiro
2017-02-01 12:45 ` [PATCH v31 01/12] memblock: add memblock_cap_memory_range() AKASHI Takahiro
2017-02-01 12:45 ` AKASHI Takahiro
2017-02-01 12:45 ` AKASHI Takahiro
2017-02-01 12:46 ` [PATCH v31 02/12] arm64: limit memory regions based on DT property, usable-memory-range AKASHI Takahiro
2017-02-01 12:46 ` AKASHI Takahiro
2017-02-01 15:07 ` Mark Rutland
2017-02-01 15:07 ` Mark Rutland
2017-02-02 4:21 ` AKASHI Takahiro
2017-02-02 4:21 ` AKASHI Takahiro
2017-02-01 12:46 ` [PATCH v31 03/12] arm64: kdump: reserve memory for crash dump kernel AKASHI Takahiro
2017-02-01 12:46 ` AKASHI Takahiro
2017-02-01 15:26 ` Mark Rutland
2017-02-01 15:26 ` Mark Rutland
2017-02-02 4:52 ` AKASHI Takahiro
2017-02-02 4:52 ` AKASHI Takahiro
2017-02-02 11:26 ` Mark Rutland
2017-02-02 11:26 ` Mark Rutland
2017-02-02 13:44 ` AKASHI Takahiro
2017-02-02 13:44 ` AKASHI Takahiro
2017-02-01 12:46 ` [PATCH v31 04/12] arm64: mm: allow for unmapping part of kernel mapping AKASHI Takahiro
2017-02-01 12:46 ` AKASHI Takahiro
2017-02-01 16:03 ` Mark Rutland
2017-02-01 16:03 ` Mark Rutland
2017-02-02 10:21 ` AKASHI Takahiro
2017-02-02 10:21 ` AKASHI Takahiro
2017-02-02 11:44 ` Mark Rutland
2017-02-02 11:44 ` Mark Rutland
2017-02-02 14:01 ` AKASHI Takahiro
2017-02-02 14:01 ` AKASHI Takahiro
2017-02-02 14:35 ` Mark Rutland
2017-02-02 14:35 ` Mark Rutland
2017-02-02 14:55 ` AKASHI Takahiro
2017-02-02 14:55 ` AKASHI Takahiro
2017-02-03 6:13 ` AKASHI Takahiro
2017-02-03 6:13 ` AKASHI Takahiro
2017-02-03 14:22 ` Mark Rutland
2017-02-03 14:22 ` Mark Rutland
2017-02-01 12:46 ` [PATCH v31 05/12] arm64: kdump: protect crash dump kernel memory AKASHI Takahiro
2017-02-01 12:46 ` AKASHI Takahiro
2017-02-01 18:00 ` Mark Rutland
2017-02-01 18:00 ` Mark Rutland
2017-02-01 18:25 ` Mark Rutland
2017-02-01 18:25 ` Mark Rutland
2017-02-02 10:39 ` AKASHI Takahiro
2017-02-02 10:39 ` AKASHI Takahiro
2017-02-02 11:54 ` Mark Rutland
2017-02-02 11:54 ` Mark Rutland
2017-02-03 1:45 ` AKASHI Takahiro
2017-02-03 1:45 ` AKASHI Takahiro
2017-02-03 11:51 ` Mark Rutland
2017-02-03 11:51 ` Mark Rutland
2017-02-02 10:45 ` James Morse
2017-02-02 10:45 ` James Morse
2017-02-02 11:19 ` AKASHI Takahiro
2017-02-02 11:19 ` AKASHI Takahiro
2017-02-02 11:48 ` Mark Rutland
2017-02-02 11:48 ` Mark Rutland
2017-02-02 10:31 ` AKASHI Takahiro
2017-02-02 10:31 ` AKASHI Takahiro
2017-02-02 11:16 ` Mark Rutland [this message]
2017-02-02 11:16 ` Mark Rutland
2017-02-02 14:36 ` AKASHI Takahiro
2017-02-02 14:36 ` AKASHI Takahiro
2017-02-02 15:36 ` Mark Rutland
2017-02-02 15:36 ` Mark Rutland
2017-02-01 12:46 ` [PATCH v31 06/12] arm64: hibernate: preserve kdump image around hibernation AKASHI Takahiro
2017-02-01 12:46 ` AKASHI Takahiro
2017-02-01 12:46 ` [PATCH v31 07/12] arm64: kdump: implement machine_crash_shutdown() AKASHI Takahiro
2017-02-01 12:46 ` AKASHI Takahiro
2017-02-01 12:46 ` [PATCH v31 08/12] arm64: kdump: add VMCOREINFO's for user-space tools AKASHI Takahiro
2017-02-01 12:46 ` AKASHI Takahiro
2017-02-01 12:46 ` [PATCH v31 09/12] arm64: kdump: provide /proc/vmcore file AKASHI Takahiro
2017-02-01 12:46 ` AKASHI Takahiro
2017-02-01 19:21 ` Mark Rutland
2017-02-01 19:21 ` Mark Rutland
2017-02-02 6:24 ` AKASHI Takahiro
2017-02-02 6:24 ` AKASHI Takahiro
2017-02-02 12:03 ` Mark Rutland
2017-02-02 12:03 ` Mark Rutland
2017-02-02 12:08 ` Mark Rutland
2017-02-02 12:08 ` Mark Rutland
2017-02-02 14:39 ` AKASHI Takahiro
2017-02-02 14:39 ` AKASHI Takahiro
2017-02-01 12:46 ` [PATCH v31 10/12] arm64: kdump: enable kdump in defconfig AKASHI Takahiro
2017-02-01 12:46 ` AKASHI Takahiro
2017-02-01 12:46 ` [PATCH v31 11/12] Documentation: kdump: describe arm64 port AKASHI Takahiro
2017-02-01 12:46 ` AKASHI Takahiro
2017-02-01 12:48 ` [PATCH v31 12/12] Documentation: dt: chosen properties for arm64 kdump AKASHI Takahiro
2017-02-01 12:48 ` AKASHI Takahiro
2017-02-01 12:48 ` AKASHI Takahiro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170202111637.GA31394@leverpostej \
--to=mark.rutland@arm.com \
--cc=bauerman@linux.vnet.ibm.com \
--cc=catalin.marinas@arm.com \
--cc=dyoung@redhat.com \
--cc=geoff@infradead.org \
--cc=james.morse@arm.com \
--cc=kexec@lists.infradead.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=takahiro.akashi@linaro.org \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.