From: James Morse <james.morse@arm.com>
To: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: mark.rutland@arm.com, geoff@infradead.org,
catalin.marinas@arm.com, will.deacon@arm.com,
bauerman@linux.vnet.ibm.com, dyoung@redhat.com,
kexec@lists.infradead.org, linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v30 05/11] arm64: kdump: protect crash dump kernel memory
Date: Wed, 25 Jan 2017 17:37:38 +0000 [thread overview]
Message-ID: <5888E262.4050208@arm.com> (raw)
In-Reply-To: <20170124085004.3892-4-takahiro.akashi@linaro.org>
Hi Akashi,
On 24/01/17 08:49, AKASHI Takahiro wrote:
> To protect the memory reserved for crash dump kernel once after loaded,
> arch_kexec_protect_crashres/unprotect_crashres() are meant to deal with
> permissions of the corresponding kernel mappings.
>
> We also have to
> - put the region in an isolated mapping, and
> - move copying kexec's control_code_page to machine_kexec_prepare()
> so that the region will be completely read-only after loading.
> Note that the region must reside in linear mapping and have corresponding
> page structures in order to be potentially freed by shrinking it through
> /sys/kernel/kexec_crash_size.
Nasty! Presumably you have to build the crash region out of individual page
mappings so that they can be returned to the slab-allocator one page at a time,
and still be able to set/clear the valid bits on the remaining chunk.
(I don't see how that happens in this patch)
debug_pagealloc has to do this too so it can flip the valid bits one page at a
time. You could change the debug_pagealloc_enabled() value passed in at the top
__create_pgd_mapping() level to be a needs_per_page_mapping(addr, size) test
that happens as we build the linear map. (This would save the 3 extra calls to
__create_pgd_mapping() in __map_memblock())
I'm glad to see you can't resize the region if a crash kernel is loaded!
This secretly-unmapped is the sort of thing that breaks hibernate, it blindly
assumes pfn_valid() means it can access the page if it wants to. Setting
PG_Reserved is a quick way to trick it out of doing this, but that would leave
the crash kernel region un-initialised after resume, while kexec_crash_image
still has a value.
I think the best fix for this is to forbid hibernate if kexec_crash_loaded()
arguing these are mutually-exclusive features, and the protect crash-dump
feature exists to prevent things like hibernate corrupting the crash region.
> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
> index bc96c8a7fc79..f7938fecf3ff 100644
> --- a/arch/arm64/kernel/machine_kexec.c
> +++ b/arch/arm64/kernel/machine_kexec.c
> @@ -159,32 +171,20 @@ void machine_kexec(struct kimage *kimage)
> kimage->control_code_page);
> pr_debug("%s:%d: reboot_code_buffer_phys: %pa\n", __func__, __LINE__,
> &reboot_code_buffer_phys);
> - pr_debug("%s:%d: reboot_code_buffer: %p\n", __func__, __LINE__,
> - reboot_code_buffer);
> pr_debug("%s:%d: relocate_new_kernel: %p\n", __func__, __LINE__,
> arm64_relocate_new_kernel);
> pr_debug("%s:%d: relocate_new_kernel_size: 0x%lx(%lu) bytes\n",
> __func__, __LINE__, arm64_relocate_new_kernel_size,
> arm64_relocate_new_kernel_size);
>
> - /*
> - * Copy arm64_relocate_new_kernel to the reboot_code_buffer for use
> - * after the kernel is shut down.
> - */
> - memcpy(reboot_code_buffer, arm64_relocate_new_kernel,
> - arm64_relocate_new_kernel_size);
> -
> - /* Flush the reboot_code_buffer in preparation for its execution. */
> - __flush_dcache_area(reboot_code_buffer, arm64_relocate_new_kernel_size);
> - flush_icache_range((uintptr_t)reboot_code_buffer,
> - arm64_relocate_new_kernel_size);
> - /* Flush the kimage list and its buffers. */
> - kexec_list_flush(kimage);
> + if (kimage != kexec_crash_image) {
> + /* Flush the kimage list and its buffers. */
> + kexec_list_flush(kimage);
>
> - /* Flush the new image if already in place. */
> - if (kimage->head & IND_DONE)
> - kexec_segment_flush(kimage);
> + /* Flush the new image if already in place. */
> + if (kimage->head & IND_DONE)
> + kexec_segment_flush(kimage);
> + }
So for kdump we cleaned the kimage->segment[i].mem regions in
arch_kexec_protect_crashkres(), so don't need to do it here.
What about the kimage->head[i] array of list entries that were cleaned by
kexec_list_flush()? Now we don't clean that for kdump either, but we do pass it
arm64_relocate_new_kernel() at the end of this function:
> cpu_soft_restart(1, reboot_code_buffer_phys, kimage->head, kimage_start, 0);
Can we test the IND_DONE_BIT of kimage->head, so that we know that
arm64_relocate_new_kernel() won't try to walk the unclean list?
Alternatively we could call kexec_list_flush() in arch_kexec_protect_crashkres()
too.
Thanks,
James
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
WARNING: multiple messages have this Message-ID (diff)
From: james.morse@arm.com (James Morse)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v30 05/11] arm64: kdump: protect crash dump kernel memory
Date: Wed, 25 Jan 2017 17:37:38 +0000 [thread overview]
Message-ID: <5888E262.4050208@arm.com> (raw)
In-Reply-To: <20170124085004.3892-4-takahiro.akashi@linaro.org>
Hi Akashi,
On 24/01/17 08:49, AKASHI Takahiro wrote:
> To protect the memory reserved for crash dump kernel once after loaded,
> arch_kexec_protect_crashres/unprotect_crashres() are meant to deal with
> permissions of the corresponding kernel mappings.
>
> We also have to
> - put the region in an isolated mapping, and
> - move copying kexec's control_code_page to machine_kexec_prepare()
> so that the region will be completely read-only after loading.
> Note that the region must reside in linear mapping and have corresponding
> page structures in order to be potentially freed by shrinking it through
> /sys/kernel/kexec_crash_size.
Nasty! Presumably you have to build the crash region out of individual page
mappings so that they can be returned to the slab-allocator one page at a time,
and still be able to set/clear the valid bits on the remaining chunk.
(I don't see how that happens in this patch)
debug_pagealloc has to do this too so it can flip the valid bits one page at a
time. You could change the debug_pagealloc_enabled() value passed in at the top
__create_pgd_mapping() level to be a needs_per_page_mapping(addr, size) test
that happens as we build the linear map. (This would save the 3 extra calls to
__create_pgd_mapping() in __map_memblock())
I'm glad to see you can't resize the region if a crash kernel is loaded!
This secretly-unmapped is the sort of thing that breaks hibernate, it blindly
assumes pfn_valid() means it can access the page if it wants to. Setting
PG_Reserved is a quick way to trick it out of doing this, but that would leave
the crash kernel region un-initialised after resume, while kexec_crash_image
still has a value.
I think the best fix for this is to forbid hibernate if kexec_crash_loaded()
arguing these are mutually-exclusive features, and the protect crash-dump
feature exists to prevent things like hibernate corrupting the crash region.
> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
> index bc96c8a7fc79..f7938fecf3ff 100644
> --- a/arch/arm64/kernel/machine_kexec.c
> +++ b/arch/arm64/kernel/machine_kexec.c
> @@ -159,32 +171,20 @@ void machine_kexec(struct kimage *kimage)
> kimage->control_code_page);
> pr_debug("%s:%d: reboot_code_buffer_phys: %pa\n", __func__, __LINE__,
> &reboot_code_buffer_phys);
> - pr_debug("%s:%d: reboot_code_buffer: %p\n", __func__, __LINE__,
> - reboot_code_buffer);
> pr_debug("%s:%d: relocate_new_kernel: %p\n", __func__, __LINE__,
> arm64_relocate_new_kernel);
> pr_debug("%s:%d: relocate_new_kernel_size: 0x%lx(%lu) bytes\n",
> __func__, __LINE__, arm64_relocate_new_kernel_size,
> arm64_relocate_new_kernel_size);
>
> - /*
> - * Copy arm64_relocate_new_kernel to the reboot_code_buffer for use
> - * after the kernel is shut down.
> - */
> - memcpy(reboot_code_buffer, arm64_relocate_new_kernel,
> - arm64_relocate_new_kernel_size);
> -
> - /* Flush the reboot_code_buffer in preparation for its execution. */
> - __flush_dcache_area(reboot_code_buffer, arm64_relocate_new_kernel_size);
> - flush_icache_range((uintptr_t)reboot_code_buffer,
> - arm64_relocate_new_kernel_size);
> - /* Flush the kimage list and its buffers. */
> - kexec_list_flush(kimage);
> + if (kimage != kexec_crash_image) {
> + /* Flush the kimage list and its buffers. */
> + kexec_list_flush(kimage);
>
> - /* Flush the new image if already in place. */
> - if (kimage->head & IND_DONE)
> - kexec_segment_flush(kimage);
> + /* Flush the new image if already in place. */
> + if (kimage->head & IND_DONE)
> + kexec_segment_flush(kimage);
> + }
So for kdump we cleaned the kimage->segment[i].mem regions in
arch_kexec_protect_crashkres(), so don't need to do it here.
What about the kimage->head[i] array of list entries that were cleaned by
kexec_list_flush()? Now we don't clean that for kdump either, but we do pass it
arm64_relocate_new_kernel() at the end of this function:
> cpu_soft_restart(1, reboot_code_buffer_phys, kimage->head, kimage_start, 0);
Can we test the IND_DONE_BIT of kimage->head, so that we know that
arm64_relocate_new_kernel() won't try to walk the unclean list?
Alternatively we could call kexec_list_flush() in arch_kexec_protect_crashkres()
too.
Thanks,
James
next prev parent reply other threads:[~2017-01-25 17:37 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-01-24 8:46 [PATCH v30 00/11] arm64: add kdump support AKASHI Takahiro
2017-01-24 8:46 ` AKASHI Takahiro
2017-01-24 8:49 ` [PATCH v30 01/11] memblock: add memblock_cap_memory_range() AKASHI Takahiro
2017-01-24 8:49 ` AKASHI Takahiro
2017-01-24 8:49 ` AKASHI Takahiro
2017-01-24 8:49 ` [PATCH v30 02/11] arm64: limit memory regions based on DT property, usable-memory-range AKASHI Takahiro
2017-01-24 8:49 ` AKASHI Takahiro
2017-01-24 8:49 ` [PATCH v30 03/11] arm64: kdump: reserve memory for crash dump kernel AKASHI Takahiro
2017-01-24 8:49 ` AKASHI Takahiro
2017-01-24 8:49 ` [PATCH v30 04/11] arm64: mm: allow for unmapping memory region from kernel mapping AKASHI Takahiro
2017-01-24 8:49 ` AKASHI Takahiro
2017-01-24 11:32 ` Pratyush Anand
2017-01-24 11:32 ` Pratyush Anand
2017-01-25 6:37 ` AKASHI Takahiro
2017-01-25 6:37 ` AKASHI Takahiro
2017-01-25 15:49 ` James Morse
2017-01-25 15:49 ` James Morse
2017-01-26 8:08 ` AKASHI Takahiro
2017-01-26 8:08 ` AKASHI Takahiro
2017-01-24 8:49 ` [PATCH v30 05/11] arm64: kdump: protect crash dump kernel memory AKASHI Takahiro
2017-01-24 8:49 ` AKASHI Takahiro
2017-01-25 17:37 ` James Morse [this message]
2017-01-25 17:37 ` James Morse
2017-01-26 11:28 ` AKASHI Takahiro
2017-01-26 11:28 ` AKASHI Takahiro
2017-01-27 11:19 ` James Morse
2017-01-27 11:19 ` James Morse
2017-01-27 17:15 ` AKASHI Takahiro
2017-01-27 17:15 ` AKASHI Takahiro
2017-01-27 18:56 ` Mark Rutland
2017-01-27 18:56 ` Mark Rutland
2017-01-30 8:42 ` AKASHI Takahiro
2017-01-30 8:42 ` AKASHI Takahiro
2017-01-30 8:27 ` AKASHI Takahiro
2017-01-30 8:27 ` AKASHI Takahiro
2017-01-27 13:59 ` James Morse
2017-01-27 13:59 ` James Morse
2017-01-27 15:42 ` AKASHI Takahiro
2017-01-27 15:42 ` AKASHI Takahiro
2017-01-27 19:41 ` Mark Rutland
2017-01-27 19:41 ` Mark Rutland
2017-01-24 8:50 ` [PATCH v30 06/11] arm64: kdump: implement machine_crash_shutdown() AKASHI Takahiro
2017-01-24 8:50 ` AKASHI Takahiro
2017-01-24 8:50 ` [PATCH v30 07/11] arm64: kdump: add VMCOREINFO's for user-space tools AKASHI Takahiro
2017-01-24 8:50 ` AKASHI Takahiro
2017-01-24 8:50 ` [PATCH v30 08/11] arm64: kdump: provide /proc/vmcore file AKASHI Takahiro
2017-01-24 8:50 ` AKASHI Takahiro
2017-01-24 8:50 ` [PATCH v30 09/11] arm64: kdump: enable kdump in defconfig AKASHI Takahiro
2017-01-24 8:50 ` AKASHI Takahiro
2017-01-24 8:50 ` [PATCH v30 10/11] Documentation: kdump: describe arm64 port AKASHI Takahiro
2017-01-24 8:50 ` AKASHI Takahiro
2017-01-24 8:53 ` [PATCH v30 11/11] Documentation: dt: chosen properties for arm64 kdump AKASHI Takahiro
2017-01-24 8:53 ` AKASHI Takahiro
2017-01-24 8:53 ` AKASHI Takahiro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5888E262.4050208@arm.com \
--to=james.morse@arm.com \
--cc=bauerman@linux.vnet.ibm.com \
--cc=catalin.marinas@arm.com \
--cc=dyoung@redhat.com \
--cc=geoff@infradead.org \
--cc=kexec@lists.infradead.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=mark.rutland@arm.com \
--cc=takahiro.akashi@linaro.org \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.