linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: takahiro.akashi@linaro.org (AKASHI Takahiro)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v29 3/9] arm64: kdump: reserve memory for crash dump kernel
Date: Fri, 13 Jan 2017 17:16:18 +0900	[thread overview]
Message-ID: <20170113081617.GI20972@linaro.org> (raw)
In-Reply-To: <20170112150926.GA12249@leverpostej>

Hi Mark,

On Thu, Jan 12, 2017 at 03:09:26PM +0000, Mark Rutland wrote:
> Hi,
> 
> As a general note, I must apologise for my minimial review of the series
> until this point. Judging by the way the DT parts are organised. I'm
> very concerned with the way the DT parts are organised, and clearly I
> did not communicate my concerns and suggestions effectively in prior
> rounds of review.
> 
> On Wed, Dec 28, 2016 at 01:36:00PM +0900, AKASHI Takahiro wrote:
> > "crashkernel=" kernel parameter specifies the size (and optionally
> > the start address) of the system ram used by crash dump kernel.
> > reserve_crashkernel() will allocate and reserve the memory at the startup
> > of primary kernel.
> > 
> > This memory range will be exported to userspace via:
> > 	- an entry named "Crash kernel" in /proc/iomem, and
> > 	- "linux,crashkernel-base" and "linux,crashkernel-size" under
> > 	  /sys/firmware/devicetree/base/chosen
> 
> > +#ifdef CONFIG_KEXEC_CORE
> > +static unsigned long long crash_size, crash_base;
> > +static struct property crash_base_prop = {
> > +	.name = "linux,crashkernel-base",
> > +	.length = sizeof(u64),
> > +	.value = &crash_base
> > +};
> > +static struct property crash_size_prop = {
> > +	.name = "linux,crashkernel-size",
> > +	.length = sizeof(u64),
> > +	.value = &crash_size,
> > +};
> > +
> > +static int __init export_crashkernel(void)
> > +{
> > +	struct device_node *node;
> > +	int ret;
> > +
> > +	if (!crash_size)
> > +		return 0;
> > +
> > +	/* Add /chosen/linux,crashkernel-* properties */
> > +	node = of_find_node_by_path("/chosen");
> > +	if (!node)
> > +		return -ENOENT;
> > +
> > +	/*
> > +	 * There might be existing crash kernel properties, but we can't
> > +	 * be sure what's in them, so remove them.
> > +	 */
> > +	of_remove_property(node, of_find_property(node,
> > +				"linux,crashkernel-base", NULL));
> > +	of_remove_property(node, of_find_property(node,
> > +				"linux,crashkernel-size", NULL));
> > +
> > +	ret = of_add_property(node, &crash_base_prop);
> > +	if (ret)
> > +		goto ret_err;
> > +
> > +	ret = of_add_property(node, &crash_size_prop);
> > +	if (ret)
> > +		goto ret_err;
> > +
> > +	return 0;
> > +
> > +ret_err:
> > +	pr_warn("Exporting crashkernel region to device tree failed\n");
> > +	return ret;
> > +}
> > +late_initcall(export_crashkernel);
> 
> I very much do not like this.
> 
> I don't think we should be modifying the DT exposed to userspace in this
> manner, in the usual boot path, especially given that the kernel itself
> does not appear to be a consumer of this property. I do not think that
> it is right to use the DT exposed to userspace as a communication
> channel solely between the kernel and userspace.

As you mentioned in your comments against my patch#9, this property
originates from PPC implementation.
I added it solely from the sympathy for dt-based architectures.

> So I think we should drop the above, and for arm64 have userspace
> consistently use /proc/iomem (or perhaps a new kexec-specific file) to
> determine the region reserved for the crash kernel, if it needs to know
> this.

As a matter of fact, my port of kexec-tools doesn't check this property
and dropping it won't cause any problem.

> I'll have further comments on this front in the binding patch.
> 
> > +/*
> > + * reserve_crashkernel() - reserves memory for crash kernel
> > + *
> > + * This function reserves memory area given in "crashkernel=" kernel command
> > + * line parameter. The memory reserved is used by dump capture kernel when
> > + * primary kernel is crashing.
> > + */
> > +static void __init reserve_crashkernel(void)
> > +{
> > +	int ret;
> > +
> > +	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
> > +				&crash_size, &crash_base);
> > +	/* no crashkernel= or invalid value specified */
> > +	if (ret || !crash_size)
> > +		return;
> > +
> > +	if (crash_base == 0) {
> > +		/* Current arm64 boot protocol requires 2MB alignment */
> > +		crash_base = memblock_find_in_range(0, ARCH_LOW_ADDRESS_LIMIT,
> > +				crash_size, SZ_2M);
> > +		if (crash_base == 0) {
> > +			pr_warn("Unable to allocate crashkernel (size:%llx)\n",
> > +				crash_size);
> > +			return;
> > +		}
> > +	} else {
> > +		/* User specifies base address explicitly. */
> > +		if (!memblock_is_region_memory(crash_base, crash_size) ||
> > +			memblock_is_region_reserved(crash_base, crash_size)) {
> > +			pr_warn("crashkernel has wrong address or size\n");
> > +			return;
> > +		}
> > +
> > +		if (!IS_ALIGNED(crash_base, SZ_2M)) {
> > +			pr_warn("crashkernel base address is not 2MB aligned\n");
> > +			return;
> > +		}
> > +	}
> > +	memblock_reserve(crash_base, crash_size);
> 
> This will mean that the crash kernel will have a permanent alias in the linear
> map which is vulnerable to being clobbered. There could also be issues
> with mismatched attributes in future.

Good point, I've never thought of that except making the memblock
region "reserved."

> We're probably ok for now, but in future we'll likely want to fix this
> up to remove the region (or mark it nomap), and only map it temporarily
> when loading things into the region.

Well, I found that the following commit is already in:
        commit 9b492cf58077
        Author: Xunlei Pang <xlpang@redhat.com>
        Date:   Mon May 23 16:24:10 2016 -0700

            kexec: introduce a protection mechanism for the crashkernel
            reserved memory

To make best use of this framework, I'd like to re-use set_memory_ro/rx()
instead of removing the region from linear mapping. But to do so,
we need to
* make memblock_isolate_range() global,
* allow set_memory_ro/rx() to be applied to regions in linear mapping
since set_memory_ro/rx() works only on page-level mappings.

What do you think?
(See my tentative solution below.)

> > +
> > +	pr_info("Reserving %lldMB of memory at %lldMB for crashkernel\n",
> > +		crash_size >> 20, crash_base >> 20);
> > +
> > +	crashk_res.start = crash_base;
> > +	crashk_res.end = crash_base + crash_size - 1;
> > +}
> > +#else
> > +static void __init reserve_crashkernel(void)
> > +{
> > +	;
> 
> Nit: the ';' line can go.

OK

Thanks,
-Takahiro AKASHI

> > +}
> > +#endif /* CONFIG_KEXEC_CORE */
> > +
> >  /*
> >   * Return the maximum physical address for ZONE_DMA (DMA_BIT_MASK(32)). It
> >   * currently assumes that for memory starting above 4G, 32-bit devices will
> > @@ -331,6 +438,9 @@ void __init arm64_memblock_init(void)
> >  		arm64_dma_phys_limit = max_zone_dma_phys();
> >  	else
> >  		arm64_dma_phys_limit = PHYS_MASK + 1;
> > +
> > +	reserve_crashkernel();
> > +
> >  	dma_contiguous_reserve(arm64_dma_phys_limit);
> >  
> >  	memblock_allow_resize();
> > -- 
> > 2.11.0
> 
> Other than my comments regarding the DT usage above, this looks fine to
> me.
> 
> Thanks,
> Mark.

===8<===
diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index c0fc3d458195..bb21c0473b8e 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -211,6 +211,44 @@ void machine_kexec(struct kimage *kimage)
 	BUG(); /* Should never get here. */
 }
 
+static int kexec_mark_range(unsigned long start, unsigned long end,
+							bool protect)
+{
+	unsigned int nr_pages;
+
+	if (!end || start >= end)
+		return 0;
+
+	nr_pages = (end >> PAGE_SHIFT) - (start >> PAGE_SHIFT) + 1;
+
+	if (protect)
+		return set_memory_ro(__phys_to_virt(start), nr_pages);
+	else
+		return set_memory_rw(__phys_to_virt(start), nr_pages);
+}
+
+static void kexec_mark_crashkres(bool protect)
+{
+	unsigned long control;
+
+	/* Don't touch the control code page used in crash_kexec().*/
+	control = page_to_phys(kexec_crash_image->control_code_page);
+	kexec_mark_range(crashk_res.start, control - 1, protect);
+
+	control += KEXEC_CONTROL_PAGE_SIZE;
+	kexec_mark_range(control, crashk_res.end, protect);
+}
+
+void arch_kexec_protect_crashkres(void)
+{
+	kexec_mark_crashkres(true);
+}
+
+void arch_kexec_unprotect_crashkres(void)
+{
+	kexec_mark_crashkres(false);
+}
+
 static void machine_kexec_mask_interrupts(void)
 {
 	unsigned int i;
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 569ec3325bc8..764ec89c4f76 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -90,6 +90,7 @@ early_param("initrd", early_initrd);
 static void __init reserve_crashkernel(void)
 {
 	unsigned long long crash_size, crash_base;
+	int start_rgn, end_rgn;
 	int ret;
 
 	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
@@ -121,6 +122,9 @@ static void __init reserve_crashkernel(void)
 		}
 	}
 	memblock_reserve(crash_base, crash_size);
+	memblock_isolate_range(&memblock.memory, crash_base, crash_size,
+			&start_rgn, &end_rgn);
+
 
 	pr_info("Reserving %lldMB of memory at %lldMB for crashkernel\n",
 		crash_size >> 20, crash_base >> 20);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 17243e43184e..0f60f19c287b 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -22,6 +22,7 @@
 #include <linux/kernel.h>
 #include <linux/errno.h>
 #include <linux/init.h>
+#include <linux/kexec.h>
 #include <linux/libfdt.h>
 #include <linux/mman.h>
 #include <linux/nodemask.h>
@@ -362,6 +363,17 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end
 	unsigned long kernel_start = __pa(_text);
 	unsigned long kernel_end = __pa(__init_begin);
 
+#ifdef CONFIG_KEXEC_CORE
+	if (crashk_res.end && start >= crashk_res.start &&
+			end <= (crashk_res.end + 1)) {
+		__create_pgd_mapping(pgd, start, __phys_to_virt(start),
+				     end - start, PAGE_KERNEL,
+				     early_pgtable_alloc,
+				     true);
+		return;
+	}
+#endif
+
 	/*
 	 * Take care not to create a writable alias for the
 	 * read-only text and rodata sections of the kernel image.
===>8===

  reply	other threads:[~2017-01-13  8:16 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-28  4:33 [PATCH v29 0/9] arm64: add kdump support AKASHI Takahiro
2016-12-28  4:35 ` [PATCH v29 1/9] memblock: add memblock_cap_memory_range() AKASHI Takahiro
2017-01-10 11:16   ` Will Deacon
2017-01-11  1:57     ` Dennis Chen
2016-12-28  4:35 ` [PATCH v29 2/9] arm64: limit memory regions based on DT property, usable-memory-range AKASHI Takahiro
2016-12-28  4:36 ` [PATCH v29 3/9] arm64: kdump: reserve memory for crash dump kernel AKASHI Takahiro
2017-01-12 15:09   ` Mark Rutland
2017-01-13  8:16     ` AKASHI Takahiro [this message]
2017-01-13 11:39       ` Mark Rutland
2017-01-17  8:20         ` AKASHI Takahiro
2017-01-17 11:54           ` Mark Rutland
2017-01-19  9:49             ` AKASHI Takahiro
2017-01-19 11:28               ` Mark Rutland
2017-01-23  9:51                 ` AKASHI Takahiro
2017-01-23 10:23                   ` Mark Rutland
2017-01-24  7:55                     ` AKASHI Takahiro
2016-12-28  4:36 ` [PATCH v29 4/9] arm64: kdump: implement machine_crash_shutdown() AKASHI Takahiro
2017-01-10 11:32   ` Will Deacon
2017-01-11  6:36     ` AKASHI Takahiro
2017-01-11 10:54       ` Will Deacon
2017-01-12  4:21         ` AKASHI Takahiro
2017-01-12 12:01           ` Will Deacon
2017-01-23 17:46             ` Will Deacon
2017-01-24  7:52               ` AKASHI Takahiro
2016-12-28  4:36 ` [PATCH v29 5/9] arm64: kdump: add VMCOREINFO's for user-space tools AKASHI Takahiro
2016-12-28  4:36 ` [PATCH v29 6/9] arm64: kdump: provide /proc/vmcore file AKASHI Takahiro
2016-12-28  4:36 ` [PATCH v29 7/9] arm64: kdump: enable kdump in defconfig AKASHI Takahiro
2016-12-28  4:36 ` [PATCH v29 8/9] Documentation: kdump: describe arm64 port AKASHI Takahiro
2016-12-28  4:37 ` [PATCH v29 9/9] Documentation: dt: chosen properties for arm64 kdump AKASHI Takahiro
2017-01-10 11:10   ` Will Deacon
2017-01-12 15:39   ` Mark Rutland
2017-01-13  9:13     ` AKASHI Takahiro
2017-01-13 11:17       ` Mark Rutland
2017-01-16  8:25         ` AKASHI Takahiro
2017-01-17  8:26           ` Dave Young
2017-01-19  9:01             ` AKASHI Takahiro
2017-01-17 11:13           ` Mark Rutland
2017-01-06  3:26 ` [PATCH v29 0/9] arm64: add kdump support Pratyush Anand
2017-01-06  4:34   ` AKASHI Takahiro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170113081617.GI20972@linaro.org \
    --to=takahiro.akashi@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).