From: AKASHI Takahiro <takahiro.akashi@linaro.org>
To: James Morse <james.morse@arm.com>
Cc: herbert@gondor.apana.org.au, bhe@redhat.com,
ard.biesheuvel@linaro.org, catalin.marinas@arm.com,
bhsharma@redhat.com, will.deacon@arm.com,
linux-kernel@vger.kernel.org, dhowells@redhat.com, arnd@arndb.de,
linux-arm-kernel@lists.infradead.org, kexec@lists.infradead.org,
dyoung@redhat.com, davem@davemloft.net, vgoyal@redhat.com
Subject: Re: [PATCH v9 07/11] arm64: kexec_file: add crash dump support
Date: Fri, 18 May 2018 19:39:26 +0900 [thread overview]
Message-ID: <20180518103925.GP2737@linaro.org> (raw)
In-Reply-To: <f8d8305d-42f5-e1e2-20c4-733cb38cbb8d@arm.com>
On Tue, May 15, 2018 at 06:11:15PM +0100, James Morse wrote:
> Hi Akashi,
>
> On 25/04/18 07:26, AKASHI Takahiro wrote:
> > Enabling crash dump (kdump) includes
> > * prepare contents of ELF header of a core dump file, /proc/vmcore,
> > using crash_prepare_elf64_headers(), and
> > * add two device tree properties, "linux,usable-memory-range" and
> > "linux,elfcorehdr", which represent repsectively a memory range
>
> (Nit: respectively)
Will fix.
>
> > to be used by crash dump kernel and the header's location
>
> > arch/arm64/include/asm/kexec.h | 4 +
> > arch/arm64/kernel/kexec_image.c | 9 +-
> > arch/arm64/kernel/machine_kexec_file.c | 202 +++++++++++++++++++++++++
>
> In this patch, machine_kexec_file.c gains its own private fdt array encoder.
See below.
>
> > diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
> > index 37c0a9dc2e47..ec674f4d267c 100644
> > --- a/arch/arm64/kernel/machine_kexec_file.c
> > +++ b/arch/arm64/kernel/machine_kexec_file.c
> > @@ -76,6 +81,78 @@ int arch_kexec_walk_mem(struct kexec_buf *kbuf,
> > return ret;
> > }
> >
> > +static int __init arch_kexec_file_init(void)
> > +{
> > + /* Those values are used later on loading the kernel */
> > + __dt_root_addr_cells = dt_root_addr_cells;
> > + __dt_root_size_cells = dt_root_size_cells;
> > +
> > + return 0;
> > +}
> > +late_initcall(arch_kexec_file_init);
>
> If we need these is it worth taking them out of __initdata? I note they've been
> 'temporary' for quite a long time.
I think that I had some reason that I didn't do that, but don't remember now.
If there's no problem, I will take your suggestion.
>
> > +
> > +#define FDT_ALIGN(x, a) (((x) + (a) - 1) & ~((a) - 1))
> > +#define FDT_TAGALIGN(x) (FDT_ALIGN((x), FDT_TAGSIZE))
> > +
> > +static int fdt_prop_len(const char *prop_name, int len)
> > +{
> > + return (strlen(prop_name) + 1) +
> > + sizeof(struct fdt_property) +
> > + FDT_TAGALIGN(len);
> > +}
>
> This stuff should really be in libfdt.h Those macros come from
> libfdt_internal.h, so we're probably doing something wrong here.
>
>
> > +static bool cells_size_fitted(unsigned long base, unsigned long size)
> > +{
> > + /* if *_cells >= 2, cells can hold 64-bit values anyway */
> > + if ((__dt_root_addr_cells == 1) && (base >= (1ULL << 32)))
> > + return false;
> > +
> > + if ((__dt_root_size_cells == 1) && (size >= (1ULL << 32)))
> > + return false;
>
> Using '> U32_MAX' here may be more readable.
OK
>
> > + return true;
> > +}
> > +
> > +static void fill_property(void *buf, u64 val64, int cells)
> > +{
> > + u32 val32;
> > +
> > + if (cells == 1) {
> > + val32 = cpu_to_fdt32((u32)val64);
> > + memcpy(buf, &val32, sizeof(val32));
> > + } else {
>
> > + memset(buf, 0, cells * sizeof(u32) - sizeof(u64));
> > + buf += cells * sizeof(u32) - sizeof(u64);
>
> Is this trying to clear the 'top' cells and shuffle the pointer to point at the
> 'bottom' 2? I'm pretty sure this isn't endian safe.
>
> Do we really expect a system to have #address-cells > 2?
I don't know, but just for safety.
>
> > + val64 = cpu_to_fdt64(val64);
> > + memcpy(buf, &val64, sizeof(val64));
> > + }
> > +}
> > +
> > +static int fdt_setprop_range(void *fdt, int nodeoffset, const char *name,
> > + unsigned long addr, unsigned long size)
>
> (the device-tree spec describes a 'ranges' property, which had me confused. This
> is encoding a prop-encoded-array)
Should we rename it to, say, fdt_setprop_reg()?
> > +{
> > + void *buf, *prop;
> > + size_t buf_size;
> > + int result;
> > +
> > + buf_size = (__dt_root_addr_cells + __dt_root_size_cells) * sizeof(u32);
> > + prop = buf = vmalloc(buf_size);
>
> virtual memory allocation for something less than PAGE_SIZE?
I've never cared about that. Let me think again.
>
> > + if (!buf)
> > + return -ENOMEM;
> > +
> > + fill_property(prop, addr, __dt_root_addr_cells);
> > + prop += __dt_root_addr_cells * sizeof(u32);
> > +
> > + fill_property(prop, size, __dt_root_size_cells);
> > +
> > + result = fdt_setprop(fdt, nodeoffset, name, buf, buf_size);
> > +
> > + vfree(buf);
> > +
> > + return result;
> > +}
>
> Doesn't this stuff belong in libfdt? I guess there is no 'add array element' api
> because this the first time we've wanted to create a node with more than
> key=fixed-size-value.
>
> I don't think this belongs in arch C code. Do we have a plan for getting libfdt
> to support encoding prop-arrays? Can we put it somewhere anyone else duplicating
> this will find it, until we can (re)move it?
I will temporarily move all fdt-related stuff to a separate file, but
> I have no idea how that happens... it looks like the devicetree list is the
> place to ask.
should we always sync with the original dtc/libfdt repository?
>
> > static int setup_dtb(struct kimage *image,
> > unsigned long initrd_load_addr, unsigned long initrd_len,
> > char *cmdline, unsigned long cmdline_len,
> > @@ -88,10 +165,26 @@ static int setup_dtb(struct kimage *image,
> > int range_len;
> > int ret;
> >
> > + /* check ranges against root's #address-cells and #size-cells */
> > + if (image->type == KEXEC_TYPE_CRASH &&
> > + (!cells_size_fitted(image->arch.elf_load_addr,
> > + image->arch.elf_headers_sz) ||
> > + !cells_size_fitted(crashk_res.start,
> > + crashk_res.end - crashk_res.start + 1))) {
> > + pr_err("Crash memory region doesn't fit into DT's root cell sizes.\n");
> > + ret = -EINVAL;
> > + goto out_err;
> > + }
>
> To check I've understood this properly: This can happen if the firmware provided
> a DTB with 32bit address/size cells, but at least some of the memory requires 64
> bit address/size cells. This could only happen on a UEFI system where the
> firmware-DTB doesn't describe memory. ACPI-only systems would have the EFIstub DT.
Probably, yes. I assumed the case where #address-cells and #size-cells
were just missing in fdt.
>
> > /* duplicate dt blob */
> > buf_size = fdt_totalsize(initial_boot_params);
> > range_len = (__dt_root_addr_cells + __dt_root_size_cells) * sizeof(u32);
> >
> > + if (image->type == KEXEC_TYPE_CRASH)
> > + buf_size += fdt_prop_len("linux,elfcorehdr", range_len)
> > + + fdt_prop_len("linux,usable-memory-range",
> > + range_len);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > +
> > if (initrd_load_addr)
> > buf_size += fdt_prop_len("linux,initrd-start", sizeof(u64))
> > + fdt_prop_len("linux,initrd-end", sizeof(u64));
> > @@ -113,6 +206,23 @@ static int setup_dtb(struct kimage *image,
> > if (nodeoffset < 0)
> > goto out_err;
> >
> > + if (image->type == KEXEC_TYPE_CRASH) {
> > + /* add linux,elfcorehdr */
> > + ret = fdt_setprop_range(buf, nodeoffset, "linux,elfcorehdr",
> > + image->arch.elf_load_addr,
> > + image->arch.elf_headers_sz);
> > + if (ret)
> > + goto out_err;
> > +
> > + /* add linux,usable-memory-range */
> > + ret = fdt_setprop_range(buf, nodeoffset,
> > + "linux,usable-memory-range",
> > + crashk_res.start,
> > + crashk_res.end - crashk_res.start + 1);
>
> Don't you need to add "linux,usable-memory-range" to the buf_size estimate?
I think the code exists. See above.
>
> > + if (ret)
> > + goto out_err;
> > + }
>
> > @@ -148,17 +258,109 @@ static int setup_dtb(struct kimage *image,
>
> > +static struct crash_mem *get_crash_memory_ranges(void)
> > +{
> > + unsigned int nr_ranges;
> > + struct crash_mem *cmem;
> > +
> > + nr_ranges = 1; /* for exclusion of crashkernel region */
> > + walk_system_ram_res(0, -1, &nr_ranges, get_nr_ranges_callback);
> > +
> > + cmem = vmalloc(sizeof(struct crash_mem) +
> > + sizeof(struct crash_mem_range) * nr_ranges);
> > + if (!cmem)
> > + return NULL;
> > +
> > + cmem->max_nr_ranges = nr_ranges;
> > + cmem->nr_ranges = 0;
> > + walk_system_ram_res(0, -1, cmem, add_mem_range_callback);
> > +
> > + /* Exclude crashkernel region */
> > + if (crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end)) {
> > + vfree(cmem);
> > + return NULL;
> > + }
> > +
> > + return cmem;
> > +}
>
> Could this function be included in prepare_elf_headers() so that the alloc() and
> free() occur together.
Or aiming that arm64 and x86 have similar-look code?
>
> > +static int prepare_elf_headers(void **addr, unsigned long *sz)
> > +{
> > + struct crash_mem *cmem;
> > + int ret = 0;
> > +
> > + cmem = get_crash_memory_ranges();
> > + if (!cmem)
> > + return -ENOMEM;
> > +
> > + ret = crash_prepare_elf64_headers(cmem, true, addr, sz);
> > +
> > + vfree(cmem);
>
> > + return ret;
> > +}
>
> All this is moving memory-range information from core-code's
> walk_system_ram_res() into core-code's struct crash_mem, and excluding
> crashk_res, which again is accessible to the core code.
>
> It looks like this is duplicated in arch/x86 and arch/arm64 because arm64
> doesn't have a second 'crashk_low_res' region, and always wants elf64, instead
> of when IS_ENABLED(CONFIG_X86_64).
> If we can abstract just those two, more of this could be moved to core code
> where powerpc can make use of it if they want to support kdump with
> kexec_file_load().
>
> But, its getting late for cross-architecture dependencies, lets put that on the
> for-later list. (assuming there isn't a powerpc-kdump series out there adding a
> third copy of this)
Sure. X86 code has so many exceptional lines in the code :)
Thanks,
-Takahiro AKASHI
>
> Thanks,
>
> James
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
WARNING: multiple messages have this Message-ID (diff)
From: takahiro.akashi@linaro.org (AKASHI Takahiro)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v9 07/11] arm64: kexec_file: add crash dump support
Date: Fri, 18 May 2018 19:39:26 +0900 [thread overview]
Message-ID: <20180518103925.GP2737@linaro.org> (raw)
In-Reply-To: <f8d8305d-42f5-e1e2-20c4-733cb38cbb8d@arm.com>
On Tue, May 15, 2018 at 06:11:15PM +0100, James Morse wrote:
> Hi Akashi,
>
> On 25/04/18 07:26, AKASHI Takahiro wrote:
> > Enabling crash dump (kdump) includes
> > * prepare contents of ELF header of a core dump file, /proc/vmcore,
> > using crash_prepare_elf64_headers(), and
> > * add two device tree properties, "linux,usable-memory-range" and
> > "linux,elfcorehdr", which represent repsectively a memory range
>
> (Nit: respectively)
Will fix.
>
> > to be used by crash dump kernel and the header's location
>
> > arch/arm64/include/asm/kexec.h | 4 +
> > arch/arm64/kernel/kexec_image.c | 9 +-
> > arch/arm64/kernel/machine_kexec_file.c | 202 +++++++++++++++++++++++++
>
> In this patch, machine_kexec_file.c gains its own private fdt array encoder.
See below.
>
> > diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
> > index 37c0a9dc2e47..ec674f4d267c 100644
> > --- a/arch/arm64/kernel/machine_kexec_file.c
> > +++ b/arch/arm64/kernel/machine_kexec_file.c
> > @@ -76,6 +81,78 @@ int arch_kexec_walk_mem(struct kexec_buf *kbuf,
> > return ret;
> > }
> >
> > +static int __init arch_kexec_file_init(void)
> > +{
> > + /* Those values are used later on loading the kernel */
> > + __dt_root_addr_cells = dt_root_addr_cells;
> > + __dt_root_size_cells = dt_root_size_cells;
> > +
> > + return 0;
> > +}
> > +late_initcall(arch_kexec_file_init);
>
> If we need these is it worth taking them out of __initdata? I note they've been
> 'temporary' for quite a long time.
I think that I had some reason that I didn't do that, but don't remember now.
If there's no problem, I will take your suggestion.
>
> > +
> > +#define FDT_ALIGN(x, a) (((x) + (a) - 1) & ~((a) - 1))
> > +#define FDT_TAGALIGN(x) (FDT_ALIGN((x), FDT_TAGSIZE))
> > +
> > +static int fdt_prop_len(const char *prop_name, int len)
> > +{
> > + return (strlen(prop_name) + 1) +
> > + sizeof(struct fdt_property) +
> > + FDT_TAGALIGN(len);
> > +}
>
> This stuff should really be in libfdt.h Those macros come from
> libfdt_internal.h, so we're probably doing something wrong here.
>
>
> > +static bool cells_size_fitted(unsigned long base, unsigned long size)
> > +{
> > + /* if *_cells >= 2, cells can hold 64-bit values anyway */
> > + if ((__dt_root_addr_cells == 1) && (base >= (1ULL << 32)))
> > + return false;
> > +
> > + if ((__dt_root_size_cells == 1) && (size >= (1ULL << 32)))
> > + return false;
>
> Using '> U32_MAX' here may be more readable.
OK
>
> > + return true;
> > +}
> > +
> > +static void fill_property(void *buf, u64 val64, int cells)
> > +{
> > + u32 val32;
> > +
> > + if (cells == 1) {
> > + val32 = cpu_to_fdt32((u32)val64);
> > + memcpy(buf, &val32, sizeof(val32));
> > + } else {
>
> > + memset(buf, 0, cells * sizeof(u32) - sizeof(u64));
> > + buf += cells * sizeof(u32) - sizeof(u64);
>
> Is this trying to clear the 'top' cells and shuffle the pointer to point at the
> 'bottom' 2? I'm pretty sure this isn't endian safe.
>
> Do we really expect a system to have #address-cells > 2?
I don't know, but just for safety.
>
> > + val64 = cpu_to_fdt64(val64);
> > + memcpy(buf, &val64, sizeof(val64));
> > + }
> > +}
> > +
> > +static int fdt_setprop_range(void *fdt, int nodeoffset, const char *name,
> > + unsigned long addr, unsigned long size)
>
> (the device-tree spec describes a 'ranges' property, which had me confused. This
> is encoding a prop-encoded-array)
Should we rename it to, say, fdt_setprop_reg()?
> > +{
> > + void *buf, *prop;
> > + size_t buf_size;
> > + int result;
> > +
> > + buf_size = (__dt_root_addr_cells + __dt_root_size_cells) * sizeof(u32);
> > + prop = buf = vmalloc(buf_size);
>
> virtual memory allocation for something less than PAGE_SIZE?
I've never cared about that. Let me think again.
>
> > + if (!buf)
> > + return -ENOMEM;
> > +
> > + fill_property(prop, addr, __dt_root_addr_cells);
> > + prop += __dt_root_addr_cells * sizeof(u32);
> > +
> > + fill_property(prop, size, __dt_root_size_cells);
> > +
> > + result = fdt_setprop(fdt, nodeoffset, name, buf, buf_size);
> > +
> > + vfree(buf);
> > +
> > + return result;
> > +}
>
> Doesn't this stuff belong in libfdt? I guess there is no 'add array element' api
> because this the first time we've wanted to create a node with more than
> key=fixed-size-value.
>
> I don't think this belongs in arch C code. Do we have a plan for getting libfdt
> to support encoding prop-arrays? Can we put it somewhere anyone else duplicating
> this will find it, until we can (re)move it?
I will temporarily move all fdt-related stuff to a separate file, but
> I have no idea how that happens... it looks like the devicetree list is the
> place to ask.
should we always sync with the original dtc/libfdt repository?
>
> > static int setup_dtb(struct kimage *image,
> > unsigned long initrd_load_addr, unsigned long initrd_len,
> > char *cmdline, unsigned long cmdline_len,
> > @@ -88,10 +165,26 @@ static int setup_dtb(struct kimage *image,
> > int range_len;
> > int ret;
> >
> > + /* check ranges against root's #address-cells and #size-cells */
> > + if (image->type == KEXEC_TYPE_CRASH &&
> > + (!cells_size_fitted(image->arch.elf_load_addr,
> > + image->arch.elf_headers_sz) ||
> > + !cells_size_fitted(crashk_res.start,
> > + crashk_res.end - crashk_res.start + 1))) {
> > + pr_err("Crash memory region doesn't fit into DT's root cell sizes.\n");
> > + ret = -EINVAL;
> > + goto out_err;
> > + }
>
> To check I've understood this properly: This can happen if the firmware provided
> a DTB with 32bit address/size cells, but at least some of the memory requires 64
> bit address/size cells. This could only happen on a UEFI system where the
> firmware-DTB doesn't describe memory. ACPI-only systems would have the EFIstub DT.
Probably, yes. I assumed the case where #address-cells and #size-cells
were just missing in fdt.
>
> > /* duplicate dt blob */
> > buf_size = fdt_totalsize(initial_boot_params);
> > range_len = (__dt_root_addr_cells + __dt_root_size_cells) * sizeof(u32);
> >
> > + if (image->type == KEXEC_TYPE_CRASH)
> > + buf_size += fdt_prop_len("linux,elfcorehdr", range_len)
> > + + fdt_prop_len("linux,usable-memory-range",
> > + range_len);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > +
> > if (initrd_load_addr)
> > buf_size += fdt_prop_len("linux,initrd-start", sizeof(u64))
> > + fdt_prop_len("linux,initrd-end", sizeof(u64));
> > @@ -113,6 +206,23 @@ static int setup_dtb(struct kimage *image,
> > if (nodeoffset < 0)
> > goto out_err;
> >
> > + if (image->type == KEXEC_TYPE_CRASH) {
> > + /* add linux,elfcorehdr */
> > + ret = fdt_setprop_range(buf, nodeoffset, "linux,elfcorehdr",
> > + image->arch.elf_load_addr,
> > + image->arch.elf_headers_sz);
> > + if (ret)
> > + goto out_err;
> > +
> > + /* add linux,usable-memory-range */
> > + ret = fdt_setprop_range(buf, nodeoffset,
> > + "linux,usable-memory-range",
> > + crashk_res.start,
> > + crashk_res.end - crashk_res.start + 1);
>
> Don't you need to add "linux,usable-memory-range" to the buf_size estimate?
I think the code exists. See above.
>
> > + if (ret)
> > + goto out_err;
> > + }
>
> > @@ -148,17 +258,109 @@ static int setup_dtb(struct kimage *image,
>
> > +static struct crash_mem *get_crash_memory_ranges(void)
> > +{
> > + unsigned int nr_ranges;
> > + struct crash_mem *cmem;
> > +
> > + nr_ranges = 1; /* for exclusion of crashkernel region */
> > + walk_system_ram_res(0, -1, &nr_ranges, get_nr_ranges_callback);
> > +
> > + cmem = vmalloc(sizeof(struct crash_mem) +
> > + sizeof(struct crash_mem_range) * nr_ranges);
> > + if (!cmem)
> > + return NULL;
> > +
> > + cmem->max_nr_ranges = nr_ranges;
> > + cmem->nr_ranges = 0;
> > + walk_system_ram_res(0, -1, cmem, add_mem_range_callback);
> > +
> > + /* Exclude crashkernel region */
> > + if (crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end)) {
> > + vfree(cmem);
> > + return NULL;
> > + }
> > +
> > + return cmem;
> > +}
>
> Could this function be included in prepare_elf_headers() so that the alloc() and
> free() occur together.
Or aiming that arm64 and x86 have similar-look code?
>
> > +static int prepare_elf_headers(void **addr, unsigned long *sz)
> > +{
> > + struct crash_mem *cmem;
> > + int ret = 0;
> > +
> > + cmem = get_crash_memory_ranges();
> > + if (!cmem)
> > + return -ENOMEM;
> > +
> > + ret = crash_prepare_elf64_headers(cmem, true, addr, sz);
> > +
> > + vfree(cmem);
>
> > + return ret;
> > +}
>
> All this is moving memory-range information from core-code's
> walk_system_ram_res() into core-code's struct crash_mem, and excluding
> crashk_res, which again is accessible to the core code.
>
> It looks like this is duplicated in arch/x86 and arch/arm64 because arm64
> doesn't have a second 'crashk_low_res' region, and always wants elf64, instead
> of when IS_ENABLED(CONFIG_X86_64).
> If we can abstract just those two, more of this could be moved to core code
> where powerpc can make use of it if they want to support kdump with
> kexec_file_load().
>
> But, its getting late for cross-architecture dependencies, lets put that on the
> for-later list. (assuming there isn't a powerpc-kdump series out there adding a
> third copy of this)
Sure. X86 code has so many exceptional lines in the code :)
Thanks,
-Takahiro AKASHI
>
> Thanks,
>
> James
WARNING: multiple messages have this Message-ID (diff)
From: AKASHI Takahiro <takahiro.akashi@linaro.org>
To: James Morse <james.morse@arm.com>
Cc: catalin.marinas@arm.com, will.deacon@arm.com,
dhowells@redhat.com, vgoyal@redhat.com,
herbert@gondor.apana.org.au, davem@davemloft.net,
dyoung@redhat.com, bhe@redhat.com, arnd@arndb.de,
ard.biesheuvel@linaro.org, bhsharma@redhat.com,
kexec@lists.infradead.org, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v9 07/11] arm64: kexec_file: add crash dump support
Date: Fri, 18 May 2018 19:39:26 +0900 [thread overview]
Message-ID: <20180518103925.GP2737@linaro.org> (raw)
In-Reply-To: <f8d8305d-42f5-e1e2-20c4-733cb38cbb8d@arm.com>
On Tue, May 15, 2018 at 06:11:15PM +0100, James Morse wrote:
> Hi Akashi,
>
> On 25/04/18 07:26, AKASHI Takahiro wrote:
> > Enabling crash dump (kdump) includes
> > * prepare contents of ELF header of a core dump file, /proc/vmcore,
> > using crash_prepare_elf64_headers(), and
> > * add two device tree properties, "linux,usable-memory-range" and
> > "linux,elfcorehdr", which represent repsectively a memory range
>
> (Nit: respectively)
Will fix.
>
> > to be used by crash dump kernel and the header's location
>
> > arch/arm64/include/asm/kexec.h | 4 +
> > arch/arm64/kernel/kexec_image.c | 9 +-
> > arch/arm64/kernel/machine_kexec_file.c | 202 +++++++++++++++++++++++++
>
> In this patch, machine_kexec_file.c gains its own private fdt array encoder.
See below.
>
> > diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
> > index 37c0a9dc2e47..ec674f4d267c 100644
> > --- a/arch/arm64/kernel/machine_kexec_file.c
> > +++ b/arch/arm64/kernel/machine_kexec_file.c
> > @@ -76,6 +81,78 @@ int arch_kexec_walk_mem(struct kexec_buf *kbuf,
> > return ret;
> > }
> >
> > +static int __init arch_kexec_file_init(void)
> > +{
> > + /* Those values are used later on loading the kernel */
> > + __dt_root_addr_cells = dt_root_addr_cells;
> > + __dt_root_size_cells = dt_root_size_cells;
> > +
> > + return 0;
> > +}
> > +late_initcall(arch_kexec_file_init);
>
> If we need these is it worth taking them out of __initdata? I note they've been
> 'temporary' for quite a long time.
I think that I had some reason that I didn't do that, but don't remember now.
If there's no problem, I will take your suggestion.
>
> > +
> > +#define FDT_ALIGN(x, a) (((x) + (a) - 1) & ~((a) - 1))
> > +#define FDT_TAGALIGN(x) (FDT_ALIGN((x), FDT_TAGSIZE))
> > +
> > +static int fdt_prop_len(const char *prop_name, int len)
> > +{
> > + return (strlen(prop_name) + 1) +
> > + sizeof(struct fdt_property) +
> > + FDT_TAGALIGN(len);
> > +}
>
> This stuff should really be in libfdt.h Those macros come from
> libfdt_internal.h, so we're probably doing something wrong here.
>
>
> > +static bool cells_size_fitted(unsigned long base, unsigned long size)
> > +{
> > + /* if *_cells >= 2, cells can hold 64-bit values anyway */
> > + if ((__dt_root_addr_cells == 1) && (base >= (1ULL << 32)))
> > + return false;
> > +
> > + if ((__dt_root_size_cells == 1) && (size >= (1ULL << 32)))
> > + return false;
>
> Using '> U32_MAX' here may be more readable.
OK
>
> > + return true;
> > +}
> > +
> > +static void fill_property(void *buf, u64 val64, int cells)
> > +{
> > + u32 val32;
> > +
> > + if (cells == 1) {
> > + val32 = cpu_to_fdt32((u32)val64);
> > + memcpy(buf, &val32, sizeof(val32));
> > + } else {
>
> > + memset(buf, 0, cells * sizeof(u32) - sizeof(u64));
> > + buf += cells * sizeof(u32) - sizeof(u64);
>
> Is this trying to clear the 'top' cells and shuffle the pointer to point at the
> 'bottom' 2? I'm pretty sure this isn't endian safe.
>
> Do we really expect a system to have #address-cells > 2?
I don't know, but just for safety.
>
> > + val64 = cpu_to_fdt64(val64);
> > + memcpy(buf, &val64, sizeof(val64));
> > + }
> > +}
> > +
> > +static int fdt_setprop_range(void *fdt, int nodeoffset, const char *name,
> > + unsigned long addr, unsigned long size)
>
> (the device-tree spec describes a 'ranges' property, which had me confused. This
> is encoding a prop-encoded-array)
Should we rename it to, say, fdt_setprop_reg()?
> > +{
> > + void *buf, *prop;
> > + size_t buf_size;
> > + int result;
> > +
> > + buf_size = (__dt_root_addr_cells + __dt_root_size_cells) * sizeof(u32);
> > + prop = buf = vmalloc(buf_size);
>
> virtual memory allocation for something less than PAGE_SIZE?
I've never cared about that. Let me think again.
>
> > + if (!buf)
> > + return -ENOMEM;
> > +
> > + fill_property(prop, addr, __dt_root_addr_cells);
> > + prop += __dt_root_addr_cells * sizeof(u32);
> > +
> > + fill_property(prop, size, __dt_root_size_cells);
> > +
> > + result = fdt_setprop(fdt, nodeoffset, name, buf, buf_size);
> > +
> > + vfree(buf);
> > +
> > + return result;
> > +}
>
> Doesn't this stuff belong in libfdt? I guess there is no 'add array element' api
> because this the first time we've wanted to create a node with more than
> key=fixed-size-value.
>
> I don't think this belongs in arch C code. Do we have a plan for getting libfdt
> to support encoding prop-arrays? Can we put it somewhere anyone else duplicating
> this will find it, until we can (re)move it?
I will temporarily move all fdt-related stuff to a separate file, but
> I have no idea how that happens... it looks like the devicetree list is the
> place to ask.
should we always sync with the original dtc/libfdt repository?
>
> > static int setup_dtb(struct kimage *image,
> > unsigned long initrd_load_addr, unsigned long initrd_len,
> > char *cmdline, unsigned long cmdline_len,
> > @@ -88,10 +165,26 @@ static int setup_dtb(struct kimage *image,
> > int range_len;
> > int ret;
> >
> > + /* check ranges against root's #address-cells and #size-cells */
> > + if (image->type == KEXEC_TYPE_CRASH &&
> > + (!cells_size_fitted(image->arch.elf_load_addr,
> > + image->arch.elf_headers_sz) ||
> > + !cells_size_fitted(crashk_res.start,
> > + crashk_res.end - crashk_res.start + 1))) {
> > + pr_err("Crash memory region doesn't fit into DT's root cell sizes.\n");
> > + ret = -EINVAL;
> > + goto out_err;
> > + }
>
> To check I've understood this properly: This can happen if the firmware provided
> a DTB with 32bit address/size cells, but at least some of the memory requires 64
> bit address/size cells. This could only happen on a UEFI system where the
> firmware-DTB doesn't describe memory. ACPI-only systems would have the EFIstub DT.
Probably, yes. I assumed the case where #address-cells and #size-cells
were just missing in fdt.
>
> > /* duplicate dt blob */
> > buf_size = fdt_totalsize(initial_boot_params);
> > range_len = (__dt_root_addr_cells + __dt_root_size_cells) * sizeof(u32);
> >
> > + if (image->type == KEXEC_TYPE_CRASH)
> > + buf_size += fdt_prop_len("linux,elfcorehdr", range_len)
> > + + fdt_prop_len("linux,usable-memory-range",
> > + range_len);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > +
> > if (initrd_load_addr)
> > buf_size += fdt_prop_len("linux,initrd-start", sizeof(u64))
> > + fdt_prop_len("linux,initrd-end", sizeof(u64));
> > @@ -113,6 +206,23 @@ static int setup_dtb(struct kimage *image,
> > if (nodeoffset < 0)
> > goto out_err;
> >
> > + if (image->type == KEXEC_TYPE_CRASH) {
> > + /* add linux,elfcorehdr */
> > + ret = fdt_setprop_range(buf, nodeoffset, "linux,elfcorehdr",
> > + image->arch.elf_load_addr,
> > + image->arch.elf_headers_sz);
> > + if (ret)
> > + goto out_err;
> > +
> > + /* add linux,usable-memory-range */
> > + ret = fdt_setprop_range(buf, nodeoffset,
> > + "linux,usable-memory-range",
> > + crashk_res.start,
> > + crashk_res.end - crashk_res.start + 1);
>
> Don't you need to add "linux,usable-memory-range" to the buf_size estimate?
I think the code exists. See above.
>
> > + if (ret)
> > + goto out_err;
> > + }
>
> > @@ -148,17 +258,109 @@ static int setup_dtb(struct kimage *image,
>
> > +static struct crash_mem *get_crash_memory_ranges(void)
> > +{
> > + unsigned int nr_ranges;
> > + struct crash_mem *cmem;
> > +
> > + nr_ranges = 1; /* for exclusion of crashkernel region */
> > + walk_system_ram_res(0, -1, &nr_ranges, get_nr_ranges_callback);
> > +
> > + cmem = vmalloc(sizeof(struct crash_mem) +
> > + sizeof(struct crash_mem_range) * nr_ranges);
> > + if (!cmem)
> > + return NULL;
> > +
> > + cmem->max_nr_ranges = nr_ranges;
> > + cmem->nr_ranges = 0;
> > + walk_system_ram_res(0, -1, cmem, add_mem_range_callback);
> > +
> > + /* Exclude crashkernel region */
> > + if (crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end)) {
> > + vfree(cmem);
> > + return NULL;
> > + }
> > +
> > + return cmem;
> > +}
>
> Could this function be included in prepare_elf_headers() so that the alloc() and
> free() occur together.
Or aiming that arm64 and x86 have similar-look code?
>
> > +static int prepare_elf_headers(void **addr, unsigned long *sz)
> > +{
> > + struct crash_mem *cmem;
> > + int ret = 0;
> > +
> > + cmem = get_crash_memory_ranges();
> > + if (!cmem)
> > + return -ENOMEM;
> > +
> > + ret = crash_prepare_elf64_headers(cmem, true, addr, sz);
> > +
> > + vfree(cmem);
>
> > + return ret;
> > +}
>
> All this is moving memory-range information from core-code's
> walk_system_ram_res() into core-code's struct crash_mem, and excluding
> crashk_res, which again is accessible to the core code.
>
> It looks like this is duplicated in arch/x86 and arch/arm64 because arm64
> doesn't have a second 'crashk_low_res' region, and always wants elf64, instead
> of when IS_ENABLED(CONFIG_X86_64).
> If we can abstract just those two, more of this could be moved to core code
> where powerpc can make use of it if they want to support kdump with
> kexec_file_load().
>
> But, its getting late for cross-architecture dependencies, lets put that on the
> for-later list. (assuming there isn't a powerpc-kdump series out there adding a
> third copy of this)
Sure. X86 code has so many exceptional lines in the code :)
Thanks,
-Takahiro AKASHI
>
> Thanks,
>
> James
next prev parent reply other threads:[~2018-05-18 10:39 UTC|newest]
Thread overview: 156+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-25 6:26 [PATCH v9 00/11] arm64: kexec: add kexec_file_load() support AKASHI Takahiro
2018-04-25 6:26 ` AKASHI Takahiro
2018-04-25 6:26 ` AKASHI Takahiro
2018-04-25 6:26 ` [PATCH v9 01/11] asm-generic: add kexec_file_load system call to unistd.h AKASHI Takahiro
2018-04-25 6:26 ` AKASHI Takahiro
2018-04-25 6:26 ` AKASHI Takahiro
2018-04-25 6:26 ` [PATCH v9 02/11] kexec_file: make kexec_image_post_load_cleanup_default() global AKASHI Takahiro
2018-04-25 6:26 ` AKASHI Takahiro
2018-04-25 6:26 ` AKASHI Takahiro
2018-04-28 9:45 ` Dave Young
2018-04-28 9:45 ` Dave Young
2018-04-28 9:45 ` Dave Young
2018-05-01 17:46 ` James Morse
2018-05-01 17:46 ` James Morse
2018-05-01 17:46 ` James Morse
2018-05-07 4:40 ` AKASHI Takahiro
2018-05-07 4:40 ` AKASHI Takahiro
2018-05-07 4:40 ` AKASHI Takahiro
2018-04-25 6:26 ` [PATCH v9 03/11] arm64: kexec_file: invoke the kernel without purgatory AKASHI Takahiro
2018-04-25 6:26 ` AKASHI Takahiro
2018-04-25 6:26 ` AKASHI Takahiro
2018-05-01 17:46 ` James Morse
2018-05-01 17:46 ` James Morse
2018-05-01 17:46 ` James Morse
2018-05-07 5:22 ` AKASHI Takahiro
2018-05-07 5:22 ` AKASHI Takahiro
2018-05-07 5:22 ` AKASHI Takahiro
2018-05-11 17:03 ` James Morse
2018-05-11 17:03 ` James Morse
2018-05-11 17:03 ` James Morse
2018-05-15 4:45 ` AKASHI Takahiro
2018-05-15 4:45 ` AKASHI Takahiro
2018-05-15 4:45 ` AKASHI Takahiro
2018-05-15 16:15 ` James Morse
2018-05-15 16:15 ` James Morse
2018-05-15 16:15 ` James Morse
2018-05-18 6:22 ` AKASHI Takahiro
2018-05-18 6:22 ` AKASHI Takahiro
2018-05-18 6:22 ` AKASHI Takahiro
2018-04-25 6:26 ` [PATCH v9 04/11] arm64: kexec_file: allocate memory walking through memblock list AKASHI Takahiro
2018-04-25 6:26 ` AKASHI Takahiro
2018-04-25 6:26 ` AKASHI Takahiro
2018-05-01 17:46 ` James Morse
2018-05-01 17:46 ` James Morse
2018-05-01 17:46 ` James Morse
2018-05-07 5:59 ` AKASHI Takahiro
2018-05-07 5:59 ` AKASHI Takahiro
2018-05-07 5:59 ` AKASHI Takahiro
2018-05-15 4:35 ` AKASHI Takahiro
2018-05-15 4:35 ` AKASHI Takahiro
2018-05-15 4:35 ` AKASHI Takahiro
2018-05-15 16:17 ` James Morse
2018-05-15 16:17 ` James Morse
2018-05-15 16:17 ` James Morse
2018-05-17 2:10 ` Baoquan He
2018-05-17 2:10 ` Baoquan He
2018-05-17 2:10 ` Baoquan He
2018-05-17 2:15 ` Baoquan He
2018-05-17 2:15 ` Baoquan He
2018-05-17 2:15 ` Baoquan He
2018-05-17 18:04 ` James Morse
2018-05-17 18:04 ` James Morse
2018-05-17 18:04 ` James Morse
2018-05-18 1:37 ` Baoquan He
2018-05-18 1:37 ` Baoquan He
2018-05-18 1:37 ` Baoquan He
2018-05-18 5:07 ` AKASHI Takahiro
2018-05-18 5:07 ` AKASHI Takahiro
2018-05-18 5:07 ` AKASHI Takahiro
2018-04-25 6:26 ` [PATCH v9 05/11] arm64: kexec_file: load initrd and device-tree AKASHI Takahiro
2018-04-25 6:26 ` AKASHI Takahiro
2018-04-25 6:26 ` AKASHI Takahiro
2018-05-15 16:20 ` James Morse
2018-05-15 16:20 ` James Morse
2018-05-15 16:20 ` James Morse
2018-05-18 7:11 ` AKASHI Takahiro
2018-05-18 7:11 ` AKASHI Takahiro
2018-05-18 7:11 ` AKASHI Takahiro
2018-05-18 7:42 ` AKASHI Takahiro
2018-05-18 7:42 ` AKASHI Takahiro
2018-05-18 7:42 ` AKASHI Takahiro
2018-05-18 15:59 ` James Morse
2018-05-18 15:59 ` James Morse
2018-05-18 15:59 ` James Morse
2018-04-25 6:26 ` [PATCH v9 06/11] arm64: kexec_file: allow for loading Image-format kernel AKASHI Takahiro
2018-04-25 6:26 ` AKASHI Takahiro
2018-04-25 6:26 ` AKASHI Takahiro
2018-05-01 17:46 ` James Morse
2018-05-01 17:46 ` James Morse
2018-05-01 17:46 ` James Morse
2018-05-07 7:21 ` AKASHI Takahiro
2018-05-07 7:21 ` AKASHI Takahiro
2018-05-07 7:21 ` AKASHI Takahiro
2018-05-11 17:07 ` James Morse
2018-05-11 17:07 ` James Morse
2018-05-11 17:07 ` James Morse
2018-05-15 5:13 ` AKASHI Takahiro
2018-05-15 5:13 ` AKASHI Takahiro
2018-05-15 5:13 ` AKASHI Takahiro
2018-05-15 17:14 ` James Morse
2018-05-15 17:14 ` James Morse
2018-05-15 17:14 ` James Morse
2018-05-21 9:32 ` AKASHI Takahiro
2018-05-21 9:32 ` AKASHI Takahiro
2018-05-21 9:32 ` AKASHI Takahiro
2018-04-25 6:26 ` [PATCH v9 07/11] arm64: kexec_file: add crash dump support AKASHI Takahiro
2018-04-25 6:26 ` AKASHI Takahiro
2018-04-25 6:26 ` AKASHI Takahiro
2018-05-15 17:11 ` James Morse
2018-05-15 17:11 ` James Morse
2018-05-15 17:11 ` James Morse
2018-05-16 8:34 ` James Morse
2018-05-16 8:34 ` James Morse
2018-05-16 8:34 ` James Morse
2018-05-18 9:58 ` AKASHI Takahiro
2018-05-18 9:58 ` AKASHI Takahiro
2018-05-18 9:58 ` AKASHI Takahiro
2018-05-16 10:06 ` James Morse
2018-05-16 10:06 ` James Morse
2018-05-16 10:06 ` James Morse
2018-05-18 9:50 ` AKASHI Takahiro
2018-05-18 9:50 ` AKASHI Takahiro
2018-05-18 9:50 ` AKASHI Takahiro
2018-05-18 10:39 ` AKASHI Takahiro [this message]
2018-05-18 10:39 ` AKASHI Takahiro
2018-05-18 10:39 ` AKASHI Takahiro
2018-05-18 16:00 ` James Morse
2018-05-18 16:00 ` James Morse
2018-05-18 16:00 ` James Morse
2018-05-21 9:46 ` AKASHI Takahiro
2018-05-21 9:46 ` AKASHI Takahiro
2018-05-21 9:46 ` AKASHI Takahiro
2018-05-15 17:12 ` James Morse
2018-05-15 17:12 ` James Morse
2018-05-15 17:12 ` James Morse
2018-05-18 15:35 ` Rob Herring
2018-05-18 15:35 ` Rob Herring
2018-05-18 15:35 ` Rob Herring
2018-05-21 10:14 ` AKASHI Takahiro
2018-05-21 10:14 ` AKASHI Takahiro
2018-05-21 10:14 ` AKASHI Takahiro
2018-05-24 14:25 ` Rob Herring
2018-05-24 14:25 ` Rob Herring
2018-05-24 14:25 ` Rob Herring
2018-04-25 6:26 ` [PATCH v9 08/11] arm64: enable KEXEC_FILE config AKASHI Takahiro
2018-04-25 6:26 ` AKASHI Takahiro
2018-04-25 6:26 ` AKASHI Takahiro
2018-04-25 6:26 ` [PATCH v9 09/11] include: pe.h: remove message[] from mz header definition AKASHI Takahiro
2018-04-25 6:26 ` AKASHI Takahiro
2018-04-25 6:26 ` AKASHI Takahiro
2018-04-25 6:26 ` [PATCH v9 10/11] arm64: kexec_file: add kernel signature verification support AKASHI Takahiro
2018-04-25 6:26 ` AKASHI Takahiro
2018-04-25 6:26 ` AKASHI Takahiro
2018-04-25 6:26 ` [PATCH v9 11/11] arm64: kexec_file: add kaslr support AKASHI Takahiro
2018-04-25 6:26 ` AKASHI Takahiro
2018-04-25 6:26 ` AKASHI Takahiro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180518103925.GP2737@linaro.org \
--to=takahiro.akashi@linaro.org \
--cc=ard.biesheuvel@linaro.org \
--cc=arnd@arndb.de \
--cc=bhe@redhat.com \
--cc=bhsharma@redhat.com \
--cc=catalin.marinas@arm.com \
--cc=davem@davemloft.net \
--cc=dhowells@redhat.com \
--cc=dyoung@redhat.com \
--cc=herbert@gondor.apana.org.au \
--cc=james.morse@arm.com \
--cc=kexec@lists.infradead.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=vgoyal@redhat.com \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.