* [PATCH] arch:powerpc simple_write_to_buffer return check
From: Mayank Suman @ 2021-02-04 18:16 UTC (permalink / raw)
To: ruscur, oohall, mpe, benh, paulus, linuxppc-dev, linux-kernel
Cc: Mayank Suman
Signed-off-by: Mayank Suman <mayanksuman@live.com>
---
arch/powerpc/kernel/eeh.c | 8 ++++----
arch/powerpc/platforms/powernv/eeh-powernv.c | 4 ++--
2 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 813713c9120c..2dbe1558a71f 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -1628,8 +1628,8 @@ static ssize_t eeh_force_recover_write(struct file *filp,
char buf[20];
int ret;
- ret = simple_write_to_buffer(buf, sizeof(buf), ppos, user_buf, count);
- if (!ret)
+ ret = simple_write_to_buffer(buf, sizeof(buf)-1, ppos, user_buf, count);
+ if (ret <= 0)
return -EFAULT;
/*
@@ -1696,7 +1696,7 @@ static ssize_t eeh_dev_check_write(struct file *filp,
memset(buf, 0, sizeof(buf));
ret = simple_write_to_buffer(buf, sizeof(buf)-1, ppos, user_buf, count);
- if (!ret)
+ if (ret <= 0)
return -EFAULT;
ret = sscanf(buf, "%x:%x:%x.%x", &domain, &bus, &dev, &fn);
@@ -1836,7 +1836,7 @@ static ssize_t eeh_dev_break_write(struct file *filp,
memset(buf, 0, sizeof(buf));
ret = simple_write_to_buffer(buf, sizeof(buf)-1, ppos, user_buf, count);
- if (!ret)
+ if (ret <= 0)
return -EFAULT;
ret = sscanf(buf, "%x:%x:%x.%x", &domain, &bus, &dev, &fn);
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 89e22c460ebf..36ed2b8f7375 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -76,8 +76,8 @@ static ssize_t pnv_eeh_ei_write(struct file *filp,
return -ENXIO;
/* Copy over argument buffer */
- ret = simple_write_to_buffer(buf, sizeof(buf), ppos, user_buf, count);
- if (!ret)
+ ret = simple_write_to_buffer(buf, sizeof(buf)-1, ppos, user_buf, count);
+ if (ret <= 0)
return -EFAULT;
/* Retrieve parameters */
--
2.30.0
^ permalink raw reply related
* Re: [PATCH RFC v1 2/6] swiotlb: convert variables to arrays
From: Konrad Rzeszutek Wilk @ 2021-02-04 19:31 UTC (permalink / raw)
To: Robin Murphy
Cc: ulf.hansson, airlied, joonas.lahtinen, dri-devel, linux-kernel,
paulus, hpa, Christoph Hellwig, m.szyprowski, sstabellini,
adrian.hunter, Dongli Zhang, x86, joe.jin, mingo, peterz, mingo,
bskeggs, linux-pci, xen-devel, matthew.auld, thomas.lendacky,
intel-gfx, jani.nikula, bp, rodrigo.vivi, bhelgaas, Claire Chang,
boris.ostrovsky, chris, jgross, tsbogend, nouveau, linux-mmc,
linux-mips, iommu, tglx, bauerman, daniel, akpm, linuxppc-dev,
rppt
In-Reply-To: <b46ddefe-d91a-fa6a-0e0d-cf1edc343c2e@arm.com>
On Thu, Feb 04, 2021 at 11:49:23AM +0000, Robin Murphy wrote:
> On 2021-02-04 07:29, Christoph Hellwig wrote:
> > On Wed, Feb 03, 2021 at 03:37:05PM -0800, Dongli Zhang wrote:
> > > This patch converts several swiotlb related variables to arrays, in
> > > order to maintain stat/status for different swiotlb buffers. Here are
> > > variables involved:
> > >
> > > - io_tlb_start and io_tlb_end
> > > - io_tlb_nslabs and io_tlb_used
> > > - io_tlb_list
> > > - io_tlb_index
> > > - max_segment
> > > - io_tlb_orig_addr
> > > - no_iotlb_memory
> > >
> > > There is no functional change and this is to prepare to enable 64-bit
> > > swiotlb.
> >
> > Claire Chang (on Cc) already posted a patch like this a month ago,
> > which looks much better because it actually uses a struct instead
> > of all the random variables.
>
> Indeed, I skimmed the cover letter and immediately thought that this whole
> thing is just the restricted DMA pool concept[1] again, only from a slightly
> different angle.
Kind of. Let me lay out how some of these pieces are right now:
+-----------------------+ +----------------------+
| | | |
| | | |
| a)Xen-SWIOTLB | | b)SWIOTLB (for !Xen) |
| | | |
+-----------XX----------+ +-------X--------------+
XXXX XXXXXXXXX
XXXX XX XXX
X XX
XXXX
+----------XX-----------+
| |
| |
| c) SWIOTLB generic |
| |
+-----------------------+
Dongli's patches modify the SWIOTLB generic c), and Xen-SWIOTLB a)
parts.
Also see the IOMMU_INIT logic which lays this a bit more deepth
(for example how to enable SWIOTLB on AMD boxes, or IBM with Calgary
IOMMU, etc - see iommu_table.h).
Furtheremore it lays the groundwork to allocate AMD SEV SWIOTLB buffers
later after boot (so that you can stich different pools together).
All the bits are kind of inside of the SWIOTLB code. And also it changes
the Xen-SWIOTLB to do something similar.
The mempool did it similarly by taking the internal parts (aka the
various io_tlb) of SWIOTLB and exposing them out and having
other code:
+-----------------------+ +----------------------+
| | | |
| | | |
| a)Xen-SWIOTLB | | b)SWIOTLB (for !Xen) |
| | | |
+-----------XX----------+ +-------X--------------+
XXXX XXXXXXXXX
XXXX XX XXX
X XX
XXXX
+----------XX-----------+ +------------------+
| | | Device tree |
| +<--------+ enabling SWIOTLB |
|c) SWIOTLB generic | | |
| | | mempool |
+-----------------------+ +------------------+
What I was suggesting to Clarie to follow Xen model, that is
do something like this:
+-----------------------+ +----------------------+ +--------------------+
| | | | | |
| | | | | |
| a)Xen-SWIOTLB | | b)SWIOTLB (for !Xen) | | e) DT-SWIOTLB |
| | | | | |
+-----------XX----------+ +-------X--------------+ +----XX-X------------+
XXXX XXXXXXXXX XXX X X XX X XX
XXXX XX XXX XXXXXXXX
X XX XXXXXXXXXXXXX
XXXXXXXX
+----------XXX----------+
| |
| |
|c) SWIOTLB generic |
| |
+-----------------------+
so using the SWIOTLB generic parts, and then bolt on top
of the device-tree logic, along with the mempool logic.
But Christopher has an interesting suggestion which is
to squash the all the existing code (a, b, c) all together
and pepper it with various jump-tables.
So:
-----------------------------+
| SWIOTLB: |
| |
| a) SWIOTLB (for non-Xen) |
| b) Xen-SWIOTLB |
| c) DT-SWIOTLB |
| |
| |
-----------------------------+
with all the various bits (M2P/P2M for Xen, mempool for ARM,
and normal allocation for BM) in one big file.
^ permalink raw reply
* Re: [PATCH v16 11/12] powerpc: Use OF alloc and free for FDT
From: Rob Herring @ 2021-02-04 19:26 UTC (permalink / raw)
To: Lakshmi Ramasubramanian
Cc: Mark Rutland, Bhupesh Sharma, tao.li, Mimi Zohar, Paul Mackerras,
vincenzo.frascino, Frank Rowand, Sasha Levin, Masahiro Yamada,
James Morris, AKASHI, Takahiro, linux-arm-kernel, Catalin Marinas,
Serge E. Hallyn, devicetree, Pavel Tatashin, Will Deacon,
Prakhar Srivastava, Hsin-Yi Wang, Allison Randal,
Christophe Leroy, Matthias Brugger, balajib, dmitry.kasatkin,
linux-kernel@vger.kernel.org, James Morse, Greg Kroah-Hartman,
Joe Perches, linux-integrity, linuxppc-dev, Thiago Jung Bauermann
In-Reply-To: <20210204164135.29856-12-nramas@linux.microsoft.com>
On Thu, Feb 4, 2021 at 10:42 AM Lakshmi Ramasubramanian
<nramas@linux.microsoft.com> wrote:
>
> of_alloc_and_init_fdt() and of_free_fdt() have been defined in
> drivers/of/kexec.c to allocate and free memory for FDT.
>
> Use of_alloc_and_init_fdt() and of_free_fdt() to allocate and
> initialize the FDT, and to free the FDT respectively.
>
> powerpc sets the FDT address in image_loader_data field in
> "struct kimage" and the memory is freed in
> kimage_file_post_load_cleanup(). This cleanup function uses kfree()
> to free the memory. But since of_alloc_and_init_fdt() uses kvmalloc()
> for allocation, the buffer needs to be freed using kvfree().
You could just change the kexec core to call kvfree() instead.
> Define "fdt" field in "struct kimage_arch" for powerpc to store
> the address of FDT, and free the memory in powerpc specific
> arch_kimage_file_post_load_cleanup().
However, given all the other buffers have an explicit field in kimage
or kimage_arch, changing powerpc is to match arm64 is better IMO.
> Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
> Suggested-by: Rob Herring <robh@kernel.org>
> Suggested-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
> ---
> arch/powerpc/include/asm/kexec.h | 2 ++
> arch/powerpc/kexec/elf_64.c | 26 ++++++++++++++++----------
> arch/powerpc/kexec/file_load_64.c | 3 +++
> 3 files changed, 21 insertions(+), 10 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
> index 2c0be93d239a..d7d13cac4d31 100644
> --- a/arch/powerpc/include/asm/kexec.h
> +++ b/arch/powerpc/include/asm/kexec.h
> @@ -111,6 +111,8 @@ struct kimage_arch {
> unsigned long elf_headers_mem;
> unsigned long elf_headers_sz;
> void *elf_headers;
> +
> + void *fdt;
> };
>
> char *setup_kdump_cmdline(struct kimage *image, char *cmdline,
> diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c
> index d0e459bb2f05..51d2d8eb6c1b 100644
> --- a/arch/powerpc/kexec/elf_64.c
> +++ b/arch/powerpc/kexec/elf_64.c
> @@ -19,6 +19,7 @@
> #include <linux/kexec.h>
> #include <linux/libfdt.h>
> #include <linux/module.h>
> +#include <linux/of.h>
> #include <linux/of_fdt.h>
> #include <linux/slab.h>
> #include <linux/types.h>
> @@ -32,7 +33,7 @@ static void *elf64_load(struct kimage *image, char *kernel_buf,
> unsigned int fdt_size;
> unsigned long kernel_load_addr;
> unsigned long initrd_load_addr = 0, fdt_load_addr;
> - void *fdt;
> + void *fdt = NULL;
> const void *slave_code;
> struct elfhdr ehdr;
> char *modified_cmdline = NULL;
> @@ -103,18 +104,12 @@ static void *elf64_load(struct kimage *image, char *kernel_buf,
> }
>
> fdt_size = fdt_totalsize(initial_boot_params) * 2;
> - fdt = kmalloc(fdt_size, GFP_KERNEL);
> + fdt = of_alloc_and_init_fdt(fdt_size);
> if (!fdt) {
> pr_err("Not enough memory for the device tree.\n");
> ret = -ENOMEM;
> goto out;
> }
> - ret = fdt_open_into(initial_boot_params, fdt, fdt_size);
> - if (ret < 0) {
> - pr_err("Error setting up the new device tree.\n");
> - ret = -EINVAL;
> - goto out;
> - }
>
> ret = setup_new_fdt_ppc64(image, fdt, initrd_load_addr,
The first thing this function does is call setup_new_fdt() which first
calls of_kexec_setup_new_fdt(). (Note, I really don't understand the
PPC code split. It looks like there's a 32-bit and 64-bit split, but
32-bit looks broken to me. Nothing ever calls setup_new_fdt() except
setup_new_fdt_ppc64()). The arm64 version is calling
of_alloc_and_init_fdt() and then of_kexec_setup_new_fdt() directly.
So we can just make of_alloc_and_init_fdt() also call
of_kexec_setup_new_fdt() (really, just tweak of_kexec_setup_new_fdt do
the alloc and copy). I don't think the architecture needs to pick the
size either. It's doubtful that either one is that sensitive to the
amount of extra space.
> initrd_len, cmdline);
> @@ -131,6 +126,10 @@ static void *elf64_load(struct kimage *image, char *kernel_buf,
> ret = kexec_add_buffer(&kbuf);
> if (ret)
> goto out;
> +
> + /* FDT will be freed in arch_kimage_file_post_load_cleanup */
> + image->arch.fdt = fdt;
> +
> fdt_load_addr = kbuf.mem;
>
> pr_debug("Loaded device tree at 0x%lx\n", fdt_load_addr);
> @@ -145,8 +144,15 @@ static void *elf64_load(struct kimage *image, char *kernel_buf,
> kfree(modified_cmdline);
> kexec_free_elf_info(&elf_info);
>
> - /* Make kimage_file_post_load_cleanup free the fdt buffer for us. */
> - return ret ? ERR_PTR(ret) : fdt;
> + /*
> + * Once FDT buffer has been successfully passed to kexec_add_buffer(),
> + * the FDT buffer address is saved in image->arch.fdt. In that case,
> + * the memory cannot be freed here in case of any other error.
> + */
> + if (ret && !image->arch.fdt)
> + of_free_fdt(fdt);
Just call kvfree() directly.
> +
> + return ret ? ERR_PTR(ret) : NULL;
> }
>
> const struct kexec_file_ops kexec_elf64_ops = {
> diff --git a/arch/powerpc/kexec/file_load_64.c b/arch/powerpc/kexec/file_load_64.c
> index 3cab318aa3b9..d9d5b5569a6d 100644
> --- a/arch/powerpc/kexec/file_load_64.c
> +++ b/arch/powerpc/kexec/file_load_64.c
> @@ -1113,5 +1113,8 @@ int arch_kimage_file_post_load_cleanup(struct kimage *image)
> image->arch.elf_headers = NULL;
> image->arch.elf_headers_sz = 0;
>
> + of_free_fdt(image->arch.fdt);
> + image->arch.fdt = NULL;
> +
> return kexec_image_post_load_cleanup_default(image);
> }
> --
> 2.30.0
>
^ permalink raw reply
* Re: [PATCH] cpufreq: Remove unused flag CPUFREQ_PM_NO_WARN
From: Rafael J. Wysocki @ 2021-02-04 18:27 UTC (permalink / raw)
To: Viresh Kumar
Cc: Vincent Guittot, Linux PM, Rafael Wysocki,
Linux Kernel Mailing List, Paul Mackerras, linuxppc-dev
In-Reply-To: <bed6bc7e15c3ed398dd61b8f3968049f1f16b1b6.1612244449.git.viresh.kumar@linaro.org>
On Tue, Feb 2, 2021 at 6:42 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> This flag is set by one of the drivers but it isn't used in the code
> otherwise. Remove the unused flag and update the driver.
>
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Applied as 5.12 material, thanks!
> ---
> Rebased over:
>
> https://lore.kernel.org/lkml/a59bb322b22c247d570b70a8e94067804287623b.1612241683.git.viresh.kumar@linaro.org/
>
> drivers/cpufreq/pmac32-cpufreq.c | 3 +--
> include/linux/cpufreq.h | 13 +++++--------
> 2 files changed, 6 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/cpufreq/pmac32-cpufreq.c b/drivers/cpufreq/pmac32-cpufreq.c
> index 73621bc11976..4f20c6a9108d 100644
> --- a/drivers/cpufreq/pmac32-cpufreq.c
> +++ b/drivers/cpufreq/pmac32-cpufreq.c
> @@ -439,8 +439,7 @@ static struct cpufreq_driver pmac_cpufreq_driver = {
> .init = pmac_cpufreq_cpu_init,
> .suspend = pmac_cpufreq_suspend,
> .resume = pmac_cpufreq_resume,
> - .flags = CPUFREQ_PM_NO_WARN |
> - CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING,
> + .flags = CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING,
> .attr = cpufreq_generic_attr,
> .name = "powermac",
> };
> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> index c8e40e91fe9b..353969c7acd3 100644
> --- a/include/linux/cpufreq.h
> +++ b/include/linux/cpufreq.h
> @@ -398,8 +398,11 @@ struct cpufreq_driver {
> /* loops_per_jiffy or other kernel "constants" aren't affected by frequency transitions */
> #define CPUFREQ_CONST_LOOPS BIT(1)
>
> -/* don't warn on suspend/resume speed mismatches */
> -#define CPUFREQ_PM_NO_WARN BIT(2)
> +/*
> + * Set by drivers that want the core to automatically register the cpufreq
> + * driver as a thermal cooling device.
> + */
> +#define CPUFREQ_IS_COOLING_DEV BIT(2)
>
> /*
> * This should be set by platforms having multiple clock-domains, i.e.
> @@ -431,12 +434,6 @@ struct cpufreq_driver {
> */
> #define CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING BIT(6)
>
> -/*
> - * Set by drivers that want the core to automatically register the cpufreq
> - * driver as a thermal cooling device.
> - */
> -#define CPUFREQ_IS_COOLING_DEV BIT(7)
> -
> int cpufreq_register_driver(struct cpufreq_driver *driver_data);
> int cpufreq_unregister_driver(struct cpufreq_driver *driver_data);
>
> --
> 2.25.0.rc1.19.g042ed3e048af
>
^ permalink raw reply
* Re: [PATCH v16 10/12] arm64: Use OF alloc and free functions for FDT
From: Will Deacon @ 2021-02-04 18:00 UTC (permalink / raw)
To: Lakshmi Ramasubramanian
Cc: mark.rutland, bhsharma, tao.li, zohar, paulus, vincenzo.frascino,
frowand.list, sashal, robh, masahiroy, jmorris, takahiro.akashi,
linux-arm-kernel, catalin.marinas, serge, devicetree,
pasha.tatashin, prsriva, hsinyi, allison, christophe.leroy,
mbrugger, balajib, dmitry.kasatkin, linux-kernel, james.morse,
gregkh, joe, linux-integrity, linuxppc-dev, bauerman
In-Reply-To: <20210204164135.29856-11-nramas@linux.microsoft.com>
On Thu, Feb 04, 2021 at 08:41:33AM -0800, Lakshmi Ramasubramanian wrote:
> of_alloc_and_init_fdt() and of_free_fdt() have been defined in
> drivers/of/kexec.c to allocate and free memory for FDT.
>
> Use of_alloc_and_init_fdt() and of_free_fdt() to allocate and
> initialize the FDT, and to free the FDT respectively.
>
> Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
> Suggested-by: Rob Herring <robh@kernel.org>
> ---
> arch/arm64/kernel/machine_kexec_file.c | 37 +++++++-------------------
> 1 file changed, 10 insertions(+), 27 deletions(-)
Acked-by: Will Deacon <will@kernel.org>
Will
^ permalink raw reply
* [PATCH v2 1/2] ima: Free IMA measurement buffer on error
From: Lakshmi Ramasubramanian @ 2021-02-04 17:49 UTC (permalink / raw)
To: zohar, bauerman, dmitry.kasatkin, ebiederm, gregkh, sashal,
tyhicks
Cc: linux-integrity, linuxppc-dev, linux-kernel
IMA allocates kernel virtual memory to carry forward the measurement
list, from the current kernel to the next kernel on kexec system call,
in ima_add_kexec_buffer() function. In error code paths this memory
is not freed resulting in memory leak.
Free the memory allocated for the IMA measurement list in
the error code paths in ima_add_kexec_buffer() function.
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Suggested-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Fixes: 7b8589cc29e7 ("ima: on soft reboot, save the measurement list")
---
security/integrity/ima/ima_kexec.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/security/integrity/ima/ima_kexec.c b/security/integrity/ima/ima_kexec.c
index 121de3e04af2..206ddcaa5c67 100644
--- a/security/integrity/ima/ima_kexec.c
+++ b/security/integrity/ima/ima_kexec.c
@@ -119,6 +119,7 @@ void ima_add_kexec_buffer(struct kimage *image)
ret = kexec_add_buffer(&kbuf);
if (ret) {
pr_err("Error passing over kexec measurement buffer.\n");
+ vfree(kexec_buffer);
return;
}
--
2.30.0
^ permalink raw reply related
* [PATCH v2 2/2] ima: Free IMA measurement buffer after kexec syscall
From: Lakshmi Ramasubramanian @ 2021-02-04 17:49 UTC (permalink / raw)
To: zohar, bauerman, dmitry.kasatkin, ebiederm, gregkh, sashal,
tyhicks
Cc: linux-integrity, linuxppc-dev, linux-kernel
In-Reply-To: <20210204174951.25771-1-nramas@linux.microsoft.com>
IMA allocates kernel virtual memory to carry forward the measurement
list, from the current kernel to the next kernel on kexec system call,
in ima_add_kexec_buffer() function. This buffer is not freed before
completing the kexec system call resulting in memory leak.
Add ima_buffer field in "struct kimage" to store the virtual address
of the buffer allocated for the IMA measurement list.
Free the memory allocated for the IMA measurement list in
kimage_file_post_load_cleanup() function.
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Suggested-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Fixes: 7b8589cc29e7 ("ima: on soft reboot, save the measurement list")
---
include/linux/kexec.h | 5 +++++
kernel/kexec_file.c | 5 +++++
security/integrity/ima/ima_kexec.c | 2 ++
3 files changed, 12 insertions(+)
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 9e93bef52968..5f61389f5f36 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -300,6 +300,11 @@ struct kimage {
/* Information for loading purgatory */
struct purgatory_info purgatory_info;
#endif
+
+#ifdef CONFIG_IMA_KEXEC
+ /* Virtual address of IMA measurement buffer for kexec syscall */
+ void *ima_buffer;
+#endif
};
/* kexec interface functions */
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index b02086d70492..5c3447cf7ad5 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -166,6 +166,11 @@ void kimage_file_post_load_cleanup(struct kimage *image)
vfree(pi->sechdrs);
pi->sechdrs = NULL;
+#ifdef CONFIG_IMA_KEXEC
+ vfree(image->ima_buffer);
+ image->ima_buffer = NULL;
+#endif /* CONFIG_IMA_KEXEC */
+
/* See if architecture has anything to cleanup post load */
arch_kimage_file_post_load_cleanup(image);
diff --git a/security/integrity/ima/ima_kexec.c b/security/integrity/ima/ima_kexec.c
index 206ddcaa5c67..e29bea3dd4cc 100644
--- a/security/integrity/ima/ima_kexec.c
+++ b/security/integrity/ima/ima_kexec.c
@@ -129,6 +129,8 @@ void ima_add_kexec_buffer(struct kimage *image)
return;
}
+ image->ima_buffer = kexec_buffer;
+
pr_debug("kexec measurement buffer for the loaded kernel at 0x%lx.\n",
kbuf.mem);
}
--
2.30.0
^ permalink raw reply related
* Re: [PATCH] powerpc/fault: fix wrong KUAP fault for IO_URING
From: Aneesh Kumar K.V @ 2021-02-04 17:42 UTC (permalink / raw)
To: Zorro Lang; +Cc: Jens Axboe, linuxppc-dev, Nicholas Piggin
In-Reply-To: <20210204175703.GH14354@localhost.localdomain>
On 2/4/21 11:27 PM, Zorro Lang wrote:
> On Thu, Feb 04, 2021 at 10:31:59PM +0530, Aneesh Kumar K.V wrote:
>> On 2/4/21 10:23 PM, Jens Axboe wrote:
>>> On 2/1/21 11:30 PM, Aneesh Kumar K.V wrote:
>>>> On 2/2/21 11:50 AM, Christophe Leroy wrote:
>>>>>
>>>>>
>>>>> Le 02/02/2021 à 07:16, Aneesh Kumar K.V a écrit :
>>>>>> On 2/2/21 11:32 AM, Christophe Leroy wrote:
>>>>>>>
>>>>>>>
>>>>>>> Le 02/02/2021 à 06:55, Aneesh Kumar K.V a écrit :
>>>>>>>> Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> writes:
>>>>>>>>
>>>>>>>>> Nicholas Piggin <npiggin@gmail.com> writes:
>>>>>>>>>
>>>>>>>>>> Excerpts from Michael Ellerman's message of January 30, 2021 9:22 pm:
>>>>>>>>>>> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>>>>>>>>>>>> +Aneesh
>>>>>>>>>>>>
>>>>>>>>>>>> Le 29/01/2021 à 07:52, Zorro Lang a écrit :
>>>>>>>>>>> ..
>>>>>>>>>>>>> [ 96.200296] ------------[ cut here ]------------
>>>>>>>>>>>>> [ 96.200304] Bug: Read fault blocked by KUAP!
>>>>>>>>>>>>> [ 96.200309] WARNING: CPU: 3 PID: 1876 at
>>>>>>>>>>>>> arch/powerpc/mm/fault.c:229 bad_kernel_fault+0x180/0x310
>>>>>>>>>>>>
>>>>>>>>>>>>> [ 96.200734] NIP [c000000000849424]
>>>>>>>>>>>>> fault_in_pages_readable+0x104/0x350
>>>>>>>>>>>>> [ 96.200741] LR [c00000000084952c]
>>>>>>>>>>>>> fault_in_pages_readable+0x20c/0x350
>>>>>>>>>>>>> [ 96.200747] --- interrupt: 300
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Problem happens in a section where userspace access is supposed
>>>>>>>>>>>> to be granted, so the patch you
>>>>>>>>>>>> proposed is definitely not the right fix.
>>>>>>>>>>>>
>>>>>>>>>>>> c000000000849408: 2c 01 00 4c isync
>>>>>>>>>>>> c00000000084940c: a6 03 3d 7d mtspr 29,r9 <== granting
>>>>>>>>>>>> userspace access permission
>>>>>>>>>>>> c000000000849410: 2c 01 00 4c isync
>>>>>>>>>>>> c000000000849414: 00 00 36 e9 ld r9,0(r22)
>>>>>>>>>>>> c000000000849418: 20 00 29 81 lwz r9,32(r9)
>>>>>>>>>>>> c00000000084941c: 00 02 29 71 andi. r9,r9,512
>>>>>>>>>>>> c000000000849420: 78 d3 5e 7f mr r30,r26
>>>>>>>>>>>> ==> c000000000849424: 00 00 bf 8b lbz r29,0(r31) <==
>>>>>>>>>>>> accessing userspace
>>>>>>>>>>>> c000000000849428: 10 00 82 41 beq c000000000849438
>>>>>>>>>>>> <fault_in_pages_readable+0x118>
>>>>>>>>>>>> c00000000084942c: 2c 01 00 4c isync
>>>>>>>>>>>> c000000000849430: a6 03 bd 7e mtspr 29,r21 <==
>>>>>>>>>>>> clearing userspace access permission
>>>>>>>>>>>> c000000000849434: 2c 01 00 4c isync
>>>>>>>>>>>>
>>>>>>>>>>>> My first guess is that the problem is linked to the following
>>>>>>>>>>>> function, see the comment
>>>>>>>>>>>>
>>>>>>>>>>>> /*
>>>>>>>>>>>> * For kernel thread that doesn't have thread.regs return
>>>>>>>>>>>> * default AMR/IAMR values.
>>>>>>>>>>>> */
>>>>>>>>>>>> static inline u64 current_thread_amr(void)
>>>>>>>>>>>> {
>>>>>>>>>>>> if (current->thread.regs)
>>>>>>>>>>>> return current->thread.regs->amr;
>>>>>>>>>>>> return AMR_KUAP_BLOCKED;
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> Above function was introduced by commit 48a8ab4eeb82
>>>>>>>>>>>> ("powerpc/book3s64/pkeys: Don't update SPRN_AMR
>>>>>>>>>>>> when in kernel mode")
>>>>>>>>>>>
>>>>>>>>>>> Yeah that's a bit of a curly one.
>>>>>>>>>>>
>>>>>>>>>>> At some point io_uring did kthread_use_mm(), which is supposed to
>>>>>>>>>>> mean
>>>>>>>>>>> the kthread can operate on behalf of the original process that
>>>>>>>>>>> submitted
>>>>>>>>>>> the IO.
>>>>>>>>>>>
>>>>>>>>>>> But because KUAP is implemented using memory protection keys, it
>>>>>>>>>>> depends
>>>>>>>>>>> on the value of the AMR register, which is not part of the mm,
>>>>>>>>>>> it's in
>>>>>>>>>>> thread.regs->amr.
>>>>>>>>>>>
>>>>>>>>>>> And what's worse by the time we're in kthread_use_mm() we no
>>>>>>>>>>> longer have
>>>>>>>>>>> access to the thread.regs->amr of the original process that
>>>>>>>>>>> submitted
>>>>>>>>>>> the IO.
>>>>>>>>>>>
>>>>>>>>>>> We also can't simply move the AMR into the mm, precisely because
>>>>>>>>>>> it's
>>>>>>>>>>> per thread, not per mm.
>>>>>>>>>>>
>>>>>>>>>>> So TBH I don't know how we're going to fix this.
>>>>>>>>>>>
>>>>>>>>>>> I guess we could return AMR=unblocked for kernel threads, but that's
>>>>>>>>>>> arguably a bug because it allows a process to circumvent memory
>>>>>>>>>>> keys by
>>>>>>>>>>> asking the kernel to do the access.
>>>>>>>>>>
>>>>>>>>>> We shouldn't need to inherit AMR should we? We only need it to be
>>>>>>>>>> locked
>>>>>>>>>> for kernel threads until it's explicitly unlocked -- nothing mm
>>>>>>>>>> specific
>>>>>>>>>> there. I think current_thread_amr could return 0 for kernel
>>>>>>>>>> threads? Or
>>>>>>>>>> I would even avoid using that function for allow_user_access and open
>>>>>>>>>> code the kthread case and remove it from current_thread_amr().
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Nick
>>>>>>>>>
>>>>>>>>
>>>>>>>> updated one
>>>>>>>>
>>>>>>>> From 8fdb0680f983940d61f91da8252b13c8d3e8ebee Mon Sep 17 00:00:00 2001
>>>>>>>> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
>>>>>>>> Date: Tue, 2 Feb 2021 09:23:38 +0530
>>>>>>>> Subject: [PATCH v2] powerpc/kuap: Allow kernel thread to access
>>>>>>>> userspace
>>>>>>>> after kthread_use_mm
>>>>>>>>
>>>>>>>> This fix the bad fault reported by KUAP when io_wqe_worker access
>>>>>>>> userspace.
>>>>>>>>
>>>>>>>> Bug: Read fault blocked by KUAP!
>>>>>>>> WARNING: CPU: 1 PID: 101841 at arch/powerpc/mm/fault.c:229
>>>>>>>> __do_page_fault+0x6b4/0xcd0
>>>>>>>> NIP [c00000000009e7e4] __do_page_fault+0x6b4/0xcd0
>>>>>>>> LR [c00000000009e7e0] __do_page_fault+0x6b0/0xcd0
>>>>>>>> ..........
>>>>>>>> Call Trace:
>>>>>>>> [c000000016367330] [c00000000009e7e0] __do_page_fault+0x6b0/0xcd0
>>>>>>>> (unreliable)
>>>>>>>> [c0000000163673e0] [c00000000009ee3c] do_page_fault+0x3c/0x120
>>>>>>>> [c000000016367430] [c00000000000c848] handle_page_fault+0x10/0x2c
>>>>>>>> --- interrupt: 300 at iov_iter_fault_in_readable+0x148/0x6f0
>>>>>>>> ..........
>>>>>>>> NIP [c0000000008e8228] iov_iter_fault_in_readable+0x148/0x6f0
>>>>>>>> LR [c0000000008e834c] iov_iter_fault_in_readable+0x26c/0x6f0
>>>>>>>> interrupt: 300
>>>>>>>> [c0000000163677e0] [c0000000007154a0] iomap_write_actor+0xc0/0x280
>>>>>>>> [c000000016367880] [c00000000070fc94] iomap_apply+0x1c4/0x780
>>>>>>>> [c000000016367990] [c000000000710330]
>>>>>>>> iomap_file_buffered_write+0xa0/0x120
>>>>>>>> [c0000000163679e0] [c00800000040791c]
>>>>>>>> xfs_file_buffered_aio_write+0x314/0x5e0 [xfs]
>>>>>>>> [c000000016367a90] [c0000000006d74bc] io_write+0x10c/0x460
>>>>>>>> [c000000016367bb0] [c0000000006d80e4] io_issue_sqe+0x8d4/0x1200
>>>>>>>> [c000000016367c70] [c0000000006d8ad0] io_wq_submit_work+0xc0/0x250
>>>>>>>> [c000000016367cb0] [c0000000006e2578]
>>>>>>>> io_worker_handle_work+0x498/0x800
>>>>>>>> [c000000016367d40] [c0000000006e2cdc] io_wqe_worker+0x3fc/0x4f0
>>>>>>>> [c000000016367da0] [c0000000001cb0a4] kthread+0x1c4/0x1d0
>>>>>>>> [c000000016367e10] [c00000000000dbf0]
>>>>>>>> ret_from_kernel_thread+0x5c/0x6c
>>>>>>>>
>>>>>>>> The kernel consider thread AMR value for kernel thread to be
>>>>>>>> AMR_KUAP_BLOCKED. Hence access to userspace is denied. This
>>>>>>>> of course not correct and we should allow userspace access after
>>>>>>>> kthread_use_mm(). To be precise, kthread_use_mm() should inherit the
>>>>>>>> AMR value of the operating address space. But, the AMR value is
>>>>>>>> thread-specific and we inherit the address space and not thread
>>>>>>>> access restrictions. Because of this ignore AMR value when accessing
>>>>>>>> userspace via kernel thread.
>>>>>>>>
>>>>>>>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>>>>>>>> ---
>>>>>>>> Changes from v1:
>>>>>>>> * Address review feedback from Nick
>>>>>>>>
>>>>>>>> arch/powerpc/include/asm/book3s/64/kup.h | 8 +++++++-
>>>>>>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/arch/powerpc/include/asm/book3s/64/kup.h
>>>>>>>> b/arch/powerpc/include/asm/book3s/64/kup.h
>>>>>>>> index f50f72e535aa..95f4df99249e 100644
>>>>>>>> --- a/arch/powerpc/include/asm/book3s/64/kup.h
>>>>>>>> +++ b/arch/powerpc/include/asm/book3s/64/kup.h
>>>>>>>> @@ -384,7 +384,13 @@ static __always_inline void
>>>>>>>> allow_user_access(void __user *to, const void __user
>>>>>>>> // This is written so we can resolve to a single case at build
>>>>>>>> time
>>>>>>>> BUILD_BUG_ON(!__builtin_constant_p(dir));
>>>>>>>> - if (mmu_has_feature(MMU_FTR_PKEY))
>>>>>>>> + /*
>>>>>>>> + * if it is a kthread that did kthread_use_mm() don't
>>>>>>>> + * use current_thread_amr().
>>>>>>>
>>>>>>> According to include/linux/sched.h, PF_KTHREAD means /* I am a kernel
>>>>>>> thread */
>>>>>>> It doesn't seem to be related to kthread_use_mm()
>>>>>>
>>>>>> That should be a sufficient check here. if we did reach here without
>>>>>> calling kthread_user_mm, we will crash on access because we don't have
>>>>>> a mm attached to the current process. a kernel thread with
>>>>>> kthread_use_mm has
>>>>>
>>>>> Ok but then the comment doesn't match the check.
>>>>
>>>>
>>>> I was trying to be explict in the comment that we expect the thread to
>>>> have done kthread_use_mm().
>>>>
>>>>>
>>>>> And also the comment in current_thread_amr() is then misleading.
>>>>>
>>>>> Why not do the current->flags & PF_KTHREAD check in current_thread_amr()
>>>>> and return 0 in that case instead of BLOCKED ?
>>>>
>>>> In my view currrent_thread_amr() is more generic and we want to be
>>>> explicit there that a kernel thread AMR is KUAP_BLOCKED. Only when we
>>>> call allow user access, we relax the AMR value.
>>>
>>> Just following up on this, as I'd hate to have 5.11 released with this
>>> bug in it for powerpc. It'll obviously also affect other cases of a
>>> kernel thread faulting after having done kthread_use_mm(), though I'm
>>> not sure how widespread that is. In any case, it'll leave io_uring
>>> mostly broken on powerpc if this isn't patched for release.
>>>
>>
>> I am waiting for test feedback on the change I posted earlier. I am also
>> running a regression run myself. Once that is complete i will post the patch
>> as a separate email.
>
> Are you waiting a test from me? Or someone else who test PPC? Although I'm
> the "Reported-by" of this bug, I just can help to verify this bug itself,
> I don't have enough test cases to do regression test from PPC side. Do you need
> me to verify this bug itself.
>
>
If you can verify this bug, it would be really helpful.
-aneesh
^ permalink raw reply
* Re: [PATCH] powerpc/fault: fix wrong KUAP fault for IO_URING
From: Zorro Lang @ 2021-02-04 17:57 UTC (permalink / raw)
To: Aneesh Kumar K.V; +Cc: Jens Axboe, linuxppc-dev, Nicholas Piggin
In-Reply-To: <490cea1a-8161-2f76-03e6-7e2b16efa648@linux.ibm.com>
On Thu, Feb 04, 2021 at 10:31:59PM +0530, Aneesh Kumar K.V wrote:
> On 2/4/21 10:23 PM, Jens Axboe wrote:
> > On 2/1/21 11:30 PM, Aneesh Kumar K.V wrote:
> > > On 2/2/21 11:50 AM, Christophe Leroy wrote:
> > > >
> > > >
> > > > Le 02/02/2021 à 07:16, Aneesh Kumar K.V a écrit :
> > > > > On 2/2/21 11:32 AM, Christophe Leroy wrote:
> > > > > >
> > > > > >
> > > > > > Le 02/02/2021 à 06:55, Aneesh Kumar K.V a écrit :
> > > > > > > Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> writes:
> > > > > > >
> > > > > > > > Nicholas Piggin <npiggin@gmail.com> writes:
> > > > > > > >
> > > > > > > > > Excerpts from Michael Ellerman's message of January 30, 2021 9:22 pm:
> > > > > > > > > > Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> > > > > > > > > > > +Aneesh
> > > > > > > > > > >
> > > > > > > > > > > Le 29/01/2021 à 07:52, Zorro Lang a écrit :
> > > > > > > > > > ..
> > > > > > > > > > > > [ 96.200296] ------------[ cut here ]------------
> > > > > > > > > > > > [ 96.200304] Bug: Read fault blocked by KUAP!
> > > > > > > > > > > > [ 96.200309] WARNING: CPU: 3 PID: 1876 at
> > > > > > > > > > > > arch/powerpc/mm/fault.c:229 bad_kernel_fault+0x180/0x310
> > > > > > > > > > >
> > > > > > > > > > > > [ 96.200734] NIP [c000000000849424]
> > > > > > > > > > > > fault_in_pages_readable+0x104/0x350
> > > > > > > > > > > > [ 96.200741] LR [c00000000084952c]
> > > > > > > > > > > > fault_in_pages_readable+0x20c/0x350
> > > > > > > > > > > > [ 96.200747] --- interrupt: 300
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Problem happens in a section where userspace access is supposed
> > > > > > > > > > > to be granted, so the patch you
> > > > > > > > > > > proposed is definitely not the right fix.
> > > > > > > > > > >
> > > > > > > > > > > c000000000849408: 2c 01 00 4c isync
> > > > > > > > > > > c00000000084940c: a6 03 3d 7d mtspr 29,r9 <== granting
> > > > > > > > > > > userspace access permission
> > > > > > > > > > > c000000000849410: 2c 01 00 4c isync
> > > > > > > > > > > c000000000849414: 00 00 36 e9 ld r9,0(r22)
> > > > > > > > > > > c000000000849418: 20 00 29 81 lwz r9,32(r9)
> > > > > > > > > > > c00000000084941c: 00 02 29 71 andi. r9,r9,512
> > > > > > > > > > > c000000000849420: 78 d3 5e 7f mr r30,r26
> > > > > > > > > > > ==> c000000000849424: 00 00 bf 8b lbz r29,0(r31) <==
> > > > > > > > > > > accessing userspace
> > > > > > > > > > > c000000000849428: 10 00 82 41 beq c000000000849438
> > > > > > > > > > > <fault_in_pages_readable+0x118>
> > > > > > > > > > > c00000000084942c: 2c 01 00 4c isync
> > > > > > > > > > > c000000000849430: a6 03 bd 7e mtspr 29,r21 <==
> > > > > > > > > > > clearing userspace access permission
> > > > > > > > > > > c000000000849434: 2c 01 00 4c isync
> > > > > > > > > > >
> > > > > > > > > > > My first guess is that the problem is linked to the following
> > > > > > > > > > > function, see the comment
> > > > > > > > > > >
> > > > > > > > > > > /*
> > > > > > > > > > > * For kernel thread that doesn't have thread.regs return
> > > > > > > > > > > * default AMR/IAMR values.
> > > > > > > > > > > */
> > > > > > > > > > > static inline u64 current_thread_amr(void)
> > > > > > > > > > > {
> > > > > > > > > > > if (current->thread.regs)
> > > > > > > > > > > return current->thread.regs->amr;
> > > > > > > > > > > return AMR_KUAP_BLOCKED;
> > > > > > > > > > > }
> > > > > > > > > > >
> > > > > > > > > > > Above function was introduced by commit 48a8ab4eeb82
> > > > > > > > > > > ("powerpc/book3s64/pkeys: Don't update SPRN_AMR
> > > > > > > > > > > when in kernel mode")
> > > > > > > > > >
> > > > > > > > > > Yeah that's a bit of a curly one.
> > > > > > > > > >
> > > > > > > > > > At some point io_uring did kthread_use_mm(), which is supposed to
> > > > > > > > > > mean
> > > > > > > > > > the kthread can operate on behalf of the original process that
> > > > > > > > > > submitted
> > > > > > > > > > the IO.
> > > > > > > > > >
> > > > > > > > > > But because KUAP is implemented using memory protection keys, it
> > > > > > > > > > depends
> > > > > > > > > > on the value of the AMR register, which is not part of the mm,
> > > > > > > > > > it's in
> > > > > > > > > > thread.regs->amr.
> > > > > > > > > >
> > > > > > > > > > And what's worse by the time we're in kthread_use_mm() we no
> > > > > > > > > > longer have
> > > > > > > > > > access to the thread.regs->amr of the original process that
> > > > > > > > > > submitted
> > > > > > > > > > the IO.
> > > > > > > > > >
> > > > > > > > > > We also can't simply move the AMR into the mm, precisely because
> > > > > > > > > > it's
> > > > > > > > > > per thread, not per mm.
> > > > > > > > > >
> > > > > > > > > > So TBH I don't know how we're going to fix this.
> > > > > > > > > >
> > > > > > > > > > I guess we could return AMR=unblocked for kernel threads, but that's
> > > > > > > > > > arguably a bug because it allows a process to circumvent memory
> > > > > > > > > > keys by
> > > > > > > > > > asking the kernel to do the access.
> > > > > > > > >
> > > > > > > > > We shouldn't need to inherit AMR should we? We only need it to be
> > > > > > > > > locked
> > > > > > > > > for kernel threads until it's explicitly unlocked -- nothing mm
> > > > > > > > > specific
> > > > > > > > > there. I think current_thread_amr could return 0 for kernel
> > > > > > > > > threads? Or
> > > > > > > > > I would even avoid using that function for allow_user_access and open
> > > > > > > > > code the kthread case and remove it from current_thread_amr().
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Nick
> > > > > > > >
> > > > > > >
> > > > > > > updated one
> > > > > > >
> > > > > > > From 8fdb0680f983940d61f91da8252b13c8d3e8ebee Mon Sep 17 00:00:00 2001
> > > > > > > From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
> > > > > > > Date: Tue, 2 Feb 2021 09:23:38 +0530
> > > > > > > Subject: [PATCH v2] powerpc/kuap: Allow kernel thread to access
> > > > > > > userspace
> > > > > > > after kthread_use_mm
> > > > > > >
> > > > > > > This fix the bad fault reported by KUAP when io_wqe_worker access
> > > > > > > userspace.
> > > > > > >
> > > > > > > Bug: Read fault blocked by KUAP!
> > > > > > > WARNING: CPU: 1 PID: 101841 at arch/powerpc/mm/fault.c:229
> > > > > > > __do_page_fault+0x6b4/0xcd0
> > > > > > > NIP [c00000000009e7e4] __do_page_fault+0x6b4/0xcd0
> > > > > > > LR [c00000000009e7e0] __do_page_fault+0x6b0/0xcd0
> > > > > > > ..........
> > > > > > > Call Trace:
> > > > > > > [c000000016367330] [c00000000009e7e0] __do_page_fault+0x6b0/0xcd0
> > > > > > > (unreliable)
> > > > > > > [c0000000163673e0] [c00000000009ee3c] do_page_fault+0x3c/0x120
> > > > > > > [c000000016367430] [c00000000000c848] handle_page_fault+0x10/0x2c
> > > > > > > --- interrupt: 300 at iov_iter_fault_in_readable+0x148/0x6f0
> > > > > > > ..........
> > > > > > > NIP [c0000000008e8228] iov_iter_fault_in_readable+0x148/0x6f0
> > > > > > > LR [c0000000008e834c] iov_iter_fault_in_readable+0x26c/0x6f0
> > > > > > > interrupt: 300
> > > > > > > [c0000000163677e0] [c0000000007154a0] iomap_write_actor+0xc0/0x280
> > > > > > > [c000000016367880] [c00000000070fc94] iomap_apply+0x1c4/0x780
> > > > > > > [c000000016367990] [c000000000710330]
> > > > > > > iomap_file_buffered_write+0xa0/0x120
> > > > > > > [c0000000163679e0] [c00800000040791c]
> > > > > > > xfs_file_buffered_aio_write+0x314/0x5e0 [xfs]
> > > > > > > [c000000016367a90] [c0000000006d74bc] io_write+0x10c/0x460
> > > > > > > [c000000016367bb0] [c0000000006d80e4] io_issue_sqe+0x8d4/0x1200
> > > > > > > [c000000016367c70] [c0000000006d8ad0] io_wq_submit_work+0xc0/0x250
> > > > > > > [c000000016367cb0] [c0000000006e2578]
> > > > > > > io_worker_handle_work+0x498/0x800
> > > > > > > [c000000016367d40] [c0000000006e2cdc] io_wqe_worker+0x3fc/0x4f0
> > > > > > > [c000000016367da0] [c0000000001cb0a4] kthread+0x1c4/0x1d0
> > > > > > > [c000000016367e10] [c00000000000dbf0]
> > > > > > > ret_from_kernel_thread+0x5c/0x6c
> > > > > > >
> > > > > > > The kernel consider thread AMR value for kernel thread to be
> > > > > > > AMR_KUAP_BLOCKED. Hence access to userspace is denied. This
> > > > > > > of course not correct and we should allow userspace access after
> > > > > > > kthread_use_mm(). To be precise, kthread_use_mm() should inherit the
> > > > > > > AMR value of the operating address space. But, the AMR value is
> > > > > > > thread-specific and we inherit the address space and not thread
> > > > > > > access restrictions. Because of this ignore AMR value when accessing
> > > > > > > userspace via kernel thread.
> > > > > > >
> > > > > > > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> > > > > > > ---
> > > > > > > Changes from v1:
> > > > > > > * Address review feedback from Nick
> > > > > > >
> > > > > > > arch/powerpc/include/asm/book3s/64/kup.h | 8 +++++++-
> > > > > > > 1 file changed, 7 insertions(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/arch/powerpc/include/asm/book3s/64/kup.h
> > > > > > > b/arch/powerpc/include/asm/book3s/64/kup.h
> > > > > > > index f50f72e535aa..95f4df99249e 100644
> > > > > > > --- a/arch/powerpc/include/asm/book3s/64/kup.h
> > > > > > > +++ b/arch/powerpc/include/asm/book3s/64/kup.h
> > > > > > > @@ -384,7 +384,13 @@ static __always_inline void
> > > > > > > allow_user_access(void __user *to, const void __user
> > > > > > > // This is written so we can resolve to a single case at build
> > > > > > > time
> > > > > > > BUILD_BUG_ON(!__builtin_constant_p(dir));
> > > > > > > - if (mmu_has_feature(MMU_FTR_PKEY))
> > > > > > > + /*
> > > > > > > + * if it is a kthread that did kthread_use_mm() don't
> > > > > > > + * use current_thread_amr().
> > > > > >
> > > > > > According to include/linux/sched.h, PF_KTHREAD means /* I am a kernel
> > > > > > thread */
> > > > > > It doesn't seem to be related to kthread_use_mm()
> > > > >
> > > > > That should be a sufficient check here. if we did reach here without
> > > > > calling kthread_user_mm, we will crash on access because we don't have
> > > > > a mm attached to the current process. a kernel thread with
> > > > > kthread_use_mm has
> > > >
> > > > Ok but then the comment doesn't match the check.
> > >
> > >
> > > I was trying to be explict in the comment that we expect the thread to
> > > have done kthread_use_mm().
> > >
> > > >
> > > > And also the comment in current_thread_amr() is then misleading.
> > > >
> > > > Why not do the current->flags & PF_KTHREAD check in current_thread_amr()
> > > > and return 0 in that case instead of BLOCKED ?
> > >
> > > In my view currrent_thread_amr() is more generic and we want to be
> > > explicit there that a kernel thread AMR is KUAP_BLOCKED. Only when we
> > > call allow user access, we relax the AMR value.
> >
> > Just following up on this, as I'd hate to have 5.11 released with this
> > bug in it for powerpc. It'll obviously also affect other cases of a
> > kernel thread faulting after having done kthread_use_mm(), though I'm
> > not sure how widespread that is. In any case, it'll leave io_uring
> > mostly broken on powerpc if this isn't patched for release.
> >
>
> I am waiting for test feedback on the change I posted earlier. I am also
> running a regression run myself. Once that is complete i will post the patch
> as a separate email.
Are you waiting a test from me? Or someone else who test PPC? Although I'm
the "Reported-by" of this bug, I just can help to verify this bug itself,
I don't have enough test cases to do regression test from PPC side. Do you need
me to verify this bug itself.
Thanks,
Zorro
>
> -aneesh
>
^ permalink raw reply
* Re: [PATCH] powerpc/fault: fix wrong KUAP fault for IO_URING
From: Jens Axboe @ 2021-02-04 17:03 UTC (permalink / raw)
To: Aneesh Kumar K.V, Christophe Leroy, Nicholas Piggin,
Michael Ellerman, Zorro Lang
Cc: linuxppc-dev
In-Reply-To: <490cea1a-8161-2f76-03e6-7e2b16efa648@linux.ibm.com>
On 2/4/21 10:01 AM, Aneesh Kumar K.V wrote:
> On 2/4/21 10:23 PM, Jens Axboe wrote:
>> On 2/1/21 11:30 PM, Aneesh Kumar K.V wrote:
>>> On 2/2/21 11:50 AM, Christophe Leroy wrote:
>>>>
>>>>
>>>> Le 02/02/2021 à 07:16, Aneesh Kumar K.V a écrit :
>>>>> On 2/2/21 11:32 AM, Christophe Leroy wrote:
>>>>>>
>>>>>>
>>>>>> Le 02/02/2021 à 06:55, Aneesh Kumar K.V a écrit :
>>>>>>> Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> writes:
>>>>>>>
>>>>>>>> Nicholas Piggin <npiggin@gmail.com> writes:
>>>>>>>>
>>>>>>>>> Excerpts from Michael Ellerman's message of January 30, 2021 9:22 pm:
>>>>>>>>>> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>>>>>>>>>>> +Aneesh
>>>>>>>>>>>
>>>>>>>>>>> Le 29/01/2021 à 07:52, Zorro Lang a écrit :
>>>>>>>>>> ..
>>>>>>>>>>>> [ 96.200296] ------------[ cut here ]------------
>>>>>>>>>>>> [ 96.200304] Bug: Read fault blocked by KUAP!
>>>>>>>>>>>> [ 96.200309] WARNING: CPU: 3 PID: 1876 at
>>>>>>>>>>>> arch/powerpc/mm/fault.c:229 bad_kernel_fault+0x180/0x310
>>>>>>>>>>>
>>>>>>>>>>>> [ 96.200734] NIP [c000000000849424]
>>>>>>>>>>>> fault_in_pages_readable+0x104/0x350
>>>>>>>>>>>> [ 96.200741] LR [c00000000084952c]
>>>>>>>>>>>> fault_in_pages_readable+0x20c/0x350
>>>>>>>>>>>> [ 96.200747] --- interrupt: 300
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Problem happens in a section where userspace access is supposed
>>>>>>>>>>> to be granted, so the patch you
>>>>>>>>>>> proposed is definitely not the right fix.
>>>>>>>>>>>
>>>>>>>>>>> c000000000849408: 2c 01 00 4c isync
>>>>>>>>>>> c00000000084940c: a6 03 3d 7d mtspr 29,r9 <== granting
>>>>>>>>>>> userspace access permission
>>>>>>>>>>> c000000000849410: 2c 01 00 4c isync
>>>>>>>>>>> c000000000849414: 00 00 36 e9 ld r9,0(r22)
>>>>>>>>>>> c000000000849418: 20 00 29 81 lwz r9,32(r9)
>>>>>>>>>>> c00000000084941c: 00 02 29 71 andi. r9,r9,512
>>>>>>>>>>> c000000000849420: 78 d3 5e 7f mr r30,r26
>>>>>>>>>>> ==> c000000000849424: 00 00 bf 8b lbz r29,0(r31) <==
>>>>>>>>>>> accessing userspace
>>>>>>>>>>> c000000000849428: 10 00 82 41 beq c000000000849438
>>>>>>>>>>> <fault_in_pages_readable+0x118>
>>>>>>>>>>> c00000000084942c: 2c 01 00 4c isync
>>>>>>>>>>> c000000000849430: a6 03 bd 7e mtspr 29,r21 <==
>>>>>>>>>>> clearing userspace access permission
>>>>>>>>>>> c000000000849434: 2c 01 00 4c isync
>>>>>>>>>>>
>>>>>>>>>>> My first guess is that the problem is linked to the following
>>>>>>>>>>> function, see the comment
>>>>>>>>>>>
>>>>>>>>>>> /*
>>>>>>>>>>> * For kernel thread that doesn't have thread.regs return
>>>>>>>>>>> * default AMR/IAMR values.
>>>>>>>>>>> */
>>>>>>>>>>> static inline u64 current_thread_amr(void)
>>>>>>>>>>> {
>>>>>>>>>>> if (current->thread.regs)
>>>>>>>>>>> return current->thread.regs->amr;
>>>>>>>>>>> return AMR_KUAP_BLOCKED;
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> Above function was introduced by commit 48a8ab4eeb82
>>>>>>>>>>> ("powerpc/book3s64/pkeys: Don't update SPRN_AMR
>>>>>>>>>>> when in kernel mode")
>>>>>>>>>>
>>>>>>>>>> Yeah that's a bit of a curly one.
>>>>>>>>>>
>>>>>>>>>> At some point io_uring did kthread_use_mm(), which is supposed to
>>>>>>>>>> mean
>>>>>>>>>> the kthread can operate on behalf of the original process that
>>>>>>>>>> submitted
>>>>>>>>>> the IO.
>>>>>>>>>>
>>>>>>>>>> But because KUAP is implemented using memory protection keys, it
>>>>>>>>>> depends
>>>>>>>>>> on the value of the AMR register, which is not part of the mm,
>>>>>>>>>> it's in
>>>>>>>>>> thread.regs->amr.
>>>>>>>>>>
>>>>>>>>>> And what's worse by the time we're in kthread_use_mm() we no
>>>>>>>>>> longer have
>>>>>>>>>> access to the thread.regs->amr of the original process that
>>>>>>>>>> submitted
>>>>>>>>>> the IO.
>>>>>>>>>>
>>>>>>>>>> We also can't simply move the AMR into the mm, precisely because
>>>>>>>>>> it's
>>>>>>>>>> per thread, not per mm.
>>>>>>>>>>
>>>>>>>>>> So TBH I don't know how we're going to fix this.
>>>>>>>>>>
>>>>>>>>>> I guess we could return AMR=unblocked for kernel threads, but that's
>>>>>>>>>> arguably a bug because it allows a process to circumvent memory
>>>>>>>>>> keys by
>>>>>>>>>> asking the kernel to do the access.
>>>>>>>>>
>>>>>>>>> We shouldn't need to inherit AMR should we? We only need it to be
>>>>>>>>> locked
>>>>>>>>> for kernel threads until it's explicitly unlocked -- nothing mm
>>>>>>>>> specific
>>>>>>>>> there. I think current_thread_amr could return 0 for kernel
>>>>>>>>> threads? Or
>>>>>>>>> I would even avoid using that function for allow_user_access and open
>>>>>>>>> code the kthread case and remove it from current_thread_amr().
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Nick
>>>>>>>>
>>>>>>>
>>>>>>> updated one
>>>>>>>
>>>>>>> From 8fdb0680f983940d61f91da8252b13c8d3e8ebee Mon Sep 17 00:00:00 2001
>>>>>>> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
>>>>>>> Date: Tue, 2 Feb 2021 09:23:38 +0530
>>>>>>> Subject: [PATCH v2] powerpc/kuap: Allow kernel thread to access
>>>>>>> userspace
>>>>>>> after kthread_use_mm
>>>>>>>
>>>>>>> This fix the bad fault reported by KUAP when io_wqe_worker access
>>>>>>> userspace.
>>>>>>>
>>>>>>> Bug: Read fault blocked by KUAP!
>>>>>>> WARNING: CPU: 1 PID: 101841 at arch/powerpc/mm/fault.c:229
>>>>>>> __do_page_fault+0x6b4/0xcd0
>>>>>>> NIP [c00000000009e7e4] __do_page_fault+0x6b4/0xcd0
>>>>>>> LR [c00000000009e7e0] __do_page_fault+0x6b0/0xcd0
>>>>>>> ..........
>>>>>>> Call Trace:
>>>>>>> [c000000016367330] [c00000000009e7e0] __do_page_fault+0x6b0/0xcd0
>>>>>>> (unreliable)
>>>>>>> [c0000000163673e0] [c00000000009ee3c] do_page_fault+0x3c/0x120
>>>>>>> [c000000016367430] [c00000000000c848] handle_page_fault+0x10/0x2c
>>>>>>> --- interrupt: 300 at iov_iter_fault_in_readable+0x148/0x6f0
>>>>>>> ..........
>>>>>>> NIP [c0000000008e8228] iov_iter_fault_in_readable+0x148/0x6f0
>>>>>>> LR [c0000000008e834c] iov_iter_fault_in_readable+0x26c/0x6f0
>>>>>>> interrupt: 300
>>>>>>> [c0000000163677e0] [c0000000007154a0] iomap_write_actor+0xc0/0x280
>>>>>>> [c000000016367880] [c00000000070fc94] iomap_apply+0x1c4/0x780
>>>>>>> [c000000016367990] [c000000000710330]
>>>>>>> iomap_file_buffered_write+0xa0/0x120
>>>>>>> [c0000000163679e0] [c00800000040791c]
>>>>>>> xfs_file_buffered_aio_write+0x314/0x5e0 [xfs]
>>>>>>> [c000000016367a90] [c0000000006d74bc] io_write+0x10c/0x460
>>>>>>> [c000000016367bb0] [c0000000006d80e4] io_issue_sqe+0x8d4/0x1200
>>>>>>> [c000000016367c70] [c0000000006d8ad0] io_wq_submit_work+0xc0/0x250
>>>>>>> [c000000016367cb0] [c0000000006e2578]
>>>>>>> io_worker_handle_work+0x498/0x800
>>>>>>> [c000000016367d40] [c0000000006e2cdc] io_wqe_worker+0x3fc/0x4f0
>>>>>>> [c000000016367da0] [c0000000001cb0a4] kthread+0x1c4/0x1d0
>>>>>>> [c000000016367e10] [c00000000000dbf0]
>>>>>>> ret_from_kernel_thread+0x5c/0x6c
>>>>>>>
>>>>>>> The kernel consider thread AMR value for kernel thread to be
>>>>>>> AMR_KUAP_BLOCKED. Hence access to userspace is denied. This
>>>>>>> of course not correct and we should allow userspace access after
>>>>>>> kthread_use_mm(). To be precise, kthread_use_mm() should inherit the
>>>>>>> AMR value of the operating address space. But, the AMR value is
>>>>>>> thread-specific and we inherit the address space and not thread
>>>>>>> access restrictions. Because of this ignore AMR value when accessing
>>>>>>> userspace via kernel thread.
>>>>>>>
>>>>>>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>>>>>>> ---
>>>>>>> Changes from v1:
>>>>>>> * Address review feedback from Nick
>>>>>>>
>>>>>>> arch/powerpc/include/asm/book3s/64/kup.h | 8 +++++++-
>>>>>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/arch/powerpc/include/asm/book3s/64/kup.h
>>>>>>> b/arch/powerpc/include/asm/book3s/64/kup.h
>>>>>>> index f50f72e535aa..95f4df99249e 100644
>>>>>>> --- a/arch/powerpc/include/asm/book3s/64/kup.h
>>>>>>> +++ b/arch/powerpc/include/asm/book3s/64/kup.h
>>>>>>> @@ -384,7 +384,13 @@ static __always_inline void
>>>>>>> allow_user_access(void __user *to, const void __user
>>>>>>> // This is written so we can resolve to a single case at build
>>>>>>> time
>>>>>>> BUILD_BUG_ON(!__builtin_constant_p(dir));
>>>>>>> - if (mmu_has_feature(MMU_FTR_PKEY))
>>>>>>> + /*
>>>>>>> + * if it is a kthread that did kthread_use_mm() don't
>>>>>>> + * use current_thread_amr().
>>>>>>
>>>>>> According to include/linux/sched.h, PF_KTHREAD means /* I am a kernel
>>>>>> thread */
>>>>>> It doesn't seem to be related to kthread_use_mm()
>>>>>
>>>>> That should be a sufficient check here. if we did reach here without
>>>>> calling kthread_user_mm, we will crash on access because we don't have
>>>>> a mm attached to the current process. a kernel thread with
>>>>> kthread_use_mm has
>>>>
>>>> Ok but then the comment doesn't match the check.
>>>
>>>
>>> I was trying to be explict in the comment that we expect the thread to
>>> have done kthread_use_mm().
>>>
>>>>
>>>> And also the comment in current_thread_amr() is then misleading.
>>>>
>>>> Why not do the current->flags & PF_KTHREAD check in current_thread_amr()
>>>> and return 0 in that case instead of BLOCKED ?
>>>
>>> In my view currrent_thread_amr() is more generic and we want to be
>>> explicit there that a kernel thread AMR is KUAP_BLOCKED. Only when we
>>> call allow user access, we relax the AMR value.
>>
>> Just following up on this, as I'd hate to have 5.11 released with this
>> bug in it for powerpc. It'll obviously also affect other cases of a
>> kernel thread faulting after having done kthread_use_mm(), though I'm
>> not sure how widespread that is. In any case, it'll leave io_uring
>> mostly broken on powerpc if this isn't patched for release.
>>
>
> I am waiting for test feedback on the change I posted earlier. I am also
> running a regression run myself. Once that is complete i will post the
> patch as a separate email.
Perfect, sounds good!
--
Jens Axboe
^ permalink raw reply
* Re: [PATCH] powerpc/fault: fix wrong KUAP fault for IO_URING
From: Aneesh Kumar K.V @ 2021-02-04 17:01 UTC (permalink / raw)
To: Jens Axboe, Christophe Leroy, Nicholas Piggin, Michael Ellerman,
Zorro Lang
Cc: linuxppc-dev
In-Reply-To: <086f6525-b851-c4c0-d611-42f76a54a2d9@kernel.dk>
On 2/4/21 10:23 PM, Jens Axboe wrote:
> On 2/1/21 11:30 PM, Aneesh Kumar K.V wrote:
>> On 2/2/21 11:50 AM, Christophe Leroy wrote:
>>>
>>>
>>> Le 02/02/2021 à 07:16, Aneesh Kumar K.V a écrit :
>>>> On 2/2/21 11:32 AM, Christophe Leroy wrote:
>>>>>
>>>>>
>>>>> Le 02/02/2021 à 06:55, Aneesh Kumar K.V a écrit :
>>>>>> Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> writes:
>>>>>>
>>>>>>> Nicholas Piggin <npiggin@gmail.com> writes:
>>>>>>>
>>>>>>>> Excerpts from Michael Ellerman's message of January 30, 2021 9:22 pm:
>>>>>>>>> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>>>>>>>>>> +Aneesh
>>>>>>>>>>
>>>>>>>>>> Le 29/01/2021 à 07:52, Zorro Lang a écrit :
>>>>>>>>> ..
>>>>>>>>>>> [ 96.200296] ------------[ cut here ]------------
>>>>>>>>>>> [ 96.200304] Bug: Read fault blocked by KUAP!
>>>>>>>>>>> [ 96.200309] WARNING: CPU: 3 PID: 1876 at
>>>>>>>>>>> arch/powerpc/mm/fault.c:229 bad_kernel_fault+0x180/0x310
>>>>>>>>>>
>>>>>>>>>>> [ 96.200734] NIP [c000000000849424]
>>>>>>>>>>> fault_in_pages_readable+0x104/0x350
>>>>>>>>>>> [ 96.200741] LR [c00000000084952c]
>>>>>>>>>>> fault_in_pages_readable+0x20c/0x350
>>>>>>>>>>> [ 96.200747] --- interrupt: 300
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Problem happens in a section where userspace access is supposed
>>>>>>>>>> to be granted, so the patch you
>>>>>>>>>> proposed is definitely not the right fix.
>>>>>>>>>>
>>>>>>>>>> c000000000849408: 2c 01 00 4c isync
>>>>>>>>>> c00000000084940c: a6 03 3d 7d mtspr 29,r9 <== granting
>>>>>>>>>> userspace access permission
>>>>>>>>>> c000000000849410: 2c 01 00 4c isync
>>>>>>>>>> c000000000849414: 00 00 36 e9 ld r9,0(r22)
>>>>>>>>>> c000000000849418: 20 00 29 81 lwz r9,32(r9)
>>>>>>>>>> c00000000084941c: 00 02 29 71 andi. r9,r9,512
>>>>>>>>>> c000000000849420: 78 d3 5e 7f mr r30,r26
>>>>>>>>>> ==> c000000000849424: 00 00 bf 8b lbz r29,0(r31) <==
>>>>>>>>>> accessing userspace
>>>>>>>>>> c000000000849428: 10 00 82 41 beq c000000000849438
>>>>>>>>>> <fault_in_pages_readable+0x118>
>>>>>>>>>> c00000000084942c: 2c 01 00 4c isync
>>>>>>>>>> c000000000849430: a6 03 bd 7e mtspr 29,r21 <==
>>>>>>>>>> clearing userspace access permission
>>>>>>>>>> c000000000849434: 2c 01 00 4c isync
>>>>>>>>>>
>>>>>>>>>> My first guess is that the problem is linked to the following
>>>>>>>>>> function, see the comment
>>>>>>>>>>
>>>>>>>>>> /*
>>>>>>>>>> * For kernel thread that doesn't have thread.regs return
>>>>>>>>>> * default AMR/IAMR values.
>>>>>>>>>> */
>>>>>>>>>> static inline u64 current_thread_amr(void)
>>>>>>>>>> {
>>>>>>>>>> if (current->thread.regs)
>>>>>>>>>> return current->thread.regs->amr;
>>>>>>>>>> return AMR_KUAP_BLOCKED;
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> Above function was introduced by commit 48a8ab4eeb82
>>>>>>>>>> ("powerpc/book3s64/pkeys: Don't update SPRN_AMR
>>>>>>>>>> when in kernel mode")
>>>>>>>>>
>>>>>>>>> Yeah that's a bit of a curly one.
>>>>>>>>>
>>>>>>>>> At some point io_uring did kthread_use_mm(), which is supposed to
>>>>>>>>> mean
>>>>>>>>> the kthread can operate on behalf of the original process that
>>>>>>>>> submitted
>>>>>>>>> the IO.
>>>>>>>>>
>>>>>>>>> But because KUAP is implemented using memory protection keys, it
>>>>>>>>> depends
>>>>>>>>> on the value of the AMR register, which is not part of the mm,
>>>>>>>>> it's in
>>>>>>>>> thread.regs->amr.
>>>>>>>>>
>>>>>>>>> And what's worse by the time we're in kthread_use_mm() we no
>>>>>>>>> longer have
>>>>>>>>> access to the thread.regs->amr of the original process that
>>>>>>>>> submitted
>>>>>>>>> the IO.
>>>>>>>>>
>>>>>>>>> We also can't simply move the AMR into the mm, precisely because
>>>>>>>>> it's
>>>>>>>>> per thread, not per mm.
>>>>>>>>>
>>>>>>>>> So TBH I don't know how we're going to fix this.
>>>>>>>>>
>>>>>>>>> I guess we could return AMR=unblocked for kernel threads, but that's
>>>>>>>>> arguably a bug because it allows a process to circumvent memory
>>>>>>>>> keys by
>>>>>>>>> asking the kernel to do the access.
>>>>>>>>
>>>>>>>> We shouldn't need to inherit AMR should we? We only need it to be
>>>>>>>> locked
>>>>>>>> for kernel threads until it's explicitly unlocked -- nothing mm
>>>>>>>> specific
>>>>>>>> there. I think current_thread_amr could return 0 for kernel
>>>>>>>> threads? Or
>>>>>>>> I would even avoid using that function for allow_user_access and open
>>>>>>>> code the kthread case and remove it from current_thread_amr().
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Nick
>>>>>>>
>>>>>>
>>>>>> updated one
>>>>>>
>>>>>> From 8fdb0680f983940d61f91da8252b13c8d3e8ebee Mon Sep 17 00:00:00 2001
>>>>>> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
>>>>>> Date: Tue, 2 Feb 2021 09:23:38 +0530
>>>>>> Subject: [PATCH v2] powerpc/kuap: Allow kernel thread to access
>>>>>> userspace
>>>>>> after kthread_use_mm
>>>>>>
>>>>>> This fix the bad fault reported by KUAP when io_wqe_worker access
>>>>>> userspace.
>>>>>>
>>>>>> Bug: Read fault blocked by KUAP!
>>>>>> WARNING: CPU: 1 PID: 101841 at arch/powerpc/mm/fault.c:229
>>>>>> __do_page_fault+0x6b4/0xcd0
>>>>>> NIP [c00000000009e7e4] __do_page_fault+0x6b4/0xcd0
>>>>>> LR [c00000000009e7e0] __do_page_fault+0x6b0/0xcd0
>>>>>> ..........
>>>>>> Call Trace:
>>>>>> [c000000016367330] [c00000000009e7e0] __do_page_fault+0x6b0/0xcd0
>>>>>> (unreliable)
>>>>>> [c0000000163673e0] [c00000000009ee3c] do_page_fault+0x3c/0x120
>>>>>> [c000000016367430] [c00000000000c848] handle_page_fault+0x10/0x2c
>>>>>> --- interrupt: 300 at iov_iter_fault_in_readable+0x148/0x6f0
>>>>>> ..........
>>>>>> NIP [c0000000008e8228] iov_iter_fault_in_readable+0x148/0x6f0
>>>>>> LR [c0000000008e834c] iov_iter_fault_in_readable+0x26c/0x6f0
>>>>>> interrupt: 300
>>>>>> [c0000000163677e0] [c0000000007154a0] iomap_write_actor+0xc0/0x280
>>>>>> [c000000016367880] [c00000000070fc94] iomap_apply+0x1c4/0x780
>>>>>> [c000000016367990] [c000000000710330]
>>>>>> iomap_file_buffered_write+0xa0/0x120
>>>>>> [c0000000163679e0] [c00800000040791c]
>>>>>> xfs_file_buffered_aio_write+0x314/0x5e0 [xfs]
>>>>>> [c000000016367a90] [c0000000006d74bc] io_write+0x10c/0x460
>>>>>> [c000000016367bb0] [c0000000006d80e4] io_issue_sqe+0x8d4/0x1200
>>>>>> [c000000016367c70] [c0000000006d8ad0] io_wq_submit_work+0xc0/0x250
>>>>>> [c000000016367cb0] [c0000000006e2578]
>>>>>> io_worker_handle_work+0x498/0x800
>>>>>> [c000000016367d40] [c0000000006e2cdc] io_wqe_worker+0x3fc/0x4f0
>>>>>> [c000000016367da0] [c0000000001cb0a4] kthread+0x1c4/0x1d0
>>>>>> [c000000016367e10] [c00000000000dbf0]
>>>>>> ret_from_kernel_thread+0x5c/0x6c
>>>>>>
>>>>>> The kernel consider thread AMR value for kernel thread to be
>>>>>> AMR_KUAP_BLOCKED. Hence access to userspace is denied. This
>>>>>> of course not correct and we should allow userspace access after
>>>>>> kthread_use_mm(). To be precise, kthread_use_mm() should inherit the
>>>>>> AMR value of the operating address space. But, the AMR value is
>>>>>> thread-specific and we inherit the address space and not thread
>>>>>> access restrictions. Because of this ignore AMR value when accessing
>>>>>> userspace via kernel thread.
>>>>>>
>>>>>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>>>>>> ---
>>>>>> Changes from v1:
>>>>>> * Address review feedback from Nick
>>>>>>
>>>>>> arch/powerpc/include/asm/book3s/64/kup.h | 8 +++++++-
>>>>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/arch/powerpc/include/asm/book3s/64/kup.h
>>>>>> b/arch/powerpc/include/asm/book3s/64/kup.h
>>>>>> index f50f72e535aa..95f4df99249e 100644
>>>>>> --- a/arch/powerpc/include/asm/book3s/64/kup.h
>>>>>> +++ b/arch/powerpc/include/asm/book3s/64/kup.h
>>>>>> @@ -384,7 +384,13 @@ static __always_inline void
>>>>>> allow_user_access(void __user *to, const void __user
>>>>>> // This is written so we can resolve to a single case at build
>>>>>> time
>>>>>> BUILD_BUG_ON(!__builtin_constant_p(dir));
>>>>>> - if (mmu_has_feature(MMU_FTR_PKEY))
>>>>>> + /*
>>>>>> + * if it is a kthread that did kthread_use_mm() don't
>>>>>> + * use current_thread_amr().
>>>>>
>>>>> According to include/linux/sched.h, PF_KTHREAD means /* I am a kernel
>>>>> thread */
>>>>> It doesn't seem to be related to kthread_use_mm()
>>>>
>>>> That should be a sufficient check here. if we did reach here without
>>>> calling kthread_user_mm, we will crash on access because we don't have
>>>> a mm attached to the current process. a kernel thread with
>>>> kthread_use_mm has
>>>
>>> Ok but then the comment doesn't match the check.
>>
>>
>> I was trying to be explict in the comment that we expect the thread to
>> have done kthread_use_mm().
>>
>>>
>>> And also the comment in current_thread_amr() is then misleading.
>>>
>>> Why not do the current->flags & PF_KTHREAD check in current_thread_amr()
>>> and return 0 in that case instead of BLOCKED ?
>>
>> In my view currrent_thread_amr() is more generic and we want to be
>> explicit there that a kernel thread AMR is KUAP_BLOCKED. Only when we
>> call allow user access, we relax the AMR value.
>
> Just following up on this, as I'd hate to have 5.11 released with this
> bug in it for powerpc. It'll obviously also affect other cases of a
> kernel thread faulting after having done kthread_use_mm(), though I'm
> not sure how widespread that is. In any case, it'll leave io_uring
> mostly broken on powerpc if this isn't patched for release.
>
I am waiting for test feedback on the change I posted earlier. I am also
running a regression run myself. Once that is complete i will post the
patch as a separate email.
-aneesh
^ permalink raw reply
* Re: [PATCH] powerpc/fault: fix wrong KUAP fault for IO_URING
From: Jens Axboe @ 2021-02-04 16:53 UTC (permalink / raw)
To: Aneesh Kumar K.V, Christophe Leroy, Nicholas Piggin,
Michael Ellerman, Zorro Lang
Cc: linuxppc-dev
In-Reply-To: <e2dfb7ec-a4d4-579d-a79b-6709df6e9e50@linux.ibm.com>
On 2/1/21 11:30 PM, Aneesh Kumar K.V wrote:
> On 2/2/21 11:50 AM, Christophe Leroy wrote:
>>
>>
>> Le 02/02/2021 à 07:16, Aneesh Kumar K.V a écrit :
>>> On 2/2/21 11:32 AM, Christophe Leroy wrote:
>>>>
>>>>
>>>> Le 02/02/2021 à 06:55, Aneesh Kumar K.V a écrit :
>>>>> Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> writes:
>>>>>
>>>>>> Nicholas Piggin <npiggin@gmail.com> writes:
>>>>>>
>>>>>>> Excerpts from Michael Ellerman's message of January 30, 2021 9:22 pm:
>>>>>>>> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>>>>>>>>> +Aneesh
>>>>>>>>>
>>>>>>>>> Le 29/01/2021 à 07:52, Zorro Lang a écrit :
>>>>>>>> ..
>>>>>>>>>> [ 96.200296] ------------[ cut here ]------------
>>>>>>>>>> [ 96.200304] Bug: Read fault blocked by KUAP!
>>>>>>>>>> [ 96.200309] WARNING: CPU: 3 PID: 1876 at
>>>>>>>>>> arch/powerpc/mm/fault.c:229 bad_kernel_fault+0x180/0x310
>>>>>>>>>
>>>>>>>>>> [ 96.200734] NIP [c000000000849424]
>>>>>>>>>> fault_in_pages_readable+0x104/0x350
>>>>>>>>>> [ 96.200741] LR [c00000000084952c]
>>>>>>>>>> fault_in_pages_readable+0x20c/0x350
>>>>>>>>>> [ 96.200747] --- interrupt: 300
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Problem happens in a section where userspace access is supposed
>>>>>>>>> to be granted, so the patch you
>>>>>>>>> proposed is definitely not the right fix.
>>>>>>>>>
>>>>>>>>> c000000000849408: 2c 01 00 4c isync
>>>>>>>>> c00000000084940c: a6 03 3d 7d mtspr 29,r9 <== granting
>>>>>>>>> userspace access permission
>>>>>>>>> c000000000849410: 2c 01 00 4c isync
>>>>>>>>> c000000000849414: 00 00 36 e9 ld r9,0(r22)
>>>>>>>>> c000000000849418: 20 00 29 81 lwz r9,32(r9)
>>>>>>>>> c00000000084941c: 00 02 29 71 andi. r9,r9,512
>>>>>>>>> c000000000849420: 78 d3 5e 7f mr r30,r26
>>>>>>>>> ==> c000000000849424: 00 00 bf 8b lbz r29,0(r31) <==
>>>>>>>>> accessing userspace
>>>>>>>>> c000000000849428: 10 00 82 41 beq c000000000849438
>>>>>>>>> <fault_in_pages_readable+0x118>
>>>>>>>>> c00000000084942c: 2c 01 00 4c isync
>>>>>>>>> c000000000849430: a6 03 bd 7e mtspr 29,r21 <==
>>>>>>>>> clearing userspace access permission
>>>>>>>>> c000000000849434: 2c 01 00 4c isync
>>>>>>>>>
>>>>>>>>> My first guess is that the problem is linked to the following
>>>>>>>>> function, see the comment
>>>>>>>>>
>>>>>>>>> /*
>>>>>>>>> * For kernel thread that doesn't have thread.regs return
>>>>>>>>> * default AMR/IAMR values.
>>>>>>>>> */
>>>>>>>>> static inline u64 current_thread_amr(void)
>>>>>>>>> {
>>>>>>>>> if (current->thread.regs)
>>>>>>>>> return current->thread.regs->amr;
>>>>>>>>> return AMR_KUAP_BLOCKED;
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> Above function was introduced by commit 48a8ab4eeb82
>>>>>>>>> ("powerpc/book3s64/pkeys: Don't update SPRN_AMR
>>>>>>>>> when in kernel mode")
>>>>>>>>
>>>>>>>> Yeah that's a bit of a curly one.
>>>>>>>>
>>>>>>>> At some point io_uring did kthread_use_mm(), which is supposed to
>>>>>>>> mean
>>>>>>>> the kthread can operate on behalf of the original process that
>>>>>>>> submitted
>>>>>>>> the IO.
>>>>>>>>
>>>>>>>> But because KUAP is implemented using memory protection keys, it
>>>>>>>> depends
>>>>>>>> on the value of the AMR register, which is not part of the mm,
>>>>>>>> it's in
>>>>>>>> thread.regs->amr.
>>>>>>>>
>>>>>>>> And what's worse by the time we're in kthread_use_mm() we no
>>>>>>>> longer have
>>>>>>>> access to the thread.regs->amr of the original process that
>>>>>>>> submitted
>>>>>>>> the IO.
>>>>>>>>
>>>>>>>> We also can't simply move the AMR into the mm, precisely because
>>>>>>>> it's
>>>>>>>> per thread, not per mm.
>>>>>>>>
>>>>>>>> So TBH I don't know how we're going to fix this.
>>>>>>>>
>>>>>>>> I guess we could return AMR=unblocked for kernel threads, but that's
>>>>>>>> arguably a bug because it allows a process to circumvent memory
>>>>>>>> keys by
>>>>>>>> asking the kernel to do the access.
>>>>>>>
>>>>>>> We shouldn't need to inherit AMR should we? We only need it to be
>>>>>>> locked
>>>>>>> for kernel threads until it's explicitly unlocked -- nothing mm
>>>>>>> specific
>>>>>>> there. I think current_thread_amr could return 0 for kernel
>>>>>>> threads? Or
>>>>>>> I would even avoid using that function for allow_user_access and open
>>>>>>> code the kthread case and remove it from current_thread_amr().
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Nick
>>>>>>
>>>>>
>>>>> updated one
>>>>>
>>>>> From 8fdb0680f983940d61f91da8252b13c8d3e8ebee Mon Sep 17 00:00:00 2001
>>>>> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
>>>>> Date: Tue, 2 Feb 2021 09:23:38 +0530
>>>>> Subject: [PATCH v2] powerpc/kuap: Allow kernel thread to access
>>>>> userspace
>>>>> after kthread_use_mm
>>>>>
>>>>> This fix the bad fault reported by KUAP when io_wqe_worker access
>>>>> userspace.
>>>>>
>>>>> Bug: Read fault blocked by KUAP!
>>>>> WARNING: CPU: 1 PID: 101841 at arch/powerpc/mm/fault.c:229
>>>>> __do_page_fault+0x6b4/0xcd0
>>>>> NIP [c00000000009e7e4] __do_page_fault+0x6b4/0xcd0
>>>>> LR [c00000000009e7e0] __do_page_fault+0x6b0/0xcd0
>>>>> ..........
>>>>> Call Trace:
>>>>> [c000000016367330] [c00000000009e7e0] __do_page_fault+0x6b0/0xcd0
>>>>> (unreliable)
>>>>> [c0000000163673e0] [c00000000009ee3c] do_page_fault+0x3c/0x120
>>>>> [c000000016367430] [c00000000000c848] handle_page_fault+0x10/0x2c
>>>>> --- interrupt: 300 at iov_iter_fault_in_readable+0x148/0x6f0
>>>>> ..........
>>>>> NIP [c0000000008e8228] iov_iter_fault_in_readable+0x148/0x6f0
>>>>> LR [c0000000008e834c] iov_iter_fault_in_readable+0x26c/0x6f0
>>>>> interrupt: 300
>>>>> [c0000000163677e0] [c0000000007154a0] iomap_write_actor+0xc0/0x280
>>>>> [c000000016367880] [c00000000070fc94] iomap_apply+0x1c4/0x780
>>>>> [c000000016367990] [c000000000710330]
>>>>> iomap_file_buffered_write+0xa0/0x120
>>>>> [c0000000163679e0] [c00800000040791c]
>>>>> xfs_file_buffered_aio_write+0x314/0x5e0 [xfs]
>>>>> [c000000016367a90] [c0000000006d74bc] io_write+0x10c/0x460
>>>>> [c000000016367bb0] [c0000000006d80e4] io_issue_sqe+0x8d4/0x1200
>>>>> [c000000016367c70] [c0000000006d8ad0] io_wq_submit_work+0xc0/0x250
>>>>> [c000000016367cb0] [c0000000006e2578]
>>>>> io_worker_handle_work+0x498/0x800
>>>>> [c000000016367d40] [c0000000006e2cdc] io_wqe_worker+0x3fc/0x4f0
>>>>> [c000000016367da0] [c0000000001cb0a4] kthread+0x1c4/0x1d0
>>>>> [c000000016367e10] [c00000000000dbf0]
>>>>> ret_from_kernel_thread+0x5c/0x6c
>>>>>
>>>>> The kernel consider thread AMR value for kernel thread to be
>>>>> AMR_KUAP_BLOCKED. Hence access to userspace is denied. This
>>>>> of course not correct and we should allow userspace access after
>>>>> kthread_use_mm(). To be precise, kthread_use_mm() should inherit the
>>>>> AMR value of the operating address space. But, the AMR value is
>>>>> thread-specific and we inherit the address space and not thread
>>>>> access restrictions. Because of this ignore AMR value when accessing
>>>>> userspace via kernel thread.
>>>>>
>>>>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>>>>> ---
>>>>> Changes from v1:
>>>>> * Address review feedback from Nick
>>>>>
>>>>> arch/powerpc/include/asm/book3s/64/kup.h | 8 +++++++-
>>>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/arch/powerpc/include/asm/book3s/64/kup.h
>>>>> b/arch/powerpc/include/asm/book3s/64/kup.h
>>>>> index f50f72e535aa..95f4df99249e 100644
>>>>> --- a/arch/powerpc/include/asm/book3s/64/kup.h
>>>>> +++ b/arch/powerpc/include/asm/book3s/64/kup.h
>>>>> @@ -384,7 +384,13 @@ static __always_inline void
>>>>> allow_user_access(void __user *to, const void __user
>>>>> // This is written so we can resolve to a single case at build
>>>>> time
>>>>> BUILD_BUG_ON(!__builtin_constant_p(dir));
>>>>> - if (mmu_has_feature(MMU_FTR_PKEY))
>>>>> + /*
>>>>> + * if it is a kthread that did kthread_use_mm() don't
>>>>> + * use current_thread_amr().
>>>>
>>>> According to include/linux/sched.h, PF_KTHREAD means /* I am a kernel
>>>> thread */
>>>> It doesn't seem to be related to kthread_use_mm()
>>>
>>> That should be a sufficient check here. if we did reach here without
>>> calling kthread_user_mm, we will crash on access because we don't have
>>> a mm attached to the current process. a kernel thread with
>>> kthread_use_mm has
>>
>> Ok but then the comment doesn't match the check.
>
>
> I was trying to be explict in the comment that we expect the thread to
> have done kthread_use_mm().
>
>>
>> And also the comment in current_thread_amr() is then misleading.
>>
>> Why not do the current->flags & PF_KTHREAD check in current_thread_amr()
>> and return 0 in that case instead of BLOCKED ?
>
> In my view currrent_thread_amr() is more generic and we want to be
> explicit there that a kernel thread AMR is KUAP_BLOCKED. Only when we
> call allow user access, we relax the AMR value.
Just following up on this, as I'd hate to have 5.11 released with this
bug in it for powerpc. It'll obviously also affect other cases of a
kernel thread faulting after having done kthread_use_mm(), though I'm
not sure how widespread that is. In any case, it'll leave io_uring
mostly broken on powerpc if this isn't patched for release.
--
Jens Axboe
^ permalink raw reply
* [PATCH v16 11/12] powerpc: Use OF alloc and free for FDT
From: Lakshmi Ramasubramanian @ 2021-02-04 16:41 UTC (permalink / raw)
To: zohar, bauerman, robh, takahiro.akashi, gregkh, will, joe,
catalin.marinas, mpe
Cc: mark.rutland, bhsharma, tao.li, paulus, vincenzo.frascino,
frowand.list, sashal, masahiroy, jmorris, linux-arm-kernel, serge,
devicetree, pasha.tatashin, prsriva, hsinyi, allison,
christophe.leroy, mbrugger, balajib, dmitry.kasatkin,
linux-kernel, james.morse, linux-integrity, linuxppc-dev
In-Reply-To: <20210204164135.29856-1-nramas@linux.microsoft.com>
of_alloc_and_init_fdt() and of_free_fdt() have been defined in
drivers/of/kexec.c to allocate and free memory for FDT.
Use of_alloc_and_init_fdt() and of_free_fdt() to allocate and
initialize the FDT, and to free the FDT respectively.
powerpc sets the FDT address in image_loader_data field in
"struct kimage" and the memory is freed in
kimage_file_post_load_cleanup(). This cleanup function uses kfree()
to free the memory. But since of_alloc_and_init_fdt() uses kvmalloc()
for allocation, the buffer needs to be freed using kvfree().
Define "fdt" field in "struct kimage_arch" for powerpc to store
the address of FDT, and free the memory in powerpc specific
arch_kimage_file_post_load_cleanup().
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Suggested-by: Rob Herring <robh@kernel.org>
Suggested-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
---
arch/powerpc/include/asm/kexec.h | 2 ++
arch/powerpc/kexec/elf_64.c | 26 ++++++++++++++++----------
arch/powerpc/kexec/file_load_64.c | 3 +++
3 files changed, 21 insertions(+), 10 deletions(-)
diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index 2c0be93d239a..d7d13cac4d31 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -111,6 +111,8 @@ struct kimage_arch {
unsigned long elf_headers_mem;
unsigned long elf_headers_sz;
void *elf_headers;
+
+ void *fdt;
};
char *setup_kdump_cmdline(struct kimage *image, char *cmdline,
diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c
index d0e459bb2f05..51d2d8eb6c1b 100644
--- a/arch/powerpc/kexec/elf_64.c
+++ b/arch/powerpc/kexec/elf_64.c
@@ -19,6 +19,7 @@
#include <linux/kexec.h>
#include <linux/libfdt.h>
#include <linux/module.h>
+#include <linux/of.h>
#include <linux/of_fdt.h>
#include <linux/slab.h>
#include <linux/types.h>
@@ -32,7 +33,7 @@ static void *elf64_load(struct kimage *image, char *kernel_buf,
unsigned int fdt_size;
unsigned long kernel_load_addr;
unsigned long initrd_load_addr = 0, fdt_load_addr;
- void *fdt;
+ void *fdt = NULL;
const void *slave_code;
struct elfhdr ehdr;
char *modified_cmdline = NULL;
@@ -103,18 +104,12 @@ static void *elf64_load(struct kimage *image, char *kernel_buf,
}
fdt_size = fdt_totalsize(initial_boot_params) * 2;
- fdt = kmalloc(fdt_size, GFP_KERNEL);
+ fdt = of_alloc_and_init_fdt(fdt_size);
if (!fdt) {
pr_err("Not enough memory for the device tree.\n");
ret = -ENOMEM;
goto out;
}
- ret = fdt_open_into(initial_boot_params, fdt, fdt_size);
- if (ret < 0) {
- pr_err("Error setting up the new device tree.\n");
- ret = -EINVAL;
- goto out;
- }
ret = setup_new_fdt_ppc64(image, fdt, initrd_load_addr,
initrd_len, cmdline);
@@ -131,6 +126,10 @@ static void *elf64_load(struct kimage *image, char *kernel_buf,
ret = kexec_add_buffer(&kbuf);
if (ret)
goto out;
+
+ /* FDT will be freed in arch_kimage_file_post_load_cleanup */
+ image->arch.fdt = fdt;
+
fdt_load_addr = kbuf.mem;
pr_debug("Loaded device tree at 0x%lx\n", fdt_load_addr);
@@ -145,8 +144,15 @@ static void *elf64_load(struct kimage *image, char *kernel_buf,
kfree(modified_cmdline);
kexec_free_elf_info(&elf_info);
- /* Make kimage_file_post_load_cleanup free the fdt buffer for us. */
- return ret ? ERR_PTR(ret) : fdt;
+ /*
+ * Once FDT buffer has been successfully passed to kexec_add_buffer(),
+ * the FDT buffer address is saved in image->arch.fdt. In that case,
+ * the memory cannot be freed here in case of any other error.
+ */
+ if (ret && !image->arch.fdt)
+ of_free_fdt(fdt);
+
+ return ret ? ERR_PTR(ret) : NULL;
}
const struct kexec_file_ops kexec_elf64_ops = {
diff --git a/arch/powerpc/kexec/file_load_64.c b/arch/powerpc/kexec/file_load_64.c
index 3cab318aa3b9..d9d5b5569a6d 100644
--- a/arch/powerpc/kexec/file_load_64.c
+++ b/arch/powerpc/kexec/file_load_64.c
@@ -1113,5 +1113,8 @@ int arch_kimage_file_post_load_cleanup(struct kimage *image)
image->arch.elf_headers = NULL;
image->arch.elf_headers_sz = 0;
+ of_free_fdt(image->arch.fdt);
+ image->arch.fdt = NULL;
+
return kexec_image_post_load_cleanup_default(image);
}
--
2.30.0
^ permalink raw reply related
* [PATCH v16 12/12] arm64: Enable passing IMA log to next kernel on kexec
From: Lakshmi Ramasubramanian @ 2021-02-04 16:41 UTC (permalink / raw)
To: zohar, bauerman, robh, takahiro.akashi, gregkh, will, joe,
catalin.marinas, mpe
Cc: mark.rutland, bhsharma, tao.li, paulus, vincenzo.frascino,
frowand.list, sashal, masahiroy, jmorris, linux-arm-kernel, serge,
devicetree, pasha.tatashin, prsriva, hsinyi, allison,
christophe.leroy, mbrugger, balajib, dmitry.kasatkin,
linux-kernel, james.morse, linux-integrity, linuxppc-dev
In-Reply-To: <20210204164135.29856-1-nramas@linux.microsoft.com>
Update CONFIG_KEXEC_FILE to select CONFIG_HAVE_IMA_KEXEC, if CONFIG_IMA
is enabled, to indicate that the IMA measurement log information is
present in the device tree for ARM64.
Co-developed-by: Prakhar Srivastava <prsriva@linux.microsoft.com>
Signed-off-by: Prakhar Srivastava <prsriva@linux.microsoft.com>
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
---
arch/arm64/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 05e17351e4f3..8a93573cebb6 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1093,6 +1093,7 @@ config KEXEC
config KEXEC_FILE
bool "kexec file based system call"
select KEXEC_CORE
+ select HAVE_IMA_KEXEC if IMA
help
This is new version of kexec system call. This system call is
file based and takes file descriptors as system call argument
--
2.30.0
^ permalink raw reply related
* [PATCH v16 10/12] arm64: Use OF alloc and free functions for FDT
From: Lakshmi Ramasubramanian @ 2021-02-04 16:41 UTC (permalink / raw)
To: zohar, bauerman, robh, takahiro.akashi, gregkh, will, joe,
catalin.marinas, mpe
Cc: mark.rutland, bhsharma, tao.li, paulus, vincenzo.frascino,
frowand.list, sashal, masahiroy, jmorris, linux-arm-kernel, serge,
devicetree, pasha.tatashin, prsriva, hsinyi, allison,
christophe.leroy, mbrugger, balajib, dmitry.kasatkin,
linux-kernel, james.morse, linux-integrity, linuxppc-dev
In-Reply-To: <20210204164135.29856-1-nramas@linux.microsoft.com>
of_alloc_and_init_fdt() and of_free_fdt() have been defined in
drivers/of/kexec.c to allocate and free memory for FDT.
Use of_alloc_and_init_fdt() and of_free_fdt() to allocate and
initialize the FDT, and to free the FDT respectively.
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Suggested-by: Rob Herring <robh@kernel.org>
---
arch/arm64/kernel/machine_kexec_file.c | 37 +++++++-------------------
1 file changed, 10 insertions(+), 27 deletions(-)
diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
index 7da22bb7b9d5..7d6cc478f73c 100644
--- a/arch/arm64/kernel/machine_kexec_file.c
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -29,7 +29,7 @@ const struct kexec_file_ops * const kexec_file_loaders[] = {
int arch_kimage_file_post_load_cleanup(struct kimage *image)
{
- vfree(image->arch.dtb);
+ of_free_fdt(image->arch.dtb);
image->arch.dtb = NULL;
vfree(image->arch.elf_headers);
@@ -57,36 +57,19 @@ static int create_dtb(struct kimage *image,
cmdline_len = cmdline ? strlen(cmdline) : 0;
buf_size = fdt_totalsize(initial_boot_params)
+ cmdline_len + DTB_EXTRA_SPACE;
-
- for (;;) {
- buf = vmalloc(buf_size);
- if (!buf)
- return -ENOMEM;
-
- /* duplicate a device tree blob */
- ret = fdt_open_into(initial_boot_params, buf, buf_size);
- if (ret)
- return -EINVAL;
-
- ret = of_kexec_setup_new_fdt(image, buf, initrd_load_addr,
+ buf = of_alloc_and_init_fdt(buf_size);
+ if (!buf)
+ return -ENOMEM;
+ ret = of_kexec_setup_new_fdt(image, buf, initrd_load_addr,
initrd_len, cmdline);
- if (ret) {
- vfree(buf);
- if (ret == -ENOMEM) {
- /* unlikely, but just in case */
- buf_size += DTB_EXTRA_SPACE;
- continue;
- } else {
- return ret;
- }
- }
-
+ if (!ret) {
/* trim it */
fdt_pack(buf);
*dtb = buf;
+ } else
+ of_free_fdt(buf);
- return 0;
- }
+ return ret;
}
static int prepare_elf_headers(void **addr, unsigned long *sz)
@@ -224,6 +207,6 @@ int load_other_segments(struct kimage *image,
out_err:
image->nr_segments = orig_segments;
- vfree(dtb);
+ of_free_fdt(dtb);
return ret;
}
--
2.30.0
^ permalink raw reply related
* [PATCH v16 08/12] powerpc: Delete unused function delete_fdt_mem_rsv()
From: Lakshmi Ramasubramanian @ 2021-02-04 16:41 UTC (permalink / raw)
To: zohar, bauerman, robh, takahiro.akashi, gregkh, will, joe,
catalin.marinas, mpe
Cc: mark.rutland, bhsharma, tao.li, paulus, vincenzo.frascino,
frowand.list, sashal, masahiroy, jmorris, linux-arm-kernel, serge,
devicetree, pasha.tatashin, prsriva, hsinyi, allison,
christophe.leroy, mbrugger, balajib, dmitry.kasatkin,
linux-kernel, james.morse, linux-integrity, linuxppc-dev
In-Reply-To: <20210204164135.29856-1-nramas@linux.microsoft.com>
delete_fdt_mem_rsv() defined in "arch/powerpc/kexec/file_load.c"
has been renamed to fdt_find_and_del_mem_rsv(), and moved to
"drivers/of/kexec.c".
Remove delete_fdt_mem_rsv() in "arch/powerpc/kexec/file_load.c".
Co-developed-by: Prakhar Srivastava <prsriva@linux.microsoft.com>
Signed-off-by: Prakhar Srivastava <prsriva@linux.microsoft.com>
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
---
arch/powerpc/include/asm/kexec.h | 1 -
arch/powerpc/kexec/file_load.c | 32 --------------------------------
2 files changed, 33 deletions(-)
diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index 939bc40dfa62..2c0be93d239a 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -118,7 +118,6 @@ char *setup_kdump_cmdline(struct kimage *image, char *cmdline,
int setup_purgatory(struct kimage *image, const void *slave_code,
const void *fdt, unsigned long kernel_load_addr,
unsigned long fdt_load_addr);
-int delete_fdt_mem_rsv(void *fdt, unsigned long start, unsigned long size);
#ifdef CONFIG_PPC64
struct kexec_buf;
diff --git a/arch/powerpc/kexec/file_load.c b/arch/powerpc/kexec/file_load.c
index 5dd3a9c45a2d..036c8fb48fc3 100644
--- a/arch/powerpc/kexec/file_load.c
+++ b/arch/powerpc/kexec/file_load.c
@@ -108,35 +108,3 @@ int setup_purgatory(struct kimage *image, const void *slave_code,
return 0;
}
-
-/**
- * delete_fdt_mem_rsv - delete memory reservation with given address and size
- *
- * Return: 0 on success, or negative errno on error.
- */
-int delete_fdt_mem_rsv(void *fdt, unsigned long start, unsigned long size)
-{
- int i, ret, num_rsvs = fdt_num_mem_rsv(fdt);
-
- for (i = 0; i < num_rsvs; i++) {
- uint64_t rsv_start, rsv_size;
-
- ret = fdt_get_mem_rsv(fdt, i, &rsv_start, &rsv_size);
- if (ret) {
- pr_err("Malformed device tree.\n");
- return -EINVAL;
- }
-
- if (rsv_start == start && rsv_size == size) {
- ret = fdt_del_mem_rsv(fdt, i);
- if (ret) {
- pr_err("Error deleting device tree reservation.\n");
- return -EINVAL;
- }
-
- return 0;
- }
- }
-
- return -ENOENT;
-}
--
2.30.0
^ permalink raw reply related
* [PATCH v16 09/12] of: Define functions to allocate and free FDT
From: Lakshmi Ramasubramanian @ 2021-02-04 16:41 UTC (permalink / raw)
To: zohar, bauerman, robh, takahiro.akashi, gregkh, will, joe,
catalin.marinas, mpe
Cc: mark.rutland, bhsharma, tao.li, paulus, vincenzo.frascino,
frowand.list, sashal, masahiroy, jmorris, linux-arm-kernel, serge,
devicetree, pasha.tatashin, prsriva, hsinyi, allison,
christophe.leroy, mbrugger, balajib, dmitry.kasatkin,
linux-kernel, james.morse, linux-integrity, linuxppc-dev
In-Reply-To: <20210204164135.29856-1-nramas@linux.microsoft.com>
Kernel components that use Flattened Device Tree (FDT) allocate kernel
memory and call fdt_open_into() to reorganize the tree into a form
suitable for the read-write operations. These operations can be
combined into a single function to allocate and initialize the FDT
so the different architecures do not have to duplicate the code.
Define of_alloc_and_init_fdt() and of_free_fdt() in drivers/of/kexec.c
to allocate and initialize FDT, and to free the FDT buffer respectively.
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Suggested-by: Rob Herring <robh@kernel.org>
Suggested-by: Joe Perches <joe@perches.com>
---
drivers/of/kexec.c | 37 +++++++++++++++++++++++++++++++++++++
include/linux/of.h | 2 ++
2 files changed, 39 insertions(+)
diff --git a/drivers/of/kexec.c b/drivers/of/kexec.c
index 5ae0e5d90f55..197e71104f47 100644
--- a/drivers/of/kexec.c
+++ b/drivers/of/kexec.c
@@ -11,6 +11,7 @@
#include <linux/kernel.h>
#include <linux/kexec.h>
+#include <linux/mm.h>
#include <linux/memblock.h>
#include <linux/libfdt.h>
#include <linux/of.h>
@@ -28,6 +29,42 @@
#define FDT_PROP_RNG_SEED "rng-seed"
#define RNG_SEED_SIZE 128
+/**
+ * of_alloc_and_init_fdt - Allocate and initialize a Flattened device tree
+ *
+ * @fdt_size: Flattened device tree size
+ *
+ * Return the allocated FDT buffer on success, or NULL on error.
+ */
+void *of_alloc_and_init_fdt(unsigned int fdt_size)
+{
+ void *fdt;
+ int ret;
+
+ fdt = kvmalloc(fdt_size, GFP_KERNEL);
+ if (!fdt)
+ return NULL;
+
+ ret = fdt_open_into(initial_boot_params, fdt, fdt_size);
+ if (ret < 0) {
+ pr_err("Error setting up the new device tree.\n");
+ kvfree(fdt);
+ fdt = NULL;
+ }
+
+ return fdt;
+}
+
+/**
+ * of_free_fdt - Free the buffer for Flattened device tree
+ *
+ * @fdt: Flattened device tree buffer to free
+ */
+void of_free_fdt(void *fdt)
+{
+ kvfree(fdt);
+}
+
/**
* fdt_find_and_del_mem_rsv - delete memory reservation with given address and size
*
diff --git a/include/linux/of.h b/include/linux/of.h
index 19f77dd12507..9f0261761e28 100644
--- a/include/linux/of.h
+++ b/include/linux/of.h
@@ -563,6 +563,8 @@ struct kimage;
int of_kexec_setup_new_fdt(const struct kimage *image, void *fdt,
unsigned long initrd_load_addr, unsigned long initrd_len,
const char *cmdline);
+void *of_alloc_and_init_fdt(unsigned int fdt_size);
+void of_free_fdt(void *fdt);
#ifdef CONFIG_IMA_KEXEC
int of_ima_add_kexec_buffer(struct kimage *image,
--
2.30.0
^ permalink raw reply related
* [PATCH v16 07/12] kexec: Use fdt_appendprop_addrrange() to add ima buffer to FDT
From: Lakshmi Ramasubramanian @ 2021-02-04 16:41 UTC (permalink / raw)
To: zohar, bauerman, robh, takahiro.akashi, gregkh, will, joe,
catalin.marinas, mpe
Cc: mark.rutland, bhsharma, tao.li, paulus, vincenzo.frascino,
frowand.list, sashal, masahiroy, jmorris, linux-arm-kernel, serge,
devicetree, pasha.tatashin, prsriva, hsinyi, allison,
christophe.leroy, mbrugger, balajib, dmitry.kasatkin,
linux-kernel, james.morse, linux-integrity, linuxppc-dev
In-Reply-To: <20210204164135.29856-1-nramas@linux.microsoft.com>
fdt_appendprop_addrrange() function adds a property, with the given name,
to the device tree at the given node offset, and also sets the address
and size of the property. This function should be used to add
"linux,ima-kexec-buffer" property to the device tree and set the address
and size of the IMA measurement buffer, instead of using custom function.
Use fdt_appendprop_addrrange() to add "linux,ima-kexec-buffer" property
to the device tree. This property holds the address and size of
the IMA measurement buffer that needs to be passed from the current
kernel to the next kernel across kexec system call.
Remove custom code that is used in setup_ima_buffer() to add
"linux,ima-kexec-buffer" property to the device tree.
Co-developed-by: Prakhar Srivastava <prsriva@linux.microsoft.com>
Signed-off-by: Prakhar Srivastava <prsriva@linux.microsoft.com>
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
---
drivers/of/kexec.c | 57 ++++------------------------------------------
1 file changed, 5 insertions(+), 52 deletions(-)
diff --git a/drivers/of/kexec.c b/drivers/of/kexec.c
index 7ee4f498ca19..5ae0e5d90f55 100644
--- a/drivers/of/kexec.c
+++ b/drivers/of/kexec.c
@@ -232,36 +232,6 @@ int of_ima_add_kexec_buffer(struct kimage *image,
return 0;
}
-/**
- * write_number - Convert number to big-endian format
- *
- * @p: Buffer to write the number to
- * @value: Number to convert
- * @cells: Number of cells
- *
- * Return: 0 on success, or negative errno on error.
- */
-static int write_number(void *p, u64 value, int cells)
-{
- if (cells == 1) {
- u32 tmp;
-
- if (value > U32_MAX)
- return -EINVAL;
-
- tmp = cpu_to_be32(value);
- memcpy(p, &tmp, sizeof(tmp));
- } else if (cells == 2) {
- u64 tmp;
-
- tmp = cpu_to_be64(value);
- memcpy(p, &tmp, sizeof(tmp));
- } else
- return -EINVAL;
-
- return 0;
-}
-
/**
* setup_ima_buffer - add IMA buffer information to the fdt
* @image: kexec image being loaded.
@@ -273,32 +243,15 @@ static int write_number(void *p, u64 value, int cells)
static int setup_ima_buffer(const struct kimage *image, void *fdt,
int chosen_node)
{
- int ret, addr_cells, size_cells, entry_size;
- u8 value[16];
+ int ret;
if (!image->ima_buffer_size)
return 0;
- ret = get_addr_size_cells(&addr_cells, &size_cells);
- if (ret)
- return ret;
-
- entry_size = 4 * (addr_cells + size_cells);
-
- if (entry_size > sizeof(value))
- return -EINVAL;
-
- ret = write_number(value, image->ima_buffer_addr, addr_cells);
- if (ret)
- return ret;
-
- ret = write_number(value + 4 * addr_cells, image->ima_buffer_size,
- size_cells);
- if (ret)
- return ret;
-
- ret = fdt_setprop(fdt, chosen_node, "linux,ima-kexec-buffer", value,
- entry_size);
+ ret = fdt_appendprop_addrrange(fdt, 0, chosen_node,
+ "linux,ima-kexec-buffer",
+ image->ima_buffer_addr,
+ image->ima_buffer_size);
if (ret < 0)
return -EINVAL;
--
2.30.0
^ permalink raw reply related
* [PATCH v16 06/12] powerpc: Move arch independent ima kexec functions to drivers/of/kexec.c
From: Lakshmi Ramasubramanian @ 2021-02-04 16:41 UTC (permalink / raw)
To: zohar, bauerman, robh, takahiro.akashi, gregkh, will, joe,
catalin.marinas, mpe
Cc: mark.rutland, bhsharma, tao.li, paulus, vincenzo.frascino,
frowand.list, sashal, masahiroy, jmorris, linux-arm-kernel, serge,
devicetree, pasha.tatashin, prsriva, hsinyi, allison,
christophe.leroy, mbrugger, balajib, dmitry.kasatkin,
linux-kernel, james.morse, linux-integrity, linuxppc-dev
In-Reply-To: <20210204164135.29856-1-nramas@linux.microsoft.com>
The functions defined in "arch/powerpc/kexec/ima.c" handle setting up
and freeing the resources required to carry over the IMA measurement
list from the current kernel to the next kernel across kexec system call.
These functions do not have architecture specific code, but are
currently limited to powerpc.
Move setup_ima_buffer() call into of_kexec_setup_new_fdt() defined in
"drivers/of/kexec.c". Call of_kexec_setup_new_fdt() from
setup_new_fdt_ppc64() and remove setup_new_fdt() in
"arch/powerpc/kexec/file_load.c".
Move the remaining architecture independent functions from
"arch/powerpc/kexec/ima.c" to "drivers/of/kexec.c".
Delete "arch/powerpc/kexec/ima.c" and "arch/powerpc/include/asm/ima.h".
Remove references to the deleted files in powerpc and in ima.
Co-developed-by: Prakhar Srivastava <prsriva@linux.microsoft.com>
Signed-off-by: Prakhar Srivastava <prsriva@linux.microsoft.com>
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
---
arch/powerpc/Kconfig | 2 +-
arch/powerpc/include/asm/ima.h | 27 ----
arch/powerpc/include/asm/kexec.h | 3 -
arch/powerpc/kexec/Makefile | 7 -
arch/powerpc/kexec/file_load.c | 35 -----
arch/powerpc/kexec/file_load_64.c | 4 +-
arch/powerpc/kexec/ima.c | 202 -------------------------
drivers/of/kexec.c | 239 ++++++++++++++++++++++++++++++
include/linux/of.h | 2 +
security/integrity/ima/ima.h | 4 -
10 files changed, 245 insertions(+), 280 deletions(-)
delete mode 100644 arch/powerpc/include/asm/ima.h
delete mode 100644 arch/powerpc/kexec/ima.c
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 107bb4319e0e..d6e593ad270e 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -554,7 +554,7 @@ config KEXEC
config KEXEC_FILE
bool "kexec file based system call"
select KEXEC_CORE
- select HAVE_IMA_KEXEC
+ select HAVE_IMA_KEXEC if IMA
select BUILD_BIN2C
select KEXEC_ELF
depends on PPC64
diff --git a/arch/powerpc/include/asm/ima.h b/arch/powerpc/include/asm/ima.h
deleted file mode 100644
index 51f64fd06c19..000000000000
--- a/arch/powerpc/include/asm/ima.h
+++ /dev/null
@@ -1,27 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _ASM_POWERPC_IMA_H
-#define _ASM_POWERPC_IMA_H
-
-struct kimage;
-
-int ima_get_kexec_buffer(void **addr, size_t *size);
-int ima_free_kexec_buffer(void);
-
-#ifdef CONFIG_IMA
-void remove_ima_buffer(void *fdt, int chosen_node);
-#else
-static inline void remove_ima_buffer(void *fdt, int chosen_node) {}
-#endif
-
-#ifdef CONFIG_IMA_KEXEC
-int setup_ima_buffer(const struct kimage *image, void *fdt, int chosen_node);
-#else
-static inline int setup_ima_buffer(const struct kimage *image, void *fdt,
- int chosen_node)
-{
- remove_ima_buffer(fdt, chosen_node);
- return 0;
-}
-#endif /* CONFIG_IMA_KEXEC */
-
-#endif /* _ASM_POWERPC_IMA_H */
diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index 2248dc27080b..939bc40dfa62 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -118,9 +118,6 @@ char *setup_kdump_cmdline(struct kimage *image, char *cmdline,
int setup_purgatory(struct kimage *image, const void *slave_code,
const void *fdt, unsigned long kernel_load_addr,
unsigned long fdt_load_addr);
-int setup_new_fdt(const struct kimage *image, void *fdt,
- unsigned long initrd_load_addr, unsigned long initrd_len,
- const char *cmdline);
int delete_fdt_mem_rsv(void *fdt, unsigned long start, unsigned long size);
#ifdef CONFIG_PPC64
diff --git a/arch/powerpc/kexec/Makefile b/arch/powerpc/kexec/Makefile
index 4aff6846c772..b6c52608cb49 100644
--- a/arch/powerpc/kexec/Makefile
+++ b/arch/powerpc/kexec/Makefile
@@ -9,13 +9,6 @@ obj-$(CONFIG_PPC32) += relocate_32.o
obj-$(CONFIG_KEXEC_FILE) += file_load.o ranges.o file_load_$(BITS).o elf_$(BITS).o
-ifdef CONFIG_HAVE_IMA_KEXEC
-ifdef CONFIG_IMA
-obj-y += ima.o
-endif
-endif
-
-
# Disable GCOV, KCOV & sanitizers in odd or sensitive code
GCOV_PROFILE_core_$(BITS).o := n
KCOV_INSTRUMENT_core_$(BITS).o := n
diff --git a/arch/powerpc/kexec/file_load.c b/arch/powerpc/kexec/file_load.c
index 956bcb2d1ec2..5dd3a9c45a2d 100644
--- a/arch/powerpc/kexec/file_load.c
+++ b/arch/powerpc/kexec/file_load.c
@@ -20,7 +20,6 @@
#include <linux/of_fdt.h>
#include <linux/libfdt.h>
#include <asm/setup.h>
-#include <asm/ima.h>
#define SLAVE_CODE_SIZE 256 /* First 0x100 bytes */
@@ -141,37 +140,3 @@ int delete_fdt_mem_rsv(void *fdt, unsigned long start, unsigned long size)
return -ENOENT;
}
-
-/*
- * setup_new_fdt - modify /chosen and memory reservation for the next kernel
- * @image: kexec image being loaded.
- * @fdt: Flattened device tree for the next kernel.
- * @initrd_load_addr: Address where the next initrd will be loaded.
- * @initrd_len: Size of the next initrd, or 0 if there will be none.
- * @cmdline: Command line for the next kernel, or NULL if there will
- * be none.
- *
- * Return: 0 on success, or negative errno on error.
- */
-int setup_new_fdt(const struct kimage *image, void *fdt,
- unsigned long initrd_load_addr, unsigned long initrd_len,
- const char *cmdline)
-{
- int ret;
-
- ret = of_kexec_setup_new_fdt(image, fdt, initrd_load_addr, initrd_len, cmdline);
- if (ret)
- goto err;
-
- ret = setup_ima_buffer(image, fdt, fdt_path_offset(fdt, "/chosen"));
- if (ret) {
- pr_err("Error setting up the new device tree.\n");
- return ret;
- }
-
- return 0;
-
-err:
- pr_err("Error setting up the new device tree.\n");
- return -EINVAL;
-}
diff --git a/arch/powerpc/kexec/file_load_64.c b/arch/powerpc/kexec/file_load_64.c
index a05c19b3cc60..3cab318aa3b9 100644
--- a/arch/powerpc/kexec/file_load_64.c
+++ b/arch/powerpc/kexec/file_load_64.c
@@ -17,6 +17,7 @@
#include <linux/kexec.h>
#include <linux/of_fdt.h>
#include <linux/libfdt.h>
+#include <linux/of.h>
#include <linux/of_device.h>
#include <linux/memblock.h>
#include <linux/slab.h>
@@ -944,7 +945,8 @@ int setup_new_fdt_ppc64(const struct kimage *image, void *fdt,
struct crash_mem *umem = NULL, *rmem = NULL;
int i, nr_ranges, ret;
- ret = setup_new_fdt(image, fdt, initrd_load_addr, initrd_len, cmdline);
+ ret = of_kexec_setup_new_fdt(image, fdt, initrd_load_addr, initrd_len,
+ cmdline);
if (ret)
goto out;
diff --git a/arch/powerpc/kexec/ima.c b/arch/powerpc/kexec/ima.c
deleted file mode 100644
index ed38125e2f87..000000000000
--- a/arch/powerpc/kexec/ima.c
+++ /dev/null
@@ -1,202 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-or-later
-/*
- * Copyright (C) 2016 IBM Corporation
- *
- * Authors:
- * Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
- */
-
-#include <linux/slab.h>
-#include <linux/kexec.h>
-#include <linux/of.h>
-#include <linux/memblock.h>
-#include <linux/libfdt.h>
-
-static int get_addr_size_cells(int *addr_cells, int *size_cells)
-{
- struct device_node *root;
-
- root = of_find_node_by_path("/");
- if (!root)
- return -EINVAL;
-
- *addr_cells = of_n_addr_cells(root);
- *size_cells = of_n_size_cells(root);
-
- of_node_put(root);
-
- return 0;
-}
-
-static int do_get_kexec_buffer(const void *prop, int len, unsigned long *addr,
- size_t *size)
-{
- int ret, addr_cells, size_cells;
-
- ret = get_addr_size_cells(&addr_cells, &size_cells);
- if (ret)
- return ret;
-
- if (len < 4 * (addr_cells + size_cells))
- return -ENOENT;
-
- *addr = of_read_number(prop, addr_cells);
- *size = of_read_number(prop + 4 * addr_cells, size_cells);
-
- return 0;
-}
-
-/**
- * ima_get_kexec_buffer - get IMA buffer from the previous kernel
- * @addr: On successful return, set to point to the buffer contents.
- * @size: On successful return, set to the buffer size.
- *
- * Return: 0 on success, negative errno on error.
- */
-int ima_get_kexec_buffer(void **addr, size_t *size)
-{
- int ret, len;
- unsigned long tmp_addr;
- size_t tmp_size;
- const void *prop;
-
- prop = of_get_property(of_chosen, "linux,ima-kexec-buffer", &len);
- if (!prop)
- return -ENOENT;
-
- ret = do_get_kexec_buffer(prop, len, &tmp_addr, &tmp_size);
- if (ret)
- return ret;
-
- *addr = __va(tmp_addr);
- *size = tmp_size;
-
- return 0;
-}
-
-/**
- * ima_free_kexec_buffer - free memory used by the IMA buffer
- */
-int ima_free_kexec_buffer(void)
-{
- int ret;
- unsigned long addr;
- size_t size;
- struct property *prop;
-
- prop = of_find_property(of_chosen, "linux,ima-kexec-buffer", NULL);
- if (!prop)
- return -ENOENT;
-
- ret = do_get_kexec_buffer(prop->value, prop->length, &addr, &size);
- if (ret)
- return ret;
-
- ret = of_remove_property(of_chosen, prop);
- if (ret)
- return ret;
-
- return memblock_free(addr, size);
-
-}
-
-/**
- * remove_ima_buffer - remove the IMA buffer property and reservation from @fdt
- *
- * The IMA measurement buffer is of no use to a subsequent kernel, so we always
- * remove it from the device tree.
- */
-void remove_ima_buffer(void *fdt, int chosen_node)
-{
- int ret, len;
- unsigned long addr;
- size_t size;
- const void *prop;
-
- prop = fdt_getprop(fdt, chosen_node, "linux,ima-kexec-buffer", &len);
- if (!prop)
- return;
-
- ret = do_get_kexec_buffer(prop, len, &addr, &size);
- fdt_delprop(fdt, chosen_node, "linux,ima-kexec-buffer");
- if (ret)
- return;
-
- ret = delete_fdt_mem_rsv(fdt, addr, size);
- if (!ret)
- pr_debug("Removed old IMA buffer reservation.\n");
-}
-
-#ifdef CONFIG_IMA_KEXEC
-static int write_number(void *p, u64 value, int cells)
-{
- if (cells == 1) {
- u32 tmp;
-
- if (value > U32_MAX)
- return -EINVAL;
-
- tmp = cpu_to_be32(value);
- memcpy(p, &tmp, sizeof(tmp));
- } else if (cells == 2) {
- u64 tmp;
-
- tmp = cpu_to_be64(value);
- memcpy(p, &tmp, sizeof(tmp));
- } else
- return -EINVAL;
-
- return 0;
-}
-
-/**
- * setup_ima_buffer - add IMA buffer information to the fdt
- * @image: kexec image being loaded.
- * @fdt: Flattened device tree for the next kernel.
- * @chosen_node: Offset to the chosen node.
- *
- * Return: 0 on success, or negative errno on error.
- */
-int setup_ima_buffer(const struct kimage *image, void *fdt, int chosen_node)
-{
- int ret, addr_cells, size_cells, entry_size;
- u8 value[16];
-
- remove_ima_buffer(fdt, chosen_node);
- if (!image->ima_buffer_size)
- return 0;
-
- ret = get_addr_size_cells(&addr_cells, &size_cells);
- if (ret)
- return ret;
-
- entry_size = 4 * (addr_cells + size_cells);
-
- if (entry_size > sizeof(value))
- return -EINVAL;
-
- ret = write_number(value, image->ima_buffer_addr, addr_cells);
- if (ret)
- return ret;
-
- ret = write_number(value + 4 * addr_cells, image->ima_buffer_size,
- size_cells);
- if (ret)
- return ret;
-
- ret = fdt_setprop(fdt, chosen_node, "linux,ima-kexec-buffer", value,
- entry_size);
- if (ret < 0)
- return -EINVAL;
-
- ret = fdt_add_mem_rsv(fdt, image->ima_buffer_addr,
- image->ima_buffer_size);
- if (ret)
- return -EINVAL;
-
- pr_debug("IMA buffer at 0x%llx, size = 0x%zx\n",
- image->ima_buffer_addr, image->ima_buffer_size);
-
- return 0;
-}
-#endif /* CONFIG_IMA_KEXEC */
diff --git a/drivers/of/kexec.c b/drivers/of/kexec.c
index efbf54f164fd..7ee4f498ca19 100644
--- a/drivers/of/kexec.c
+++ b/drivers/of/kexec.c
@@ -11,6 +11,7 @@
#include <linux/kernel.h>
#include <linux/kexec.h>
+#include <linux/memblock.h>
#include <linux/libfdt.h>
#include <linux/of.h>
#include <linux/of_fdt.h>
@@ -63,6 +64,152 @@ static int fdt_find_and_del_mem_rsv(void *fdt, unsigned long start, unsigned lon
return -ENOENT;
}
+
+/**
+ * get_addr_size_cells - Get address and size of root node
+ *
+ * @addr_cells: Return address of the root node
+ * @size_cells: Return size of the root node
+ *
+ * Return: 0 on success, or negative errno on error.
+ */
+static int get_addr_size_cells(int *addr_cells, int *size_cells)
+{
+ struct device_node *root;
+
+ root = of_find_node_by_path("/");
+ if (!root)
+ return -EINVAL;
+
+ *addr_cells = of_n_addr_cells(root);
+ *size_cells = of_n_size_cells(root);
+
+ of_node_put(root);
+
+ return 0;
+}
+
+/**
+ * do_get_kexec_buffer - Get address and size of device tree property
+ *
+ * @prop: Device tree property
+ * @len: Size of @prop
+ * @addr: Return address of the node
+ * @size: Return size of the node
+ *
+ * Return: 0 on success, or negative errno on error.
+ */
+static int do_get_kexec_buffer(const void *prop, int len, unsigned long *addr,
+ size_t *size)
+{
+ int ret, addr_cells, size_cells;
+
+ ret = get_addr_size_cells(&addr_cells, &size_cells);
+ if (ret)
+ return ret;
+
+ if (len < 4 * (addr_cells + size_cells))
+ return -ENOENT;
+
+ *addr = of_read_number(prop, addr_cells);
+ *size = of_read_number(prop + 4 * addr_cells, size_cells);
+
+ return 0;
+}
+
+/**
+ * remove_ima_buffer - remove the IMA buffer property and reservation from @fdt
+ *
+ * @fdt: Flattened Device Tree to update
+ * @chosen_node: Offset to the chosen node in the device tree
+ *
+ * The IMA measurement buffer is of no use to a subsequent kernel, so we always
+ * remove it from the device tree.
+ */
+static void remove_ima_buffer(void *fdt, int chosen_node)
+{
+ int ret, len;
+ unsigned long addr;
+ size_t size;
+ const void *prop;
+
+ if (!IS_ENABLED(CONFIG_HAVE_IMA_KEXEC))
+ return;
+
+ prop = fdt_getprop(fdt, chosen_node, "linux,ima-kexec-buffer", &len);
+ if (!prop)
+ return;
+
+ ret = do_get_kexec_buffer(prop, len, &addr, &size);
+ fdt_delprop(fdt, chosen_node, "linux,ima-kexec-buffer");
+ if (ret)
+ return;
+
+ ret = fdt_find_and_del_mem_rsv(fdt, addr, size);
+ if (!ret)
+ pr_debug("Removed old IMA buffer reservation.\n");
+}
+
+/**
+ * ima_get_kexec_buffer - get IMA buffer from the previous kernel
+ * @addr: On successful return, set to point to the buffer contents.
+ * @size: On successful return, set to the buffer size.
+ *
+ * Return: 0 on success, negative errno on error.
+ */
+int ima_get_kexec_buffer(void **addr, size_t *size)
+{
+ int ret, len;
+ unsigned long tmp_addr;
+ size_t tmp_size;
+ const void *prop;
+
+ if (!IS_ENABLED(CONFIG_HAVE_IMA_KEXEC))
+ return -ENOTSUPP;
+
+ prop = of_get_property(of_chosen, "linux,ima-kexec-buffer", &len);
+ if (!prop)
+ return -ENOENT;
+
+ ret = do_get_kexec_buffer(prop, len, &tmp_addr, &tmp_size);
+ if (ret)
+ return ret;
+
+ *addr = __va(tmp_addr);
+ *size = tmp_size;
+
+ return 0;
+}
+
+/**
+ * ima_free_kexec_buffer - free memory used by the IMA buffer
+ */
+int ima_free_kexec_buffer(void)
+{
+ int ret;
+ unsigned long addr;
+ size_t size;
+ struct property *prop;
+
+ if (!IS_ENABLED(CONFIG_HAVE_IMA_KEXEC))
+ return -ENOTSUPP;
+
+ prop = of_find_property(of_chosen, "linux,ima-kexec-buffer", NULL);
+ if (!prop)
+ return -ENOENT;
+
+ ret = do_get_kexec_buffer(prop->value, prop->length, &addr, &size);
+ if (ret)
+ return ret;
+
+ ret = of_remove_property(of_chosen, prop);
+ if (ret)
+ return ret;
+
+ return memblock_free(addr, size);
+
+}
+
#ifdef CONFIG_IMA_KEXEC
/**
* of_ima_add_kexec_buffer - Add IMA buffer for next kernel
@@ -84,6 +231,93 @@ int of_ima_add_kexec_buffer(struct kimage *image,
return 0;
}
+
+/**
+ * write_number - Convert number to big-endian format
+ *
+ * @p: Buffer to write the number to
+ * @value: Number to convert
+ * @cells: Number of cells
+ *
+ * Return: 0 on success, or negative errno on error.
+ */
+static int write_number(void *p, u64 value, int cells)
+{
+ if (cells == 1) {
+ u32 tmp;
+
+ if (value > U32_MAX)
+ return -EINVAL;
+
+ tmp = cpu_to_be32(value);
+ memcpy(p, &tmp, sizeof(tmp));
+ } else if (cells == 2) {
+ u64 tmp;
+
+ tmp = cpu_to_be64(value);
+ memcpy(p, &tmp, sizeof(tmp));
+ } else
+ return -EINVAL;
+
+ return 0;
+}
+
+/**
+ * setup_ima_buffer - add IMA buffer information to the fdt
+ * @image: kexec image being loaded.
+ * @fdt: Flattened device tree for the next kernel.
+ * @chosen_node: Offset to the chosen node.
+ *
+ * Return: 0 on success, or negative errno on error.
+ */
+static int setup_ima_buffer(const struct kimage *image, void *fdt,
+ int chosen_node)
+{
+ int ret, addr_cells, size_cells, entry_size;
+ u8 value[16];
+
+ if (!image->ima_buffer_size)
+ return 0;
+
+ ret = get_addr_size_cells(&addr_cells, &size_cells);
+ if (ret)
+ return ret;
+
+ entry_size = 4 * (addr_cells + size_cells);
+
+ if (entry_size > sizeof(value))
+ return -EINVAL;
+
+ ret = write_number(value, image->ima_buffer_addr, addr_cells);
+ if (ret)
+ return ret;
+
+ ret = write_number(value + 4 * addr_cells, image->ima_buffer_size,
+ size_cells);
+ if (ret)
+ return ret;
+
+ ret = fdt_setprop(fdt, chosen_node, "linux,ima-kexec-buffer", value,
+ entry_size);
+ if (ret < 0)
+ return -EINVAL;
+
+ ret = fdt_add_mem_rsv(fdt, image->ima_buffer_addr,
+ image->ima_buffer_size);
+ if (ret)
+ return -EINVAL;
+
+ pr_debug("IMA buffer at 0x%llx, size = 0x%zx\n",
+ image->ima_buffer_addr, image->ima_buffer_size);
+
+ return 0;
+}
+#else /* CONFIG_IMA_KEXEC */
+static inline int setup_ima_buffer(const struct kimage *image, void *fdt,
+ int chosen_node)
+{
+ return 0;
+}
#endif /* CONFIG_IMA_KEXEC */
/*
@@ -250,6 +484,11 @@ int of_kexec_setup_new_fdt(const struct kimage *image, void *fdt,
}
ret = fdt_setprop(fdt, chosen_node, "linux,booted-from-kexec", NULL, 0);
+ if (ret)
+ goto out;
+
+ remove_ima_buffer(fdt, chosen_node);
+ ret = setup_ima_buffer(image, fdt, fdt_path_offset(fdt, "/chosen"));
out:
if (ret)
diff --git a/include/linux/of.h b/include/linux/of.h
index 551117c32773..19f77dd12507 100644
--- a/include/linux/of.h
+++ b/include/linux/of.h
@@ -569,6 +569,8 @@ int of_ima_add_kexec_buffer(struct kimage *image,
unsigned long load_addr, size_t size);
#endif /* CONFIG_IMA_KEXEC */
+int ima_get_kexec_buffer(void **addr, size_t *size);
+int ima_free_kexec_buffer(void);
#else /* CONFIG_OF */
static inline void of_core_init(void)
diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index aa312472c7c5..fdae37fa7051 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -24,10 +24,6 @@
#include "../integrity.h"
-#ifdef CONFIG_HAVE_IMA_KEXEC
-#include <asm/ima.h>
-#endif
-
enum ima_show_type { IMA_SHOW_BINARY, IMA_SHOW_BINARY_NO_FIELD_LEN,
IMA_SHOW_BINARY_OLD_STRING_FMT, IMA_SHOW_ASCII };
enum tpm_pcrs { TPM_PCR0 = 0, TPM_PCR8 = 8, TPM_PCR10 = 10 };
--
2.30.0
^ permalink raw reply related
* [PATCH v16 05/12] powerpc: Move ima buffer fields to struct kimage
From: Lakshmi Ramasubramanian @ 2021-02-04 16:41 UTC (permalink / raw)
To: zohar, bauerman, robh, takahiro.akashi, gregkh, will, joe,
catalin.marinas, mpe
Cc: mark.rutland, bhsharma, tao.li, paulus, vincenzo.frascino,
frowand.list, sashal, masahiroy, jmorris, linux-arm-kernel, serge,
devicetree, pasha.tatashin, prsriva, hsinyi, allison,
christophe.leroy, mbrugger, balajib, dmitry.kasatkin,
linux-kernel, james.morse, linux-integrity, linuxppc-dev
In-Reply-To: <20210204164135.29856-1-nramas@linux.microsoft.com>
The fields ima_buffer_addr and ima_buffer_size in "struct kimage_arch"
for powerpc are used to carry forward the IMA measurement list across
kexec system call. These fields are not architecture specific, but are
currently limited to powerpc.
arch_ima_add_kexec_buffer() defined in "arch/powerpc/kexec/ima.c"
sets ima_buffer_addr and ima_buffer_size for the kexec system call.
This function does not have architecture specific code, but is
currently limited to powerpc.
Move ima_buffer_addr and ima_buffer_size to "struct kimage".
Rename arch_ima_add_kexec_buffer() to of_ima_add_kexec_buffer()
and move it in drivers/of/kexec.c.
Co-developed-by: Prakhar Srivastava <prsriva@linux.microsoft.com>
Signed-off-by: Prakhar Srivastava <prsriva@linux.microsoft.com>
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Suggested-by: Will Deacon <will@kernel.org>
---
arch/powerpc/include/asm/ima.h | 3 ---
arch/powerpc/include/asm/kexec.h | 5 -----
arch/powerpc/kexec/ima.c | 29 ++++++-----------------------
drivers/of/kexec.c | 23 +++++++++++++++++++++++
include/linux/kexec.h | 5 +++++
include/linux/of.h | 5 +++++
security/integrity/ima/ima_kexec.c | 3 ++-
7 files changed, 41 insertions(+), 32 deletions(-)
diff --git a/arch/powerpc/include/asm/ima.h b/arch/powerpc/include/asm/ima.h
index ead488cf3981..51f64fd06c19 100644
--- a/arch/powerpc/include/asm/ima.h
+++ b/arch/powerpc/include/asm/ima.h
@@ -14,9 +14,6 @@ static inline void remove_ima_buffer(void *fdt, int chosen_node) {}
#endif
#ifdef CONFIG_IMA_KEXEC
-int arch_ima_add_kexec_buffer(struct kimage *image, unsigned long load_addr,
- size_t size);
-
int setup_ima_buffer(const struct kimage *image, void *fdt, int chosen_node);
#else
static inline int setup_ima_buffer(const struct kimage *image, void *fdt,
diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index dbf09d2f36d0..2248dc27080b 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -111,11 +111,6 @@ struct kimage_arch {
unsigned long elf_headers_mem;
unsigned long elf_headers_sz;
void *elf_headers;
-
-#ifdef CONFIG_IMA_KEXEC
- phys_addr_t ima_buffer_addr;
- size_t ima_buffer_size;
-#endif
};
char *setup_kdump_cmdline(struct kimage *image, char *cmdline,
diff --git a/arch/powerpc/kexec/ima.c b/arch/powerpc/kexec/ima.c
index 720e50e490b6..ed38125e2f87 100644
--- a/arch/powerpc/kexec/ima.c
+++ b/arch/powerpc/kexec/ima.c
@@ -128,23 +128,6 @@ void remove_ima_buffer(void *fdt, int chosen_node)
}
#ifdef CONFIG_IMA_KEXEC
-/**
- * arch_ima_add_kexec_buffer - do arch-specific steps to add the IMA buffer
- *
- * Architectures should use this function to pass on the IMA buffer
- * information to the next kernel.
- *
- * Return: 0 on success, negative errno on error.
- */
-int arch_ima_add_kexec_buffer(struct kimage *image, unsigned long load_addr,
- size_t size)
-{
- image->arch.ima_buffer_addr = load_addr;
- image->arch.ima_buffer_size = size;
-
- return 0;
-}
-
static int write_number(void *p, u64 value, int cells)
{
if (cells == 1) {
@@ -180,7 +163,7 @@ int setup_ima_buffer(const struct kimage *image, void *fdt, int chosen_node)
u8 value[16];
remove_ima_buffer(fdt, chosen_node);
- if (!image->arch.ima_buffer_size)
+ if (!image->ima_buffer_size)
return 0;
ret = get_addr_size_cells(&addr_cells, &size_cells);
@@ -192,11 +175,11 @@ int setup_ima_buffer(const struct kimage *image, void *fdt, int chosen_node)
if (entry_size > sizeof(value))
return -EINVAL;
- ret = write_number(value, image->arch.ima_buffer_addr, addr_cells);
+ ret = write_number(value, image->ima_buffer_addr, addr_cells);
if (ret)
return ret;
- ret = write_number(value + 4 * addr_cells, image->arch.ima_buffer_size,
+ ret = write_number(value + 4 * addr_cells, image->ima_buffer_size,
size_cells);
if (ret)
return ret;
@@ -206,13 +189,13 @@ int setup_ima_buffer(const struct kimage *image, void *fdt, int chosen_node)
if (ret < 0)
return -EINVAL;
- ret = fdt_add_mem_rsv(fdt, image->arch.ima_buffer_addr,
- image->arch.ima_buffer_size);
+ ret = fdt_add_mem_rsv(fdt, image->ima_buffer_addr,
+ image->ima_buffer_size);
if (ret)
return -EINVAL;
pr_debug("IMA buffer at 0x%llx, size = 0x%zx\n",
- image->arch.ima_buffer_addr, image->arch.ima_buffer_size);
+ image->ima_buffer_addr, image->ima_buffer_size);
return 0;
}
diff --git a/drivers/of/kexec.c b/drivers/of/kexec.c
index 4afd3cc1c04a..efbf54f164fd 100644
--- a/drivers/of/kexec.c
+++ b/drivers/of/kexec.c
@@ -63,6 +63,29 @@ static int fdt_find_and_del_mem_rsv(void *fdt, unsigned long start, unsigned lon
return -ENOENT;
}
+#ifdef CONFIG_IMA_KEXEC
+/**
+ * of_ima_add_kexec_buffer - Add IMA buffer for next kernel
+ *
+ * @image: kimage struct to set IMA buffer data
+ * @load_addr: Starting address where IMA buffer is loaded at
+ * @size: Number of bytes in the IMA buffer
+ *
+ * Use this function to pass on the IMA buffer information to
+ * the next kernel across kexec system call.
+ *
+ * Return: 0 on success, negative errno on error.
+ */
+int of_ima_add_kexec_buffer(struct kimage *image,
+ unsigned long load_addr, size_t size)
+{
+ image->ima_buffer_addr = load_addr;
+ image->ima_buffer_size = size;
+
+ return 0;
+}
+#endif /* CONFIG_IMA_KEXEC */
+
/*
* of_kexec_setup_new_fdt - modify /chosen and memory reservation for the next kernel
*
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 9e93bef52968..c142a1e5568d 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -300,6 +300,11 @@ struct kimage {
/* Information for loading purgatory */
struct purgatory_info purgatory_info;
#endif
+
+#ifdef CONFIG_IMA_KEXEC
+ phys_addr_t ima_buffer_addr;
+ size_t ima_buffer_size;
+#endif
};
/* kexec interface functions */
diff --git a/include/linux/of.h b/include/linux/of.h
index 41e256adf758..551117c32773 100644
--- a/include/linux/of.h
+++ b/include/linux/of.h
@@ -564,6 +564,11 @@ int of_kexec_setup_new_fdt(const struct kimage *image, void *fdt,
unsigned long initrd_load_addr, unsigned long initrd_len,
const char *cmdline);
+#ifdef CONFIG_IMA_KEXEC
+int of_ima_add_kexec_buffer(struct kimage *image,
+ unsigned long load_addr, size_t size);
+#endif /* CONFIG_IMA_KEXEC */
+
#else /* CONFIG_OF */
static inline void of_core_init(void)
diff --git a/security/integrity/ima/ima_kexec.c b/security/integrity/ima/ima_kexec.c
index 121de3e04af2..345b78515774 100644
--- a/security/integrity/ima/ima_kexec.c
+++ b/security/integrity/ima/ima_kexec.c
@@ -10,6 +10,7 @@
#include <linux/seq_file.h>
#include <linux/vmalloc.h>
#include <linux/kexec.h>
+#include <linux/of.h>
#include "ima.h"
#ifdef CONFIG_IMA_KEXEC
@@ -122,7 +123,7 @@ void ima_add_kexec_buffer(struct kimage *image)
return;
}
- ret = arch_ima_add_kexec_buffer(image, kbuf.mem, kexec_segment_size);
+ ret = of_ima_add_kexec_buffer(image, kbuf.mem, kexec_segment_size);
if (ret) {
pr_err("Error passing over kexec measurement buffer.\n");
return;
--
2.30.0
^ permalink raw reply related
* [PATCH v16 04/12] powerpc: Use common of_kexec_setup_new_fdt()
From: Lakshmi Ramasubramanian @ 2021-02-04 16:41 UTC (permalink / raw)
To: zohar, bauerman, robh, takahiro.akashi, gregkh, will, joe,
catalin.marinas, mpe
Cc: mark.rutland, bhsharma, tao.li, paulus, vincenzo.frascino,
frowand.list, sashal, masahiroy, jmorris, linux-arm-kernel, serge,
devicetree, pasha.tatashin, prsriva, hsinyi, allison,
christophe.leroy, mbrugger, balajib, dmitry.kasatkin,
linux-kernel, james.morse, linux-integrity, linuxppc-dev
In-Reply-To: <20210204164135.29856-1-nramas@linux.microsoft.com>
From: Rob Herring <robh@kernel.org>
The code for setting up the /chosen node in the device tree
and updating the memory reservation for the next kernel has been
moved to of_kexec_setup_new_fdt() defined in "drivers/of/kexec.c".
Use the common of_kexec_setup_new_fdt() to setup the device tree
and update the memory reservation for kexec for powerpc.
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
---
arch/powerpc/kexec/file_load.c | 125 ++-------------------------------
1 file changed, 6 insertions(+), 119 deletions(-)
diff --git a/arch/powerpc/kexec/file_load.c b/arch/powerpc/kexec/file_load.c
index e452b11df631..956bcb2d1ec2 100644
--- a/arch/powerpc/kexec/file_load.c
+++ b/arch/powerpc/kexec/file_load.c
@@ -16,6 +16,7 @@
#include <linux/slab.h>
#include <linux/kexec.h>
+#include <linux/of.h>
#include <linux/of_fdt.h>
#include <linux/libfdt.h>
#include <asm/setup.h>
@@ -156,132 +157,18 @@ int setup_new_fdt(const struct kimage *image, void *fdt,
unsigned long initrd_load_addr, unsigned long initrd_len,
const char *cmdline)
{
- int ret, chosen_node;
- const void *prop;
-
- /* Remove memory reservation for the current device tree. */
- ret = delete_fdt_mem_rsv(fdt, __pa(initial_boot_params),
- fdt_totalsize(initial_boot_params));
- if (ret == 0)
- pr_debug("Removed old device tree reservation.\n");
- else if (ret != -ENOENT)
- return ret;
-
- chosen_node = fdt_path_offset(fdt, "/chosen");
- if (chosen_node == -FDT_ERR_NOTFOUND) {
- chosen_node = fdt_add_subnode(fdt, fdt_path_offset(fdt, "/"),
- "chosen");
- if (chosen_node < 0) {
- pr_err("Error creating /chosen.\n");
- return -EINVAL;
- }
- } else if (chosen_node < 0) {
- pr_err("Malformed device tree: error reading /chosen.\n");
- return -EINVAL;
- }
-
- /* Did we boot using an initrd? */
- prop = fdt_getprop(fdt, chosen_node, "linux,initrd-start", NULL);
- if (prop) {
- uint64_t tmp_start, tmp_end, tmp_size;
-
- tmp_start = fdt64_to_cpu(*((const fdt64_t *) prop));
-
- prop = fdt_getprop(fdt, chosen_node, "linux,initrd-end", NULL);
- if (!prop) {
- pr_err("Malformed device tree.\n");
- return -EINVAL;
- }
- tmp_end = fdt64_to_cpu(*((const fdt64_t *) prop));
-
- /*
- * kexec reserves exact initrd size, while firmware may
- * reserve a multiple of PAGE_SIZE, so check for both.
- */
- tmp_size = tmp_end - tmp_start;
- ret = delete_fdt_mem_rsv(fdt, tmp_start, tmp_size);
- if (ret == -ENOENT)
- ret = delete_fdt_mem_rsv(fdt, tmp_start,
- round_up(tmp_size, PAGE_SIZE));
- if (ret == 0)
- pr_debug("Removed old initrd reservation.\n");
- else if (ret != -ENOENT)
- return ret;
-
- /* If there's no new initrd, delete the old initrd's info. */
- if (initrd_len == 0) {
- ret = fdt_delprop(fdt, chosen_node,
- "linux,initrd-start");
- if (ret) {
- pr_err("Error deleting linux,initrd-start.\n");
- return -EINVAL;
- }
-
- ret = fdt_delprop(fdt, chosen_node, "linux,initrd-end");
- if (ret) {
- pr_err("Error deleting linux,initrd-end.\n");
- return -EINVAL;
- }
- }
- }
-
- if (initrd_len) {
- ret = fdt_setprop_u64(fdt, chosen_node,
- "linux,initrd-start",
- initrd_load_addr);
- if (ret < 0)
- goto err;
-
- /* initrd-end is the first address after the initrd image. */
- ret = fdt_setprop_u64(fdt, chosen_node, "linux,initrd-end",
- initrd_load_addr + initrd_len);
- if (ret < 0)
- goto err;
-
- ret = fdt_add_mem_rsv(fdt, initrd_load_addr, initrd_len);
- if (ret) {
- pr_err("Error reserving initrd memory: %s\n",
- fdt_strerror(ret));
- return -EINVAL;
- }
- }
-
- if (cmdline != NULL) {
- ret = fdt_setprop_string(fdt, chosen_node, "bootargs", cmdline);
- if (ret < 0)
- goto err;
- } else {
- ret = fdt_delprop(fdt, chosen_node, "bootargs");
- if (ret && ret != -FDT_ERR_NOTFOUND) {
- pr_err("Error deleting bootargs.\n");
- return -EINVAL;
- }
- }
+ int ret;
- if (image->type == KEXEC_TYPE_CRASH) {
- /*
- * Avoid elfcorehdr from being stomped on in kdump kernel by
- * setting up memory reserve map.
- */
- ret = fdt_add_mem_rsv(fdt, image->arch.elf_headers_mem,
- image->arch.elf_headers_sz);
- if (ret) {
- pr_err("Error reserving elfcorehdr memory: %s\n",
- fdt_strerror(ret));
- goto err;
- }
- }
+ ret = of_kexec_setup_new_fdt(image, fdt, initrd_load_addr, initrd_len, cmdline);
+ if (ret)
+ goto err;
- ret = setup_ima_buffer(image, fdt, chosen_node);
+ ret = setup_ima_buffer(image, fdt, fdt_path_offset(fdt, "/chosen"));
if (ret) {
pr_err("Error setting up the new device tree.\n");
return ret;
}
- ret = fdt_setprop(fdt, chosen_node, "linux,booted-from-kexec", NULL, 0);
- if (ret)
- goto err;
-
return 0;
err:
--
2.30.0
^ permalink raw reply related
* [PATCH v16 03/12] arm64: Use common of_kexec_setup_new_fdt()
From: Lakshmi Ramasubramanian @ 2021-02-04 16:41 UTC (permalink / raw)
To: zohar, bauerman, robh, takahiro.akashi, gregkh, will, joe,
catalin.marinas, mpe
Cc: mark.rutland, bhsharma, tao.li, paulus, vincenzo.frascino,
frowand.list, sashal, masahiroy, jmorris, linux-arm-kernel, serge,
devicetree, pasha.tatashin, prsriva, hsinyi, allison,
christophe.leroy, mbrugger, balajib, dmitry.kasatkin,
linux-kernel, james.morse, linux-integrity, linuxppc-dev
In-Reply-To: <20210204164135.29856-1-nramas@linux.microsoft.com>
From: Rob Herring <robh@kernel.org>
The code for setting up the /chosen node in the device tree
and updating the memory reservation for the next kernel has been
moved to of_kexec_setup_new_fdt() defined in "drivers/of/kexec.c".
Use the common of_kexec_setup_new_fdt() to setup the device tree
and update the memory reservation for kexec for arm64.
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Acked-by: Will Deacon <will@kernel.org>
---
arch/arm64/kernel/machine_kexec_file.c | 123 +------------------------
1 file changed, 3 insertions(+), 120 deletions(-)
diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
index 03210f644790..7da22bb7b9d5 100644
--- a/arch/arm64/kernel/machine_kexec_file.c
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -15,23 +15,12 @@
#include <linux/kexec.h>
#include <linux/libfdt.h>
#include <linux/memblock.h>
+#include <linux/of.h>
#include <linux/of_fdt.h>
-#include <linux/random.h>
#include <linux/slab.h>
#include <linux/string.h>
#include <linux/types.h>
#include <linux/vmalloc.h>
-#include <asm/byteorder.h>
-
-/* relevant device tree properties */
-#define FDT_PROP_KEXEC_ELFHDR "linux,elfcorehdr"
-#define FDT_PROP_MEM_RANGE "linux,usable-memory-range"
-#define FDT_PROP_INITRD_START "linux,initrd-start"
-#define FDT_PROP_INITRD_END "linux,initrd-end"
-#define FDT_PROP_BOOTARGS "bootargs"
-#define FDT_PROP_KASLR_SEED "kaslr-seed"
-#define FDT_PROP_RNG_SEED "rng-seed"
-#define RNG_SEED_SIZE 128
const struct kexec_file_ops * const kexec_file_loaders[] = {
&kexec_image_ops,
@@ -50,112 +39,6 @@ int arch_kimage_file_post_load_cleanup(struct kimage *image)
return kexec_image_post_load_cleanup_default(image);
}
-static int setup_dtb(struct kimage *image,
- unsigned long initrd_load_addr, unsigned long initrd_len,
- char *cmdline, void *dtb)
-{
- int off, ret;
-
- ret = fdt_path_offset(dtb, "/chosen");
- if (ret < 0)
- goto out;
-
- off = ret;
-
- ret = fdt_delprop(dtb, off, FDT_PROP_KEXEC_ELFHDR);
- if (ret && ret != -FDT_ERR_NOTFOUND)
- goto out;
- ret = fdt_delprop(dtb, off, FDT_PROP_MEM_RANGE);
- if (ret && ret != -FDT_ERR_NOTFOUND)
- goto out;
-
- if (image->type == KEXEC_TYPE_CRASH) {
- /* add linux,elfcorehdr */
- ret = fdt_appendprop_addrrange(dtb, 0, off,
- FDT_PROP_KEXEC_ELFHDR,
- image->arch.elf_headers_mem,
- image->arch.elf_headers_sz);
- if (ret)
- return (ret == -FDT_ERR_NOSPACE ? -ENOMEM : -EINVAL);
-
- /* add linux,usable-memory-range */
- ret = fdt_appendprop_addrrange(dtb, 0, off,
- FDT_PROP_MEM_RANGE,
- crashk_res.start,
- crashk_res.end - crashk_res.start + 1);
- if (ret)
- return (ret == -FDT_ERR_NOSPACE ? -ENOMEM : -EINVAL);
- }
-
- /* add bootargs */
- if (cmdline) {
- ret = fdt_setprop_string(dtb, off, FDT_PROP_BOOTARGS, cmdline);
- if (ret)
- goto out;
- } else {
- ret = fdt_delprop(dtb, off, FDT_PROP_BOOTARGS);
- if (ret && (ret != -FDT_ERR_NOTFOUND))
- goto out;
- }
-
- /* add initrd-* */
- if (initrd_load_addr) {
- ret = fdt_setprop_u64(dtb, off, FDT_PROP_INITRD_START,
- initrd_load_addr);
- if (ret)
- goto out;
-
- ret = fdt_setprop_u64(dtb, off, FDT_PROP_INITRD_END,
- initrd_load_addr + initrd_len);
- if (ret)
- goto out;
- } else {
- ret = fdt_delprop(dtb, off, FDT_PROP_INITRD_START);
- if (ret && (ret != -FDT_ERR_NOTFOUND))
- goto out;
-
- ret = fdt_delprop(dtb, off, FDT_PROP_INITRD_END);
- if (ret && (ret != -FDT_ERR_NOTFOUND))
- goto out;
- }
-
- /* add kaslr-seed */
- ret = fdt_delprop(dtb, off, FDT_PROP_KASLR_SEED);
- if (ret == -FDT_ERR_NOTFOUND)
- ret = 0;
- else if (ret)
- goto out;
-
- if (rng_is_initialized()) {
- u64 seed = get_random_u64();
- ret = fdt_setprop_u64(dtb, off, FDT_PROP_KASLR_SEED, seed);
- if (ret)
- goto out;
- } else {
- pr_notice("RNG is not initialised: omitting \"%s\" property\n",
- FDT_PROP_KASLR_SEED);
- }
-
- /* add rng-seed */
- if (rng_is_initialized()) {
- void *rng_seed;
- ret = fdt_setprop_placeholder(dtb, off, FDT_PROP_RNG_SEED,
- RNG_SEED_SIZE, &rng_seed);
- if (ret)
- goto out;
- get_random_bytes(rng_seed, RNG_SEED_SIZE);
- } else {
- pr_notice("RNG is not initialised: omitting \"%s\" property\n",
- FDT_PROP_RNG_SEED);
- }
-
-out:
- if (ret)
- return (ret == -FDT_ERR_NOSPACE) ? -ENOMEM : -EINVAL;
-
- return 0;
-}
-
/*
* More space needed so that we can add initrd, bootargs, kaslr-seed,
* rng-seed, userable-memory-range and elfcorehdr.
@@ -185,8 +68,8 @@ static int create_dtb(struct kimage *image,
if (ret)
return -EINVAL;
- ret = setup_dtb(image, initrd_load_addr, initrd_len,
- cmdline, buf);
+ ret = of_kexec_setup_new_fdt(image, buf, initrd_load_addr,
+ initrd_len, cmdline);
if (ret) {
vfree(buf);
if (ret == -ENOMEM) {
--
2.30.0
^ permalink raw reply related
* [PATCH v16 00/12] Carry forward IMA measurement log on kexec on ARM64
From: Lakshmi Ramasubramanian @ 2021-02-04 16:41 UTC (permalink / raw)
To: zohar, bauerman, robh, takahiro.akashi, gregkh, will, joe,
catalin.marinas, mpe
Cc: mark.rutland, bhsharma, tao.li, paulus, vincenzo.frascino,
frowand.list, sashal, masahiroy, jmorris, linux-arm-kernel, serge,
devicetree, pasha.tatashin, prsriva, hsinyi, allison,
christophe.leroy, mbrugger, balajib, dmitry.kasatkin,
linux-kernel, james.morse, linux-integrity, linuxppc-dev
On kexec file load Integrity Measurement Architecture (IMA) subsystem
may verify the IMA signature of the kernel and initramfs, and measure
it. The command line parameters passed to the kernel in the kexec call
may also be measured by IMA. A remote attestation service can verify
a TPM quote based on the TPM event log, the IMA measurement list, and
the TPM PCR data. This can be achieved only if the IMA measurement log
is carried over from the current kernel to the next kernel across
the kexec call.
powerpc already supports carrying forward the IMA measurement log on
kexec. This patch set adds support for carrying forward the IMA
measurement log on kexec on ARM64.
This patch set moves the platform independent code defined for powerpc
such that it can be reused for other platforms as well. A chosen node
"linux,ima-kexec-buffer" is added to the DTB for ARM64 to hold
the address and the size of the memory reserved to carry
the IMA measurement log.
This patch set has been tested for ARM64 platform using QEMU.
I would like help from the community for testing this change on powerpc.
Thanks.
This patch set is based on
commit b3f82afc1041 ("IMA: Measure kernel version in early boot")
in https://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity.git
"next-integrity" branch.
Changelog:
v16
- Defined functions to allocate and free buffer for FDT for powerpc
and arm64.
- Moved ima_buffer_addr and ima_buffer_size fields from
"struct kimage_arch" in powerpc to "struct kimage"
v15
- Included Rob's patches in the patch set, and rebased
the changes to "next-integrity" branch.
- Allocate memory for DTB, on arm64, using kmalloc() instead of
vmalloc() to keep it consistent with powerpc implementation.
- Call of_kexec_setup_new_fdt() from setup_new_fdt_ppc64() and
remove setup_new_fdt() in the same patch to keep it bisect safe.
v14
- Select CONFIG_HAVE_IMA_KEXEC for CONFIG_KEXEC_FILE, for powerpc
and arm64, if CONFIG_IMA is enabled.
- Use IS_ENABLED() macro instead of "#ifdef" in remove_ima_buffer(),
ima_get_kexec_buffer(), and ima_free_kexec_buffer().
- Call of_kexec_setup_new_fdt() from setup_new_fdt_ppc64() and
remove setup_new_fdt() in "arch/powerpc/kexec/file_load.c".
v13
- Moved the arch independent functions to drivers/of/kexec.c
and then refactored the code.
- Moved arch_ima_add_kexec_buffer() to
security/integrity/ima/ima_kexec.c
v12
- Use fdt_appendprop_addrrange() in setup_ima_buffer()
to setup the IMA measurement list property in
the device tree.
- Moved architecture independent functions from
"arch/powerpc/kexec/ima.c" to "drivers/of/kexec."
- Deleted "arch/powerpc/kexec/ima.c" and
"arch/powerpc/include/asm/ima.h".
v11
- Rebased the changes on the kexec code refactoring done by
Rob Herring in his "dt/kexec" branch
- Removed "extern" keyword in function declarations
- Removed unnecessary header files included in C files
- Updated patch descriptions per Thiago's comments
v10
- Moved delete_fdt_mem_rsv(), remove_ima_buffer(),
get_ima_kexec_buffer, and get_root_addr_size_cells()
to drivers/of/kexec.c
- Moved arch_ima_add_kexec_buffer() to
security/integrity/ima/ima_kexec.c
- Conditionally define IMA buffer fields in struct kimage_arch
v9
- Moved delete_fdt_mem_rsv() to drivers/of/kexec_fdt.c
- Defined a new function get_ima_kexec_buffer() in
drivers/of/ima_kexec.c to replace do_get_kexec_buffer()
- Changed remove_ima_kexec_buffer() to the original function name
remove_ima_buffer()
- Moved remove_ima_buffer() to drivers/of/ima_kexec.c
- Moved ima_get_kexec_buffer() and ima_free_kexec_buffer()
to security/integrity/ima/ima_kexec.c
v8:
- Moved remove_ima_kexec_buffer(), do_get_kexec_buffer(), and
delete_fdt_mem_rsv() to drivers/of/fdt.c
- Moved ima_dump_measurement_list() and ima_add_kexec_buffer()
back to security/integrity/ima/ima_kexec.c
v7:
- Renamed remove_ima_buffer() to remove_ima_kexec_buffer() and moved
this function definition to kernel.
- Moved delete_fdt_mem_rsv() definition to kernel
- Moved ima_dump_measurement_list() and ima_add_kexec_buffer() to
a new file namely ima_kexec_fdt.c in IMA
v6:
- Remove any existing FDT_PROP_IMA_KEXEC_BUFFER property in the device
tree and also its corresponding memory reservation in the currently
running kernel.
- Moved the function remove_ima_buffer() defined for powerpc to IMA
and renamed the function to ima_remove_kexec_buffer(). Also, moved
delete_fdt_mem_rsv() from powerpc to IMA.
v5:
- Merged get_addr_size_cells() and do_get_kexec_buffer() into a single
function when moving the arch independent code from powerpc to IMA
- Reverted the change to use FDT functions in powerpc code and added
back the original code in get_addr_size_cells() and
do_get_kexec_buffer() for powerpc.
- Added fdt_add_mem_rsv() for ARM64 to reserve the memory for
the IMA log buffer during kexec.
- Fixed the warning reported by kernel test bot for ARM64
arch_ima_add_kexec_buffer() - moved this function to a new file
namely arch/arm64/kernel/ima_kexec.c
v4:
- Submitting the patch series on behalf of the original author
Prakhar Srivastava <prsriva@linux.microsoft.com>
- Moved FDT_PROP_IMA_KEXEC_BUFFER ("linux,ima-kexec-buffer") to
libfdt.h so that it can be shared by multiple platforms.
v3:
Breakup patches further into separate patches.
- Refactoring non architecture specific code out of powerpc
- Update powerpc related code to use fdt functions
- Update IMA buffer read related code to use of functions
- Add support to store the memory information of the IMA
measurement logs to be carried forward.
- Update the property strings to align with documented nodes
https://github.com/devicetree-org/dt-schema/pull/46
v2:
Break patches into separate patches.
- Powerpc related Refactoring
- Updating the docuemntation for chosen node
- Updating arm64 to support IMA buffer pass
v1:
Refactoring carrying over IMA measuremnet logs over Kexec. This patch
moves the non-architecture specific code out of powerpc and adds to
security/ima.(Suggested by Thiago)
Add Documentation regarding the ima-kexec-buffer node in the chosen
node documentation
v0:
Add a layer of abstraction to use the memory reserved by device tree
for ima buffer pass.
Add support for ima buffer pass using reserved memory for arm64 kexec.
Update the arch sepcific code path in kexec file load to store the
ima buffer in the reserved memory. The same reserved memory is read
on kexec or cold boot.
Lakshmi Ramasubramanian (8):
powerpc: Move ima buffer fields to struct kimage
powerpc: Move arch independent ima kexec functions to
drivers/of/kexec.c
kexec: Use fdt_appendprop_addrrange() to add ima buffer to FDT
powerpc: Delete unused function delete_fdt_mem_rsv()
of: Define functions to allocate and free FDT
arm64: Use OF alloc and free functions for FDT
powerpc: Use OF alloc and free for FDT
arm64: Enable passing IMA log to next kernel on kexec
Rob Herring (4):
powerpc: Rename kexec elfcorehdr_addr to elf_headers_mem
of: Add a common kexec FDT setup function
arm64: Use common of_kexec_setup_new_fdt()
powerpc: Use common of_kexec_setup_new_fdt()
arch/arm64/Kconfig | 1 +
arch/arm64/kernel/machine_kexec_file.c | 158 +-------
arch/powerpc/Kconfig | 2 +-
arch/powerpc/include/asm/ima.h | 30 --
arch/powerpc/include/asm/kexec.h | 11 +-
arch/powerpc/kexec/Makefile | 7 -
arch/powerpc/kexec/elf_64.c | 26 +-
arch/powerpc/kexec/file_load.c | 184 +---------
arch/powerpc/kexec/file_load_64.c | 11 +-
arch/powerpc/kexec/ima.c | 219 -----------
drivers/of/Makefile | 1 +
drivers/of/kexec.c | 488 +++++++++++++++++++++++++
include/linux/kexec.h | 5 +
include/linux/of.h | 15 +-
security/integrity/ima/ima.h | 4 -
security/integrity/ima/ima_kexec.c | 3 +-
16 files changed, 552 insertions(+), 613 deletions(-)
delete mode 100644 arch/powerpc/include/asm/ima.h
delete mode 100644 arch/powerpc/kexec/ima.c
create mode 100644 drivers/of/kexec.c
--
2.30.0
^ permalink raw reply
* [PATCH v16 01/12] powerpc: Rename kexec elfcorehdr_addr to elf_headers_mem
From: Lakshmi Ramasubramanian @ 2021-02-04 16:41 UTC (permalink / raw)
To: zohar, bauerman, robh, takahiro.akashi, gregkh, will, joe,
catalin.marinas, mpe
Cc: mark.rutland, bhsharma, tao.li, paulus, vincenzo.frascino,
frowand.list, sashal, masahiroy, jmorris, linux-arm-kernel, serge,
devicetree, pasha.tatashin, prsriva, hsinyi, allison,
christophe.leroy, mbrugger, balajib, dmitry.kasatkin,
linux-kernel, james.morse, linux-integrity, linuxppc-dev
In-Reply-To: <20210204164135.29856-1-nramas@linux.microsoft.com>
From: Rob Herring <robh@kernel.org>
The architecture specific field, elfcorehdr_addr in struct kimage_arch,
that holds the address of the buffer in memory for ELF core header for
powerpc has a different name than the one used for arm64. This makes
it hard to have a common code for setting up the device tree for
kexec system call.
Rename elfcorehdr_addr to elf_headers_mem to align with arm64 name so
common code can use it.
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
---
arch/powerpc/include/asm/kexec.h | 2 +-
arch/powerpc/kexec/file_load.c | 4 ++--
arch/powerpc/kexec/file_load_64.c | 4 ++--
3 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index 55d6ede30c19..dbf09d2f36d0 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -108,7 +108,7 @@ struct kimage_arch {
unsigned long backup_start;
void *backup_buf;
- unsigned long elfcorehdr_addr;
+ unsigned long elf_headers_mem;
unsigned long elf_headers_sz;
void *elf_headers;
diff --git a/arch/powerpc/kexec/file_load.c b/arch/powerpc/kexec/file_load.c
index 9a232bc36c8f..e452b11df631 100644
--- a/arch/powerpc/kexec/file_load.c
+++ b/arch/powerpc/kexec/file_load.c
@@ -45,7 +45,7 @@ char *setup_kdump_cmdline(struct kimage *image, char *cmdline,
return NULL;
elfcorehdr_strlen = sprintf(cmdline_ptr, "elfcorehdr=0x%lx ",
- image->arch.elfcorehdr_addr);
+ image->arch.elf_headers_mem);
if (elfcorehdr_strlen + cmdline_len > COMMAND_LINE_SIZE) {
pr_err("Appending elfcorehdr=<addr> exceeds cmdline size\n");
@@ -263,7 +263,7 @@ int setup_new_fdt(const struct kimage *image, void *fdt,
* Avoid elfcorehdr from being stomped on in kdump kernel by
* setting up memory reserve map.
*/
- ret = fdt_add_mem_rsv(fdt, image->arch.elfcorehdr_addr,
+ ret = fdt_add_mem_rsv(fdt, image->arch.elf_headers_mem,
image->arch.elf_headers_sz);
if (ret) {
pr_err("Error reserving elfcorehdr memory: %s\n",
diff --git a/arch/powerpc/kexec/file_load_64.c b/arch/powerpc/kexec/file_load_64.c
index c69bcf9b547a..a05c19b3cc60 100644
--- a/arch/powerpc/kexec/file_load_64.c
+++ b/arch/powerpc/kexec/file_load_64.c
@@ -815,7 +815,7 @@ static int load_elfcorehdr_segment(struct kimage *image, struct kexec_buf *kbuf)
goto out;
}
- image->arch.elfcorehdr_addr = kbuf->mem;
+ image->arch.elf_headers_mem = kbuf->mem;
image->arch.elf_headers_sz = headers_sz;
image->arch.elf_headers = headers;
out:
@@ -851,7 +851,7 @@ int load_crashdump_segments_ppc64(struct kimage *image,
return ret;
}
pr_debug("Loaded elf core header at 0x%lx, bufsz=0x%lx memsz=0x%lx\n",
- image->arch.elfcorehdr_addr, kbuf->bufsz, kbuf->memsz);
+ image->arch.elf_headers_mem, kbuf->bufsz, kbuf->memsz);
return 0;
}
--
2.30.0
^ permalink raw reply related
* [PATCH v16 02/12] of: Add a common kexec FDT setup function
From: Lakshmi Ramasubramanian @ 2021-02-04 16:41 UTC (permalink / raw)
To: zohar, bauerman, robh, takahiro.akashi, gregkh, will, joe,
catalin.marinas, mpe
Cc: mark.rutland, bhsharma, tao.li, paulus, vincenzo.frascino,
frowand.list, sashal, masahiroy, jmorris, linux-arm-kernel, serge,
devicetree, pasha.tatashin, prsriva, hsinyi, allison,
christophe.leroy, mbrugger, balajib, dmitry.kasatkin,
linux-kernel, james.morse, linux-integrity, linuxppc-dev
In-Reply-To: <20210204164135.29856-1-nramas@linux.microsoft.com>
From: Rob Herring <robh@kernel.org>
Both arm64 and powerpc do essentially the same FDT /chosen setup for
kexec. The differences are either omissions that arm64 should have
or additional properties that will be ignored. The setup code can be
combined and shared by both powerpc and arm64.
The differences relative to the arm64 version:
- If /chosen doesn't exist, it will be created (should never happen).
- Any old dtb and initrd reserved memory will be released.
- The new initrd and elfcorehdr are marked reserved.
- "linux,booted-from-kexec" is set.
The differences relative to the powerpc version:
- "kaslr-seed" and "rng-seed" may be set.
- "linux,elfcorehdr" is set.
- Any existing "linux,usable-memory-range" is removed.
Combine the code for setting up the /chosen node in the FDT and updating
the memory reservation for kexec, for powerpc and arm64, in
of_kexec_setup_new_fdt() and move it to "drivers/of/kexec.c".
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
---
drivers/of/Makefile | 1 +
drivers/of/kexec.c | 236 ++++++++++++++++++++++++++++++++++++++++++++
include/linux/of.h | 4 +
3 files changed, 241 insertions(+)
create mode 100644 drivers/of/kexec.c
diff --git a/drivers/of/Makefile b/drivers/of/Makefile
index 6e1e5212f058..8ce11955afde 100644
--- a/drivers/of/Makefile
+++ b/drivers/of/Makefile
@@ -13,5 +13,6 @@ obj-$(CONFIG_OF_RESERVED_MEM) += of_reserved_mem.o
obj-$(CONFIG_OF_RESOLVE) += resolver.o
obj-$(CONFIG_OF_OVERLAY) += overlay.o
obj-$(CONFIG_OF_NUMA) += of_numa.o
+obj-$(CONFIG_KEXEC_FILE) += kexec.o
obj-$(CONFIG_OF_UNITTEST) += unittest-data/
diff --git a/drivers/of/kexec.c b/drivers/of/kexec.c
new file mode 100644
index 000000000000..4afd3cc1c04a
--- /dev/null
+++ b/drivers/of/kexec.c
@@ -0,0 +1,236 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2020 Arm Limited
+ *
+ * Based on arch/arm64/kernel/machine_kexec_file.c:
+ * Copyright (C) 2018 Linaro Limited
+ *
+ * And arch/powerpc/kexec/file_load.c:
+ * Copyright (C) 2016 IBM Corporation
+ */
+
+#include <linux/kernel.h>
+#include <linux/kexec.h>
+#include <linux/libfdt.h>
+#include <linux/of.h>
+#include <linux/of_fdt.h>
+#include <linux/random.h>
+#include <linux/types.h>
+
+/* relevant device tree properties */
+#define FDT_PROP_KEXEC_ELFHDR "linux,elfcorehdr"
+#define FDT_PROP_MEM_RANGE "linux,usable-memory-range"
+#define FDT_PROP_INITRD_START "linux,initrd-start"
+#define FDT_PROP_INITRD_END "linux,initrd-end"
+#define FDT_PROP_BOOTARGS "bootargs"
+#define FDT_PROP_KASLR_SEED "kaslr-seed"
+#define FDT_PROP_RNG_SEED "rng-seed"
+#define RNG_SEED_SIZE 128
+
+/**
+ * fdt_find_and_del_mem_rsv - delete memory reservation with given address and size
+ *
+ * @fdt: Flattened device tree for the current kernel.
+ * @start: Starting address of the reserved memory.
+ * @size: Size of the reserved memory.
+ *
+ * Return: 0 on success, or negative errno on error.
+ */
+static int fdt_find_and_del_mem_rsv(void *fdt, unsigned long start, unsigned long size)
+{
+ int i, ret, num_rsvs = fdt_num_mem_rsv(fdt);
+
+ for (i = 0; i < num_rsvs; i++) {
+ u64 rsv_start, rsv_size;
+
+ ret = fdt_get_mem_rsv(fdt, i, &rsv_start, &rsv_size);
+ if (ret) {
+ pr_err("Malformed device tree.\n");
+ return -EINVAL;
+ }
+
+ if (rsv_start == start && rsv_size == size) {
+ ret = fdt_del_mem_rsv(fdt, i);
+ if (ret) {
+ pr_err("Error deleting device tree reservation.\n");
+ return -EINVAL;
+ }
+
+ return 0;
+ }
+ }
+
+ return -ENOENT;
+}
+
+/*
+ * of_kexec_setup_new_fdt - modify /chosen and memory reservation for the next kernel
+ *
+ * @image: kexec image being loaded.
+ * @fdt: Flattened device tree for the next kernel.
+ * @initrd_load_addr: Address where the next initrd will be loaded.
+ * @initrd_len: Size of the next initrd, or 0 if there will be none.
+ * @cmdline: Command line for the next kernel, or NULL if there will
+ * be none.
+ *
+ * Return: 0 on success, or negative errno on error.
+ */
+int of_kexec_setup_new_fdt(const struct kimage *image, void *fdt,
+ unsigned long initrd_load_addr, unsigned long initrd_len,
+ const char *cmdline)
+{
+ int ret, chosen_node;
+ const void *prop;
+
+ /* Remove memory reservation for the current device tree. */
+ ret = fdt_find_and_del_mem_rsv(fdt, __pa(initial_boot_params),
+ fdt_totalsize(initial_boot_params));
+ if (ret == -EINVAL)
+ return ret;
+
+ chosen_node = fdt_path_offset(fdt, "/chosen");
+ if (chosen_node == -FDT_ERR_NOTFOUND)
+ chosen_node = fdt_add_subnode(fdt, fdt_path_offset(fdt, "/"),
+ "chosen");
+ if (chosen_node < 0) {
+ ret = chosen_node;
+ goto out;
+ }
+
+ ret = fdt_delprop(fdt, chosen_node, FDT_PROP_KEXEC_ELFHDR);
+ if (ret && ret != -FDT_ERR_NOTFOUND)
+ goto out;
+ ret = fdt_delprop(fdt, chosen_node, FDT_PROP_MEM_RANGE);
+ if (ret && ret != -FDT_ERR_NOTFOUND)
+ goto out;
+
+ /* Did we boot using an initrd? */
+ prop = fdt_getprop(fdt, chosen_node, "linux,initrd-start", NULL);
+ if (prop) {
+ u64 tmp_start, tmp_end, tmp_size;
+
+ tmp_start = fdt64_to_cpu(*((const fdt64_t *) prop));
+
+ prop = fdt_getprop(fdt, chosen_node, "linux,initrd-end", NULL);
+ if (!prop)
+ return -EINVAL;
+
+ tmp_end = fdt64_to_cpu(*((const fdt64_t *) prop));
+
+ /*
+ * kexec reserves exact initrd size, while firmware may
+ * reserve a multiple of PAGE_SIZE, so check for both.
+ */
+ tmp_size = tmp_end - tmp_start;
+ ret = fdt_find_and_del_mem_rsv(fdt, tmp_start, tmp_size);
+ if (ret == -ENOENT)
+ ret = fdt_find_and_del_mem_rsv(fdt, tmp_start,
+ round_up(tmp_size, PAGE_SIZE));
+ if (ret == -EINVAL)
+ return ret;
+ }
+
+ /* add initrd-* */
+ if (initrd_load_addr) {
+ ret = fdt_setprop_u64(fdt, chosen_node, FDT_PROP_INITRD_START,
+ initrd_load_addr);
+ if (ret)
+ goto out;
+
+ ret = fdt_setprop_u64(fdt, chosen_node, FDT_PROP_INITRD_END,
+ initrd_load_addr + initrd_len);
+ if (ret)
+ goto out;
+
+ ret = fdt_add_mem_rsv(fdt, initrd_load_addr, initrd_len);
+ if (ret)
+ goto out;
+
+ } else {
+ ret = fdt_delprop(fdt, chosen_node, FDT_PROP_INITRD_START);
+ if (ret && (ret != -FDT_ERR_NOTFOUND))
+ goto out;
+
+ ret = fdt_delprop(fdt, chosen_node, FDT_PROP_INITRD_END);
+ if (ret && (ret != -FDT_ERR_NOTFOUND))
+ goto out;
+ }
+
+ if (image->type == KEXEC_TYPE_CRASH) {
+ /* add linux,elfcorehdr */
+ ret = fdt_appendprop_addrrange(fdt, 0, chosen_node,
+ FDT_PROP_KEXEC_ELFHDR,
+ image->arch.elf_headers_mem,
+ image->arch.elf_headers_sz);
+ if (ret)
+ goto out;
+
+ /*
+ * Avoid elfcorehdr from being stomped on in kdump kernel by
+ * setting up memory reserve map.
+ */
+ ret = fdt_add_mem_rsv(fdt, image->arch.elf_headers_mem,
+ image->arch.elf_headers_sz);
+ if (ret)
+ goto out;
+
+ /* add linux,usable-memory-range */
+ ret = fdt_appendprop_addrrange(fdt, 0, chosen_node,
+ FDT_PROP_MEM_RANGE,
+ crashk_res.start,
+ crashk_res.end - crashk_res.start + 1);
+ if (ret)
+ goto out;
+ }
+
+ /* add bootargs */
+ if (cmdline) {
+ ret = fdt_setprop_string(fdt, chosen_node, FDT_PROP_BOOTARGS, cmdline);
+ if (ret)
+ goto out;
+ } else {
+ ret = fdt_delprop(fdt, chosen_node, FDT_PROP_BOOTARGS);
+ if (ret && (ret != -FDT_ERR_NOTFOUND))
+ goto out;
+ }
+
+ /* add kaslr-seed */
+ ret = fdt_delprop(fdt, chosen_node, FDT_PROP_KASLR_SEED);
+ if (ret == -FDT_ERR_NOTFOUND)
+ ret = 0;
+ else if (ret)
+ goto out;
+
+ if (rng_is_initialized()) {
+ u64 seed = get_random_u64();
+
+ ret = fdt_setprop_u64(fdt, chosen_node, FDT_PROP_KASLR_SEED, seed);
+ if (ret)
+ goto out;
+ } else {
+ pr_notice("RNG is not initialised: omitting \"%s\" property\n",
+ FDT_PROP_KASLR_SEED);
+ }
+
+ /* add rng-seed */
+ if (rng_is_initialized()) {
+ void *rng_seed;
+
+ ret = fdt_setprop_placeholder(fdt, chosen_node, FDT_PROP_RNG_SEED,
+ RNG_SEED_SIZE, &rng_seed);
+ if (ret)
+ goto out;
+ get_random_bytes(rng_seed, RNG_SEED_SIZE);
+ } else {
+ pr_notice("RNG is not initialised: omitting \"%s\" property\n",
+ FDT_PROP_RNG_SEED);
+ }
+
+ ret = fdt_setprop(fdt, chosen_node, "linux,booted-from-kexec", NULL, 0);
+
+out:
+ if (ret)
+ return (ret == -FDT_ERR_NOSPACE) ? -ENOMEM : -EINVAL;
+
+ return 0;
+}
diff --git a/include/linux/of.h b/include/linux/of.h
index 4b27c9a27df3..41e256adf758 100644
--- a/include/linux/of.h
+++ b/include/linux/of.h
@@ -559,6 +559,10 @@ int of_map_id(struct device_node *np, u32 id,
struct device_node **target, u32 *id_out);
phys_addr_t of_dma_get_max_cpu_address(struct device_node *np);
+struct kimage;
+int of_kexec_setup_new_fdt(const struct kimage *image, void *fdt,
+ unsigned long initrd_load_addr, unsigned long initrd_len,
+ const char *cmdline);
#else /* CONFIG_OF */
--
2.30.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox