* [PATCH 10/12] powerpc/pseries/iommu: Don't use dma_iommu_ops on secure guests
From: Thiago Jung Bauermann @ 2019-05-21 4:49 UTC (permalink / raw)
To: linuxppc-dev
Cc: Anshuman Khandual, Alexey Kardashevskiy, Mike Anderson, Ram Pai,
linux-kernel, Claudio Carvalho, Paul Mackerras, Christoph Hellwig,
Thiago Jung Bauermann
In-Reply-To: <20190521044912.1375-1-bauerman@linux.ibm.com>
Secure guest memory is inacessible to devices so regular DMA isn't
possible.
In that case set devices' dma_map_ops to NULL so that the generic
DMA code path will use SWIOTLB and DMA to bounce buffers.
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
---
arch/powerpc/platforms/pseries/iommu.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
index 03bbb299320e..7d9550edb700 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -50,6 +50,7 @@
#include <asm/udbg.h>
#include <asm/mmzone.h>
#include <asm/plpar_wrappers.h>
+#include <asm/svm.h>
#include "pseries.h"
@@ -1332,7 +1333,10 @@ void iommu_init_early_pSeries(void)
of_reconfig_notifier_register(&iommu_reconfig_nb);
register_memory_notifier(&iommu_mem_nb);
- set_pci_dma_ops(&dma_iommu_ops);
+ if (is_secure_guest())
+ set_pci_dma_ops(NULL);
+ else
+ set_pci_dma_ops(&dma_iommu_ops);
}
static int __init disable_multitce(char *str)
^ permalink raw reply related
* [PATCH 09/12] powerpc/pseries/svm: Disable doorbells in SVM guests
From: Thiago Jung Bauermann @ 2019-05-21 4:49 UTC (permalink / raw)
To: linuxppc-dev
Cc: Anshuman Khandual, Alexey Kardashevskiy, Mike Anderson, Ram Pai,
linux-kernel, Claudio Carvalho, Paul Mackerras,
Sukadev Bhattiprolu, Christoph Hellwig, Thiago Jung Bauermann
In-Reply-To: <20190521044912.1375-1-bauerman@linux.ibm.com>
From: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Normally, the HV emulates some instructions like MSGSNDP, MSGCLRP
from a KVM guest. To emulate the instructions, it must first read
the instruction from the guest's memory and decode its parameters.
However for a secure guest (aka SVM), the page containing the
instruction is in secure memory and the HV cannot access directly.
It would need the Ultravisor (UV) to facilitate accessing the
instruction and parameters but the UV currently does not have
the support for such accesses.
Until the UV has such support, disable doorbells in SVMs. This might
incur a performance hit but that is yet to be quantified.
With this patch applied (needed only in SVMs not needed for HV) we
are able to launch SVM guests with multi-core support. Eg:
qemu -smp sockets=2,cores=2,threads=2.
Fix suggested by Benjamin Herrenschmidt. Thanks to input from
Paul Mackerras, Ram Pai and Michael Anderson.
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
---
arch/powerpc/platforms/pseries/smp.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/platforms/pseries/smp.c b/arch/powerpc/platforms/pseries/smp.c
index 3df46123cce3..95a5c24a1544 100644
--- a/arch/powerpc/platforms/pseries/smp.c
+++ b/arch/powerpc/platforms/pseries/smp.c
@@ -45,6 +45,7 @@
#include <asm/dbell.h>
#include <asm/plpar_wrappers.h>
#include <asm/code-patching.h>
+#include <asm/svm.h>
#include "pseries.h"
#include "offline_states.h"
@@ -225,7 +226,7 @@ static __init void pSeries_smp_probe_xics(void)
{
xics_smp_probe();
- if (cpu_has_feature(CPU_FTR_DBELL))
+ if (cpu_has_feature(CPU_FTR_DBELL) && !is_secure_guest())
smp_ops->cause_ipi = smp_pseries_cause_ipi;
else
smp_ops->cause_ipi = icp_ops->cause_ipi;
^ permalink raw reply related
* [PATCH 08/12] powerpc/pseries/svm: Export guest SVM status to user space via sysfs
From: Thiago Jung Bauermann @ 2019-05-21 4:49 UTC (permalink / raw)
To: linuxppc-dev
Cc: Anshuman Khandual, Alexey Kardashevskiy, Mike Anderson, Ram Pai,
linux-kernel, Claudio Carvalho, Ryan Grimm, Paul Mackerras,
Christoph Hellwig, Thiago Jung Bauermann
In-Reply-To: <20190521044912.1375-1-bauerman@linux.ibm.com>
From: Ryan Grimm <grimm@linux.vnet.ibm.com>
User space might want to know it's running in a secure VM. It can't do
a mfmsr because mfmsr is a privileged instruction.
The solution here is to create a cpu attribute:
/sys/devices/system/cpu/svm
which will read 0 or 1 based on the S bit of the guest's CPU 0.
Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com>
Reviewed-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
---
arch/powerpc/kernel/sysfs.c | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)
diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
index e8e93c2c7d03..8fdab134e9ae 100644
--- a/arch/powerpc/kernel/sysfs.c
+++ b/arch/powerpc/kernel/sysfs.c
@@ -18,6 +18,7 @@
#include <asm/smp.h>
#include <asm/pmc.h>
#include <asm/firmware.h>
+#include <asm/svm.h>
#include "cacheinfo.h"
#include "setup.h"
@@ -714,6 +715,32 @@ static struct device_attribute pa6t_attrs[] = {
#endif /* HAS_PPC_PMC_PA6T */
#endif /* HAS_PPC_PMC_CLASSIC */
+#ifdef CONFIG_PPC_SVM
+static void get_svm(void *val)
+{
+ u32 *value = val;
+
+ *value = is_secure_guest();
+}
+
+static ssize_t show_svm(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ u32 val;
+ smp_call_function_single(0, get_svm, &val, 1);
+ return sprintf(buf, "%u\n", val);
+}
+static DEVICE_ATTR(svm, 0444, show_svm, NULL);
+
+static void create_svm_file(void)
+{
+ device_create_file(cpu_subsys.dev_root, &dev_attr_svm);
+}
+#else
+static void create_svm_file(void)
+{
+}
+#endif /* CONFIG_PPC_SVM */
+
static int register_cpu_online(unsigned int cpu)
{
struct cpu *c = &per_cpu(cpu_devices, cpu);
@@ -1057,6 +1084,8 @@ static int __init topology_init(void)
sysfs_create_dscr_default();
#endif /* CONFIG_PPC64 */
+ create_svm_file();
+
return 0;
}
subsys_initcall(topology_init);
^ permalink raw reply related
* [PATCH 06/12] powerpc/pseries/svm: Use shared memory for LPPACA structures
From: Thiago Jung Bauermann @ 2019-05-21 4:49 UTC (permalink / raw)
To: linuxppc-dev
Cc: Anshuman Khandual, Alexey Kardashevskiy, Mike Anderson, Ram Pai,
linux-kernel, Claudio Carvalho, Paul Mackerras, Christoph Hellwig,
Thiago Jung Bauermann, Anshuman Khandual
In-Reply-To: <20190521044912.1375-1-bauerman@linux.ibm.com>
From: Anshuman Khandual <khandual@linux.vnet.ibm.com>
LPPACA structures need to be shared with the host. Hence they need to be in
shared memory. Instead of allocating individual chunks of memory for a
given structure from memblock, a contiguous chunk of memory is allocated
and then converted into shared memory. Subsequent allocation requests will
come from the contiguous chunk which will be always shared memory for all
structures.
While we are able to use a kmem_cache constructor for the Debug Trace Log,
LPPACAs are allocated very early in the boot process (before SLUB is
available) so we need to use a simpler scheme here.
Introduce helper is_svm_platform() which uses the S bit of the MSR to tell
whether we're running as a secure guest.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
---
arch/powerpc/include/asm/svm.h | 26 ++++++++++++++++++++
arch/powerpc/kernel/paca.c | 43 +++++++++++++++++++++++++++++++++-
2 files changed, 68 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/include/asm/svm.h b/arch/powerpc/include/asm/svm.h
new file mode 100644
index 000000000000..fef3740f46a6
--- /dev/null
+++ b/arch/powerpc/include/asm/svm.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: GPL-2.0+ */
+/*
+ * SVM helper functions
+ *
+ * Copyright 2019 Anshuman Khandual, IBM Corporation.
+ */
+
+#ifndef _ASM_POWERPC_SVM_H
+#define _ASM_POWERPC_SVM_H
+
+#ifdef CONFIG_PPC_SVM
+
+static inline bool is_secure_guest(void)
+{
+ return mfmsr() & MSR_S;
+}
+
+#else /* CONFIG_PPC_SVM */
+
+static inline bool is_secure_guest(void)
+{
+ return false;
+}
+
+#endif /* CONFIG_PPC_SVM */
+#endif /* _ASM_POWERPC_SVM_H */
diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c
index 854105db5cff..a9622f4b45bb 100644
--- a/arch/powerpc/kernel/paca.c
+++ b/arch/powerpc/kernel/paca.c
@@ -18,6 +18,8 @@
#include <asm/sections.h>
#include <asm/pgtable.h>
#include <asm/kexec.h>
+#include <asm/svm.h>
+#include <asm/ultravisor.h>
#include "setup.h"
@@ -58,6 +60,41 @@ static void *__init alloc_paca_data(unsigned long size, unsigned long align,
#define LPPACA_SIZE 0x400
+static void *__init alloc_shared_lppaca(unsigned long size, unsigned long align,
+ unsigned long limit, int cpu)
+{
+ size_t shared_lppaca_total_size = PAGE_ALIGN(nr_cpu_ids * LPPACA_SIZE);
+ static unsigned long shared_lppaca_size;
+ static void *shared_lppaca;
+ void *ptr;
+
+ if (!shared_lppaca) {
+ memblock_set_bottom_up(true);
+
+ shared_lppaca =
+ memblock_alloc_try_nid(shared_lppaca_total_size,
+ PAGE_SIZE, MEMBLOCK_LOW_LIMIT,
+ limit, NUMA_NO_NODE);
+ if (!shared_lppaca)
+ panic("cannot allocate shared data");
+
+ memblock_set_bottom_up(false);
+ uv_share_page(PHYS_PFN(__pa(shared_lppaca)),
+ shared_lppaca_total_size >> PAGE_SHIFT);
+ }
+
+ ptr = shared_lppaca + shared_lppaca_size;
+ shared_lppaca_size += size;
+
+ /*
+ * This is very early in boot, so no harm done if the kernel crashes at
+ * this point.
+ */
+ BUG_ON(shared_lppaca_size >= shared_lppaca_total_size);
+
+ return ptr;
+}
+
/*
* See asm/lppaca.h for more detail.
*
@@ -87,7 +124,11 @@ static struct lppaca * __init new_lppaca(int cpu, unsigned long limit)
if (early_cpu_has_feature(CPU_FTR_HVMODE))
return NULL;
- lp = alloc_paca_data(LPPACA_SIZE, 0x400, limit, cpu);
+ if (is_secure_guest())
+ lp = alloc_shared_lppaca(LPPACA_SIZE, 0x400, limit, cpu);
+ else
+ lp = alloc_paca_data(LPPACA_SIZE, 0x400, limit, cpu);
+
init_lppaca(lp);
return lp;
^ permalink raw reply related
* [PATCH 05/12] powerpc/pseries: Add and use LPPACA_SIZE constant
From: Thiago Jung Bauermann @ 2019-05-21 4:49 UTC (permalink / raw)
To: linuxppc-dev
Cc: Alexey Kardashevskiy, Anshuman Khandual, Alexey Kardashevskiy,
Mike Anderson, Ram Pai, linux-kernel, Claudio Carvalho,
Paul Mackerras, Christoph Hellwig, Thiago Jung Bauermann
In-Reply-To: <20190521044912.1375-1-bauerman@linux.ibm.com>
Helps document what the hard-coded number means.
Also take the opportunity to fix an #endif comment.
Suggested-by: Alexey Kardashevskiy <aik@linux.ibm.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
---
arch/powerpc/kernel/paca.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c
index 9cc91d03ab62..854105db5cff 100644
--- a/arch/powerpc/kernel/paca.c
+++ b/arch/powerpc/kernel/paca.c
@@ -56,6 +56,8 @@ static void *__init alloc_paca_data(unsigned long size, unsigned long align,
#ifdef CONFIG_PPC_PSERIES
+#define LPPACA_SIZE 0x400
+
/*
* See asm/lppaca.h for more detail.
*
@@ -69,7 +71,7 @@ static inline void init_lppaca(struct lppaca *lppaca)
*lppaca = (struct lppaca) {
.desc = cpu_to_be32(0xd397d781), /* "LpPa" */
- .size = cpu_to_be16(0x400),
+ .size = cpu_to_be16(LPPACA_SIZE),
.fpregs_in_use = 1,
.slb_count = cpu_to_be16(64),
.vmxregs_in_use = 0,
@@ -79,19 +81,18 @@ static inline void init_lppaca(struct lppaca *lppaca)
static struct lppaca * __init new_lppaca(int cpu, unsigned long limit)
{
struct lppaca *lp;
- size_t size = 0x400;
- BUILD_BUG_ON(size < sizeof(struct lppaca));
+ BUILD_BUG_ON(sizeof(struct lppaca) > LPPACA_SIZE);
if (early_cpu_has_feature(CPU_FTR_HVMODE))
return NULL;
- lp = alloc_paca_data(size, 0x400, limit, cpu);
+ lp = alloc_paca_data(LPPACA_SIZE, 0x400, limit, cpu);
init_lppaca(lp);
return lp;
}
-#endif /* CONFIG_PPC_BOOK3S */
+#endif /* CONFIG_PPC_PSERIES */
#ifdef CONFIG_PPC_BOOK3S_64
^ permalink raw reply related
* [PATCH 04/12] powerpc/pseries/svm: Add helpers for UV_SHARE_PAGE and UV_UNSHARE_PAGE
From: Thiago Jung Bauermann @ 2019-05-21 4:49 UTC (permalink / raw)
To: linuxppc-dev
Cc: Anshuman Khandual, Alexey Kardashevskiy, Mike Anderson, Ram Pai,
linux-kernel, Claudio Carvalho, Paul Mackerras, Christoph Hellwig,
Thiago Jung Bauermann
In-Reply-To: <20190521044912.1375-1-bauerman@linux.ibm.com>
From: Ram Pai <linuxram@us.ibm.com>
These functions are used when the guest wants to grant the hypervisor
access to certain pages.
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
---
arch/powerpc/include/asm/ultravisor-api.h | 2 ++
arch/powerpc/include/asm/ultravisor.h | 14 ++++++++++++++
2 files changed, 16 insertions(+)
diff --git a/arch/powerpc/include/asm/ultravisor-api.h b/arch/powerpc/include/asm/ultravisor-api.h
index 0e8b72081718..ed68b02869fd 100644
--- a/arch/powerpc/include/asm/ultravisor-api.h
+++ b/arch/powerpc/include/asm/ultravisor-api.h
@@ -20,6 +20,8 @@
/* opcodes */
#define UV_WRITE_PATE 0xF104
#define UV_ESM 0xF110
+#define UV_SHARE_PAGE 0xF130
+#define UV_UNSHARE_PAGE 0xF134
#define UV_RETURN 0xF11C
#endif /* _ASM_POWERPC_ULTRAVISOR_API_H */
diff --git a/arch/powerpc/include/asm/ultravisor.h b/arch/powerpc/include/asm/ultravisor.h
index 09e0a615d96f..537f7717d21a 100644
--- a/arch/powerpc/include/asm/ultravisor.h
+++ b/arch/powerpc/include/asm/ultravisor.h
@@ -44,6 +44,20 @@ static inline int uv_register_pate(u64 lpid, u64 dw0, u64 dw1)
return ucall(UV_WRITE_PATE, retbuf, lpid, dw0, dw1);
}
+static inline int uv_share_page(u64 pfn, u64 npages)
+{
+ unsigned long retbuf[UCALL_BUFSIZE];
+
+ return ucall(UV_SHARE_PAGE, retbuf, pfn, npages);
+}
+
+static inline int uv_unshare_page(u64 pfn, u64 npages)
+{
+ unsigned long retbuf[UCALL_BUFSIZE];
+
+ return ucall(UV_UNSHARE_PAGE, retbuf, pfn, npages);
+}
+
#endif /* !__ASSEMBLY__ */
#endif /* _ASM_POWERPC_ULTRAVISOR_H */
^ permalink raw reply related
* [RFC PATCH 03/12] powerpc/prom_init: Add the ESM call to prom_init
From: Thiago Jung Bauermann @ 2019-05-21 4:49 UTC (permalink / raw)
To: linuxppc-dev
Cc: Anshuman Khandual, Alexey Kardashevskiy, Mike Anderson, Ram Pai,
linux-kernel, Claudio Carvalho, Paul Mackerras, Christoph Hellwig,
Thiago Jung Bauermann
In-Reply-To: <20190521044912.1375-1-bauerman@linux.ibm.com>
From: Ram Pai <linuxram@us.ibm.com>
Make the Enter-Secure-Mode (ESM) ultravisor call to switch the VM to secure
mode. Add "svm=" command line option to turn off switching to secure mode.
Introduce CONFIG_PPC_SVM to control support for secure guests.
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
[ Generate an RTAS os-term hcall when the ESM ucall fails. ]
Signed-off-by: Michael Anderson <andmike@linux.ibm.com>
[ Cleaned up the code a bit. ]
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
---
.../admin-guide/kernel-parameters.txt | 5 +
arch/powerpc/include/asm/ultravisor-api.h | 1 +
arch/powerpc/kernel/prom_init.c | 124 ++++++++++++++++++
3 files changed, 130 insertions(+)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index c45a19d654f3..7237d86b25c6 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4501,6 +4501,11 @@
/sys/power/pm_test). Only available when CONFIG_PM_DEBUG
is set. Default value is 5.
+ svm= [PPC]
+ Format: { on | off | y | n | 1 | 0 }
+ This parameter controls use of the Protected
+ Execution Facility on pSeries.
+
swapaccount=[0|1]
[KNL] Enable accounting of swap in memory resource
controller if no parameter or 1 is given or disable
diff --git a/arch/powerpc/include/asm/ultravisor-api.h b/arch/powerpc/include/asm/ultravisor-api.h
index 15e6ce77a131..0e8b72081718 100644
--- a/arch/powerpc/include/asm/ultravisor-api.h
+++ b/arch/powerpc/include/asm/ultravisor-api.h
@@ -19,6 +19,7 @@
/* opcodes */
#define UV_WRITE_PATE 0xF104
+#define UV_ESM 0xF110
#define UV_RETURN 0xF11C
#endif /* _ASM_POWERPC_ULTRAVISOR_API_H */
diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
index 523bb99d7676..5d8a3efb54f2 100644
--- a/arch/powerpc/kernel/prom_init.c
+++ b/arch/powerpc/kernel/prom_init.c
@@ -44,6 +44,7 @@
#include <asm/sections.h>
#include <asm/machdep.h>
#include <asm/asm-prototypes.h>
+#include <asm/ultravisor-api.h>
#include <linux/linux_logo.h>
@@ -174,6 +175,10 @@ static unsigned long __prombss prom_tce_alloc_end;
static bool __prombss prom_radix_disable;
#endif
+#ifdef CONFIG_PPC_SVM
+static bool __prombss prom_svm_disable;
+#endif
+
struct platform_support {
bool hash_mmu;
bool radix_mmu;
@@ -809,6 +814,17 @@ static void __init early_cmdline_parse(void)
if (prom_radix_disable)
prom_debug("Radix disabled from cmdline\n");
#endif /* CONFIG_PPC_PSERIES */
+
+#ifdef CONFIG_PPC_SVM
+ opt = prom_strstr(prom_cmd_line, "svm=");
+ if (opt) {
+ bool val;
+
+ opt += sizeof("svm=") - 1;
+ if (!prom_strtobool(opt, &val))
+ prom_svm_disable = !val;
+ }
+#endif /* CONFIG_PPC_SVM */
}
#ifdef CONFIG_PPC_PSERIES
@@ -1707,6 +1723,43 @@ static void __init prom_close_stdin(void)
}
}
+#ifdef CONFIG_PPC_SVM
+static int prom_rtas_os_term_hcall(uint64_t args)
+{
+ register uint64_t arg1 asm("r3") = 0xf000;
+ register uint64_t arg2 asm("r4") = args;
+
+ asm volatile("sc 1\n" : "=r" (arg1) :
+ "r" (arg1),
+ "r" (arg2) :);
+ return arg1;
+}
+
+static struct rtas_args __prombss os_term_args;
+
+static void __init prom_rtas_os_term(char *str)
+{
+ phandle rtas_node;
+ __be32 val;
+ u32 token;
+
+ prom_printf("%s: start...\n", __func__);
+ rtas_node = call_prom("finddevice", 1, 1, ADDR("/rtas"));
+ prom_printf("rtas_node: %x\n", rtas_node);
+ if (!PHANDLE_VALID(rtas_node))
+ return;
+
+ val = 0;
+ prom_getprop(rtas_node, "ibm,os-term", &val, sizeof(val));
+ token = be32_to_cpu(val);
+ prom_printf("ibm,os-term: %x\n", token);
+ if (token == 0)
+ prom_panic("Could not get token for ibm,os-term\n");
+ os_term_args.token = cpu_to_be32(token);
+ prom_rtas_os_term_hcall((uint64_t)&os_term_args);
+}
+#endif /* CONFIG_PPC_SVM */
+
/*
* Allocate room for and instantiate RTAS
*/
@@ -3162,6 +3215,74 @@ static void unreloc_toc(void)
#endif
#endif
+#ifdef CONFIG_PPC_SVM
+/*
+ * The ESM blob is a data structure with information needed by the Ultravisor to
+ * validate the integrity of the secure guest.
+ */
+static void *get_esm_blob(void)
+{
+ /*
+ * FIXME: We are still finalizing the details on how prom_init will grab
+ * the ESM blob. When that is done, this function will be updated.
+ */
+ return (void *)0xdeadbeef;
+}
+
+/*
+ * Perform the Enter Secure Mode ultracall.
+ */
+static int enter_secure_mode(void *esm_blob, void *retaddr, void *fdt)
+{
+ register uint64_t func asm("r0") = UV_ESM;
+ register uint64_t arg1 asm("r3") = (uint64_t)esm_blob;
+ register uint64_t arg2 asm("r4") = (uint64_t)retaddr;
+ register uint64_t arg3 asm("r5") = (uint64_t)fdt;
+
+ asm volatile("sc 2\n"
+ : "=r"(arg1)
+ : "r"(func), "0"(arg1), "r"(arg2), "r"(arg3)
+ :);
+
+ return (int)arg1;
+}
+
+/*
+ * Call the Ultravisor to transfer us to secure memory if we have an ESM blob.
+ */
+static void setup_secure_guest(void *fdt)
+{
+ void *esm_blob;
+ int ret;
+
+ if (prom_svm_disable) {
+ prom_printf("Secure mode is OFF\n");
+ return;
+ }
+
+ esm_blob = get_esm_blob();
+ if (esm_blob == NULL)
+ /*
+ * Absence of an ESM blob isn't an error, it just means we
+ * shouldn't switch to secure mode.
+ */
+ return;
+
+ /* Switch to secure mode. */
+ prom_printf("Switching to secure mode.\n");
+
+ ret = enter_secure_mode(esm_blob, NULL, fdt);
+ if (ret != U_SUCCESS) {
+ prom_printf("Returned %d from switching to secure mode.\n", ret);
+ prom_rtas_os_term("Switch to secure mode failed.\n");
+ }
+}
+#else
+static void setup_secure_guest(void *fdt)
+{
+}
+#endif /* CONFIG_PPC_SVM */
+
/*
* We enter here early on, when the Open Firmware prom is still
* handling exceptions and the MMU hash table for us.
@@ -3360,6 +3481,9 @@ unsigned long __init prom_init(unsigned long r3, unsigned long r4,
unreloc_toc();
#endif
+ /* Move to secure memory if we're supposed to be secure guests. */
+ setup_secure_guest((void *)hdr);
+
__start(hdr, kbase, 0, 0, 0, 0, 0);
return 0;
^ permalink raw reply related
* [RFC PATCH 02/12] powerpc: Add support for adding an ESM blob to the zImage wrapper
From: Thiago Jung Bauermann @ 2019-05-21 4:49 UTC (permalink / raw)
To: linuxppc-dev
Cc: Anshuman Khandual, Alexey Kardashevskiy, Mike Anderson, Ram Pai,
linux-kernel, Claudio Carvalho, Paul Mackerras, Christoph Hellwig,
Thiago Jung Bauermann
In-Reply-To: <20190521044912.1375-1-bauerman@linux.ibm.com>
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
For secure VMs, the signing tool will create a ticket called the "ESM blob"
for the Enter Secure Mode ultravisor call with the signatures of the kernel
and initrd among other things.
This adds support to the wrapper script for adding that blob via the "-e"
option to the zImage.pseries.
It also adds code to the zImage wrapper itself to retrieve and if necessary
relocate the blob, and pass its address to Linux via the device-tree, to be
later consumed by prom_init.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[ Minor adjustments to some comments. ]
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
---
arch/powerpc/boot/main.c | 41 ++++++++++++++++++++++++++++++++++
arch/powerpc/boot/ops.h | 2 ++
arch/powerpc/boot/wrapper | 24 +++++++++++++++++---
arch/powerpc/boot/zImage.lds.S | 8 +++++++
4 files changed, 72 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/boot/main.c b/arch/powerpc/boot/main.c
index 78aaf4ffd7ab..ca612efd3e81 100644
--- a/arch/powerpc/boot/main.c
+++ b/arch/powerpc/boot/main.c
@@ -150,6 +150,46 @@ static struct addr_range prep_initrd(struct addr_range vmlinux, void *chosen,
return (struct addr_range){(void *)initrd_addr, initrd_size};
}
+#ifdef __powerpc64__
+static void prep_esm_blob(struct addr_range vmlinux, void *chosen)
+{
+ unsigned long esm_blob_addr, esm_blob_size;
+
+ /* Do we have an ESM (Enter Secure Mode) blob? */
+ if (_esm_blob_end <= _esm_blob_start)
+ return;
+
+ printf("Attached ESM blob at 0x%p-0x%p\n\r",
+ _esm_blob_start, _esm_blob_end);
+ esm_blob_addr = (unsigned long)_esm_blob_start;
+ esm_blob_size = _esm_blob_end - _esm_blob_start;
+
+ /*
+ * If the ESM blob is too low it will be clobbered when the
+ * kernel relocates to its final location. In this case,
+ * allocate a safer place and move it.
+ */
+ if (esm_blob_addr < vmlinux.size) {
+ void *old_addr = (void *)esm_blob_addr;
+
+ printf("Allocating 0x%lx bytes for esm_blob ...\n\r",
+ esm_blob_size);
+ esm_blob_addr = (unsigned long)malloc(esm_blob_size);
+ if (!esm_blob_addr)
+ fatal("Can't allocate memory for ESM blob !\n\r");
+ printf("Relocating ESM blob 0x%lx <- 0x%p (0x%lx bytes)\n\r",
+ esm_blob_addr, old_addr, esm_blob_size);
+ memmove((void *)esm_blob_addr, old_addr, esm_blob_size);
+ }
+
+ /* Tell the kernel ESM blob address via device tree. */
+ setprop_val(chosen, "linux,esm-blob-start", (u32)(esm_blob_addr));
+ setprop_val(chosen, "linux,esm-blob-end", (u32)(esm_blob_addr + esm_blob_size));
+}
+#else
+static inline void prep_esm_blob(struct addr_range vmlinux, void *chosen) { }
+#endif
+
/* A buffer that may be edited by tools operating on a zImage binary so as to
* edit the command line passed to vmlinux (by setting /chosen/bootargs).
* The buffer is put in it's own section so that tools may locate it easier.
@@ -218,6 +258,7 @@ void start(void)
vmlinux = prep_kernel();
initrd = prep_initrd(vmlinux, chosen,
loader_info.initrd_addr, loader_info.initrd_size);
+ prep_esm_blob(vmlinux, chosen);
prep_cmdline(chosen);
printf("Finalizing device tree...");
diff --git a/arch/powerpc/boot/ops.h b/arch/powerpc/boot/ops.h
index cd043726ed88..e0606766480f 100644
--- a/arch/powerpc/boot/ops.h
+++ b/arch/powerpc/boot/ops.h
@@ -251,6 +251,8 @@ extern char _initrd_start[];
extern char _initrd_end[];
extern char _dtb_start[];
extern char _dtb_end[];
+extern char _esm_blob_start[];
+extern char _esm_blob_end[];
static inline __attribute__((const))
int __ilog2_u32(u32 n)
diff --git a/arch/powerpc/boot/wrapper b/arch/powerpc/boot/wrapper
index f9141eaec6ff..36b2ad6cd5b7 100755
--- a/arch/powerpc/boot/wrapper
+++ b/arch/powerpc/boot/wrapper
@@ -14,6 +14,7 @@
# -i initrd specify initrd file
# -d devtree specify device-tree blob
# -s tree.dts specify device-tree source file (needs dtc installed)
+# -e esm_blob specify ESM blob for secure images
# -c cache $kernel.strip.gz (use if present & newer, else make)
# -C prefix specify command prefix for cross-building tools
# (strip, objcopy, ld)
@@ -38,6 +39,7 @@ platform=of
initrd=
dtb=
dts=
+esm_blob=
cacheit=
binary=
compression=.gz
@@ -60,9 +62,9 @@ tmpdir=.
usage() {
echo 'Usage: wrapper [-o output] [-p platform] [-i initrd]' >&2
- echo ' [-d devtree] [-s tree.dts] [-c] [-C cross-prefix]' >&2
- echo ' [-D datadir] [-W workingdir] [-Z (gz|xz|none)]' >&2
- echo ' [--no-compression] [vmlinux]' >&2
+ echo ' [-d devtree] [-s tree.dts] [-e esm_blob]' >&2
+ echo ' [-c] [-C cross-prefix] [-D datadir] [-W workingdir]' >&2
+ echo ' [-Z (gz|xz|none)] [--no-compression] [vmlinux]' >&2
exit 1
}
@@ -105,6 +107,11 @@ while [ "$#" -gt 0 ]; do
[ "$#" -gt 0 ] || usage
dtb="$1"
;;
+ -e)
+ shift
+ [ "$#" -gt 0 ] || usage
+ esm_blob="$1"
+ ;;
-s)
shift
[ "$#" -gt 0 ] || usage
@@ -211,9 +218,16 @@ objflags=-S
tmp=$tmpdir/zImage.$$.o
ksection=.kernel:vmlinux.strip
isection=.kernel:initrd
+esection=.kernel:esm_blob
link_address='0x400000'
make_space=y
+
+if [ -n "$esm_blob" -a "$platform" != "pseries" ]; then
+ echo "ESM blob not support on non-pseries platforms" >&2
+ exit 1
+fi
+
case "$platform" in
of)
platformo="$object/of.o $object/epapr.o"
@@ -463,6 +477,10 @@ if [ -n "$dtb" ]; then
fi
fi
+if [ -n "$esm_blob" ]; then
+ addsec $tmp "$esm_blob" $esection
+fi
+
if [ "$platform" != "miboot" ]; then
if [ -n "$link_address" ] ; then
text_start="-Ttext $link_address"
diff --git a/arch/powerpc/boot/zImage.lds.S b/arch/powerpc/boot/zImage.lds.S
index 4ac1e36edfe7..a21f3a76e06f 100644
--- a/arch/powerpc/boot/zImage.lds.S
+++ b/arch/powerpc/boot/zImage.lds.S
@@ -68,6 +68,14 @@ SECTIONS
_initrd_end = .;
}
+ . = ALIGN(4096);
+ .kernel:esm_blob :
+ {
+ _esm_blob_start = .;
+ *(.kernel:esm_blob)
+ _esm_blob_end = .;
+ }
+
#ifdef CONFIG_PPC64_BOOT_WRAPPER
. = ALIGN(256);
.got :
^ permalink raw reply related
* [PATCH 01/12] powerpc/pseries: Introduce option to build secure virtual machines
From: Thiago Jung Bauermann @ 2019-05-21 4:49 UTC (permalink / raw)
To: linuxppc-dev
Cc: Anshuman Khandual, Alexey Kardashevskiy, Mike Anderson, Ram Pai,
linux-kernel, Claudio Carvalho, Paul Mackerras, Christoph Hellwig,
Thiago Jung Bauermann
In-Reply-To: <20190521044912.1375-1-bauerman@linux.ibm.com>
Introduce CONFIG_PPC_SVM to control support for secure guests and include
Ultravisor-related helpers when it is selected
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
---
arch/powerpc/include/asm/ultravisor.h | 2 +-
arch/powerpc/kernel/Makefile | 4 +++-
arch/powerpc/platforms/pseries/Kconfig | 12 ++++++++++++
3 files changed, 16 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/include/asm/ultravisor.h b/arch/powerpc/include/asm/ultravisor.h
index 4ffec7a36acd..09e0a615d96f 100644
--- a/arch/powerpc/include/asm/ultravisor.h
+++ b/arch/powerpc/include/asm/ultravisor.h
@@ -28,7 +28,7 @@ extern int early_init_dt_scan_ultravisor(unsigned long node, const char *uname,
* This call supports up to 6 arguments and 4 return arguments. Use
* UCALL_BUFSIZE to size the return argument buffer.
*/
-#if defined(CONFIG_PPC_UV)
+#if defined(CONFIG_PPC_UV) || defined(CONFIG_PPC_SVM)
long ucall(unsigned long opcode, unsigned long *retbuf, ...);
#else
static long ucall(unsigned long opcode, unsigned long *retbuf, ...)
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 43ff4546e469..1e9b721634c8 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -154,7 +154,9 @@ endif
obj-$(CONFIG_EPAPR_PARAVIRT) += epapr_paravirt.o epapr_hcalls.o
obj-$(CONFIG_KVM_GUEST) += kvm.o kvm_emul.o
-obj-$(CONFIG_PPC_UV) += ultravisor.o ucall.o
+ifneq ($(CONFIG_PPC_UV)$(CONFIG_PPC_SVM),)
+obj-y += ultravisor.o ucall.o
+endif
# Disable GCOV, KCOV & sanitizers in odd or sensitive code
GCOV_PROFILE_prom_init.o := n
diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig
index 9c6b3d860518..82c16aa4f1ce 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -144,3 +144,15 @@ config PAPR_SCM
tristate "Support for the PAPR Storage Class Memory interface"
help
Enable access to hypervisor provided storage class memory.
+
+config PPC_SVM
+ bool "Secure virtual machine (SVM) support for POWER"
+ depends on PPC_PSERIES
+ default n
+ help
+ Support secure guests on POWER. There are certain POWER platforms which
+ support secure guests using the Protected Execution Facility, with the
+ help of an Ultravisor executing below the hypervisor layer. This
+ enables the support for those guests.
+
+ If unsure, say "N".
^ permalink raw reply related
* [PATCH 00/12] Secure Virtual Machine Enablement
From: Thiago Jung Bauermann @ 2019-05-21 4:49 UTC (permalink / raw)
To: linuxppc-dev
Cc: Anshuman Khandual, Alexey Kardashevskiy, Mike Anderson, Ram Pai,
linux-kernel, Claudio Carvalho, Paul Mackerras, Christoph Hellwig,
Thiago Jung Bauermann
This series enables Secure Virtual Machines (SVMs) on powerpc. SVMs use the
Protected Execution Facility (PEF) and request to be migrated to secure
memory during prom_init() so by default all of their memory is inaccessible
to the hypervisor. There is an Ultravisor call that the VM can use to
request certain pages to be made accessible to (or shared with) the
hypervisor.
The objective of these patches is to have the guest perform this request
for buffers that need to be accessed by the hypervisor such as the LPPACAs,
the SWIOTLB memory and the Debug Trace Log.
The patch set applies on top of Claudio Carvalho's "kvmppc: Paravirtualize KVM
to support ultravisor" series:
https://lore.kernel.org/linuxppc-dev/20190518142524.28528-1-cclaudio@linux.ibm.com/
I only need the following two patches from his series:
[RFC PATCH v2 02/10] KVM: PPC: Ultravisor: Introduce the MSR_S bit
[RFC PATCH v2 04/10] KVM: PPC: Ultravisor: Add generic ultravisor call handler
Patches 2 and 3 are posted as RFC because we are still finalizing the
details on how the ESM blob will be passed to the kernel. All other patches
are (hopefully) in upstreamable shape.
Unfortunately this series still doesn't enable the use of virtio devices in
the secure guest. This support depends on a discussion that is currently
ongoing with the virtio community:
https://lore.kernel.org/linuxppc-dev/87womn8inf.fsf@morokweng.localdomain/
This was the last time I posted this patch set:
https://lore.kernel.org/linuxppc-dev/20180824162535.22798-1-bauerman@linux.ibm.com/
At that time, it wasn't possible to launch a real secure guest because the
Ultravisor was still in very early development. Now there is a relatively
mature Ultravisor and I was able to test it using Claudio's patches in the
host kernel, booting normally using an initramfs for the root filesystem.
This is the command used to start up the guest with QEMU 4.0:
qemu-system-ppc64 \
-nodefaults \
-cpu host \
-machine pseries,accel=kvm,kvm-type=HV,cap-htm=off,cap-cfpc=broken,cap-sbbc=broken,cap-ibs=broken \
-display none \
-serial mon:stdio \
-smp 1 \
-m 4G \
-kernel /home/bauermann/vmlinux \
-initrd /home/bauermann/fs_small.cpio \
-append 'debug'
Changelog since the RFC from August:
- Patch "powerpc/pseries: Introduce option to build secure virtual machines"
- New patch.
- Patch "powerpc: Add support for adding an ESM blob to the zImage wrapper"
- Patch from Benjamin Herrenschmidt, first posted here:
https://lore.kernel.org/linuxppc-dev/20180531043417.25073-1-benh@kernel.crashing.org/
- Made minor adjustments to some comments. Code is unchanged.
- Patch "powerpc/prom_init: Add the ESM call to prom_init"
- New patch from Ram Pai and Michael Anderson.
- Patch "powerpc/pseries/svm: Add helpers for UV_SHARE_PAGE and UV_UNSHARE_PAGE"
- New patch from Ram Pai.
- Patch "powerpc/pseries: Add and use LPPACA_SIZE constant"
- Moved LPPACA_SIZE macro inside the CONFIG_PPC_PSERIES #ifdef.
- Put sizeof() operand left of comparison operator in BUILD_BUG_ON()
macro to appease a checkpatch warning.
- Patch "powerpc/pseries/svm: Use shared memory for LPPACA structures"
- Moved definition of is_secure_guest() helper to this patch.
- Changed shared_lppaca and shared_lppaca_size from globals to static
variables inside alloc_shared_lppaca().
- Changed shared_lppaca to hold virtual address instead of physical
address.
- Patch "powerpc/pseries/svm: Use shared memory for Debug Trace Log (DTL)"
- Add get_dtl_cache_ctor() macro. Suggested by Ram Pai.
- Patch "powerpc/pseries/svm: Export guest SVM status to user space via sysfs"
- New patch from Ryan Grimm.
- Patch "powerpc/pseries/svm: Disable doorbells in SVM guests"
- New patch from Sukadev Bhattiprolu.
- Patch "powerpc/pseries/iommu: Don't use dma_iommu_ops on secure guests"
- New patch.
- Patch "powerpc/pseries/svm: Force SWIOTLB for secure guests"
- New patch with code that was previously in other patches.
- Patch "powerpc/configs: Enable secure guest support in pseries and ppc64 defconfigs"
- New patch from Ryan Grimm.
- Patch "powerpc/pseries/svm: Detect Secure Virtual Machine (SVM) platform"
- Dropped this patch by moving its code to other patches.
- Patch "powerpc/svm: Select CONFIG_DMA_DIRECT_OPS and CONFIG_SWIOTLB"
- No need to select CONFIG_DMA_DIRECT_OPS anymore. The CONFIG_SWIOTLB
change was moved to another patch and this patch was dropped.
- Patch "powerpc/pseries/svm: Add memory conversion (shared/secure) helper functions"
- Dropped patch since the helper functions were unnecessary wrappers
around uv_share_page() and uv_unshare_page().
- Patch "powerpc/svm: Convert SWIOTLB buffers to shared memory"
- Squashed into patch "powerpc/pseries/svm: Force SWIOTLB for secure
guests"
- Patch "powerpc/svm: Don't release SWIOTLB buffers on secure guests"
- Squashed into patch "powerpc/pseries/svm: Force SWIOTLB for secure
guests"
- Patch "powerpc/svm: Use SWIOTLB DMA API for all virtio devices"
- Dropped patch. Enablement of virtio will use a difference approach.
- Patch "powerpc/svm: Force the use of bounce buffers"
- Squashed into patch "powerpc/pseries/svm: Force SWIOTLB for secure
guests"
- Added comment explaining why it's necessary.to force use of SWIOTLB.
Suggested by Christoph Hellwig.
- Patch "powerpc/svm: Increase SWIOTLB buffer size"
- Dropped patch.
Anshuman Khandual (3):
powerpc/pseries/svm: Use shared memory for LPPACA structures
powerpc/pseries/svm: Use shared memory for Debug Trace Log (DTL)
powerpc/pseries/svm: Force SWIOTLB for secure guests
Benjamin Herrenschmidt (1):
powerpc: Add support for adding an ESM blob to the zImage wrapper
Ram Pai (2):
powerpc/prom_init: Add the ESM call to prom_init
powerpc/pseries/svm: Add helpers for UV_SHARE_PAGE and UV_UNSHARE_PAGE
Ryan Grimm (2):
powerpc/pseries/svm: Export guest SVM status to user space via sysfs
powerpc/configs: Enable secure guest support in pseries and ppc64
defconfigs
Sukadev Bhattiprolu (1):
powerpc/pseries/svm: Disable doorbells in SVM guests
Thiago Jung Bauermann (3):
powerpc/pseries: Introduce option to build secure virtual machines
powerpc/pseries: Add and use LPPACA_SIZE constant
powerpc/pseries/iommu: Don't use dma_iommu_ops on secure guests
.../admin-guide/kernel-parameters.txt | 5 +
arch/powerpc/boot/main.c | 41 ++++++
arch/powerpc/boot/ops.h | 2 +
arch/powerpc/boot/wrapper | 24 +++-
arch/powerpc/boot/zImage.lds.S | 8 ++
arch/powerpc/configs/ppc64_defconfig | 1 +
arch/powerpc/configs/pseries_defconfig | 1 +
arch/powerpc/include/asm/mem_encrypt.h | 19 +++
arch/powerpc/include/asm/svm.h | 31 +++++
arch/powerpc/include/asm/ultravisor-api.h | 3 +
arch/powerpc/include/asm/ultravisor.h | 16 ++-
arch/powerpc/kernel/Makefile | 4 +-
arch/powerpc/kernel/paca.c | 52 +++++++-
arch/powerpc/kernel/prom_init.c | 124 ++++++++++++++++++
arch/powerpc/kernel/sysfs.c | 29 ++++
arch/powerpc/platforms/pseries/Kconfig | 17 +++
arch/powerpc/platforms/pseries/Makefile | 1 +
arch/powerpc/platforms/pseries/iommu.c | 6 +-
arch/powerpc/platforms/pseries/setup.c | 5 +-
arch/powerpc/platforms/pseries/smp.c | 3 +-
arch/powerpc/platforms/pseries/svm.c | 85 ++++++++++++
21 files changed, 464 insertions(+), 13 deletions(-)
create mode 100644 arch/powerpc/include/asm/mem_encrypt.h
create mode 100644 arch/powerpc/include/asm/svm.h
create mode 100644 arch/powerpc/platforms/pseries/svm.c
^ permalink raw reply
* Re: [RFC PATCH] powerpc/mm: Implement STRICT_MODULE_RWX
From: Russell Currey @ 2019-05-21 4:33 UTC (permalink / raw)
To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras,
Michael Ellerman
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <df502ffe07caa38c46b0144fc824fff447f4105b.1557901092.git.christophe.leroy@c-s.fr>
On Wed, 2019-05-15 at 06:20 +0000, Christophe Leroy wrote:
Confirming this works on hash and radix book3s64.
> +
> + // only operate on VM areas for now
> + area = find_vm_area((void *)addr);
> + if (!area || end > (unsigned long)area->addr + area->size ||
> + !(area->flags & VM_ALLOC))
> + return -EINVAL;
https://lore.kernel.org/patchwork/project/lkml/list/?series=391470
With this patch, the above series causes crashes on (at least) Hash,
since it adds another user of change_page_rw() and change_page_nx()
that for reasons I don't understand yet, we can't handle. I can work
around this with:
if (area->flags & VM_FLUSH_RESET_PERMS)
return 0;
so this is broken on at least one platform as of 5.2-rc1. We're going
to look into this more to see if there's anything else we have to do as
a result of this series before the next merge window, or if just
working around it like this is good enough.
- Russell
^ permalink raw reply
* Re: [PATCH] crypto: vmx - convert to SPDX license identifiers
From: Michael Ellerman @ 2019-05-21 3:58 UTC (permalink / raw)
To: Eric Biggers, linux-crypto, Herbert Xu
Cc: Breno Leitão, linuxppc-dev, Nayna Jain,
Paulo Flabiano Smorigo, Daniel Axtens
In-Reply-To: <20190520164232.159053-1-ebiggers@kernel.org>
Eric Biggers <ebiggers@kernel.org> writes:
> From: Eric Biggers <ebiggers@google.com>
>
> Remove the boilerplate license text and replace it with the equivalent
> SPDX license identifier.
>
> Signed-off-by: Eric Biggers <ebiggers@google.com>
> ---
> drivers/crypto/vmx/aes.c | 14 +-------------
> drivers/crypto/vmx/aes_cbc.c | 14 +-------------
> drivers/crypto/vmx/aes_ctr.c | 14 +-------------
> drivers/crypto/vmx/aes_xts.c | 14 +-------------
> drivers/crypto/vmx/vmx.c | 14 +-------------
> 5 files changed, 5 insertions(+), 65 deletions(-)
Looks good to me.
Reviewed-by: Michael Ellerman <mpe@ellerman.id.au>
cheers
^ permalink raw reply
* Re: [PATCH 10/10] docs: fix broken documentation links
From: Wolfram Sang @ 2019-05-20 15:05 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: kvm, Linux Doc Mailing List, linux-pci, platform-driver-x86,
linux-mm, linux-i2c, linux-kselftest, devel, Jonathan Corbet, x86,
linux-acpi, xen-devel, linux-edac, devicetree, linux-arm-msm,
Mauro Carvalho Chehab, linux-gpio, linux-amlogic, virtualization,
linux-arm-kernel, devel, netdev, linux-kernel,
linux-security-module, linuxppc-dev
In-Reply-To: <4fd1182b4a41feb2447c7ccde4d7f0a6b3c92686.1558362030.git.mchehab+samsung@kernel.org>
[-- Attachment #1: Type: text/plain, Size: 1091 bytes --]
On Mon, May 20, 2019 at 11:47:39AM -0300, Mauro Carvalho Chehab wrote:
> Mostly due to x86 and acpi conversion, several documentation
> links are still pointing to the old file. Fix them.
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Thanks, didn't notice that.
> Documentation/i2c/instantiating-devices | 2 +-
...
> diff --git a/Documentation/i2c/instantiating-devices b/Documentation/i2c/instantiating-devices
> index 0d85ac1935b7..5a3e2f331e8c 100644
> --- a/Documentation/i2c/instantiating-devices
> +++ b/Documentation/i2c/instantiating-devices
> @@ -85,7 +85,7 @@ Method 1c: Declare the I2C devices via ACPI
> -------------------------------------------
>
> ACPI can also describe I2C devices. There is special documentation for this
> -which is currently located at Documentation/acpi/enumeration.txt.
> +which is currently located at Documentation/firmware-guide/acpi/enumeration.rst.
>
>
> Method 2: Instantiate the devices explicitly
For this I2C part:
Reviewed-by: Wolfram Sang <wsa@the-dreams.de>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply
* Re: [PATCH] misc: remove redundant 'default n' from Kconfig-s
From: Michał Mirosław @ 2019-05-20 17:55 UTC (permalink / raw)
To: Bartlomiej Zolnierkiewicz
Cc: Eric Piel, Andrew Donnellan, Arnd Bergmann, Greg Kroah-Hartman,
Frank Haverkamp, linux-kernel, Frederic Barrat, linuxppc-dev
In-Reply-To: <1ab818ae-4d9f-d17a-f11f-7caaa5bf98bc@samsung.com>
On Mon, May 20, 2019 at 04:10:46PM +0200, Bartlomiej Zolnierkiewicz wrote:
> 'default n' is the default value for any bool or tristate Kconfig
> setting so there is no need to write it explicitly.
>
> Also since commit f467c5640c29 ("kconfig: only write '# CONFIG_FOO
> is not set' for visible symbols") the Kconfig behavior is the same
> regardless of 'default n' being present or not:
>
> ...
> One side effect of (and the main motivation for) this change is making
> the following two definitions behave exactly the same:
>
> config FOO
> bool
>
> config FOO
> bool
> default n
>
> With this change, neither of these will generate a
> '# CONFIG_FOO is not set' line (assuming FOO isn't selected/implied).
> That might make it clearer to people that a bare 'default n' is
> redundant.
> ...
>
> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
[...]
> drivers/misc/cb710/Kconfig | 1 -
Acked-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
^ permalink raw reply
* [PATCH] crypto: testmgr - fix length truncation with large page size
From: Eric Biggers @ 2019-05-20 16:47 UTC (permalink / raw)
To: linux-crypto, Herbert Xu; +Cc: linuxppc-dev
From: Eric Biggers <ebiggers@google.com>
On PowerPC with CONFIG_CRYPTO_MANAGER_EXTRA_TESTS=y, there is sometimes
a crash in generate_random_aead_testvec(). The problem is that the
generated test vectors use data lengths of up to about 2 * PAGE_SIZE,
which is 128 KiB on PowerPC; however, the data length fields in the test
vectors are 'unsigned short', so the lengths get truncated. Fix this by
changing the relevant fields to 'unsigned int'.
Fixes: 40153b10d91c ("crypto: testmgr - fuzz AEADs against their generic implementation")
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
crypto/testmgr.h | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/crypto/testmgr.h b/crypto/testmgr.h
index 9a13c634b2077..fb2afdd544e00 100644
--- a/crypto/testmgr.h
+++ b/crypto/testmgr.h
@@ -43,7 +43,7 @@ struct hash_testvec {
const char *key;
const char *plaintext;
const char *digest;
- unsigned short psize;
+ unsigned int psize;
unsigned short ksize;
int setkey_error;
int digest_error;
@@ -74,7 +74,7 @@ struct cipher_testvec {
const char *ctext;
unsigned char wk; /* weak key flag */
unsigned short klen;
- unsigned short len;
+ unsigned int len;
bool fips_skip;
bool generates_iv;
int setkey_error;
@@ -110,9 +110,9 @@ struct aead_testvec {
unsigned char novrfy;
unsigned char wk;
unsigned char klen;
- unsigned short plen;
- unsigned short clen;
- unsigned short alen;
+ unsigned int plen;
+ unsigned int clen;
+ unsigned int alen;
int setkey_error;
int setauthsize_error;
int crypt_error;
--
2.21.0.1020.gf2820cf01a-goog
^ permalink raw reply related
* [PATCH] crypto: vmx - convert to skcipher API
From: Eric Biggers @ 2019-05-20 16:44 UTC (permalink / raw)
To: linux-crypto, Herbert Xu
Cc: Breno Leitão, Paulo Flabiano Smorigo, Nayna Jain,
linuxppc-dev, Daniel Axtens
From: Eric Biggers <ebiggers@google.com>
Convert the VMX implementations of AES-CBC, AES-CTR, and AES-XTS from
the deprecated "blkcipher" API to the "skcipher" API.
As part of this, I moved the skcipher_request for the fallback algorithm
off the stack and into the request context of the parent algorithm.
I tested this in a PowerPC VM with CONFIG_CRYPTO_MANAGER_EXTRA_TESTS=y.
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
drivers/crypto/vmx/aes_cbc.c | 183 ++++++++++++---------------------
drivers/crypto/vmx/aes_ctr.c | 165 +++++++++++++----------------
drivers/crypto/vmx/aes_xts.c | 175 ++++++++++++++-----------------
drivers/crypto/vmx/aesp8-ppc.h | 2 -
drivers/crypto/vmx/vmx.c | 72 +++++++------
5 files changed, 252 insertions(+), 345 deletions(-)
diff --git a/drivers/crypto/vmx/aes_cbc.c b/drivers/crypto/vmx/aes_cbc.c
index dae8af3c46dce..92e75a05d6a9e 100644
--- a/drivers/crypto/vmx/aes_cbc.c
+++ b/drivers/crypto/vmx/aes_cbc.c
@@ -7,64 +7,52 @@
* Author: Marcelo Henrique Cerri <mhcerri@br.ibm.com>
*/
-#include <linux/types.h>
-#include <linux/err.h>
-#include <linux/crypto.h>
-#include <linux/delay.h>
#include <asm/simd.h>
#include <asm/switch_to.h>
#include <crypto/aes.h>
#include <crypto/internal/simd.h>
-#include <crypto/scatterwalk.h>
-#include <crypto/skcipher.h>
+#include <crypto/internal/skcipher.h>
#include "aesp8-ppc.h"
struct p8_aes_cbc_ctx {
- struct crypto_sync_skcipher *fallback;
+ struct crypto_skcipher *fallback;
struct aes_key enc_key;
struct aes_key dec_key;
};
-static int p8_aes_cbc_init(struct crypto_tfm *tfm)
+static int p8_aes_cbc_init(struct crypto_skcipher *tfm)
{
- const char *alg = crypto_tfm_alg_name(tfm);
- struct crypto_sync_skcipher *fallback;
- struct p8_aes_cbc_ctx *ctx = crypto_tfm_ctx(tfm);
-
- fallback = crypto_alloc_sync_skcipher(alg, 0,
- CRYPTO_ALG_NEED_FALLBACK);
+ struct p8_aes_cbc_ctx *ctx = crypto_skcipher_ctx(tfm);
+ struct crypto_skcipher *fallback;
+ fallback = crypto_alloc_skcipher("cbc(aes)", 0,
+ CRYPTO_ALG_NEED_FALLBACK |
+ CRYPTO_ALG_ASYNC);
if (IS_ERR(fallback)) {
- printk(KERN_ERR
- "Failed to allocate transformation for '%s': %ld\n",
- alg, PTR_ERR(fallback));
+ pr_err("Failed to allocate cbc(aes) fallback: %ld\n",
+ PTR_ERR(fallback));
return PTR_ERR(fallback);
}
- crypto_sync_skcipher_set_flags(
- fallback,
- crypto_skcipher_get_flags((struct crypto_skcipher *)tfm));
+ crypto_skcipher_set_reqsize(tfm, sizeof(struct skcipher_request) +
+ crypto_skcipher_reqsize(fallback));
ctx->fallback = fallback;
-
return 0;
}
-static void p8_aes_cbc_exit(struct crypto_tfm *tfm)
+static void p8_aes_cbc_exit(struct crypto_skcipher *tfm)
{
- struct p8_aes_cbc_ctx *ctx = crypto_tfm_ctx(tfm);
+ struct p8_aes_cbc_ctx *ctx = crypto_skcipher_ctx(tfm);
- if (ctx->fallback) {
- crypto_free_sync_skcipher(ctx->fallback);
- ctx->fallback = NULL;
- }
+ crypto_free_skcipher(ctx->fallback);
}
-static int p8_aes_cbc_setkey(struct crypto_tfm *tfm, const u8 *key,
+static int p8_aes_cbc_setkey(struct crypto_skcipher *tfm, const u8 *key,
unsigned int keylen)
{
+ struct p8_aes_cbc_ctx *ctx = crypto_skcipher_ctx(tfm);
int ret;
- struct p8_aes_cbc_ctx *ctx = crypto_tfm_ctx(tfm);
preempt_disable();
pagefault_disable();
@@ -75,108 +63,71 @@ static int p8_aes_cbc_setkey(struct crypto_tfm *tfm, const u8 *key,
pagefault_enable();
preempt_enable();
- ret |= crypto_sync_skcipher_setkey(ctx->fallback, key, keylen);
+ ret |= crypto_skcipher_setkey(ctx->fallback, key, keylen);
return ret ? -EINVAL : 0;
}
-static int p8_aes_cbc_encrypt(struct blkcipher_desc *desc,
- struct scatterlist *dst,
- struct scatterlist *src, unsigned int nbytes)
+static int p8_aes_cbc_crypt(struct skcipher_request *req, int enc)
{
+ struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+ const struct p8_aes_cbc_ctx *ctx = crypto_skcipher_ctx(tfm);
+ struct skcipher_walk walk;
+ unsigned int nbytes;
int ret;
- struct blkcipher_walk walk;
- struct p8_aes_cbc_ctx *ctx =
- crypto_tfm_ctx(crypto_blkcipher_tfm(desc->tfm));
if (!crypto_simd_usable()) {
- SYNC_SKCIPHER_REQUEST_ON_STACK(req, ctx->fallback);
- skcipher_request_set_sync_tfm(req, ctx->fallback);
- skcipher_request_set_callback(req, desc->flags, NULL, NULL);
- skcipher_request_set_crypt(req, src, dst, nbytes, desc->info);
- ret = crypto_skcipher_encrypt(req);
- skcipher_request_zero(req);
- } else {
- blkcipher_walk_init(&walk, dst, src, nbytes);
- ret = blkcipher_walk_virt(desc, &walk);
- while ((nbytes = walk.nbytes)) {
- preempt_disable();
- pagefault_disable();
- enable_kernel_vsx();
- aes_p8_cbc_encrypt(walk.src.virt.addr,
- walk.dst.virt.addr,
- nbytes & AES_BLOCK_MASK,
- &ctx->enc_key, walk.iv, 1);
- disable_kernel_vsx();
- pagefault_enable();
- preempt_enable();
-
- nbytes &= AES_BLOCK_SIZE - 1;
- ret = blkcipher_walk_done(desc, &walk, nbytes);
- }
+ struct skcipher_request *subreq = skcipher_request_ctx(req);
+
+ *subreq = *req;
+ skcipher_request_set_tfm(subreq, ctx->fallback);
+ return enc ? crypto_skcipher_encrypt(subreq) :
+ crypto_skcipher_decrypt(subreq);
}
+ ret = skcipher_walk_virt(&walk, req, false);
+ while ((nbytes = walk.nbytes) != 0) {
+ preempt_disable();
+ pagefault_disable();
+ enable_kernel_vsx();
+ aes_p8_cbc_encrypt(walk.src.virt.addr,
+ walk.dst.virt.addr,
+ round_down(nbytes, AES_BLOCK_SIZE),
+ enc ? &ctx->enc_key : &ctx->dec_key,
+ walk.iv, enc);
+ disable_kernel_vsx();
+ pagefault_enable();
+ preempt_enable();
+
+ ret = skcipher_walk_done(&walk, nbytes % AES_BLOCK_SIZE);
+ }
return ret;
}
-static int p8_aes_cbc_decrypt(struct blkcipher_desc *desc,
- struct scatterlist *dst,
- struct scatterlist *src, unsigned int nbytes)
+static int p8_aes_cbc_encrypt(struct skcipher_request *req)
{
- int ret;
- struct blkcipher_walk walk;
- struct p8_aes_cbc_ctx *ctx =
- crypto_tfm_ctx(crypto_blkcipher_tfm(desc->tfm));
-
- if (!crypto_simd_usable()) {
- SYNC_SKCIPHER_REQUEST_ON_STACK(req, ctx->fallback);
- skcipher_request_set_sync_tfm(req, ctx->fallback);
- skcipher_request_set_callback(req, desc->flags, NULL, NULL);
- skcipher_request_set_crypt(req, src, dst, nbytes, desc->info);
- ret = crypto_skcipher_decrypt(req);
- skcipher_request_zero(req);
- } else {
- blkcipher_walk_init(&walk, dst, src, nbytes);
- ret = blkcipher_walk_virt(desc, &walk);
- while ((nbytes = walk.nbytes)) {
- preempt_disable();
- pagefault_disable();
- enable_kernel_vsx();
- aes_p8_cbc_encrypt(walk.src.virt.addr,
- walk.dst.virt.addr,
- nbytes & AES_BLOCK_MASK,
- &ctx->dec_key, walk.iv, 0);
- disable_kernel_vsx();
- pagefault_enable();
- preempt_enable();
-
- nbytes &= AES_BLOCK_SIZE - 1;
- ret = blkcipher_walk_done(desc, &walk, nbytes);
- }
- }
-
- return ret;
+ return p8_aes_cbc_crypt(req, 1);
}
+static int p8_aes_cbc_decrypt(struct skcipher_request *req)
+{
+ return p8_aes_cbc_crypt(req, 0);
+}
-struct crypto_alg p8_aes_cbc_alg = {
- .cra_name = "cbc(aes)",
- .cra_driver_name = "p8_aes_cbc",
- .cra_module = THIS_MODULE,
- .cra_priority = 2000,
- .cra_type = &crypto_blkcipher_type,
- .cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER | CRYPTO_ALG_NEED_FALLBACK,
- .cra_alignmask = 0,
- .cra_blocksize = AES_BLOCK_SIZE,
- .cra_ctxsize = sizeof(struct p8_aes_cbc_ctx),
- .cra_init = p8_aes_cbc_init,
- .cra_exit = p8_aes_cbc_exit,
- .cra_blkcipher = {
- .ivsize = AES_BLOCK_SIZE,
- .min_keysize = AES_MIN_KEY_SIZE,
- .max_keysize = AES_MAX_KEY_SIZE,
- .setkey = p8_aes_cbc_setkey,
- .encrypt = p8_aes_cbc_encrypt,
- .decrypt = p8_aes_cbc_decrypt,
- },
+struct skcipher_alg p8_aes_cbc_alg = {
+ .base.cra_name = "cbc(aes)",
+ .base.cra_driver_name = "p8_aes_cbc",
+ .base.cra_module = THIS_MODULE,
+ .base.cra_priority = 2000,
+ .base.cra_flags = CRYPTO_ALG_NEED_FALLBACK,
+ .base.cra_blocksize = AES_BLOCK_SIZE,
+ .base.cra_ctxsize = sizeof(struct p8_aes_cbc_ctx),
+ .setkey = p8_aes_cbc_setkey,
+ .encrypt = p8_aes_cbc_encrypt,
+ .decrypt = p8_aes_cbc_decrypt,
+ .init = p8_aes_cbc_init,
+ .exit = p8_aes_cbc_exit,
+ .min_keysize = AES_MIN_KEY_SIZE,
+ .max_keysize = AES_MAX_KEY_SIZE,
+ .ivsize = AES_BLOCK_SIZE,
};
diff --git a/drivers/crypto/vmx/aes_ctr.c b/drivers/crypto/vmx/aes_ctr.c
index dc31101178446..c4d2809a5d9ee 100644
--- a/drivers/crypto/vmx/aes_ctr.c
+++ b/drivers/crypto/vmx/aes_ctr.c
@@ -7,62 +7,51 @@
* Author: Marcelo Henrique Cerri <mhcerri@br.ibm.com>
*/
-#include <linux/types.h>
-#include <linux/err.h>
-#include <linux/crypto.h>
-#include <linux/delay.h>
#include <asm/simd.h>
#include <asm/switch_to.h>
#include <crypto/aes.h>
#include <crypto/internal/simd.h>
-#include <crypto/scatterwalk.h>
-#include <crypto/skcipher.h>
+#include <crypto/internal/skcipher.h>
#include "aesp8-ppc.h"
struct p8_aes_ctr_ctx {
- struct crypto_sync_skcipher *fallback;
+ struct crypto_skcipher *fallback;
struct aes_key enc_key;
};
-static int p8_aes_ctr_init(struct crypto_tfm *tfm)
+static int p8_aes_ctr_init(struct crypto_skcipher *tfm)
{
- const char *alg = crypto_tfm_alg_name(tfm);
- struct crypto_sync_skcipher *fallback;
- struct p8_aes_ctr_ctx *ctx = crypto_tfm_ctx(tfm);
+ struct p8_aes_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
+ struct crypto_skcipher *fallback;
- fallback = crypto_alloc_sync_skcipher(alg, 0,
- CRYPTO_ALG_NEED_FALLBACK);
+ fallback = crypto_alloc_skcipher("ctr(aes)", 0,
+ CRYPTO_ALG_NEED_FALLBACK |
+ CRYPTO_ALG_ASYNC);
if (IS_ERR(fallback)) {
- printk(KERN_ERR
- "Failed to allocate transformation for '%s': %ld\n",
- alg, PTR_ERR(fallback));
+ pr_err("Failed to allocate ctr(aes) fallback: %ld\n",
+ PTR_ERR(fallback));
return PTR_ERR(fallback);
}
- crypto_sync_skcipher_set_flags(
- fallback,
- crypto_skcipher_get_flags((struct crypto_skcipher *)tfm));
+ crypto_skcipher_set_reqsize(tfm, sizeof(struct skcipher_request) +
+ crypto_skcipher_reqsize(fallback));
ctx->fallback = fallback;
-
return 0;
}
-static void p8_aes_ctr_exit(struct crypto_tfm *tfm)
+static void p8_aes_ctr_exit(struct crypto_skcipher *tfm)
{
- struct p8_aes_ctr_ctx *ctx = crypto_tfm_ctx(tfm);
+ struct p8_aes_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
- if (ctx->fallback) {
- crypto_free_sync_skcipher(ctx->fallback);
- ctx->fallback = NULL;
- }
+ crypto_free_skcipher(ctx->fallback);
}
-static int p8_aes_ctr_setkey(struct crypto_tfm *tfm, const u8 *key,
+static int p8_aes_ctr_setkey(struct crypto_skcipher *tfm, const u8 *key,
unsigned int keylen)
{
+ struct p8_aes_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
int ret;
- struct p8_aes_ctr_ctx *ctx = crypto_tfm_ctx(tfm);
preempt_disable();
pagefault_disable();
@@ -72,13 +61,13 @@ static int p8_aes_ctr_setkey(struct crypto_tfm *tfm, const u8 *key,
pagefault_enable();
preempt_enable();
- ret |= crypto_sync_skcipher_setkey(ctx->fallback, key, keylen);
+ ret |= crypto_skcipher_setkey(ctx->fallback, key, keylen);
return ret ? -EINVAL : 0;
}
-static void p8_aes_ctr_final(struct p8_aes_ctr_ctx *ctx,
- struct blkcipher_walk *walk)
+static void p8_aes_ctr_final(const struct p8_aes_ctr_ctx *ctx,
+ struct skcipher_walk *walk)
{
u8 *ctrblk = walk->iv;
u8 keystream[AES_BLOCK_SIZE];
@@ -98,77 +87,63 @@ static void p8_aes_ctr_final(struct p8_aes_ctr_ctx *ctx,
crypto_inc(ctrblk, AES_BLOCK_SIZE);
}
-static int p8_aes_ctr_crypt(struct blkcipher_desc *desc,
- struct scatterlist *dst,
- struct scatterlist *src, unsigned int nbytes)
+static int p8_aes_ctr_crypt(struct skcipher_request *req)
{
+ struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+ const struct p8_aes_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
+ struct skcipher_walk walk;
+ unsigned int nbytes;
int ret;
- u64 inc;
- struct blkcipher_walk walk;
- struct p8_aes_ctr_ctx *ctx =
- crypto_tfm_ctx(crypto_blkcipher_tfm(desc->tfm));
if (!crypto_simd_usable()) {
- SYNC_SKCIPHER_REQUEST_ON_STACK(req, ctx->fallback);
- skcipher_request_set_sync_tfm(req, ctx->fallback);
- skcipher_request_set_callback(req, desc->flags, NULL, NULL);
- skcipher_request_set_crypt(req, src, dst, nbytes, desc->info);
- ret = crypto_skcipher_encrypt(req);
- skcipher_request_zero(req);
- } else {
- blkcipher_walk_init(&walk, dst, src, nbytes);
- ret = blkcipher_walk_virt_block(desc, &walk, AES_BLOCK_SIZE);
- while ((nbytes = walk.nbytes) >= AES_BLOCK_SIZE) {
- preempt_disable();
- pagefault_disable();
- enable_kernel_vsx();
- aes_p8_ctr32_encrypt_blocks(walk.src.virt.addr,
- walk.dst.virt.addr,
- (nbytes &
- AES_BLOCK_MASK) /
- AES_BLOCK_SIZE,
- &ctx->enc_key,
- walk.iv);
- disable_kernel_vsx();
- pagefault_enable();
- preempt_enable();
-
- /* We need to update IV mostly for last bytes/round */
- inc = (nbytes & AES_BLOCK_MASK) / AES_BLOCK_SIZE;
- if (inc > 0)
- while (inc--)
- crypto_inc(walk.iv, AES_BLOCK_SIZE);
-
- nbytes &= AES_BLOCK_SIZE - 1;
- ret = blkcipher_walk_done(desc, &walk, nbytes);
- }
- if (walk.nbytes) {
- p8_aes_ctr_final(ctx, &walk);
- ret = blkcipher_walk_done(desc, &walk, 0);
- }
+ struct skcipher_request *subreq = skcipher_request_ctx(req);
+
+ *subreq = *req;
+ skcipher_request_set_tfm(subreq, ctx->fallback);
+ return crypto_skcipher_encrypt(subreq);
}
+ ret = skcipher_walk_virt(&walk, req, false);
+ while ((nbytes = walk.nbytes) >= AES_BLOCK_SIZE) {
+ preempt_disable();
+ pagefault_disable();
+ enable_kernel_vsx();
+ aes_p8_ctr32_encrypt_blocks(walk.src.virt.addr,
+ walk.dst.virt.addr,
+ nbytes / AES_BLOCK_SIZE,
+ &ctx->enc_key, walk.iv);
+ disable_kernel_vsx();
+ pagefault_enable();
+ preempt_enable();
+
+ do {
+ crypto_inc(walk.iv, AES_BLOCK_SIZE);
+ } while ((nbytes -= AES_BLOCK_SIZE) >= AES_BLOCK_SIZE);
+
+ ret = skcipher_walk_done(&walk, nbytes);
+ }
+ if (nbytes) {
+ p8_aes_ctr_final(ctx, &walk);
+ ret = skcipher_walk_done(&walk, 0);
+ }
return ret;
}
-struct crypto_alg p8_aes_ctr_alg = {
- .cra_name = "ctr(aes)",
- .cra_driver_name = "p8_aes_ctr",
- .cra_module = THIS_MODULE,
- .cra_priority = 2000,
- .cra_type = &crypto_blkcipher_type,
- .cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER | CRYPTO_ALG_NEED_FALLBACK,
- .cra_alignmask = 0,
- .cra_blocksize = 1,
- .cra_ctxsize = sizeof(struct p8_aes_ctr_ctx),
- .cra_init = p8_aes_ctr_init,
- .cra_exit = p8_aes_ctr_exit,
- .cra_blkcipher = {
- .ivsize = AES_BLOCK_SIZE,
- .min_keysize = AES_MIN_KEY_SIZE,
- .max_keysize = AES_MAX_KEY_SIZE,
- .setkey = p8_aes_ctr_setkey,
- .encrypt = p8_aes_ctr_crypt,
- .decrypt = p8_aes_ctr_crypt,
- },
+struct skcipher_alg p8_aes_ctr_alg = {
+ .base.cra_name = "ctr(aes)",
+ .base.cra_driver_name = "p8_aes_ctr",
+ .base.cra_module = THIS_MODULE,
+ .base.cra_priority = 2000,
+ .base.cra_flags = CRYPTO_ALG_NEED_FALLBACK,
+ .base.cra_blocksize = 1,
+ .base.cra_ctxsize = sizeof(struct p8_aes_ctr_ctx),
+ .setkey = p8_aes_ctr_setkey,
+ .encrypt = p8_aes_ctr_crypt,
+ .decrypt = p8_aes_ctr_crypt,
+ .init = p8_aes_ctr_init,
+ .exit = p8_aes_ctr_exit,
+ .min_keysize = AES_MIN_KEY_SIZE,
+ .max_keysize = AES_MAX_KEY_SIZE,
+ .ivsize = AES_BLOCK_SIZE,
+ .chunksize = AES_BLOCK_SIZE,
};
diff --git a/drivers/crypto/vmx/aes_xts.c b/drivers/crypto/vmx/aes_xts.c
index aee1339f134ec..965d8e03321cd 100644
--- a/drivers/crypto/vmx/aes_xts.c
+++ b/drivers/crypto/vmx/aes_xts.c
@@ -7,67 +7,56 @@
* Author: Leonidas S. Barbosa <leosilva@linux.vnet.ibm.com>
*/
-#include <linux/types.h>
-#include <linux/err.h>
-#include <linux/crypto.h>
-#include <linux/delay.h>
#include <asm/simd.h>
#include <asm/switch_to.h>
#include <crypto/aes.h>
#include <crypto/internal/simd.h>
-#include <crypto/scatterwalk.h>
+#include <crypto/internal/skcipher.h>
#include <crypto/xts.h>
-#include <crypto/skcipher.h>
#include "aesp8-ppc.h"
struct p8_aes_xts_ctx {
- struct crypto_sync_skcipher *fallback;
+ struct crypto_skcipher *fallback;
struct aes_key enc_key;
struct aes_key dec_key;
struct aes_key tweak_key;
};
-static int p8_aes_xts_init(struct crypto_tfm *tfm)
+static int p8_aes_xts_init(struct crypto_skcipher *tfm)
{
- const char *alg = crypto_tfm_alg_name(tfm);
- struct crypto_sync_skcipher *fallback;
- struct p8_aes_xts_ctx *ctx = crypto_tfm_ctx(tfm);
+ struct p8_aes_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
+ struct crypto_skcipher *fallback;
- fallback = crypto_alloc_sync_skcipher(alg, 0,
- CRYPTO_ALG_NEED_FALLBACK);
+ fallback = crypto_alloc_skcipher("xts(aes)", 0,
+ CRYPTO_ALG_NEED_FALLBACK |
+ CRYPTO_ALG_ASYNC);
if (IS_ERR(fallback)) {
- printk(KERN_ERR
- "Failed to allocate transformation for '%s': %ld\n",
- alg, PTR_ERR(fallback));
+ pr_err("Failed to allocate xts(aes) fallback: %ld\n",
+ PTR_ERR(fallback));
return PTR_ERR(fallback);
}
- crypto_sync_skcipher_set_flags(
- fallback,
- crypto_skcipher_get_flags((struct crypto_skcipher *)tfm));
+ crypto_skcipher_set_reqsize(tfm, sizeof(struct skcipher_request) +
+ crypto_skcipher_reqsize(fallback));
ctx->fallback = fallback;
-
return 0;
}
-static void p8_aes_xts_exit(struct crypto_tfm *tfm)
+static void p8_aes_xts_exit(struct crypto_skcipher *tfm)
{
- struct p8_aes_xts_ctx *ctx = crypto_tfm_ctx(tfm);
+ struct p8_aes_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
- if (ctx->fallback) {
- crypto_free_sync_skcipher(ctx->fallback);
- ctx->fallback = NULL;
- }
+ crypto_free_skcipher(ctx->fallback);
}
-static int p8_aes_xts_setkey(struct crypto_tfm *tfm, const u8 *key,
+static int p8_aes_xts_setkey(struct crypto_skcipher *tfm, const u8 *key,
unsigned int keylen)
{
+ struct p8_aes_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
int ret;
- struct p8_aes_xts_ctx *ctx = crypto_tfm_ctx(tfm);
- ret = xts_check_key(tfm, key, keylen);
+ ret = xts_verify_key(tfm, key, keylen);
if (ret)
return ret;
@@ -81,100 +70,90 @@ static int p8_aes_xts_setkey(struct crypto_tfm *tfm, const u8 *key,
pagefault_enable();
preempt_enable();
- ret |= crypto_sync_skcipher_setkey(ctx->fallback, key, keylen);
+ ret |= crypto_skcipher_setkey(ctx->fallback, key, keylen);
return ret ? -EINVAL : 0;
}
-static int p8_aes_xts_crypt(struct blkcipher_desc *desc,
- struct scatterlist *dst,
- struct scatterlist *src,
- unsigned int nbytes, int enc)
+static int p8_aes_xts_crypt(struct skcipher_request *req, int enc)
{
- int ret;
+ struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+ const struct p8_aes_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
+ struct skcipher_walk walk;
+ unsigned int nbytes;
u8 tweak[AES_BLOCK_SIZE];
- u8 *iv;
- struct blkcipher_walk walk;
- struct p8_aes_xts_ctx *ctx =
- crypto_tfm_ctx(crypto_blkcipher_tfm(desc->tfm));
+ int ret;
if (!crypto_simd_usable()) {
- SYNC_SKCIPHER_REQUEST_ON_STACK(req, ctx->fallback);
- skcipher_request_set_sync_tfm(req, ctx->fallback);
- skcipher_request_set_callback(req, desc->flags, NULL, NULL);
- skcipher_request_set_crypt(req, src, dst, nbytes, desc->info);
- ret = enc? crypto_skcipher_encrypt(req) : crypto_skcipher_decrypt(req);
- skcipher_request_zero(req);
- } else {
- blkcipher_walk_init(&walk, dst, src, nbytes);
+ struct skcipher_request *subreq = skcipher_request_ctx(req);
+
+ *subreq = *req;
+ skcipher_request_set_tfm(subreq, ctx->fallback);
+ return enc ? crypto_skcipher_encrypt(subreq) :
+ crypto_skcipher_decrypt(subreq);
+ }
+
+ ret = skcipher_walk_virt(&walk, req, false);
+ if (ret)
+ return ret;
+
+ preempt_disable();
+ pagefault_disable();
+ enable_kernel_vsx();
- ret = blkcipher_walk_virt(desc, &walk);
+ aes_p8_encrypt(walk.iv, tweak, &ctx->tweak_key);
+
+ disable_kernel_vsx();
+ pagefault_enable();
+ preempt_enable();
+ while ((nbytes = walk.nbytes) != 0) {
preempt_disable();
pagefault_disable();
enable_kernel_vsx();
-
- iv = walk.iv;
- memset(tweak, 0, AES_BLOCK_SIZE);
- aes_p8_encrypt(iv, tweak, &ctx->tweak_key);
-
+ if (enc)
+ aes_p8_xts_encrypt(walk.src.virt.addr,
+ walk.dst.virt.addr,
+ round_down(nbytes, AES_BLOCK_SIZE),
+ &ctx->enc_key, NULL, tweak);
+ else
+ aes_p8_xts_decrypt(walk.src.virt.addr,
+ walk.dst.virt.addr,
+ round_down(nbytes, AES_BLOCK_SIZE),
+ &ctx->dec_key, NULL, tweak);
disable_kernel_vsx();
pagefault_enable();
preempt_enable();
- while ((nbytes = walk.nbytes)) {
- preempt_disable();
- pagefault_disable();
- enable_kernel_vsx();
- if (enc)
- aes_p8_xts_encrypt(walk.src.virt.addr, walk.dst.virt.addr,
- nbytes & AES_BLOCK_MASK, &ctx->enc_key, NULL, tweak);
- else
- aes_p8_xts_decrypt(walk.src.virt.addr, walk.dst.virt.addr,
- nbytes & AES_BLOCK_MASK, &ctx->dec_key, NULL, tweak);
- disable_kernel_vsx();
- pagefault_enable();
- preempt_enable();
-
- nbytes &= AES_BLOCK_SIZE - 1;
- ret = blkcipher_walk_done(desc, &walk, nbytes);
- }
+ ret = skcipher_walk_done(&walk, nbytes % AES_BLOCK_SIZE);
}
return ret;
}
-static int p8_aes_xts_encrypt(struct blkcipher_desc *desc,
- struct scatterlist *dst,
- struct scatterlist *src, unsigned int nbytes)
+static int p8_aes_xts_encrypt(struct skcipher_request *req)
{
- return p8_aes_xts_crypt(desc, dst, src, nbytes, 1);
+ return p8_aes_xts_crypt(req, 1);
}
-static int p8_aes_xts_decrypt(struct blkcipher_desc *desc,
- struct scatterlist *dst,
- struct scatterlist *src, unsigned int nbytes)
+static int p8_aes_xts_decrypt(struct skcipher_request *req)
{
- return p8_aes_xts_crypt(desc, dst, src, nbytes, 0);
+ return p8_aes_xts_crypt(req, 0);
}
-struct crypto_alg p8_aes_xts_alg = {
- .cra_name = "xts(aes)",
- .cra_driver_name = "p8_aes_xts",
- .cra_module = THIS_MODULE,
- .cra_priority = 2000,
- .cra_type = &crypto_blkcipher_type,
- .cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER | CRYPTO_ALG_NEED_FALLBACK,
- .cra_alignmask = 0,
- .cra_blocksize = AES_BLOCK_SIZE,
- .cra_ctxsize = sizeof(struct p8_aes_xts_ctx),
- .cra_init = p8_aes_xts_init,
- .cra_exit = p8_aes_xts_exit,
- .cra_blkcipher = {
- .ivsize = AES_BLOCK_SIZE,
- .min_keysize = 2 * AES_MIN_KEY_SIZE,
- .max_keysize = 2 * AES_MAX_KEY_SIZE,
- .setkey = p8_aes_xts_setkey,
- .encrypt = p8_aes_xts_encrypt,
- .decrypt = p8_aes_xts_decrypt,
- }
+struct skcipher_alg p8_aes_xts_alg = {
+ .base.cra_name = "xts(aes)",
+ .base.cra_driver_name = "p8_aes_xts",
+ .base.cra_module = THIS_MODULE,
+ .base.cra_priority = 2000,
+ .base.cra_flags = CRYPTO_ALG_NEED_FALLBACK,
+ .base.cra_blocksize = AES_BLOCK_SIZE,
+ .base.cra_ctxsize = sizeof(struct p8_aes_xts_ctx),
+ .setkey = p8_aes_xts_setkey,
+ .encrypt = p8_aes_xts_encrypt,
+ .decrypt = p8_aes_xts_decrypt,
+ .init = p8_aes_xts_init,
+ .exit = p8_aes_xts_exit,
+ .min_keysize = 2 * AES_MIN_KEY_SIZE,
+ .max_keysize = 2 * AES_MAX_KEY_SIZE,
+ .ivsize = AES_BLOCK_SIZE,
};
diff --git a/drivers/crypto/vmx/aesp8-ppc.h b/drivers/crypto/vmx/aesp8-ppc.h
index 349646b73754f..01774a4d26a25 100644
--- a/drivers/crypto/vmx/aesp8-ppc.h
+++ b/drivers/crypto/vmx/aesp8-ppc.h
@@ -2,8 +2,6 @@
#include <linux/types.h>
#include <crypto/aes.h>
-#define AES_BLOCK_MASK (~(AES_BLOCK_SIZE-1))
-
struct aes_key {
u8 key[AES_MAX_KEYLENGTH];
int rounds;
diff --git a/drivers/crypto/vmx/vmx.c b/drivers/crypto/vmx/vmx.c
index abd89c2bcec4d..eff03fdf964f2 100644
--- a/drivers/crypto/vmx/vmx.c
+++ b/drivers/crypto/vmx/vmx.c
@@ -15,54 +15,58 @@
#include <linux/crypto.h>
#include <asm/cputable.h>
#include <crypto/internal/hash.h>
+#include <crypto/internal/skcipher.h>
extern struct shash_alg p8_ghash_alg;
extern struct crypto_alg p8_aes_alg;
-extern struct crypto_alg p8_aes_cbc_alg;
-extern struct crypto_alg p8_aes_ctr_alg;
-extern struct crypto_alg p8_aes_xts_alg;
-static struct crypto_alg *algs[] = {
- &p8_aes_alg,
- &p8_aes_cbc_alg,
- &p8_aes_ctr_alg,
- &p8_aes_xts_alg,
- NULL,
-};
+extern struct skcipher_alg p8_aes_cbc_alg;
+extern struct skcipher_alg p8_aes_ctr_alg;
+extern struct skcipher_alg p8_aes_xts_alg;
static int __init p8_init(void)
{
- int ret = 0;
- struct crypto_alg **alg_it;
+ int ret;
- for (alg_it = algs; *alg_it; alg_it++) {
- ret = crypto_register_alg(*alg_it);
- printk(KERN_INFO "crypto_register_alg '%s' = %d\n",
- (*alg_it)->cra_name, ret);
- if (ret) {
- for (alg_it--; alg_it >= algs; alg_it--)
- crypto_unregister_alg(*alg_it);
- break;
- }
- }
+ ret = crypto_register_shash(&p8_ghash_alg);
if (ret)
- return ret;
+ goto err;
- ret = crypto_register_shash(&p8_ghash_alg);
- if (ret) {
- for (alg_it = algs; *alg_it; alg_it++)
- crypto_unregister_alg(*alg_it);
- }
+ ret = crypto_register_alg(&p8_aes_alg);
+ if (ret)
+ goto err_unregister_ghash;
+
+ ret = crypto_register_skcipher(&p8_aes_cbc_alg);
+ if (ret)
+ goto err_unregister_aes;
+
+ ret = crypto_register_skcipher(&p8_aes_ctr_alg);
+ if (ret)
+ goto err_unregister_aes_cbc;
+
+ ret = crypto_register_skcipher(&p8_aes_xts_alg);
+ if (ret)
+ goto err_unregister_aes_ctr;
+
+ return 0;
+
+err_unregister_aes_ctr:
+ crypto_unregister_skcipher(&p8_aes_ctr_alg);
+err_unregister_aes_cbc:
+ crypto_unregister_skcipher(&p8_aes_cbc_alg);
+err_unregister_aes:
+ crypto_unregister_alg(&p8_aes_alg);
+err_unregister_ghash:
+ crypto_unregister_shash(&p8_ghash_alg);
+err:
return ret;
}
static void __exit p8_exit(void)
{
- struct crypto_alg **alg_it;
-
- for (alg_it = algs; *alg_it; alg_it++) {
- printk(KERN_INFO "Removing '%s'\n", (*alg_it)->cra_name);
- crypto_unregister_alg(*alg_it);
- }
+ crypto_unregister_skcipher(&p8_aes_xts_alg);
+ crypto_unregister_skcipher(&p8_aes_ctr_alg);
+ crypto_unregister_skcipher(&p8_aes_cbc_alg);
+ crypto_unregister_alg(&p8_aes_alg);
crypto_unregister_shash(&p8_ghash_alg);
}
--
2.21.0.1020.gf2820cf01a-goog
^ permalink raw reply related
* [PATCH] crypto: vmx - convert to SPDX license identifiers
From: Eric Biggers @ 2019-05-20 16:42 UTC (permalink / raw)
To: linux-crypto, Herbert Xu
Cc: Breno Leitão, Paulo Flabiano Smorigo, Nayna Jain,
linuxppc-dev, Daniel Axtens
From: Eric Biggers <ebiggers@google.com>
Remove the boilerplate license text and replace it with the equivalent
SPDX license identifier.
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
drivers/crypto/vmx/aes.c | 14 +-------------
drivers/crypto/vmx/aes_cbc.c | 14 +-------------
drivers/crypto/vmx/aes_ctr.c | 14 +-------------
drivers/crypto/vmx/aes_xts.c | 14 +-------------
drivers/crypto/vmx/vmx.c | 14 +-------------
5 files changed, 5 insertions(+), 65 deletions(-)
diff --git a/drivers/crypto/vmx/aes.c b/drivers/crypto/vmx/aes.c
index 603a620819941..2e9476158df49 100644
--- a/drivers/crypto/vmx/aes.c
+++ b/drivers/crypto/vmx/aes.c
@@ -1,21 +1,9 @@
+// SPDX-License-Identifier: GPL-2.0
/**
* AES routines supporting VMX instructions on the Power 8
*
* Copyright (C) 2015 International Business Machines Inc.
*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; version 2 only.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, write to the Free Software
- * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
- *
* Author: Marcelo Henrique Cerri <mhcerri@br.ibm.com>
*/
diff --git a/drivers/crypto/vmx/aes_cbc.c b/drivers/crypto/vmx/aes_cbc.c
index a1a9a6f0d42cf..dae8af3c46dce 100644
--- a/drivers/crypto/vmx/aes_cbc.c
+++ b/drivers/crypto/vmx/aes_cbc.c
@@ -1,21 +1,9 @@
+// SPDX-License-Identifier: GPL-2.0
/**
* AES CBC routines supporting VMX instructions on the Power 8
*
* Copyright (C) 2015 International Business Machines Inc.
*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; version 2 only.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, write to the Free Software
- * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
- *
* Author: Marcelo Henrique Cerri <mhcerri@br.ibm.com>
*/
diff --git a/drivers/crypto/vmx/aes_ctr.c b/drivers/crypto/vmx/aes_ctr.c
index 192a53512f5e8..dc31101178446 100644
--- a/drivers/crypto/vmx/aes_ctr.c
+++ b/drivers/crypto/vmx/aes_ctr.c
@@ -1,21 +1,9 @@
+// SPDX-License-Identifier: GPL-2.0
/**
* AES CTR routines supporting VMX instructions on the Power 8
*
* Copyright (C) 2015 International Business Machines Inc.
*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; version 2 only.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, write to the Free Software
- * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
- *
* Author: Marcelo Henrique Cerri <mhcerri@br.ibm.com>
*/
diff --git a/drivers/crypto/vmx/aes_xts.c b/drivers/crypto/vmx/aes_xts.c
index 00d412d811ae6..aee1339f134ec 100644
--- a/drivers/crypto/vmx/aes_xts.c
+++ b/drivers/crypto/vmx/aes_xts.c
@@ -1,21 +1,9 @@
+// SPDX-License-Identifier: GPL-2.0
/**
* AES XTS routines supporting VMX In-core instructions on Power 8
*
* Copyright (C) 2015 International Business Machines Inc.
*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundations; version 2 only.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY of FITNESS FOR A PARTICUPAR PURPOSE. See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, write to the Free Software
- * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
- *
* Author: Leonidas S. Barbosa <leosilva@linux.vnet.ibm.com>
*/
diff --git a/drivers/crypto/vmx/vmx.c b/drivers/crypto/vmx/vmx.c
index a9f5198306155..abd89c2bcec4d 100644
--- a/drivers/crypto/vmx/vmx.c
+++ b/drivers/crypto/vmx/vmx.c
@@ -1,21 +1,9 @@
+// SPDX-License-Identifier: GPL-2.0
/**
* Routines supporting VMX instructions on the Power 8
*
* Copyright (C) 2015 International Business Machines Inc.
*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; version 2 only.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, write to the Free Software
- * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
- *
* Author: Marcelo Henrique Cerri <mhcerri@br.ibm.com>
*/
--
2.21.0.1020.gf2820cf01a-goog
^ permalink raw reply related
* Re: [PATCH] crypto: vmx - CTR: always increment IV as quadword
From: Eric Biggers @ 2019-05-20 16:39 UTC (permalink / raw)
To: Daniel Axtens
Cc: leo.barbosa, Herbert Xu, Stephan Mueller, nayna, omosnacek,
leitao, pfsmorigo, linux-crypto, marcelo.cerri, gcwilson,
linuxppc-dev
In-Reply-To: <87r28tzy1i.fsf@dja-thinkpad.axtens.net>
On Mon, May 20, 2019 at 11:59:05AM +1000, Daniel Axtens wrote:
> Daniel Axtens <dja@axtens.net> writes:
>
> > The kernel self-tests picked up an issue with CTR mode:
> > alg: skcipher: p8_aes_ctr encryption test failed (wrong result) on test vector 3, cfg="uneven misaligned splits, may sleep"
> >
> > Test vector 3 has an IV of FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFD, so
> > after 3 increments it should wrap around to 0.
> >
> > In the aesp8-ppc code from OpenSSL, there are two paths that
> > increment IVs: the bulk (8 at a time) path, and the individual
> > path which is used when there are fewer than 8 AES blocks to
> > process.
> >
> > In the bulk path, the IV is incremented with vadduqm: "Vector
> > Add Unsigned Quadword Modulo", which does 128-bit addition.
> >
> > In the individual path, however, the IV is incremented with
> > vadduwm: "Vector Add Unsigned Word Modulo", which instead
> > does 4 32-bit additions. Thus the IV would instead become
> > FFFFFFFFFFFFFFFFFFFFFFFF00000000, throwing off the result.
> >
> > Use vadduqm.
> >
> > This was probably a typo originally, what with q and w being
> > adjacent. It is a pretty narrow edge case: I am really
> > impressed by the quality of the kernel self-tests!
> >
> > Fixes: 5c380d623ed3 ("crypto: vmx - Add support for VMS instructions by ASM")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Daniel Axtens <dja@axtens.net>
> >
> > ---
> >
> > I'll pass this along internally to get it into OpenSSL as well.
>
> I passed this along to OpenSSL and got pretty comprehensively schooled:
> https://github.com/openssl/openssl/pull/8942
>
> It seems we tweak the openssl code to use a 128-bit counter, whereas
> the original code was in fact designed for a 32-bit counter. We must
> have changed the vaddu instruction in the bulk path but not in the
> individual path, as they're both vadduwm (4x32-bit) upstream.
>
> I think this change is still correct with regards to the kernel,
> but I guess it's probably something where I should have done a more
> thorough read of the documentation before diving in to the code, and
> perhaps we should note it in the code somewhere too. Ah well.
>
> Regards,
> Daniel
>
Ah, I didn't realize there are multiple conventions for CTR. Yes, all CTR
implementations in the kernel follow the 128-bit convention, and evidently the
test vectors test for that. Apparently the VMX OpenSSL code got incompletely
changed from the 32-bit convention by this commit, so that's what you're fixing:
commit 1d4aa0b4c1816e8ca92a6aadb0d8f6b43c56c0d0
Author: Leonidas Da Silva Barbosa <leosilva@linux.vnet.ibm.com>
Date: Fri Aug 14 10:12:22 2015 -0300
crypto: vmx - Fixing AES-CTR counter bug
AES-CTR is using a counter 8bytes-8bytes what miss match with
kernel specs.
In the previous code a vadduwm was done to increment counter.
Replacing this for a vadduqm now considering both cases counter
8-8 bytes and full 16bytes.
A comment in the code would indeed be helpful.
Note that the kernel doesn't currently need a 32-bit CTR implementation for GCM
like OpenSSL does, because the kernel currently only supports 12-byte IVs with
GCM. So the low 32 bits of the counter start at 1 and don't overflow,
regardless of whether the counter is incremented mod 2^32 or mod 2^128.
- Eric
^ permalink raw reply
* Re: [PATCH 5/5] asm-generic: remove ptrace.h
From: Oleg Nesterov @ 2019-05-20 16:03 UTC (permalink / raw)
To: Christoph Hellwig
Cc: linux-arch, Arnd Bergmann, linux-sh, x86, linux-mips,
linux-kernel, linuxppc-dev, linux-arm-kernel
In-Reply-To: <20190520060018.25569-6-hch@lst.de>
On 05/20, Christoph Hellwig wrote:
>
> No one is using this header anymore.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Acked-by: Arnd Bergmann <arnd@arndb.de>
> ---
> MAINTAINERS | 1 -
> arch/mips/include/asm/ptrace.h | 5 ---
> include/asm-generic/ptrace.h | 74 ----------------------------------
> 3 files changed, 80 deletions(-)
> delete mode 100644 include/asm-generic/ptrace.h
Acked-by: Oleg Nesterov <oleg@redhat.com>
^ permalink raw reply
* Re: [PATCH 4/5] x86: don't use asm-generic/ptrace.h
From: Oleg Nesterov @ 2019-05-20 16:02 UTC (permalink / raw)
To: Christoph Hellwig
Cc: linux-arch, Arnd Bergmann, linux-sh, x86, linux-mips,
linux-kernel, linuxppc-dev, Ingo Molnar, linux-arm-kernel
In-Reply-To: <20190520060018.25569-5-hch@lst.de>
On 05/20, Christoph Hellwig wrote:
>
> Doing the indirection through macros for the regs accessors just
> makes them harder to read, so implement the helpers directly.
Acked-by: Oleg Nesterov <oleg@redhat.com>
^ permalink raw reply
* [PATCH v3 1/2] pid: add pidfd_open()
From: Christian Brauner @ 2019-05-20 15:56 UTC (permalink / raw)
To: jannh, oleg, viro, torvalds, linux-kernel, arnd
Cc: linux-ia64, linux-sh, linux-mips, dhowells, joel, linux-kselftest,
sparclinux, elena.reshetova, linux-arch, linux-s390, dancol,
Christian Brauner, kernel-team, serge, linux-xtensa, keescook,
linux-m68k, luto, tglx, surenb, linux-arm-kernel, linux-parisc,
linux-api, cyphar, luto, ebiederm, linux-alpha, akpm,
linuxppc-dev
This adds the pidfd_open() syscall. It allows a caller to retrieve pollable
pidfds for a process which did not get created via CLONE_PIDFD, i.e. for a
process that is created via traditional fork()/clone() calls that is only
referenced by a PID:
int pidfd = pidfd_open(1234, 0);
ret = pidfd_send_signal(pidfd, SIGSTOP, NULL, 0);
With the introduction of pidfds through CLONE_PIDFD it is possible to
created pidfds at process creation time.
However, a lot of processes get created with traditional PID-based calls
such as fork() or clone() (without CLONE_PIDFD). For these processes a
caller can currently not create a pollable pidfd. This is a problem for
Android's low memory killer (LMK) and service managers such as systemd.
Both are examples of tools that want to make use of pidfds to get reliable
notification of process exit for non-parents (pidfd polling) and race-free
signal sending (pidfd_send_signal()). They intend to switch to this API for
process supervision/management as soon as possible. Having no way to get
pollable pidfds from PID-only processes is one of the biggest blockers for
them in adopting this api. With pidfd_open() making it possible to retrieve
pidfds for PID-based processes we enable them to adopt this api.
In line with Arnd's recent changes to consolidate syscall numbers across
architectures, I have added the pidfd_open() syscall to all architectures
at the same time.
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jann Horn <jannh@google.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andy Lutomirsky <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-api@vger.kernel.org
---
v1:
- kbuild test robot <lkp@intel.com>:
- add missing entry for pidfd_open to arch/arm/tools/syscall.tbl
- Oleg Nesterov <oleg@redhat.com>:
- use simpler thread-group leader check
v2:
- Oleg Nesterov <oleg@redhat.com>:
- avoid using additional variable
- remove unneeded comment
- Arnd Bergmann <arnd@arndb.de>:
- switch from 428 to 434 since the new mount api has taken it
- bump syscall numbers in arch/arm64/include/asm/unistd.h
- Joel Fernandes (Google) <joel@joelfernandes.org>:
- switch from ESRCH to EINVAL when the passed-in pid does not refer to a
thread-group leader
- Christian Brauner <christian@brauner.io>:
- rebase on v5.2-rc1
- adapt syscall number to account for new mount api syscalls
v3:
- Arnd Bergmann <arnd@arndb.de>:
- add missing syscall entries for mips-o32 and mips-n64
---
arch/alpha/kernel/syscalls/syscall.tbl | 1 +
arch/arm/tools/syscall.tbl | 1 +
arch/arm64/include/asm/unistd.h | 2 +-
arch/arm64/include/asm/unistd32.h | 2 +
arch/ia64/kernel/syscalls/syscall.tbl | 1 +
arch/m68k/kernel/syscalls/syscall.tbl | 1 +
arch/microblaze/kernel/syscalls/syscall.tbl | 1 +
arch/mips/kernel/syscalls/syscall_n32.tbl | 1 +
arch/mips/kernel/syscalls/syscall_n64.tbl | 1 +
arch/mips/kernel/syscalls/syscall_o32.tbl | 1 +
arch/parisc/kernel/syscalls/syscall.tbl | 1 +
arch/powerpc/kernel/syscalls/syscall.tbl | 1 +
arch/s390/kernel/syscalls/syscall.tbl | 1 +
arch/sh/kernel/syscalls/syscall.tbl | 1 +
arch/sparc/kernel/syscalls/syscall.tbl | 1 +
arch/x86/entry/syscalls/syscall_32.tbl | 1 +
arch/x86/entry/syscalls/syscall_64.tbl | 1 +
arch/xtensa/kernel/syscalls/syscall.tbl | 1 +
include/linux/pid.h | 1 +
include/linux/syscalls.h | 1 +
include/uapi/asm-generic/unistd.h | 4 +-
kernel/fork.c | 2 +-
kernel/pid.c | 43 +++++++++++++++++++++
23 files changed, 68 insertions(+), 3 deletions(-)
diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl
index 9e7704e44f6d..1db9bbcfb84e 100644
--- a/arch/alpha/kernel/syscalls/syscall.tbl
+++ b/arch/alpha/kernel/syscalls/syscall.tbl
@@ -473,3 +473,4 @@
541 common fsconfig sys_fsconfig
542 common fsmount sys_fsmount
543 common fspick sys_fspick
+544 common pidfd_open sys_pidfd_open
diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
index aaf479a9e92d..81e6e1817c45 100644
--- a/arch/arm/tools/syscall.tbl
+++ b/arch/arm/tools/syscall.tbl
@@ -447,3 +447,4 @@
431 common fsconfig sys_fsconfig
432 common fsmount sys_fsmount
433 common fspick sys_fspick
+434 common pidfd_open sys_pidfd_open
diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
index 70e6882853c0..e8f7d95a1481 100644
--- a/arch/arm64/include/asm/unistd.h
+++ b/arch/arm64/include/asm/unistd.h
@@ -44,7 +44,7 @@
#define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE + 5)
#define __ARM_NR_COMPAT_END (__ARM_NR_COMPAT_BASE + 0x800)
-#define __NR_compat_syscalls 434
+#define __NR_compat_syscalls 435
#endif
#define __ARCH_WANT_SYS_CLONE
diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h
index c39e90600bb3..7a3158ccd68e 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -886,6 +886,8 @@ __SYSCALL(__NR_fsconfig, sys_fsconfig)
__SYSCALL(__NR_fsmount, sys_fsmount)
#define __NR_fspick 433
__SYSCALL(__NR_fspick, sys_fspick)
+#define __NR_pidfd_open 434
+__SYSCALL(__NR_pidfd_open, sys_pidfd_open)
/*
* Please add new compat syscalls above this comment and update
diff --git a/arch/ia64/kernel/syscalls/syscall.tbl b/arch/ia64/kernel/syscalls/syscall.tbl
index e01df3f2f80d..ecc44926737b 100644
--- a/arch/ia64/kernel/syscalls/syscall.tbl
+++ b/arch/ia64/kernel/syscalls/syscall.tbl
@@ -354,3 +354,4 @@
431 common fsconfig sys_fsconfig
432 common fsmount sys_fsmount
433 common fspick sys_fspick
+434 common pidfd_open sys_pidfd_open
diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl
index 7e3d0734b2f3..9a3eb2558568 100644
--- a/arch/m68k/kernel/syscalls/syscall.tbl
+++ b/arch/m68k/kernel/syscalls/syscall.tbl
@@ -433,3 +433,4 @@
431 common fsconfig sys_fsconfig
432 common fsmount sys_fsmount
433 common fspick sys_fspick
+434 common pidfd_open sys_pidfd_open
diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/kernel/syscalls/syscall.tbl
index 26339e417695..ad706f83c755 100644
--- a/arch/microblaze/kernel/syscalls/syscall.tbl
+++ b/arch/microblaze/kernel/syscalls/syscall.tbl
@@ -439,3 +439,4 @@
431 common fsconfig sys_fsconfig
432 common fsmount sys_fsmount
433 common fspick sys_fspick
+434 common pidfd_open sys_pidfd_open
diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl
index 0e2dd68ade57..97035e19ad03 100644
--- a/arch/mips/kernel/syscalls/syscall_n32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
@@ -372,3 +372,4 @@
431 n32 fsconfig sys_fsconfig
432 n32 fsmount sys_fsmount
433 n32 fspick sys_fspick
+434 n32 pidfd_open sys_pidfd_open
diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl b/arch/mips/kernel/syscalls/syscall_n64.tbl
index 5eebfa0d155c..d7292722d3b0 100644
--- a/arch/mips/kernel/syscalls/syscall_n64.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n64.tbl
@@ -348,3 +348,4 @@
431 n64 fsconfig sys_fsconfig
432 n64 fsmount sys_fsmount
433 n64 fspick sys_fspick
+434 n64 pidfd_open sys_pidfd_open
diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl
index 3cc1374e02d0..dba084c92f14 100644
--- a/arch/mips/kernel/syscalls/syscall_o32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
@@ -421,3 +421,4 @@
431 o32 fsconfig sys_fsconfig
432 o32 fsmount sys_fsmount
433 o32 fspick sys_fspick
+434 o32 pidfd_open sys_pidfd_open
diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl
index c9e377d59232..5022b9e179c2 100644
--- a/arch/parisc/kernel/syscalls/syscall.tbl
+++ b/arch/parisc/kernel/syscalls/syscall.tbl
@@ -430,3 +430,4 @@
431 common fsconfig sys_fsconfig
432 common fsmount sys_fsmount
433 common fspick sys_fspick
+434 common pidfd_open sys_pidfd_open
diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl
index 103655d84b4b..f2c3bda2d39f 100644
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ b/arch/powerpc/kernel/syscalls/syscall.tbl
@@ -515,3 +515,4 @@
431 common fsconfig sys_fsconfig
432 common fsmount sys_fsmount
433 common fspick sys_fspick
+434 common pidfd_open sys_pidfd_open
diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl
index e822b2964a83..6ebacfeaf853 100644
--- a/arch/s390/kernel/syscalls/syscall.tbl
+++ b/arch/s390/kernel/syscalls/syscall.tbl
@@ -436,3 +436,4 @@
431 common fsconfig sys_fsconfig sys_fsconfig
432 common fsmount sys_fsmount sys_fsmount
433 common fspick sys_fspick sys_fspick
+434 common pidfd_open sys_pidfd_open sys_pidfd_open
diff --git a/arch/sh/kernel/syscalls/syscall.tbl b/arch/sh/kernel/syscalls/syscall.tbl
index 016a727d4357..834c9c7d79fa 100644
--- a/arch/sh/kernel/syscalls/syscall.tbl
+++ b/arch/sh/kernel/syscalls/syscall.tbl
@@ -436,3 +436,4 @@
431 common fsconfig sys_fsconfig
432 common fsmount sys_fsmount
433 common fspick sys_fspick
+434 common pidfd_open sys_pidfd_open
diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl
index e047480b1605..c58e71f21129 100644
--- a/arch/sparc/kernel/syscalls/syscall.tbl
+++ b/arch/sparc/kernel/syscalls/syscall.tbl
@@ -479,3 +479,4 @@
431 common fsconfig sys_fsconfig
432 common fsmount sys_fsmount
433 common fspick sys_fspick
+434 common pidfd_open sys_pidfd_open
diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index ad968b7bac72..43e4429a5272 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -438,3 +438,4 @@
431 i386 fsconfig sys_fsconfig __ia32_sys_fsconfig
432 i386 fsmount sys_fsmount __ia32_sys_fsmount
433 i386 fspick sys_fspick __ia32_sys_fspick
+434 i386 pidfd_open sys_pidfd_open __ia32_sys_pidfd_open
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
index b4e6f9e6204a..1bee0a77fdd3 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -355,6 +355,7 @@
431 common fsconfig __x64_sys_fsconfig
432 common fsmount __x64_sys_fsmount
433 common fspick __x64_sys_fspick
+434 common pidfd_open __x64_sys_pidfd_open
#
# x32-specific system call numbers start at 512 to avoid cache impact
diff --git a/arch/xtensa/kernel/syscalls/syscall.tbl b/arch/xtensa/kernel/syscalls/syscall.tbl
index 5fa0ee1c8e00..782b81945ccc 100644
--- a/arch/xtensa/kernel/syscalls/syscall.tbl
+++ b/arch/xtensa/kernel/syscalls/syscall.tbl
@@ -404,3 +404,4 @@
431 common fsconfig sys_fsconfig
432 common fsmount sys_fsmount
433 common fspick sys_fspick
+434 common pidfd_open sys_pidfd_open
diff --git a/include/linux/pid.h b/include/linux/pid.h
index 3c8ef5a199ca..c938a92eab99 100644
--- a/include/linux/pid.h
+++ b/include/linux/pid.h
@@ -67,6 +67,7 @@ struct pid
extern struct pid init_struct_pid;
extern const struct file_operations pidfd_fops;
+extern int pidfd_create(struct pid *pid);
static inline struct pid *get_pid(struct pid *pid)
{
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index e2870fe1be5b..989055e0b501 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -929,6 +929,7 @@ asmlinkage long sys_clock_adjtime32(clockid_t which_clock,
struct old_timex32 __user *tx);
asmlinkage long sys_syncfs(int fd);
asmlinkage long sys_setns(int fd, int nstype);
+asmlinkage long sys_pidfd_open(pid_t pid, unsigned int flags);
asmlinkage long sys_sendmmsg(int fd, struct mmsghdr __user *msg,
unsigned int vlen, unsigned flags);
asmlinkage long sys_process_vm_readv(pid_t pid,
diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index a87904daf103..e5684a4512c0 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -844,9 +844,11 @@ __SYSCALL(__NR_fsconfig, sys_fsconfig)
__SYSCALL(__NR_fsmount, sys_fsmount)
#define __NR_fspick 433
__SYSCALL(__NR_fspick, sys_fspick)
+#define __NR_pidfd_open 434
+__SYSCALL(__NR_pidfd_open, sys_pidfd_open)
#undef __NR_syscalls
-#define __NR_syscalls 434
+#define __NR_syscalls 435
/*
* 32 bit systems traditionally used different
diff --git a/kernel/fork.c b/kernel/fork.c
index b4cba953040a..c3df226f47a1 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1724,7 +1724,7 @@ const struct file_operations pidfd_fops = {
* Return: On success, a cloexec pidfd is returned.
* On error, a negative errno number will be returned.
*/
-static int pidfd_create(struct pid *pid)
+int pidfd_create(struct pid *pid)
{
int fd;
diff --git a/kernel/pid.c b/kernel/pid.c
index 89548d35eefb..8fc9d94f6ac1 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -37,6 +37,7 @@
#include <linux/syscalls.h>
#include <linux/proc_ns.h>
#include <linux/proc_fs.h>
+#include <linux/sched/signal.h>
#include <linux/sched/task.h>
#include <linux/idr.h>
@@ -450,6 +451,48 @@ struct pid *find_ge_pid(int nr, struct pid_namespace *ns)
return idr_get_next(&ns->idr, &nr);
}
+/**
+ * pidfd_open() - Open new pid file descriptor.
+ *
+ * @pid: pid for which to retrieve a pidfd
+ * @flags: flags to pass
+ *
+ * This creates a new pid file descriptor with the O_CLOEXEC flag set for
+ * the process identified by @pid. Currently, the process identified by
+ * @pid must be a thread-group leader. This restriction currently exists
+ * for all aspects of pidfds including pidfd creation (CLONE_PIDFD cannot
+ * be used with CLONE_THREAD) and pidfd polling (only supports thread group
+ * leaders).
+ *
+ * Return: On success, a cloexec pidfd is returned.
+ * On error, a negative errno number will be returned.
+ */
+SYSCALL_DEFINE2(pidfd_open, pid_t, pid, unsigned int, flags)
+{
+ int fd, ret;
+ struct pid *p;
+
+ if (flags)
+ return -EINVAL;
+
+ if (pid <= 0)
+ return -EINVAL;
+
+ p = find_get_pid(pid);
+ if (!p)
+ return -ESRCH;
+
+ ret = 0;
+ rcu_read_lock();
+ if (!pid_task(p, PIDTYPE_TGID))
+ ret = -EINVAL;
+ rcu_read_unlock();
+
+ fd = ret ?: pidfd_create(p);
+ put_pid(p);
+ return fd;
+}
+
void __init pid_idr_init(void)
{
/* Verify no one has done anything silly: */
--
2.21.0
^ permalink raw reply related
* [PATCH v3 2/2] tests: add pidfd_open() tests
From: Christian Brauner @ 2019-05-20 15:56 UTC (permalink / raw)
To: jannh, oleg, viro, torvalds, linux-kernel, arnd
Cc: linux-ia64, linux-sh, linux-mips, dhowells, joel, linux-kselftest,
sparclinux, elena.reshetova, linux-arch, linux-s390, dancol,
Christian Brauner, kernel-team, serge, linux-xtensa, keescook,
linux-m68k, luto, tglx, surenb, linux-arm-kernel, linux-parisc,
linux-api, cyphar, luto, ebiederm, linux-alpha,
Michael Kerrisk (man-pages), akpm, linuxppc-dev
In-Reply-To: <20190520155630.21684-1-christian@brauner.io>
This adds testing for the new pidfd_open() syscalls. Specifically, we test:
- that no invalid flags can be passed to pidfd_open()
- that no invalid pid can be passed to pidfd_open()
- that a pidfd can be retrieved with pidfd_open()
- that the retrieved pidfd references the correct pid
Signed-off-by: Christian Brauner <christian@brauner.io>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jann Horn <jannh@google.com>
Cc: David Howells <dhowells@redhat.com>
Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
Cc: Andy Lutomirsky <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-api@vger.kernel.org
---
v1: unchanged
v2:
- Christian Brauner <christian@brauner.io>:
- set number of planned tests via ksft_set_plan()
v3: unchanged
---
tools/testing/selftests/pidfd/.gitignore | 1 +
tools/testing/selftests/pidfd/Makefile | 2 +-
tools/testing/selftests/pidfd/pidfd.h | 57 ++++++
.../testing/selftests/pidfd/pidfd_open_test.c | 169 ++++++++++++++++++
tools/testing/selftests/pidfd/pidfd_test.c | 41 +----
5 files changed, 229 insertions(+), 41 deletions(-)
create mode 100644 tools/testing/selftests/pidfd/pidfd.h
create mode 100644 tools/testing/selftests/pidfd/pidfd_open_test.c
diff --git a/tools/testing/selftests/pidfd/.gitignore b/tools/testing/selftests/pidfd/.gitignore
index 822a1e63d045..16d84d117bc0 100644
--- a/tools/testing/selftests/pidfd/.gitignore
+++ b/tools/testing/selftests/pidfd/.gitignore
@@ -1 +1,2 @@
+pidfd_open_test
pidfd_test
diff --git a/tools/testing/selftests/pidfd/Makefile b/tools/testing/selftests/pidfd/Makefile
index deaf8073bc06..b36c0be70848 100644
--- a/tools/testing/selftests/pidfd/Makefile
+++ b/tools/testing/selftests/pidfd/Makefile
@@ -1,6 +1,6 @@
CFLAGS += -g -I../../../../usr/include/
-TEST_GEN_PROGS := pidfd_test
+TEST_GEN_PROGS := pidfd_test pidfd_open_test
include ../lib.mk
diff --git a/tools/testing/selftests/pidfd/pidfd.h b/tools/testing/selftests/pidfd/pidfd.h
new file mode 100644
index 000000000000..8452e910463f
--- /dev/null
+++ b/tools/testing/selftests/pidfd/pidfd.h
@@ -0,0 +1,57 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __PIDFD_H
+#define __PIDFD_H
+
+#define _GNU_SOURCE
+#include <errno.h>
+#include <fcntl.h>
+#include <sched.h>
+#include <signal.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <syscall.h>
+#include <sys/mount.h>
+
+#include "../kselftest.h"
+
+/*
+ * The kernel reserves 300 pids via RESERVED_PIDS in kernel/pid.c
+ * That means, when it wraps around any pid < 300 will be skipped.
+ * So we need to use a pid > 300 in order to test recycling.
+ */
+#define PID_RECYCLE 1000
+
+/*
+ * Define a few custom error codes for the child process to clearly indicate
+ * what is happening. This way we can tell the difference between a system
+ * error, a test error, etc.
+ */
+#define PIDFD_PASS 0
+#define PIDFD_FAIL 1
+#define PIDFD_ERROR 2
+#define PIDFD_SKIP 3
+#define PIDFD_XFAIL 4
+
+int wait_for_pid(pid_t pid)
+{
+ int status, ret;
+
+again:
+ ret = waitpid(pid, &status, 0);
+ if (ret == -1) {
+ if (errno == EINTR)
+ goto again;
+
+ return -1;
+ }
+
+ if (!WIFEXITED(status))
+ return -1;
+
+ return WEXITSTATUS(status);
+}
+
+
+#endif /* __PIDFD_H */
diff --git a/tools/testing/selftests/pidfd/pidfd_open_test.c b/tools/testing/selftests/pidfd/pidfd_open_test.c
new file mode 100644
index 000000000000..0377133dd6dc
--- /dev/null
+++ b/tools/testing/selftests/pidfd/pidfd_open_test.c
@@ -0,0 +1,169 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#define _GNU_SOURCE
+#include <errno.h>
+#include <fcntl.h>
+#include <inttypes.h>
+#include <limits.h>
+#include <linux/types.h>
+#include <linux/wait.h>
+#include <sched.h>
+#include <signal.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <syscall.h>
+#include <sys/mount.h>
+#include <sys/prctl.h>
+#include <sys/wait.h>
+#include <unistd.h>
+
+#include "pidfd.h"
+#include "../kselftest.h"
+
+static inline int sys_pidfd_open(pid_t pid, unsigned int flags)
+{
+ return syscall(__NR_pidfd_open, pid, flags);
+}
+
+static int safe_int(const char *numstr, int *converted)
+{
+ char *err = NULL;
+ long sli;
+
+ errno = 0;
+ sli = strtol(numstr, &err, 0);
+ if (errno == ERANGE && (sli == LONG_MAX || sli == LONG_MIN))
+ return -ERANGE;
+
+ if (errno != 0 && sli == 0)
+ return -EINVAL;
+
+ if (err == numstr || *err != '\0')
+ return -EINVAL;
+
+ if (sli > INT_MAX || sli < INT_MIN)
+ return -ERANGE;
+
+ *converted = (int)sli;
+ return 0;
+}
+
+static int char_left_gc(const char *buffer, size_t len)
+{
+ size_t i;
+
+ for (i = 0; i < len; i++) {
+ if (buffer[i] == ' ' ||
+ buffer[i] == '\t')
+ continue;
+
+ return i;
+ }
+
+ return 0;
+}
+
+static int char_right_gc(const char *buffer, size_t len)
+{
+ int i;
+
+ for (i = len - 1; i >= 0; i--) {
+ if (buffer[i] == ' ' ||
+ buffer[i] == '\t' ||
+ buffer[i] == '\n' ||
+ buffer[i] == '\0')
+ continue;
+
+ return i + 1;
+ }
+
+ return 0;
+}
+
+static char *trim_whitespace_in_place(char *buffer)
+{
+ buffer += char_left_gc(buffer, strlen(buffer));
+ buffer[char_right_gc(buffer, strlen(buffer))] = '\0';
+ return buffer;
+}
+
+static pid_t get_pid_from_fdinfo_file(int pidfd, const char *key, size_t keylen)
+{
+ int ret;
+ char path[512];
+ FILE *f;
+ size_t n = 0;
+ pid_t result = -1;
+ char *line = NULL;
+
+ snprintf(path, sizeof(path), "/proc/self/fdinfo/%d", pidfd);
+
+ f = fopen(path, "re");
+ if (!f)
+ return -1;
+
+ while (getline(&line, &n, f) != -1) {
+ char *numstr;
+
+ if (strncmp(line, key, keylen))
+ continue;
+
+ numstr = trim_whitespace_in_place(line + 4);
+ ret = safe_int(numstr, &result);
+ if (ret < 0)
+ goto out;
+
+ break;
+ }
+
+out:
+ free(line);
+ fclose(f);
+ return result;
+}
+
+int main(int argc, char **argv)
+{
+ int pidfd = -1, ret = 1;
+ pid_t pid;
+
+ ksft_set_plan(3);
+
+ pidfd = sys_pidfd_open(-1, 0);
+ if (pidfd >= 0) {
+ ksft_print_msg(
+ "%s - succeeded to open pidfd for invalid pid -1\n",
+ strerror(errno));
+ goto on_error;
+ }
+ ksft_test_result_pass("do not allow invalid pid test: passed\n");
+
+ pidfd = sys_pidfd_open(getpid(), 1);
+ if (pidfd >= 0) {
+ ksft_print_msg(
+ "%s - succeeded to open pidfd with invalid flag value specified\n",
+ strerror(errno));
+ goto on_error;
+ }
+ ksft_test_result_pass("do not allow invalid flag test: passed\n");
+
+ pidfd = sys_pidfd_open(getpid(), 0);
+ if (pidfd < 0) {
+ ksft_print_msg("%s - failed to open pidfd\n", strerror(errno));
+ goto on_error;
+ }
+ ksft_test_result_pass("open a new pidfd test: passed\n");
+
+ pid = get_pid_from_fdinfo_file(pidfd, "Pid:", sizeof("Pid:") - 1);
+ ksft_print_msg("pidfd %d refers to process with pid %d\n", pidfd, pid);
+
+ ret = 0;
+
+on_error:
+ if (pidfd >= 0)
+ close(pidfd);
+
+ return !ret ? ksft_exit_pass() : ksft_exit_fail();
+}
diff --git a/tools/testing/selftests/pidfd/pidfd_test.c b/tools/testing/selftests/pidfd/pidfd_test.c
index 5bae1792e3d6..a03387052aa7 100644
--- a/tools/testing/selftests/pidfd/pidfd_test.c
+++ b/tools/testing/selftests/pidfd/pidfd_test.c
@@ -14,6 +14,7 @@
#include <sys/wait.h>
#include <unistd.h>
+#include "pidfd.h"
#include "../kselftest.h"
static inline int sys_pidfd_send_signal(int pidfd, int sig, siginfo_t *info,
@@ -62,28 +63,6 @@ static int test_pidfd_send_signal_simple_success(void)
return 0;
}
-static int wait_for_pid(pid_t pid)
-{
- int status, ret;
-
-again:
- ret = waitpid(pid, &status, 0);
- if (ret == -1) {
- if (errno == EINTR)
- goto again;
-
- return -1;
- }
-
- if (ret != pid)
- goto again;
-
- if (!WIFEXITED(status))
- return -1;
-
- return WEXITSTATUS(status);
-}
-
static int test_pidfd_send_signal_exited_fail(void)
{
int pidfd, ret, saved_errno;
@@ -128,13 +107,6 @@ static int test_pidfd_send_signal_exited_fail(void)
return 0;
}
-/*
- * The kernel reserves 300 pids via RESERVED_PIDS in kernel/pid.c
- * That means, when it wraps around any pid < 300 will be skipped.
- * So we need to use a pid > 300 in order to test recycling.
- */
-#define PID_RECYCLE 1000
-
/*
* Maximum number of cycles we allow. This is equivalent to PID_MAX_DEFAULT.
* If users set a higher limit or we have cycled PIDFD_MAX_DEFAULT number of
@@ -143,17 +115,6 @@ static int test_pidfd_send_signal_exited_fail(void)
*/
#define PIDFD_MAX_DEFAULT 0x8000
-/*
- * Define a few custom error codes for the child process to clearly indicate
- * what is happening. This way we can tell the difference between a system
- * error, a test error, etc.
- */
-#define PIDFD_PASS 0
-#define PIDFD_FAIL 1
-#define PIDFD_ERROR 2
-#define PIDFD_SKIP 3
-#define PIDFD_XFAIL 4
-
static int test_pidfd_send_signal_recycled_pid_fail(void)
{
int i, ret;
--
2.21.0
^ permalink raw reply related
* Re: [PATCH] mm: add account_locked_vm utility function
From: Daniel Jordan @ 2019-05-20 15:30 UTC (permalink / raw)
To: Alexey Kardashevskiy
Cc: Mark Rutland, Davidlohr Bueso, kvm, Alan Tull, linux-fpga,
linux-kernel, kvm-ppc, Daniel Jordan, linux-mm, Alex Williamson,
Jason Gunthorpe, Moritz Fischer, Steve Sistare, akpm,
linuxppc-dev, Christoph Lameter, Wu Hao
In-Reply-To: <4b42057f-b998-f87c-4e0f-a91abcb366f9@ozlabs.ru>
On Mon, May 20, 2019 at 04:19:34PM +1000, Alexey Kardashevskiy wrote:
> On 04/05/2019 06:16, Daniel Jordan wrote:
> > locked_vm accounting is done roughly the same way in five places, so
> > unify them in a helper. Standardize the debug prints, which vary
> > slightly.
>
> And I rather liked that prints were different and tell precisely which
> one of three each printk is.
I'm not following. One of three...callsites? But there were five callsites.
Anyway, I added a _RET_IP_ to the debug print so you can differentiate.
> I commented below but in general this seems working.
>
> Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Thanks! And for the review as well.
> > diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c b/drivers/vfio/vfio_iommu_spapr_tce.c
> > index 6b64e45a5269..d39a1b830d82 100644
> > --- a/drivers/vfio/vfio_iommu_spapr_tce.c
> > +++ b/drivers/vfio/vfio_iommu_spapr_tce.c
> > @@ -34,49 +35,13 @@
> > static void tce_iommu_detach_group(void *iommu_data,
> > struct iommu_group *iommu_group);
> >
> > -static long try_increment_locked_vm(struct mm_struct *mm, long npages)
> > +static int tce_account_locked_vm(struct mm_struct *mm, unsigned long npages,
> > + bool inc)
> > {
> > - long ret = 0, locked, lock_limit;
> > -
> > if (WARN_ON_ONCE(!mm))
> > return -EPERM;
>
>
> If this WARN_ON is the only reason for having tce_account_locked_vm()
> instead of calling account_locked_vm() directly, you can then ditch the
> check as I have never ever seen this triggered.
Great, will do.
> > diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> > index d0f731c9920a..15ac76171ccd 100644
> > --- a/drivers/vfio/vfio_iommu_type1.c
> > +++ b/drivers/vfio/vfio_iommu_type1.c
> > @@ -273,25 +273,14 @@ static int vfio_lock_acct(struct vfio_dma *dma, long npage, bool async)
> > return -ESRCH; /* process exited */
> >
> > ret = down_write_killable(&mm->mmap_sem);
> > - if (!ret) {
> > - if (npage > 0) {
> > - if (!dma->lock_cap) {
> > - unsigned long limit;
> > -
> > - limit = task_rlimit(dma->task,
> > - RLIMIT_MEMLOCK) >> PAGE_SHIFT;
> > -
> > - if (mm->locked_vm + npage > limit)
> > - ret = -ENOMEM;
> > - }
> > - }
> > + if (ret)
> > + goto out;
>
>
> A single "goto" to jump just 3 lines below seems unnecessary.
No strong preference here, I'll take out the goto.
> > +int __account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc,
> > + struct task_struct *task, bool bypass_rlim)
> > +{
> > + unsigned long locked_vm, limit;
> > + int ret = 0;
> > +
> > + locked_vm = mm->locked_vm;
> > + if (inc) {
> > + if (!bypass_rlim) {
> > + limit = task_rlimit(task, RLIMIT_MEMLOCK) >> PAGE_SHIFT;
> > + if (locked_vm + pages > limit) {
> > + ret = -ENOMEM;
> > + goto out;
> > + }
> > + }
>
> Nit:
>
> if (!ret)
>
> and then you don't need "goto out".
Ok, sure.
> > + mm->locked_vm = locked_vm + pages;
> > + } else {
> > + WARN_ON_ONCE(pages > locked_vm);
> > + mm->locked_vm = locked_vm - pages;
>
>
> Can go negative here. Not a huge deal but inaccurate imo.
I hear you, but setting a negative value to zero, as we had done previously,
doesn't make much sense to me.
^ permalink raw reply
* Re: PROBLEM: Power9: kernel oops on memory hotunplug from ppc64le guest
From: Aneesh Kumar K.V @ 2019-05-20 15:20 UTC (permalink / raw)
To: Nicholas Piggin, bharata
Cc: linux-kernel, srikanth, linux-next, bharata, linuxppc-dev
In-Reply-To: <1558363500.jsgl4a2lfa.astroid@bobo.none>
On 5/20/19 8:25 PM, Nicholas Piggin wrote:
> Bharata B Rao's on May 21, 2019 12:29 am:
>> On Mon, May 20, 2019 at 01:50:35PM +0530, Bharata B Rao wrote:
>>> On Mon, May 20, 2019 at 05:00:21PM +1000, Nicholas Piggin wrote:
>>>> Bharata B Rao's on May 20, 2019 3:56 pm:
>>>>> On Mon, May 20, 2019 at 02:48:35PM +1000, Nicholas Piggin wrote:
>>>>>>>>> git bisect points to
>>>>>>>>>
>>>>>>>>> commit 4231aba000f5a4583dd9f67057aadb68c3eca99d
>>>>>>>>> Author: Nicholas Piggin <npiggin@gmail.com>
>>>>>>>>> Date: Fri Jul 27 21:48:17 2018 +1000
>>>>>>>>>
>>>>>>>>> powerpc/64s: Fix page table fragment refcount race vs speculative references
>>>>>>>>>
>>>>>>>>> The page table fragment allocator uses the main page refcount racily
>>>>>>>>> with respect to speculative references. A customer observed a BUG due
>>>>>>>>> to page table page refcount underflow in the fragment allocator. This
>>>>>>>>> can be caused by the fragment allocator set_page_count stomping on a
>>>>>>>>> speculative reference, and then the speculative failure handler
>>>>>>>>> decrements the new reference, and the underflow eventually pops when
>>>>>>>>> the page tables are freed.
>>>>>>>>>
>>>>>>>>> Fix this by using a dedicated field in the struct page for the page
>>>>>>>>> table fragment allocator.
>>>>>>>>>
>>>>>>>>> Fixes: 5c1f6ee9a31c ("powerpc: Reduce PTE table memory wastage")
>>>>>>>>> Cc: stable@vger.kernel.org # v3.10+
>>>>>>>>
>>>>>>>> That's the commit that added the BUG_ON(), so prior to that you won't
>>>>>>>> see the crash.
>>>>>>>
>>>>>>> Right, but the commit says it fixes page table page refcount underflow by
>>>>>>> introducing a new field &page->pt_frag_refcount. Now we are hitting the underflow
>>>>>>> for this pt_frag_refcount.
>>>>>>
>>>>>> The fixed underflow is caused by a bug (race on page count) that got
>>>>>> fixed by that patch. You are hitting a different underflow here. It's
>>>>>> not certain my patch caused it, I'm just trying to reproduce now.
>>>>>
>>>>> Ok.
>>>>
>>>> Can't reproduce I'm afraid, tried adding and removing 8GB memory from a
>>>> 4GB guest (via host adding / removing memory device), and it just works.
>>>
>>> Boot, add 8G, reboot, remove 8G is the sequence to reproduce.
>>>
>>>>
>>>> It's likely to be an edge case like an off by one or rounding error
>>>> that just happens to trigger in your config. Might be easiest if you
>>>> could test with a debug patch.
>>>
>>> Sure, I will continue debugging.
>>
>> When the guest is rebooted after hotplug, the entire memory (which includes
>> the hotplugged memory) gets remapped again freshly. However at this time
>> since no slab is available yet, pt_frag_refcount never gets initialized as we
>> never do pte_fragment_alloc() for these mappings. So we right away hit the
>> underflow during the first unplug itself, it looks like.
>
> Nice catch, good debugging work.
>
>> I will check how this can be fixed.
>
> Tricky problem. What do you think? You might be able to make the early
> page table allocations in the same pattern as the frag allocations, and
> then fill in the struct page metadata when you have those.
I guess we need to do something similar to what x86 does. We need to
walk the init_mm page table again and re-init struct page and other data
structures backing the tables?
-aneesh
^ permalink raw reply
* Re: PROBLEM: Power9: kernel oops on memory hotunplug from ppc64le guest
From: Bharata B Rao @ 2019-05-20 15:12 UTC (permalink / raw)
To: Nicholas Piggin
Cc: bharata, linux-kernel, linux-next, aneesh.kumar, srikanth,
linuxppc-dev
In-Reply-To: <1558363500.jsgl4a2lfa.astroid@bobo.none>
On Tue, May 21, 2019 at 12:55:49AM +1000, Nicholas Piggin wrote:
> Bharata B Rao's on May 21, 2019 12:29 am:
> > On Mon, May 20, 2019 at 01:50:35PM +0530, Bharata B Rao wrote:
> >> On Mon, May 20, 2019 at 05:00:21PM +1000, Nicholas Piggin wrote:
> >> > Bharata B Rao's on May 20, 2019 3:56 pm:
> >> > > On Mon, May 20, 2019 at 02:48:35PM +1000, Nicholas Piggin wrote:
> >> > >> >> > git bisect points to
> >> > >> >> >
> >> > >> >> > commit 4231aba000f5a4583dd9f67057aadb68c3eca99d
> >> > >> >> > Author: Nicholas Piggin <npiggin@gmail.com>
> >> > >> >> > Date: Fri Jul 27 21:48:17 2018 +1000
> >> > >> >> >
> >> > >> >> > powerpc/64s: Fix page table fragment refcount race vs speculative references
> >> > >> >> >
> >> > >> >> > The page table fragment allocator uses the main page refcount racily
> >> > >> >> > with respect to speculative references. A customer observed a BUG due
> >> > >> >> > to page table page refcount underflow in the fragment allocator. This
> >> > >> >> > can be caused by the fragment allocator set_page_count stomping on a
> >> > >> >> > speculative reference, and then the speculative failure handler
> >> > >> >> > decrements the new reference, and the underflow eventually pops when
> >> > >> >> > the page tables are freed.
> >> > >> >> >
> >> > >> >> > Fix this by using a dedicated field in the struct page for the page
> >> > >> >> > table fragment allocator.
> >> > >> >> >
> >> > >> >> > Fixes: 5c1f6ee9a31c ("powerpc: Reduce PTE table memory wastage")
> >> > >> >> > Cc: stable@vger.kernel.org # v3.10+
> >> > >> >>
> >> > >> >> That's the commit that added the BUG_ON(), so prior to that you won't
> >> > >> >> see the crash.
> >> > >> >
> >> > >> > Right, but the commit says it fixes page table page refcount underflow by
> >> > >> > introducing a new field &page->pt_frag_refcount. Now we are hitting the underflow
> >> > >> > for this pt_frag_refcount.
> >> > >>
> >> > >> The fixed underflow is caused by a bug (race on page count) that got
> >> > >> fixed by that patch. You are hitting a different underflow here. It's
> >> > >> not certain my patch caused it, I'm just trying to reproduce now.
> >> > >
> >> > > Ok.
> >> >
> >> > Can't reproduce I'm afraid, tried adding and removing 8GB memory from a
> >> > 4GB guest (via host adding / removing memory device), and it just works.
> >>
> >> Boot, add 8G, reboot, remove 8G is the sequence to reproduce.
> >>
> >> >
> >> > It's likely to be an edge case like an off by one or rounding error
> >> > that just happens to trigger in your config. Might be easiest if you
> >> > could test with a debug patch.
> >>
> >> Sure, I will continue debugging.
> >
> > When the guest is rebooted after hotplug, the entire memory (which includes
> > the hotplugged memory) gets remapped again freshly. However at this time
> > since no slab is available yet, pt_frag_refcount never gets initialized as we
> > never do pte_fragment_alloc() for these mappings. So we right away hit the
> > underflow during the first unplug itself, it looks like.
>
> Nice catch, good debugging work.
Thanks, with help from Aneesh.
>
> > I will check how this can be fixed.
>
> Tricky problem. What do you think? You might be able to make the early
> page table allocations in the same pattern as the frag allocations, and
> then fill in the struct page metadata when you have those.
Will explore.
>
> Other option may be create a new set of page tables after mm comes up
> to replace the early page tables with. That's a bigger hammer though.
Will also check if similar scenario exists on x86 and if so, how and when
pte frag data is fixed there.
Regards,
Bharata.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox