* [PATCH v2 08/12] mm: Define pasid in mm
From: Fenghua Yu @ 2020-06-13 0:41 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
David Woodhouse, Lu Baolu, Frederic Barrat, Andrew Donnellan,
Felix Kuehling, Joerg Roedel, Dave Hansen, Tony Luck, Ashok Raj,
Jacob Jun Pan, Dave Jiang, Yu-cheng Yu, Sohil Mehta,
Ravi V Shankar
Cc: Fenghua Yu, x86, linux-kernel, amd-gfx, iommu, linuxppc-dev
In-Reply-To: <1592008893-9388-1-git-send-email-fenghua.yu@intel.com>
PASID is shared by all threads in a process. So the logical place to keep
track of it is in the "mm". Both ARM and X86 need to use the PASID in the
"mm".
Suggested-by: Christoph Hellwig <hch@infradeed.org>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
v2:
- This new patch moves "pasid" from x86 specific mm_context_t to generic
struct mm_struct per Christopher's comment: https://lore.kernel.org/linux-iommu/20200414170252.714402-1-jean-philippe@linaro.org/T/#mb57110ffe1aaa24750eeea4f93b611f0d1913911
- Jean-Philippe Brucker released a virtually same patch. I still put this
patch in the series for better review. The upstream kernel only needs one
of the two patches eventually.
https://lore.kernel.org/linux-iommu/20200519175502.2504091-2-jean-philippe@linaro.org/
- Change CONFIG_IOASID to CONFIG_PCI_PASID (Ashok)
include/linux/mm_types.h | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 64ede5f150dc..5778db3aa42d 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -538,6 +538,10 @@ struct mm_struct {
atomic_long_t hugetlb_usage;
#endif
struct work_struct async_put_work;
+
+#ifdef CONFIG_PCI_PASID
+ unsigned int pasid;
+#endif
} __randomize_layout;
/*
--
2.19.1
^ permalink raw reply related
* [PATCH v2 06/12] x86/fpu/xstate: Add supervisor PASID state for ENQCMD feature
From: Fenghua Yu @ 2020-06-13 0:41 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
David Woodhouse, Lu Baolu, Frederic Barrat, Andrew Donnellan,
Felix Kuehling, Joerg Roedel, Dave Hansen, Tony Luck, Ashok Raj,
Jacob Jun Pan, Dave Jiang, Yu-cheng Yu, Sohil Mehta,
Ravi V Shankar
Cc: Fenghua Yu, x86, linux-kernel, amd-gfx, iommu, linuxppc-dev
In-Reply-To: <1592008893-9388-1-git-send-email-fenghua.yu@intel.com>
From: Yu-cheng Yu <yu-cheng.yu@intel.com>
ENQCMD instruction reads PASID from IA32_PASID MSR. The MSR is stored
in the task's supervisor FPU PASID state and is context switched by
XSAVES/XRSTORS.
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
v2:
- Modify the commit message (Thomas)
arch/x86/include/asm/fpu/types.h | 10 ++++++++++
arch/x86/include/asm/fpu/xstate.h | 2 +-
arch/x86/kernel/fpu/xstate.c | 4 ++++
3 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index f098f6cab94b..00f8efd4c07d 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -114,6 +114,7 @@ enum xfeature {
XFEATURE_Hi16_ZMM,
XFEATURE_PT_UNIMPLEMENTED_SO_FAR,
XFEATURE_PKRU,
+ XFEATURE_PASID,
XFEATURE_MAX,
};
@@ -128,6 +129,7 @@ enum xfeature {
#define XFEATURE_MASK_Hi16_ZMM (1 << XFEATURE_Hi16_ZMM)
#define XFEATURE_MASK_PT (1 << XFEATURE_PT_UNIMPLEMENTED_SO_FAR)
#define XFEATURE_MASK_PKRU (1 << XFEATURE_PKRU)
+#define XFEATURE_MASK_PASID (1 << XFEATURE_PASID)
#define XFEATURE_MASK_FPSSE (XFEATURE_MASK_FP | XFEATURE_MASK_SSE)
#define XFEATURE_MASK_AVX512 (XFEATURE_MASK_OPMASK \
@@ -229,6 +231,14 @@ struct pkru_state {
u32 pad;
} __packed;
+/*
+ * State component 10 is supervisor state used for context-switching the
+ * PASID state.
+ */
+struct ia32_pasid_state {
+ u64 pasid;
+} __packed;
+
struct xstate_header {
u64 xfeatures;
u64 xcomp_bv;
diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index 422d8369012a..ab9833c57aaa 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -33,7 +33,7 @@
XFEATURE_MASK_BNDCSR)
/* All currently supported supervisor features */
-#define XFEATURE_MASK_SUPERVISOR_SUPPORTED (0)
+#define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID)
/*
* Unsupported supervisor features. When a supervisor feature in this mask is
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index bda2e5eaca0e..31629e43383c 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -37,6 +37,7 @@ static const char *xfeature_names[] =
"AVX-512 ZMM_Hi256" ,
"Processor Trace (unused)" ,
"Protection Keys User registers",
+ "PASID state",
"unknown xstate feature" ,
};
@@ -51,6 +52,7 @@ static short xsave_cpuid_features[] __initdata = {
X86_FEATURE_AVX512F,
X86_FEATURE_INTEL_PT,
X86_FEATURE_PKU,
+ X86_FEATURE_ENQCMD,
};
/*
@@ -316,6 +318,7 @@ static void __init print_xstate_features(void)
print_xstate_feature(XFEATURE_MASK_ZMM_Hi256);
print_xstate_feature(XFEATURE_MASK_Hi16_ZMM);
print_xstate_feature(XFEATURE_MASK_PKRU);
+ print_xstate_feature(XFEATURE_MASK_PASID);
}
/*
@@ -590,6 +593,7 @@ static void check_xstate_against_struct(int nr)
XCHECK_SZ(sz, nr, XFEATURE_ZMM_Hi256, struct avx_512_zmm_uppers_state);
XCHECK_SZ(sz, nr, XFEATURE_Hi16_ZMM, struct avx_512_hi16_state);
XCHECK_SZ(sz, nr, XFEATURE_PKRU, struct pkru_state);
+ XCHECK_SZ(sz, nr, XFEATURE_PASID, struct ia32_pasid_state);
/*
* Make *SURE* to add any feature numbers in below if
--
2.19.1
^ permalink raw reply related
* [PATCH v2 10/12] x86/process: Clear PASID state for a newly forked/cloned thread
From: Fenghua Yu @ 2020-06-13 0:41 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
David Woodhouse, Lu Baolu, Frederic Barrat, Andrew Donnellan,
Felix Kuehling, Joerg Roedel, Dave Hansen, Tony Luck, Ashok Raj,
Jacob Jun Pan, Dave Jiang, Yu-cheng Yu, Sohil Mehta,
Ravi V Shankar
Cc: Fenghua Yu, x86, linux-kernel, amd-gfx, iommu, linuxppc-dev
In-Reply-To: <1592008893-9388-1-git-send-email-fenghua.yu@intel.com>
The PASID state has to be cleared on forks, since the child has a
different address space. The PASID is also cleared for thread clone. While
it would be correct to inherit the PASID in this case, it is unknown
whether the new task will use ENQCMD. Giving it the PASID "just in case"
would have the downside of increased context switch overhead to setting
the PASID MSR.
Since #GP faults have to be handled on any threads that were created before
the PASID was assigned to the mm of the process, newly created threads
might as well be treated in a consistent way.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
v2:
- Modify init_task_pasid().
arch/x86/kernel/process.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index f362ce0d5ac0..1b1492e337a6 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -121,6 +121,21 @@ static int set_new_tls(struct task_struct *p, unsigned long tls)
return do_set_thread_area_64(p, ARCH_SET_FS, tls);
}
+/* Initialize the PASID state for the forked/cloned thread. */
+static void init_task_pasid(struct task_struct *task)
+{
+ struct ia32_pasid_state *ppasid;
+
+ /*
+ * Initialize the PASID state so that the PASID MSR will be
+ * initialized to its initial state (0) by XRSTORS when the task is
+ * scheduled for the first time.
+ */
+ ppasid = get_xsave_addr(&task->thread.fpu.state.xsave, XFEATURE_PASID);
+ if (ppasid)
+ ppasid->pasid = INIT_PASID;
+}
+
int copy_thread_tls(unsigned long clone_flags, unsigned long sp,
unsigned long arg, struct task_struct *p, unsigned long tls)
{
@@ -174,6 +189,9 @@ int copy_thread_tls(unsigned long clone_flags, unsigned long sp,
task_user_gs(p) = get_user_gs(current_pt_regs());
#endif
+ if (static_cpu_has(X86_FEATURE_ENQCMD))
+ init_task_pasid(p);
+
/* Set a new TLS for the child thread? */
if (clone_flags & CLONE_SETTLS)
ret = set_new_tls(p, tls);
--
2.19.1
^ permalink raw reply related
* [PATCH v2 09/12] fork: Clear PASID for new mm
From: Fenghua Yu @ 2020-06-13 0:41 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
David Woodhouse, Lu Baolu, Frederic Barrat, Andrew Donnellan,
Felix Kuehling, Joerg Roedel, Dave Hansen, Tony Luck, Ashok Raj,
Jacob Jun Pan, Dave Jiang, Yu-cheng Yu, Sohil Mehta,
Ravi V Shankar
Cc: Fenghua Yu, x86, linux-kernel, amd-gfx, iommu, linuxppc-dev
In-Reply-To: <1592008893-9388-1-git-send-email-fenghua.yu@intel.com>
When a new mm is created, its PASID should be cleared, i.e. the PASID is
initialized to its init state 0 on both ARM and X86.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
v2:
- Add this patch to initialize PASID value for a new mm.
include/linux/mm_types.h | 2 ++
kernel/fork.c | 8 ++++++++
2 files changed, 10 insertions(+)
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 5778db3aa42d..904bc07411a9 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -22,6 +22,8 @@
#endif
#define AT_VECTOR_SIZE (2*(AT_VECTOR_SIZE_ARCH + AT_VECTOR_SIZE_BASE + 1))
+/* Initial PASID value is 0. */
+#define INIT_PASID 0
struct address_space;
struct mem_cgroup;
diff --git a/kernel/fork.c b/kernel/fork.c
index 142b23645d82..085e72d3e9eb 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1007,6 +1007,13 @@ static void mm_init_owner(struct mm_struct *mm, struct task_struct *p)
#endif
}
+static void mm_init_pasid(struct mm_struct *mm)
+{
+#ifdef CONFIG_PCI_PASID
+ mm->pasid = INIT_PASID;
+#endif
+}
+
static void mm_init_uprobes_state(struct mm_struct *mm)
{
#ifdef CONFIG_UPROBES
@@ -1035,6 +1042,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p,
mm_init_cpumask(mm);
mm_init_aio(mm);
mm_init_owner(mm, p);
+ mm_init_pasid(mm);
RCU_INIT_POINTER(mm->exe_file, NULL);
mmu_notifier_subscriptions_init(mm);
init_tlb_flush_pending(mm);
--
2.19.1
^ permalink raw reply related
* [PATCH v2 11/12] x86/mmu: Allocate/free PASID
From: Fenghua Yu @ 2020-06-13 0:41 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
David Woodhouse, Lu Baolu, Frederic Barrat, Andrew Donnellan,
Felix Kuehling, Joerg Roedel, Dave Hansen, Tony Luck, Ashok Raj,
Jacob Jun Pan, Dave Jiang, Yu-cheng Yu, Sohil Mehta,
Ravi V Shankar
Cc: Fenghua Yu, x86, linux-kernel, amd-gfx, iommu, linuxppc-dev
In-Reply-To: <1592008893-9388-1-git-send-email-fenghua.yu@intel.com>
A PASID is allocated for an "mm" the first time any thread attaches
to an SVM capable device. Later device attachments (whether to the same
device or another SVM device) will re-use the same PASID.
The PASID is freed when the process exits (so no need to keep
reference counts on how many SVM devices are sharing the PASID).
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
v2:
- Define a helper free_bind() to simplify error exit code in bind_mm()
(Thomas)
- Fix a ret error code in bind_mm() (Thomas)
- Change pasid's type from "int" to "unsigned int" to have consistent
pasid type in iommu (Thomas)
- Simplify alloc_pasid() a bit.
arch/x86/include/asm/iommu.h | 2 +
arch/x86/include/asm/mmu_context.h | 14 ++++
drivers/iommu/intel/svm.c | 101 +++++++++++++++++++++++++----
3 files changed, 105 insertions(+), 12 deletions(-)
diff --git a/arch/x86/include/asm/iommu.h b/arch/x86/include/asm/iommu.h
index bf1ed2ddc74b..ed41259fe7ac 100644
--- a/arch/x86/include/asm/iommu.h
+++ b/arch/x86/include/asm/iommu.h
@@ -26,4 +26,6 @@ arch_rmrr_sanity_check(struct acpi_dmar_reserved_memory *rmrr)
return -EINVAL;
}
+void __free_pasid(struct mm_struct *mm);
+
#endif /* _ASM_X86_IOMMU_H */
diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index 47562147e70b..f8c91ce8c451 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -13,6 +13,7 @@
#include <asm/tlbflush.h>
#include <asm/paravirt.h>
#include <asm/debugreg.h>
+#include <asm/iommu.h>
extern atomic64_t last_mm_ctx_id;
@@ -117,9 +118,22 @@ static inline int init_new_context(struct task_struct *tsk,
init_new_context_ldt(mm);
return 0;
}
+
+static inline void free_pasid(struct mm_struct *mm)
+{
+ if (!IS_ENABLED(CONFIG_INTEL_IOMMU_SVM))
+ return;
+
+ if (!cpu_feature_enabled(X86_FEATURE_ENQCMD))
+ return;
+
+ __free_pasid(mm);
+}
+
static inline void destroy_context(struct mm_struct *mm)
{
destroy_context_ldt(mm);
+ free_pasid(mm);
}
extern void switch_mm(struct mm_struct *prev, struct mm_struct *next,
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 4e775e12ae52..27dc866b8461 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -425,6 +425,53 @@ int intel_svm_unbind_gpasid(struct device *dev, unsigned int pasid)
return ret;
}
+static void free_bind(struct intel_svm *svm, struct intel_svm_dev *sdev,
+ bool new_pasid)
+{
+ if (new_pasid)
+ ioasid_free(svm->pasid);
+ kfree(svm);
+ kfree(sdev);
+}
+
+/*
+ * If this mm already has a PASID, use it. Otherwise allocate a new one.
+ * Let the caller know if a new PASID is allocated via 'new_pasid'.
+ */
+static int alloc_pasid(struct intel_svm *svm, struct mm_struct *mm,
+ unsigned int pasid_max, bool *new_pasid,
+ unsigned int flags)
+{
+ unsigned int pasid;
+
+ *new_pasid = false;
+
+ /*
+ * Reuse the PASID if the mm already has a PASID and not a private
+ * PASID is requested.
+ */
+ if (mm && mm->pasid && !(flags & SVM_FLAG_PRIVATE_PASID)) {
+ /*
+ * Once a PASID is allocated for this mm, the PASID
+ * stays with the mm until the mm is dropped. Reuse
+ * the PASID which has been already allocated for the
+ * mm instead of allocating a new one.
+ */
+ ioasid_set_data(mm->pasid, svm);
+
+ return mm->pasid;
+ }
+
+ /* Allocate a new pasid. Do not use PASID 0, reserved for init PASID. */
+ pasid = ioasid_alloc(NULL, PASID_MIN, pasid_max - 1, svm);
+ if (pasid != INVALID_IOASID) {
+ /* A new pasid is allocated. */
+ *new_pasid = true;
+ }
+
+ return pasid;
+}
+
/* Caller must hold pasid_mutex, mm reference */
static int
intel_svm_bind_mm(struct device *dev, unsigned int flags,
@@ -518,6 +565,8 @@ intel_svm_bind_mm(struct device *dev, unsigned int flags,
init_rcu_head(&sdev->rcu);
if (!svm) {
+ bool new_pasid;
+
svm = kzalloc(sizeof(*svm), GFP_KERNEL);
if (!svm) {
ret = -ENOMEM;
@@ -529,12 +578,9 @@ intel_svm_bind_mm(struct device *dev, unsigned int flags,
if (pasid_max > intel_pasid_max_id)
pasid_max = intel_pasid_max_id;
- /* Do not use PASID 0, reserved for RID to PASID */
- svm->pasid = ioasid_alloc(NULL, PASID_MIN,
- pasid_max - 1, svm);
+ svm->pasid = alloc_pasid(svm, mm, pasid_max, &new_pasid, flags);
if (svm->pasid == INVALID_IOASID) {
- kfree(svm);
- kfree(sdev);
+ free_bind(svm, sdev, new_pasid);
ret = -ENOSPC;
goto out;
}
@@ -547,9 +593,7 @@ intel_svm_bind_mm(struct device *dev, unsigned int flags,
if (mm) {
ret = mmu_notifier_register(&svm->notifier, mm);
if (ret) {
- ioasid_free(svm->pasid);
- kfree(svm);
- kfree(sdev);
+ free_bind(svm, sdev, new_pasid);
goto out;
}
}
@@ -565,12 +609,18 @@ intel_svm_bind_mm(struct device *dev, unsigned int flags,
if (ret) {
if (mm)
mmu_notifier_unregister(&svm->notifier, mm);
- ioasid_free(svm->pasid);
- kfree(svm);
- kfree(sdev);
+ free_bind(svm, sdev, new_pasid);
goto out;
}
+ if (mm && new_pasid && !(flags & SVM_FLAG_PRIVATE_PASID)) {
+ /*
+ * Track the new pasid in the mm. The pasid will be
+ * freed at process exit. Don't track requested
+ * private PASID in the mm.
+ */
+ mm->pasid = svm->pasid;
+ }
list_add_tail(&svm->list, &global_svm_list);
} else {
/*
@@ -640,7 +690,8 @@ static int intel_svm_unbind_mm(struct device *dev, unsigned int pasid)
kfree_rcu(sdev, rcu);
if (list_empty(&svm->devs)) {
- ioasid_free(svm->pasid);
+ /* Clear data in the pasid. */
+ ioasid_set_data(pasid, NULL);
if (svm->mm)
mmu_notifier_unregister(&svm->notifier, svm->mm);
list_del(&svm->list);
@@ -1001,3 +1052,29 @@ unsigned int intel_svm_get_pasid(struct iommu_sva *sva)
return pasid;
}
+
+/*
+ * An invalid pasid is either 0 (init PASID value) or bigger than max PASID
+ * (PASID_MAX - 1).
+ */
+static bool invalid_pasid(unsigned int pasid)
+{
+ return (pasid == INIT_PASID) || (pasid >= PASID_MAX);
+}
+
+/* On process exit free the PASID (if one was allocated). */
+void __free_pasid(struct mm_struct *mm)
+{
+ unsigned int pasid = mm->pasid;
+
+ /* No need to free invalid pasid. */
+ if (invalid_pasid(pasid))
+ return;
+
+ /*
+ * Since the pasid is not bound to any svm by now, there is no race
+ * here with binding/unbinding and no need to protect the free
+ * operation by pasid_mutex.
+ */
+ ioasid_free(pasid);
+}
--
2.19.1
^ permalink raw reply related
* [PATCH v2 12/12] x86/traps: Fix up invalid PASID
From: Fenghua Yu @ 2020-06-13 0:41 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
David Woodhouse, Lu Baolu, Frederic Barrat, Andrew Donnellan,
Felix Kuehling, Joerg Roedel, Dave Hansen, Tony Luck, Ashok Raj,
Jacob Jun Pan, Dave Jiang, Yu-cheng Yu, Sohil Mehta,
Ravi V Shankar
Cc: Fenghua Yu, x86, linux-kernel, amd-gfx, iommu, linuxppc-dev
In-Reply-To: <1592008893-9388-1-git-send-email-fenghua.yu@intel.com>
A #GP fault is generated when ENQCMD instruction is executed without
a valid PASID value programmed in the current thread's PASID MSR. The
#GP fault handler will initialize the MSR if a PASID has been allocated
for this process.
Decoding the user instruction is ugly and sets a bad architecture
precedent. It may not function if the faulting instruction is modified
after #GP.
Thomas suggested to provide a reason for the #GP caused by executing ENQCMD
without a valid PASID value programmed. #GP error codes are 16 bits and all
16 bits are taken. Refer to SDM Vol 3, Chapter 16.13 for details. The other
choice was to reflect the error code in an MSR. ENQCMD can also cause #GP
when loading from the source operand, so its not fully comprehending all
the reasons. Rather than special case the ENQCMD, in future Intel may
choose a different fault mechanism for such cases if recovery is needed on
#GP.
The following heuristic is used to avoid decoding the user instructions
to determine the precise reason for the #GP fault:
1) If the mm for the process has not been allocated a PASID, this #GP
cannot be fixed.
2) If the PASID MSR is already initialized, then the #GP was for some
other reason
3) Try initializing the PASID MSR and returning. If the #GP was from
an ENQCMD this will fix it. If not, the #GP fault will be repeated
and will hit case "2".
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
v2:
- Update the first paragraph of the commit message (Thomas)
- Add reasons why don't decode the user instruction and don't use
#GP error code (Thomas)
- Change get_task_mm() to current->mm (Thomas)
- Add comments on why IRQ is disabled during PASID fixup (Thomas)
- Add comment in fixup() that the function is called when #GP is from
user (so mm is not NULL) (Dave Hansen)
arch/x86/include/asm/iommu.h | 1 +
arch/x86/kernel/traps.c | 23 +++++++++++++++++++++
drivers/iommu/intel/svm.c | 39 ++++++++++++++++++++++++++++++++++++
3 files changed, 63 insertions(+)
diff --git a/arch/x86/include/asm/iommu.h b/arch/x86/include/asm/iommu.h
index ed41259fe7ac..e9365a5d6f7d 100644
--- a/arch/x86/include/asm/iommu.h
+++ b/arch/x86/include/asm/iommu.h
@@ -27,5 +27,6 @@ arch_rmrr_sanity_check(struct acpi_dmar_reserved_memory *rmrr)
}
void __free_pasid(struct mm_struct *mm);
+bool __fixup_pasid_exception(void);
#endif /* _ASM_X86_IOMMU_H */
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 4cc541051994..0f78d5cdddfe 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -59,6 +59,7 @@
#include <asm/umip.h>
#include <asm/insn.h>
#include <asm/insn-eval.h>
+#include <asm/iommu.h>
#ifdef CONFIG_X86_64
#include <asm/x86_init.h>
@@ -436,6 +437,16 @@ static enum kernel_gp_hint get_kernel_gp_address(struct pt_regs *regs,
return GP_CANONICAL;
}
+static bool fixup_pasid_exception(void)
+{
+ if (!IS_ENABLED(CONFIG_INTEL_IOMMU_SVM))
+ return false;
+ if (!static_cpu_has(X86_FEATURE_ENQCMD))
+ return false;
+
+ return __fixup_pasid_exception();
+}
+
#define GPFSTR "general protection fault"
dotraplinkage void do_general_protection(struct pt_regs *regs, long error_code)
@@ -447,6 +458,18 @@ dotraplinkage void do_general_protection(struct pt_regs *regs, long error_code)
int ret;
RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
+
+ /*
+ * Perform the check for a user mode PASID exception before enable
+ * interrupts. Doing this here ensures that the PASID MSR can be simply
+ * accessed because the contents are known to be still associated
+ * with the current process.
+ */
+ if (user_mode(regs) && fixup_pasid_exception()) {
+ cond_local_irq_enable(regs);
+ return;
+ }
+
cond_local_irq_enable(regs);
if (static_cpu_has(X86_FEATURE_UMIP)) {
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 27dc866b8461..81fd2380c0f9 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -1078,3 +1078,42 @@ void __free_pasid(struct mm_struct *mm)
*/
ioasid_free(pasid);
}
+
+/*
+ * Apply some heuristics to see if the #GP fault was caused by a thread
+ * that hasn't had the IA32_PASID MSR initialized. If it looks like that
+ * is the problem, try initializing the IA32_PASID MSR. If the heuristic
+ * guesses incorrectly, take one more #GP fault.
+ */
+bool __fixup_pasid_exception(void)
+{
+ u64 pasid_msr;
+ unsigned int pasid;
+
+ /*
+ * This function is called only when this #GP was triggered from user
+ * space. So the mm cannot be NULL.
+ */
+ pasid = current->mm->pasid;
+ /* If the mm doesn't have a valid PASID, then can't help. */
+ if (invalid_pasid(pasid))
+ return false;
+
+ /*
+ * Since IRQ is disabled now, the current task still owns the FPU on
+ * this CPU and the PASID MSR can be directly accessed.
+ *
+ * If the MSR has a valid PASID, the #GP must be for some other reason.
+ *
+ * If rdmsr() is really a performance issue, a TIF_ flag may be
+ * added to check if the thread has a valid PASID instead of rdmsr().
+ */
+ rdmsrl(MSR_IA32_PASID, pasid_msr);
+ if (pasid_msr & MSR_IA32_PASID_VALID)
+ return false;
+
+ /* Fix up the MSR if the MSR doesn't have a valid PASID. */
+ wrmsrl(MSR_IA32_PASID, pasid | MSR_IA32_PASID_VALID);
+
+ return true;
+}
--
2.19.1
^ permalink raw reply related
* Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'
From: Segher Boessenkool @ 2020-06-13 1:08 UTC (permalink / raw)
To: Nick Desaulniers
Cc: Christophe Leroy, Michael Ellerman, LKML, Nicholas Piggin,
clang-built-linux, Paul Mackerras, linuxppc-dev
In-Reply-To: <CAKwvOdkKywb1KZ-SDwwuvQEmbsaAzJj9mEPqVG=qw1F5Ogv8rw@mail.gmail.com>
Hi!
On Fri, Jun 12, 2020 at 02:33:09PM -0700, Nick Desaulniers wrote:
> On Thu, Jun 11, 2020 at 4:53 PM Segher Boessenkool
> <segher@kernel.crashing.org> wrote:
> > The PowerPC part of
> > https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints
> > (sorry, no anchor) documents %U.
>
> I thought those were constraints, not output templates? Oh,
> The asm statement must also use %U<opno> as a placeholder for the
> “update” flag in the corresponding load or store instruction.
> got it.
Traditionally, *all* constraints were documented here, including the
ones that are only meant for GCC's internal use. And the output
modifiers were largely not documented at all.
For GCC 10, for Power, I changed it to only document the constraints
that should be public in gcc.info (and everything in gccint.info). The
output modifiers can neatly be documented here as well, since it such a
short section now. We're not quite there yet, but getting there.
> > Traditionally the source code is the documentation for this. The code
> > here starts with the comment
> > /* Write second word of DImode or DFmode reference. Works on register
> > or non-indexed memory only. */
> > (which is very out-of-date itself, it works fine for e.g. TImode as well,
> > but alas).
> >
> > Unit tests are completely unsuitable for most compiler things like this.
>
> What? No, surely one may write tests for output operands. Grepping
> for `%L` in gcc/ was less fun than I was hoping.
You should look for 'L' instead (incl. those quotes) ;-)
Unit tests are 100x as much work, and gets <5% of the problems, compared
to regression tests. Unit tests only test the stuff you should have
written *anyway*. It is much more useful to test that much higher level
things work, IMNSHO.
> > HtH,
>
> Yes, perfect, thank you so much! So it looks like LLVM does not yet
> handle %L properly for memory operands.
> https://bugs.llvm.org/show_bug.cgi?id=46186#c4
> It's neat to see how this is implemented in GCC (and how many aren't
> implemented in LLVM, yikes :( ). For reference, this is implemented
> in PPCAsmPrinter::PrintAsmOperand() and
> PPCAsmPrinter::PrintAsmMemoryOperand() in
> llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp. GCC switches first on the
> modifier characters, then the operand type.
That is what the rs6000 backend currently does, yeah. The print_operand
function just gets passed the modifier character (as "int code", or 0 if
there is no modifier). Since there are so many modifiers there aren't
really any better options than just doing a "switch (code)" around
everything else (well, things can be factored, some helper functions,
etc., but this is mostly very old code, and it has grown organically).
> LLVM dispatches on operand type, then modifier.
That is neater, certainly for REG operands.
> When I was looking into LLVM's AsmPrinter class,
> I was surprised to see it's basically an assembler that just has
> complex logic to just do a bunch of prints, so it makes sense to see
> that pattern in GCC literally calling printf.
GCC always outputs assembler code. This is usually a big advantage, for
things like output_operand.
> Some things I don't understand from PPC parlance is the "mode"
> (preinc, predec, premodify) and small data operands?
"mode" is "machine mode" -- SImode and the like. PRE_DEC etc. are
*codes* (rtx codes), like, (mem:DF (pre_dec:SI (reg:SI 39))) (straight
from the manual).
> IIUC the bug report correctly, it looks like LLVM is failing for the
> __put_user_asm2_goto case for -m32. A simple reproducer:
> https://godbolt.org/z/jBBF9b
>
> void foo(long long in, long long* out) {
> asm volatile(
> "stw%X1 %0, %1\n\t"
> "stw%X1 %L0, %L1"
> ::"r"(in), "m"(*out));
> }
This is wrong if operands[0] is a register, btw. So it should use 'o'
as constraint (not 'm'), and then the 'X' output modifier has become
useless.
> prints (in GCC):
> foo:
> stw 3, 0(5)
> stw 4, 4(5)
> blr
> (first time looking at ppc assembler, seems constants and registers
> are not as easy to distinguish,
The instruction mnemonic always tells you what types all arguments are.
Traditionally we don't write spaces after commas, either. That is
actually easier to read -- well, if you are used to it, anyway! :-)
> https://developer.ibm.com/technologies/linux/articles/l-ppc/ say "Get
> used to it." LOL, ok).
Since quite a while you can write your assembler using register names as
well. Not using the dangerous macros the Linux kernel had/has(with
which you can write "rN" in place of any "N", and it doesn't force you
to use the register name either, so you could write "li r3,r4" and
"mr r3,0" and even "addi r3,r0,1234", all very misleading).
> so that's "store word from register 3 into dereference of register 5
> plus 0, then store word from register 4 into dereference of register 5
> plus 4?"
Yup.
> Guessing the ppc32 abi is ILP32 putting long long's into two
> separate registers?
Yes, and the order is the same as it would be in memory (on BE, high
half goes into the lower-numbered register; on LE, the wr^Wother way
around).
> Seems easy to implement in LLVM (short of those modes/small data operands).
I don't know what SDATA variants LLVM does support?
> https://reviews.llvm.org/D81767
Output modifiers are not just for use by the calling convention (as your
examples already show :-) )
%Ln is the second word of a multi-word reference, not the "upper word"
(%Yn is third, %Zn is fourth, and for BE it isn't the high half even
for 2-word things).
The code looks like it will work (I don't know most LLVM specifics of
course).
Cheers,
Segher
^ permalink raw reply
* Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'
From: Christophe Leroy @ 2020-06-13 6:46 UTC (permalink / raw)
To: Nick Desaulniers, Segher Boessenkool
Cc: Christophe Leroy, Michael Ellerman, LKML, Nicholas Piggin,
clang-built-linux, Paul Mackerras, linuxppc-dev
In-Reply-To: <CAKwvOdkKywb1KZ-SDwwuvQEmbsaAzJj9mEPqVG=qw1F5Ogv8rw@mail.gmail.com>
On 06/12/2020 09:33 PM, Nick Desaulniers wrote:
>
> IIUC the bug report correctly, it looks like LLVM is failing for the
> __put_user_asm2_goto case for -m32. A simple reproducer:
> https://godbolt.org/z/jBBF9b
>
> void foo(long long in, long long* out) {
> asm volatile(
> "stw%X1 %0, %1\n\t"
> "stw%X1 %L0, %L1"
> ::"r"(in), "m"(*out));
> }
> prints (in GCC):
> foo:
> stw 3, 0(5)
> stw 4, 4(5)
> blr
> (first time looking at ppc assembler, seems constants and registers
> are not as easy to distinguish,
> https://developer.ibm.com/technologies/linux/articles/l-ppc/ say "Get
> used to it." LOL, ok).
When I do ppc-linux-objdump -d vmlinux, registers and constants are
easily distinguished, see below.
c0002284 <start_here>:
c0002284: 3c 40 c0 3c lis r2,-16324
c0002288: 60 42 45 00 ori r2,r2,17664
c000228c: 3c 82 40 00 addis r4,r2,16384
c0002290: 38 84 04 30 addi r4,r4,1072
c0002294: 7c 93 43 a6 mtsprg 3,r4
c0002298: 3c 20 c0 3e lis r1,-16322
c000229c: 38 21 e0 00 addi r1,r1,-8192
c00022a0: 38 00 00 00 li r0,0
c00022a4: 94 01 1f f0 stwu r0,8176(r1)
c00022a8: 48 35 e7 41 bl c03609e8 <early_init>
c00022ac: 38 60 00 00 li r3,0
c00022b0: 7f e4 fb 78 mr r4,r31
c00022b4: 48 35 e7 8d bl c0360a40 <machine_init>
c00022b8: 48 35 eb e1 bl c0360e98 <MMU_init>
c00022bc: 3c c0 c0 3c lis r6,-16324
c00022c0: 3c c6 40 00 addis r6,r6,16384
c00022c4: 7c df c3 a6 mtspr 799,r6
c00022c8: 3c 80 c0 00 lis r4,-16384
c00022cc: 60 84 22 e4 ori r4,r4,8932
c00022d0: 3c 84 40 00 addis r4,r4,16384
c00022d4: 38 60 10 02 li r3,4098
c00022d8: 7c 9a 03 a6 mtsrr0 r4
c00022dc: 7c 7b 03 a6 mtsrr1 r3
c00022e0: 4c 00 00 64 rfi
c00022e4: 7c 00 02 e4 tlbia
c00022e8: 7c 00 04 ac hwsync
c00022ec: 3c c6 c0 00 addis r6,r6,-16384
c00022f0: 3c a0 c0 3c lis r5,-16324
c00022f4: 60 a5 40 00 ori r5,r5,16384
c00022f8: 90 a0 00 f0 stw r5,240(0)
c00022fc: 3c a5 40 00 addis r5,r5,16384
c0002300: 90 c5 00 00 stw r6,0(r5)
c0002304: 38 80 10 32 li r4,4146
c0002308: 3c 60 c0 35 lis r3,-16331
c000230c: 60 63 d6 a8 ori r3,r3,54952
c0002310: 7c 7a 03 a6 mtsrr0 r3
c0002314: 7c 9b 03 a6 mtsrr1 r4
c0002318: 4c 00 00 64 rfi
For GCC, I think you call tell you want register names with -mregnames
Christophe
^ permalink raw reply
* [powerpc:next-test] BUILD SUCCESS 3371673d42d314f9ac721dc5042135df8bec49f9
From: kernel test robot @ 2020-06-13 6:49 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next-test
branch HEAD: 3371673d42d314f9ac721dc5042135df8bec49f9 powerpc/xive: Ignore kmemleak false positives
elapsed time: 510m
configs tested: 113
configs skipped: 111
The following configs have been built successfully.
More configs may be tested in the coming days.
arm defconfig
arm allyesconfig
arm allmodconfig
arm allnoconfig
arm64 allyesconfig
arm64 defconfig
arm64 allmodconfig
arm64 allnoconfig
s390 zfcpdump_defconfig
arc vdk_hs38_defconfig
arm moxart_defconfig
arc tb10x_defconfig
openrisc allyesconfig
arm efm32_defconfig
powerpc pq2fads_defconfig
arm tango4_defconfig
c6x evmc6472_defconfig
arm ixp4xx_defconfig
i386 allnoconfig
i386 allyesconfig
i386 defconfig
i386 debian-10.3
ia64 allmodconfig
ia64 defconfig
ia64 allnoconfig
ia64 allyesconfig
m68k allmodconfig
m68k allnoconfig
m68k sun3_defconfig
m68k defconfig
m68k allyesconfig
nios2 defconfig
nios2 allyesconfig
openrisc defconfig
c6x allyesconfig
c6x allnoconfig
nds32 defconfig
nds32 allnoconfig
csky allyesconfig
csky defconfig
alpha defconfig
alpha allyesconfig
xtensa allyesconfig
h8300 allyesconfig
h8300 allmodconfig
xtensa defconfig
arc defconfig
arc allyesconfig
sh allmodconfig
sh allnoconfig
microblaze allnoconfig
mips allyesconfig
mips allnoconfig
mips allmodconfig
parisc allnoconfig
parisc defconfig
parisc allyesconfig
parisc allmodconfig
powerpc allyesconfig
powerpc rhel-kconfig
powerpc allmodconfig
powerpc allnoconfig
powerpc defconfig
i386 randconfig-a006-20200612
i386 randconfig-a002-20200612
i386 randconfig-a001-20200612
i386 randconfig-a004-20200612
i386 randconfig-a005-20200612
i386 randconfig-a003-20200612
x86_64 randconfig-a001-20200612
x86_64 randconfig-a003-20200612
x86_64 randconfig-a002-20200612
x86_64 randconfig-a006-20200612
x86_64 randconfig-a005-20200612
x86_64 randconfig-a004-20200612
x86_64 randconfig-a015-20200613
x86_64 randconfig-a011-20200613
x86_64 randconfig-a016-20200613
x86_64 randconfig-a014-20200613
x86_64 randconfig-a012-20200613
x86_64 randconfig-a013-20200613
i386 randconfig-a015-20200612
i386 randconfig-a011-20200612
i386 randconfig-a014-20200612
i386 randconfig-a016-20200612
i386 randconfig-a013-20200612
i386 randconfig-a012-20200612
riscv allyesconfig
riscv allnoconfig
riscv defconfig
riscv allmodconfig
s390 allyesconfig
s390 allnoconfig
s390 allmodconfig
s390 defconfig
sparc allyesconfig
sparc defconfig
sparc64 defconfig
sparc64 allnoconfig
sparc64 allyesconfig
sparc64 allmodconfig
um allmodconfig
um allnoconfig
um defconfig
um allyesconfig
x86_64 rhel-7.6
x86_64 rhel-7.6-kselftests
x86_64 rhel-8.3
x86_64 kexec
x86_64 rhel
x86_64 rhel-7.2-clear
x86_64 lkp
x86_64 fedora-25
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* [powerpc:merge] BUILD SUCCESS 062ce06f9dcd140b6cd97102fec593a57c5fb397
From: kernel test robot @ 2020-06-13 6:49 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git merge
branch HEAD: 062ce06f9dcd140b6cd97102fec593a57c5fb397 Automatic merge of 'master', 'next' and 'fixes' (2020-06-13 08:10)
elapsed time: 511m
configs tested: 142
configs skipped: 6
The following configs have been built successfully.
More configs may be tested in the coming days.
arm defconfig
arm allyesconfig
arm allmodconfig
arm allnoconfig
arm64 allyesconfig
arm64 defconfig
arm64 allmodconfig
arm64 allnoconfig
arm zeus_defconfig
arm socfpga_defconfig
parisc generic-64bit_defconfig
mips qi_lb60_defconfig
arm ezx_defconfig
arm pxa168_defconfig
xtensa virt_defconfig
arm moxart_defconfig
arm zx_defconfig
sh sh7770_generic_defconfig
arm imote2_defconfig
arm clps711x_defconfig
sh kfr2r09-romimage_defconfig
arc nsimosci_hs_smp_defconfig
xtensa iss_defconfig
riscv rv32_defconfig
c6x evmc6474_defconfig
sh urquell_defconfig
powerpc amigaone_defconfig
microblaze defconfig
arm lpd270_defconfig
um x86_64_defconfig
arm s3c6400_defconfig
sh ecovec24-romimage_defconfig
c6x dsk6455_defconfig
arm tct_hammer_defconfig
arm aspeed_g5_defconfig
sh microdev_defconfig
mips bmips_stb_defconfig
mips ip22_defconfig
s390 zfcpdump_defconfig
arc vdk_hs38_defconfig
arc tb10x_defconfig
openrisc allyesconfig
sparc allyesconfig
arm pxa910_defconfig
mips cobalt_defconfig
microblaze nommu_defconfig
arm colibri_pxa300_defconfig
m68k atari_defconfig
arc nsim_700_defconfig
arm efm32_defconfig
powerpc pq2fads_defconfig
arm tango4_defconfig
c6x evmc6472_defconfig
arm ixp4xx_defconfig
i386 allyesconfig
i386 defconfig
i386 debian-10.3
i386 allnoconfig
ia64 allmodconfig
ia64 defconfig
ia64 allnoconfig
ia64 allyesconfig
m68k allmodconfig
m68k allnoconfig
m68k sun3_defconfig
m68k defconfig
m68k allyesconfig
nios2 defconfig
nios2 allyesconfig
openrisc defconfig
c6x allyesconfig
c6x allnoconfig
nds32 defconfig
nds32 allnoconfig
csky allyesconfig
csky defconfig
alpha defconfig
alpha allyesconfig
xtensa allyesconfig
h8300 allyesconfig
h8300 allmodconfig
xtensa defconfig
arc defconfig
arc allyesconfig
sh allmodconfig
sh allnoconfig
microblaze allnoconfig
mips allyesconfig
mips allnoconfig
mips allmodconfig
parisc allnoconfig
parisc defconfig
parisc allyesconfig
parisc allmodconfig
powerpc allyesconfig
powerpc rhel-kconfig
powerpc allmodconfig
powerpc allnoconfig
powerpc defconfig
x86_64 randconfig-a001-20200612
x86_64 randconfig-a003-20200612
x86_64 randconfig-a002-20200612
x86_64 randconfig-a006-20200612
x86_64 randconfig-a005-20200612
x86_64 randconfig-a004-20200612
i386 randconfig-a006-20200612
i386 randconfig-a002-20200612
i386 randconfig-a001-20200612
i386 randconfig-a004-20200612
i386 randconfig-a005-20200612
i386 randconfig-a003-20200612
i386 randconfig-a015-20200612
i386 randconfig-a011-20200612
i386 randconfig-a014-20200612
i386 randconfig-a016-20200612
i386 randconfig-a013-20200612
i386 randconfig-a012-20200612
riscv allyesconfig
riscv allnoconfig
riscv defconfig
riscv allmodconfig
s390 allyesconfig
s390 allnoconfig
s390 allmodconfig
s390 defconfig
sparc defconfig
sparc64 defconfig
sparc64 allnoconfig
sparc64 allyesconfig
sparc64 allmodconfig
um allmodconfig
um allnoconfig
um defconfig
um allyesconfig
x86_64 rhel-7.6
x86_64 rhel-7.6-kselftests
x86_64 rhel-8.3
x86_64 kexec
x86_64 rhel
x86_64 rhel-7.2-clear
x86_64 lkp
x86_64 fedora-25
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* [powerpc:fixes] BUILD SUCCESS e881bfaf5a5f409390973e076333281465f2b0d9
From: kernel test robot @ 2020-06-13 6:49 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git fixes
branch HEAD: e881bfaf5a5f409390973e076333281465f2b0d9 KVM: PPC: Fix nested guest RC bits update
elapsed time: 512m
configs tested: 108
configs skipped: 111
The following configs have been built successfully.
More configs may be tested in the coming days.
arm64 allyesconfig
arm64 defconfig
arm64 allmodconfig
arm64 allnoconfig
arm defconfig
arm allyesconfig
arm allmodconfig
arm allnoconfig
s390 zfcpdump_defconfig
arc vdk_hs38_defconfig
arm moxart_defconfig
arc tb10x_defconfig
openrisc allyesconfig
arm efm32_defconfig
powerpc pq2fads_defconfig
arm tango4_defconfig
c6x evmc6472_defconfig
arm ixp4xx_defconfig
i386 allnoconfig
i386 allyesconfig
i386 defconfig
i386 debian-10.3
ia64 allmodconfig
ia64 defconfig
ia64 allnoconfig
ia64 allyesconfig
m68k allmodconfig
m68k allnoconfig
m68k sun3_defconfig
m68k defconfig
m68k allyesconfig
nios2 defconfig
nios2 allyesconfig
openrisc defconfig
c6x allyesconfig
c6x allnoconfig
nds32 defconfig
nds32 allnoconfig
csky allyesconfig
csky defconfig
alpha defconfig
alpha allyesconfig
xtensa allyesconfig
h8300 allyesconfig
h8300 allmodconfig
xtensa defconfig
mips allyesconfig
mips allnoconfig
mips allmodconfig
parisc allnoconfig
parisc defconfig
parisc allyesconfig
parisc allmodconfig
powerpc allyesconfig
powerpc rhel-kconfig
powerpc allmodconfig
powerpc allnoconfig
powerpc defconfig
i386 randconfig-a006-20200612
i386 randconfig-a002-20200612
i386 randconfig-a001-20200612
i386 randconfig-a004-20200612
i386 randconfig-a005-20200612
i386 randconfig-a003-20200612
x86_64 randconfig-a001-20200612
x86_64 randconfig-a003-20200612
x86_64 randconfig-a002-20200612
x86_64 randconfig-a006-20200612
x86_64 randconfig-a005-20200612
x86_64 randconfig-a004-20200612
x86_64 randconfig-a015-20200613
x86_64 randconfig-a011-20200613
x86_64 randconfig-a016-20200613
x86_64 randconfig-a014-20200613
x86_64 randconfig-a012-20200613
x86_64 randconfig-a013-20200613
i386 randconfig-a015-20200612
i386 randconfig-a011-20200612
i386 randconfig-a014-20200612
i386 randconfig-a016-20200612
i386 randconfig-a013-20200612
i386 randconfig-a012-20200612
riscv allyesconfig
riscv allnoconfig
riscv defconfig
riscv allmodconfig
s390 allyesconfig
s390 allnoconfig
s390 allmodconfig
s390 defconfig
sparc allyesconfig
sparc defconfig
sparc64 defconfig
sparc64 allnoconfig
sparc64 allyesconfig
sparc64 allmodconfig
um allmodconfig
um allnoconfig
um allyesconfig
um defconfig
x86_64 rhel-7.6
x86_64 rhel-7.6-kselftests
x86_64 rhel-8.3
x86_64 kexec
x86_64 rhel
x86_64 rhel-7.2-clear
x86_64 lkp
x86_64 fedora-25
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* [powerpc:fixes-test] BUILD SUCCESS 54457f89d18ccbfa28805ca9457f0a95c65820fb
From: kernel test robot @ 2020-06-13 6:49 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git fixes-test
branch HEAD: 54457f89d18ccbfa28805ca9457f0a95c65820fb powerpc/mm: Fix typo in IS_ENABLED()
elapsed time: 512m
configs tested: 108
configs skipped: 111
The following configs have been built successfully.
More configs may be tested in the coming days.
arm defconfig
arm allyesconfig
arm allmodconfig
arm allnoconfig
arm64 allyesconfig
arm64 defconfig
arm64 allmodconfig
arm64 allnoconfig
s390 zfcpdump_defconfig
arc vdk_hs38_defconfig
arm moxart_defconfig
arc tb10x_defconfig
openrisc allyesconfig
arm efm32_defconfig
powerpc pq2fads_defconfig
arm tango4_defconfig
c6x evmc6472_defconfig
arm ixp4xx_defconfig
i386 allnoconfig
i386 allyesconfig
i386 defconfig
i386 debian-10.3
ia64 allmodconfig
ia64 defconfig
ia64 allnoconfig
ia64 allyesconfig
m68k allmodconfig
m68k allnoconfig
m68k sun3_defconfig
m68k defconfig
m68k allyesconfig
nios2 defconfig
nios2 allyesconfig
openrisc defconfig
c6x allyesconfig
c6x allnoconfig
nds32 defconfig
nds32 allnoconfig
csky allyesconfig
csky defconfig
alpha defconfig
alpha allyesconfig
xtensa allyesconfig
h8300 allyesconfig
h8300 allmodconfig
xtensa defconfig
mips allyesconfig
mips allnoconfig
mips allmodconfig
parisc allnoconfig
parisc defconfig
parisc allyesconfig
parisc allmodconfig
powerpc allyesconfig
powerpc rhel-kconfig
powerpc allmodconfig
powerpc allnoconfig
powerpc defconfig
i386 randconfig-a006-20200612
i386 randconfig-a002-20200612
i386 randconfig-a001-20200612
i386 randconfig-a004-20200612
i386 randconfig-a005-20200612
i386 randconfig-a003-20200612
x86_64 randconfig-a001-20200612
x86_64 randconfig-a003-20200612
x86_64 randconfig-a002-20200612
x86_64 randconfig-a006-20200612
x86_64 randconfig-a005-20200612
x86_64 randconfig-a004-20200612
x86_64 randconfig-a015-20200613
x86_64 randconfig-a011-20200613
x86_64 randconfig-a016-20200613
x86_64 randconfig-a014-20200613
x86_64 randconfig-a012-20200613
x86_64 randconfig-a013-20200613
i386 randconfig-a015-20200612
i386 randconfig-a011-20200612
i386 randconfig-a014-20200612
i386 randconfig-a016-20200612
i386 randconfig-a013-20200612
i386 randconfig-a012-20200612
riscv allyesconfig
riscv allnoconfig
riscv defconfig
riscv allmodconfig
s390 allyesconfig
s390 allnoconfig
s390 allmodconfig
s390 defconfig
sparc allyesconfig
sparc defconfig
sparc64 defconfig
sparc64 allnoconfig
sparc64 allyesconfig
sparc64 allmodconfig
um allmodconfig
um allnoconfig
um defconfig
um allyesconfig
x86_64 rhel-7.6
x86_64 rhel-7.6-kselftests
x86_64 rhel-8.3
x86_64 kexec
x86_64 rhel
x86_64 rhel-7.2-clear
x86_64 lkp
x86_64 fedora-25
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'
From: Michael Ellerman @ 2020-06-13 10:47 UTC (permalink / raw)
To: Nick Desaulniers, Segher Boessenkool
Cc: Christophe Leroy, Michael Ellerman, LKML, Nicholas Piggin,
clang-built-linux, Paul Mackerras, linuxppc-dev
In-Reply-To: <CAKwvOdkKywb1KZ-SDwwuvQEmbsaAzJj9mEPqVG=qw1F5Ogv8rw@mail.gmail.com>
Nick Desaulniers <ndesaulniers@google.com> writes:
> On Thu, Jun 11, 2020 at 4:53 PM Segher Boessenkool
> <segher@kernel.crashing.org> wrote:
>>
>> On Thu, Jun 11, 2020 at 03:43:55PM -0700, Nick Desaulniers wrote:
>> > Segher, Cristophe, I suspect Clang is missing support for the %L and %U
>> > output templates [1].
...
>
> IIUC the bug report correctly, it looks like LLVM is failing for the
> __put_user_asm2_goto case for -m32. A simple reproducer:
> https://godbolt.org/z/jBBF9b
If you add `-mregnames` you get register names:
https://godbolt.org/z/MxLjhF
foo:
stw %r3, 0(%r5)
stw %r4, 4(%r5)
blr
cheers
^ permalink raw reply
* Re: [PATCH v2 04/12] docs: x86: Add documentation for SVA (Shared Virtual Addressing)
From: Lu Baolu @ 2020-06-13 12:17 UTC (permalink / raw)
To: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, David Woodhouse, Frederic Barrat, Andrew Donnellan,
Felix Kuehling, Joerg Roedel, Dave Hansen, Tony Luck, Ashok Raj,
Jacob Jun Pan, Dave Jiang, Yu-cheng Yu, Sohil Mehta,
Ravi V Shankar
Cc: x86, linux-kernel, amd-gfx, iommu, linuxppc-dev, baolu.lu
In-Reply-To: <1592008893-9388-5-git-send-email-fenghua.yu@intel.com>
Hi Fenghua,
On 2020/6/13 8:41, Fenghua Yu wrote:
> From: Ashok Raj <ashok.raj@intel.com>
>
> ENQCMD and Data Streaming Accelerator (DSA) and all of their associated
> features are a complicated stack with lots of interconnected pieces.
> This documentation provides a big picture overview for all of the
> features.
>
> Signed-off-by: Ashok Raj <ashok.raj@intel.com>
> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
> v2:
> - Fix the doc format and add the doc in toctree (Thomas)
> - Modify the doc for better description (Thomas, Tony, Dave)
>
> Documentation/x86/index.rst | 1 +
> Documentation/x86/sva.rst | 287 ++++++++++++++++++++++++++++++++++++
> 2 files changed, 288 insertions(+)
> create mode 100644 Documentation/x86/sva.rst
>
> diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
> index 265d9e9a093b..e5d5ff096685 100644
> --- a/Documentation/x86/index.rst
> +++ b/Documentation/x86/index.rst
> @@ -30,3 +30,4 @@ x86-specific Documentation
> usb-legacy-support
> i386/index
> x86_64/index
> + sva
> diff --git a/Documentation/x86/sva.rst b/Documentation/x86/sva.rst
> new file mode 100644
> index 000000000000..1e52208c7dda
> --- /dev/null
> +++ b/Documentation/x86/sva.rst
> @@ -0,0 +1,287 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +===========================================
> +Shared Virtual Addressing (SVA) with ENQCMD
> +===========================================
> +
> +Background
> +==========
> +
> +Shared Virtual Addressing (SVA) allows the processor and device to use the
> +same virtual addresses avoiding the need for software to translate virtual
> +addresses to physical addresses. SVA is what PCIe calls Shared Virtual
> +Memory (SVM)
> +
> +In addition to the convenience of using application virtual addresses
> +by the device, it also doesn't require pinning pages for DMA.
> +PCIe Address Translation Services (ATS) along with Page Request Interface
> +(PRI) allow devices to function much the same way as the CPU handling
> +application page-faults. For more information please refer to PCIe
> +specification Chapter 10: ATS Specification.
> +
> +Use of SVA requires IOMMU support in the platform. IOMMU also is required
> +to support PCIe features ATS and PRI. ATS allows devices to cache
> +translations for the virtual address. IOMMU driver uses the mmu_notifier()
> +support to keep the device tlb cache and the CPU cache in sync. PRI allows
> +the device to request paging the virtual address before using if they are
> +not paged in the CPU page tables.
> +
> +
> +Shared Hardware Workqueues
> +==========================
> +
> +Unlike Single Root I/O Virtualization (SRIOV), Scalable IOV (SIOV) permits
> +the use of Shared Work Queues (SWQ) by both applications and Virtual
> +Machines (VM's). This allows better hardware utilization vs. hard
> +partitioning resources that could result in under utilization. In order to
> +allow the hardware to distinguish the context for which work is being
> +executed in the hardware by SWQ interface, SIOV uses Process Address Space
> +ID (PASID), which is a 20bit number defined by the PCIe SIG.
> +
> +PASID value is encoded in all transactions from the device. This allows the
> +IOMMU to track I/O on a per-PASID granularity in addition to using the PCIe
> +Resource Identifier (RID) which is the Bus/Device/Function.
> +
> +
> +ENQCMD
> +======
> +
> +ENQCMD is a new instruction on Intel platforms that atomically submits a
> +work descriptor to a device. The descriptor includes the operation to be
> +performed, virtual addresses of all parameters, virtual address of a completion
> +record, and the PASID (process address space ID) of the current process.
> +
> +ENQCMD works with non-posted semantics and carries a status back if the
> +command was accepted by hardware. This allows the submitter to know if the
> +submission needs to be retried or other device specific mechanisms to
> +implement implement fairness or ensure forward progress can be made.
Repeated "implement".
> +
> +ENQCMD is the glue that ensures applications can directly submit commands
> +to the hardware and also permit hardware to be aware of application context
> +to perform I/O operations via use of PASID.
> +
> +Process Address Space Tagging
> +=============================
> +
> +A new thread scoped MSR (IA32_PASID) provides the connection between
> +user processes and the rest of the hardware. When an application first
> +accesses an SVA capable device this MSR is initialized with a newly
> +allocated PASID. The driver for the device calls an IOMMU specific api
> +that sets up the routing for DMA and page-requests.
> +
> +For example, the Intel Data Streaming Accelerator (DSA) uses
> +intel_svm_bind_mm(), which will do the following.
The Intel SVM APIs have been deprecated. Drivers should use
iommu_sva_bind_device() instead. Please also update other places in
this document.
> +
> +- Allocate the PASID, and program the process page-table (cr3) in the PASID
> + context entries.
> +- Register for mmu_notifier() to track any page-table invalidations to keep
> + the device tlb in sync. For example, when a page-table entry is invalidated,
> + IOMMU propagates the invalidation to device tlb. This will force any
> + future access by the device to this virtual address to participate in
> + ATS. If the IOMMU responds with proper response that a page is not
> + present, the device would request the page to be paged in via the PCIe PRI
> + protocol before performing I/O.
> +
> +This MSR is managed with the XSAVE feature set as "supervisor state" to
> +ensure the MSR is updated during context switch.
> +
> +PASID Management
> +================
> +
> +The kernel must allocate a PASID on behalf of each process and program it
> +into the new MSR to communicate the process identity to platform hardware.
> +ENQCMD uses the PASID stored in this MSR to tag requests from this process.
> +When a user submits a work descriptor to a device using the ENQCMD
> +instruction, the PASID field in the descriptor is auto-filled with the
> +value from MSR_IA32_PASID. Requests for DMA from the device are also tagged
> +with the same PASID. The platform IOMMU uses the PASID in the transaction to
> +perform address translation. The IOMMU api's setup the corresponding PASID
> +entry in IOMMU with the process address used by the CPU (for e.g cr3 in x86).
> +
> +The MSR must be configured on each logical CPU before any application
> +thread can interact with a device. Threads that belong to the same
> +process share the same page tables, thus the same MSR value.
> +
> +PASID is cleared when a process is created. The PASID allocation and MSR
> +programming may occur long after a process and its threads have been created.
> +One thread must call bind() to allocate the PASID for the process. If a
> +thread uses ENQCMD without the MSR first being populated, it will cause #GP.
> +The kernel will fix up the #GP by writing the process-wide PASID into the
> +thread that took the #GP. A single process PASID can be used simultaneously
> +with multiple devices since they all share the same address space.
> +
> +New threads could inherit the MSR value from the parent. But this would
> +involve additional state management for those threads which may never use
> +ENQCMD. Clearing the MSR at thread creation permits all threads to have a
> +consistent behavior; the PASID is only programmed when the thread calls
> +the bind() api (intel_svm_bind_mm()()), or when a thread calls ENQCMD for
> +the first time.
> +
> +PASID Lifecycle Management
> +==========================
> +
> +Only processes that access SVA capable devices need to have a PASID
> +allocated. This allocation happens when a process first opens an SVA
> +capable device (subsequent opens of the same, or other devices will
> +share the same PASID).
> +
> +Although the PASID is allocated to the process by opening a device,
> +it is not active in any of the threads of that process. Activation is
> +done lazily when a thread tries to submit a work descriptor to a device
> +using the ENQCMD.
> +
> +That first access will trigger a #GP fault because the IA32_PASID MSR
> +has not been initialized with the PASID value assigned to the process
> +when the device was opened. The Linux #GP handler notes that a PASID as
> +been allocated for the process, and so initializes the IA32_PASID MSR
> +and returns so that the ENQCMD instruction is re-executed.
> +
> +On fork(2) or exec(2) the PASID is removed from the process as it no
> +longer has the same address space that it had when the device was opened.
> +
> +On clone(2) the new task shares the same address space, so will be
> +able to use the PASID allocated to the process. The IA32_PASID is not
> +preemptively initialized as the kernel does not know whether this thread
> +is going to access the device.
> +
> +On exit(2) the PASID is freed. The device driver ensures that any pending
> +operations queued to the device are either completed or aborted before
> +allowing the PASID to be re-allocated.
reallocated
> +
> +Relationships
> +=============
> +
> + * Each process has many threads, but only one PASID
> + * Devices have a limited number (~10's to 1000's) of hardware
> + workqueues and each portal maps down to a single workqueue.
> + The device driver manages allocating hardware workqueues.
> + * A single mmap() maps a single hardware workqueue as a "portal"
> + * For each device with which a process interacts, there must be
> + one or more mmap()'d portals.
> + * Many threads within a process can share a single portal to access
> + a single device.
> + * Multiple processes can separately mmap() the same portal, in
> + which case they still share one device hardware workqueue.
> + * The single process-wide PASID is used by all threads to interact
> + with all devices. There is not, for instance, a PASID for each
> + thread or each thread<->device pair.
> +
> +FAQ
> +===
> +
> +* What is SVA/SVM?
> +
> +Shared Virtual Addressing (SVA) permits I/O hardware and the processor to
> +work in the same address space. In short, sharing the address space. Some
> +call it Shared Virtual Memory (SVM), but Linux community wanted to avoid
> +it with Posix Shared Memory and Secure Virtual Machines which were terms
> +already in circulation.
> +
> +* What is a PASID?
> +
> +A Process Address Space ID (PASID) is a PCIe-defined TLP Prefix. A PASID is
> +a 20 bit number allocated and managed by the OS. PASID is included in all
> +transactions between the platform and the device.
> +
> +* How are shared work queues different?
> +
> +Traditionally to allow user space applications interact with hardware,
> +there is a separate instance required per process. For example, consider
> +doorbells as a mechanism of informing hardware about work to process. Each
> +doorbell is required to be spaced 4k (or page-size) apart for process
> +isolation. This requires hardware to provision that space and reserve in
> +MMIO. This doesn't scale as the number of threads becomes quite large. The
> +hardware also manages the queue depth for Shared Work Queues (SWQ), and
> +consumers don't need to track queue depth. If there is no space to accept
> +a command, the device will return an error indicating retry. Also
> +submitting a command to an MMIO address that can't accept ENQCMD will
> +return retry in response. In the new DMWr PCIe terminology, devices need to
> +support DMWr completer capability. In addition it requires all switch ports
> +to support DMWr routing and must be enabled by the PCIe subsystem, much
> +like how PCIe Atomics() are managed for instance.
> +
> +SWQ allows hardware to provision just a single address in the device. When
> +used with ENQCMD to submit work, the device can distinguish the process
> +submitting the work since it will include the PASID assigned to that
> +process. This decreases the pressure of hardware requiring to support
> +hardware to scale to a large number of processes.
> +
> +* Is this the same as a user space device driver?
> +
> +Communicating with the device via the shared work queue is much simpler
> +than a full blown user space driver. The kernel driver does all the
> +initialization of the hardware. User space only needs to worry about
> +submitting work and processing completions.
> +
> +* Is this the same as SR-IOV?
> +
> +Single Root I/O Virtualization (SR-IOV) focuses on providing independent
> +hardware interfaces for virtualizing hardware. Hence its required to be
> +almost fully functional interface to software supporting the traditional
> +BAR's, space for interrupts via MSI-x, its own register layout.
> +Virtual Functions (VFs) are assisted by the Physical Function (PF)
> +driver.
> +
> +Scalable I/O Virtualization builds on the PASID concept to create device
> +instances for virtualization. SIOV requires host software to assist in
> +creating virtual devices, each virtual device is represented by a PASID
> +along with the BDF of the device. This allows device hardware to optimize
> +device resource creation and can grow dynamically on demand. SR-IOV creation
> +and management is very static in nature. Consult references below for more
> +details.
> +
> +* Why not just create a virtual function for each app?
> +
> +Creating PCIe SRIOV type virtual functions (VF) are expensive. They create
> +duplicated hardware for PCI config space requirements, Interrupts such as
> +MSIx for instance. Resources such as interrupts have to be hard partitioned
> +between VF's at creation time, and cannot scale dynamically on demand. The
> +VF's are not completely independent from the Physical function (PF). Most
> +VF's require some communication and assistance from the PF driver. SIOV
> +creates a software defined device. Where all the configuration and control
> +aspects are mediated via the slow path. The work submission and completion
> +happen without any mediation.
> +
> +* Does this support virtualization?
> +
> +ENQCMD can be used from within a guest VM. In these cases the VMM helps
> +with setting up a translation table to translate from Guest PASID to Host
> +PASID. Please consult the ENQCMD instruction set reference for more
> +details.
> +
> +* Does memory need to be pinned?
> +
> +When devices support SVA, along with platform hardware such as IOMMU
> +supporting such devices, there is no need to pin memory for DMA purposes.
> +Devices that support SVA also support other PCIe features that remove the
> +pinning requirement for memory.
> +
> +Device TLB support - Device requests the IOMMU to lookup an address before
> +use via Address Translation Service (ATS) requests. If the mapping exists
> +but there is no page allocated by the OS, IOMMU hardware returns that no
> +mapping exists.
> +
> +Device requests that virtual address to be mapped via Page Request
> +Interface (PRI). Once the OS has successfully completed the mapping, it
> +returns the response back to the device. The device continues again to
> +request for a translation and continues.
> +
> +IOMMU works with the OS in managing consistency of page-tables with the
> +device. When removing pages, it interacts with the device to remove any
> +device-tlb that might have been cached before removing the mappings from
> +the OS.
> +
> +References
> +==========
> +
> +VT-D:
> +https://01.org/blogs/ashokraj/2018/recent-enhancements-intel-virtualization-technology-directed-i/o-intel-vt-d
> +
> +SIOV:
> +https://01.org/blogs/2019/assignable-interfaces-intel-scalable-i/o-virtualization-linux
> +
> +ENQCMD in ISE:
> +https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf
> +
> +DSA spec:
> +https://software.intel.com/sites/default/files/341204-intel-data-streaming-accelerator-spec.pdf
>
Best regards,
baolu
^ permalink raw reply
* Re: [PATCH v2 11/12] x86/mmu: Allocate/free PASID
From: Lu Baolu @ 2020-06-13 13:07 UTC (permalink / raw)
To: Fenghua Yu, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
H Peter Anvin, David Woodhouse, Frederic Barrat, Andrew Donnellan,
Felix Kuehling, Joerg Roedel, Dave Hansen, Tony Luck, Ashok Raj,
Jacob Jun Pan, Dave Jiang, Yu-cheng Yu, Sohil Mehta,
Ravi V Shankar
Cc: x86, linux-kernel, amd-gfx, iommu, linuxppc-dev, baolu.lu
In-Reply-To: <1592008893-9388-12-git-send-email-fenghua.yu@intel.com>
Hi Fenghua,
On 2020/6/13 8:41, Fenghua Yu wrote:
> A PASID is allocated for an "mm" the first time any thread attaches
> to an SVM capable device. Later device attachments (whether to the same
> device or another SVM device) will re-use the same PASID.
>
> The PASID is freed when the process exits (so no need to keep
> reference counts on how many SVM devices are sharing the PASID).
FYI.
Jean-Philippe Brucker has a patch for mm->pasid management in the vendor
agnostic manner.
https://www.spinics.net/lists/iommu/msg44459.html
Best regards,
baolu
>
> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
> v2:
> - Define a helper free_bind() to simplify error exit code in bind_mm()
> (Thomas)
> - Fix a ret error code in bind_mm() (Thomas)
> - Change pasid's type from "int" to "unsigned int" to have consistent
> pasid type in iommu (Thomas)
> - Simplify alloc_pasid() a bit.
>
> arch/x86/include/asm/iommu.h | 2 +
> arch/x86/include/asm/mmu_context.h | 14 ++++
> drivers/iommu/intel/svm.c | 101 +++++++++++++++++++++++++----
> 3 files changed, 105 insertions(+), 12 deletions(-)
>
> diff --git a/arch/x86/include/asm/iommu.h b/arch/x86/include/asm/iommu.h
> index bf1ed2ddc74b..ed41259fe7ac 100644
> --- a/arch/x86/include/asm/iommu.h
> +++ b/arch/x86/include/asm/iommu.h
> @@ -26,4 +26,6 @@ arch_rmrr_sanity_check(struct acpi_dmar_reserved_memory *rmrr)
> return -EINVAL;
> }
>
> +void __free_pasid(struct mm_struct *mm);
> +
> #endif /* _ASM_X86_IOMMU_H */
> diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
> index 47562147e70b..f8c91ce8c451 100644
> --- a/arch/x86/include/asm/mmu_context.h
> +++ b/arch/x86/include/asm/mmu_context.h
> @@ -13,6 +13,7 @@
> #include <asm/tlbflush.h>
> #include <asm/paravirt.h>
> #include <asm/debugreg.h>
> +#include <asm/iommu.h>
>
> extern atomic64_t last_mm_ctx_id;
>
> @@ -117,9 +118,22 @@ static inline int init_new_context(struct task_struct *tsk,
> init_new_context_ldt(mm);
> return 0;
> }
> +
> +static inline void free_pasid(struct mm_struct *mm)
> +{
> + if (!IS_ENABLED(CONFIG_INTEL_IOMMU_SVM))
> + return;
> +
> + if (!cpu_feature_enabled(X86_FEATURE_ENQCMD))
> + return;
> +
> + __free_pasid(mm);
> +}
> +
> static inline void destroy_context(struct mm_struct *mm)
> {
> destroy_context_ldt(mm);
> + free_pasid(mm);
> }
>
> extern void switch_mm(struct mm_struct *prev, struct mm_struct *next,
> diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
> index 4e775e12ae52..27dc866b8461 100644
> --- a/drivers/iommu/intel/svm.c
> +++ b/drivers/iommu/intel/svm.c
> @@ -425,6 +425,53 @@ int intel_svm_unbind_gpasid(struct device *dev, unsigned int pasid)
> return ret;
> }
>
> +static void free_bind(struct intel_svm *svm, struct intel_svm_dev *sdev,
> + bool new_pasid)
> +{
> + if (new_pasid)
> + ioasid_free(svm->pasid);
> + kfree(svm);
> + kfree(sdev);
> +}
> +
> +/*
> + * If this mm already has a PASID, use it. Otherwise allocate a new one.
> + * Let the caller know if a new PASID is allocated via 'new_pasid'.
> + */
> +static int alloc_pasid(struct intel_svm *svm, struct mm_struct *mm,
> + unsigned int pasid_max, bool *new_pasid,
> + unsigned int flags)
> +{
> + unsigned int pasid;
> +
> + *new_pasid = false;
> +
> + /*
> + * Reuse the PASID if the mm already has a PASID and not a private
> + * PASID is requested.
> + */
> + if (mm && mm->pasid && !(flags & SVM_FLAG_PRIVATE_PASID)) {
> + /*
> + * Once a PASID is allocated for this mm, the PASID
> + * stays with the mm until the mm is dropped. Reuse
> + * the PASID which has been already allocated for the
> + * mm instead of allocating a new one.
> + */
> + ioasid_set_data(mm->pasid, svm);
> +
> + return mm->pasid;
> + }
> +
> + /* Allocate a new pasid. Do not use PASID 0, reserved for init PASID. */
> + pasid = ioasid_alloc(NULL, PASID_MIN, pasid_max - 1, svm);
> + if (pasid != INVALID_IOASID) {
> + /* A new pasid is allocated. */
> + *new_pasid = true;
> + }
> +
> + return pasid;
> +}
> +
> /* Caller must hold pasid_mutex, mm reference */
> static int
> intel_svm_bind_mm(struct device *dev, unsigned int flags,
> @@ -518,6 +565,8 @@ intel_svm_bind_mm(struct device *dev, unsigned int flags,
> init_rcu_head(&sdev->rcu);
>
> if (!svm) {
> + bool new_pasid;
> +
> svm = kzalloc(sizeof(*svm), GFP_KERNEL);
> if (!svm) {
> ret = -ENOMEM;
> @@ -529,12 +578,9 @@ intel_svm_bind_mm(struct device *dev, unsigned int flags,
> if (pasid_max > intel_pasid_max_id)
> pasid_max = intel_pasid_max_id;
>
> - /* Do not use PASID 0, reserved for RID to PASID */
> - svm->pasid = ioasid_alloc(NULL, PASID_MIN,
> - pasid_max - 1, svm);
> + svm->pasid = alloc_pasid(svm, mm, pasid_max, &new_pasid, flags);
> if (svm->pasid == INVALID_IOASID) {
> - kfree(svm);
> - kfree(sdev);
> + free_bind(svm, sdev, new_pasid);
> ret = -ENOSPC;
> goto out;
> }
> @@ -547,9 +593,7 @@ intel_svm_bind_mm(struct device *dev, unsigned int flags,
> if (mm) {
> ret = mmu_notifier_register(&svm->notifier, mm);
> if (ret) {
> - ioasid_free(svm->pasid);
> - kfree(svm);
> - kfree(sdev);
> + free_bind(svm, sdev, new_pasid);
> goto out;
> }
> }
> @@ -565,12 +609,18 @@ intel_svm_bind_mm(struct device *dev, unsigned int flags,
> if (ret) {
> if (mm)
> mmu_notifier_unregister(&svm->notifier, mm);
> - ioasid_free(svm->pasid);
> - kfree(svm);
> - kfree(sdev);
> + free_bind(svm, sdev, new_pasid);
> goto out;
> }
>
> + if (mm && new_pasid && !(flags & SVM_FLAG_PRIVATE_PASID)) {
> + /*
> + * Track the new pasid in the mm. The pasid will be
> + * freed at process exit. Don't track requested
> + * private PASID in the mm.
> + */
> + mm->pasid = svm->pasid;
> + }
> list_add_tail(&svm->list, &global_svm_list);
> } else {
> /*
> @@ -640,7 +690,8 @@ static int intel_svm_unbind_mm(struct device *dev, unsigned int pasid)
> kfree_rcu(sdev, rcu);
>
> if (list_empty(&svm->devs)) {
> - ioasid_free(svm->pasid);
> + /* Clear data in the pasid. */
> + ioasid_set_data(pasid, NULL);
> if (svm->mm)
> mmu_notifier_unregister(&svm->notifier, svm->mm);
> list_del(&svm->list);
> @@ -1001,3 +1052,29 @@ unsigned int intel_svm_get_pasid(struct iommu_sva *sva)
>
> return pasid;
> }
> +
> +/*
> + * An invalid pasid is either 0 (init PASID value) or bigger than max PASID
> + * (PASID_MAX - 1).
> + */
> +static bool invalid_pasid(unsigned int pasid)
> +{
> + return (pasid == INIT_PASID) || (pasid >= PASID_MAX);
> +}
> +
> +/* On process exit free the PASID (if one was allocated). */
> +void __free_pasid(struct mm_struct *mm)
> +{
> + unsigned int pasid = mm->pasid;
> +
> + /* No need to free invalid pasid. */
> + if (invalid_pasid(pasid))
> + return;
> +
> + /*
> + * Since the pasid is not bound to any svm by now, there is no race
> + * here with binding/unbinding and no need to protect the free
> + * operation by pasid_mutex.
> + */
> + ioasid_free(pasid);
> +}
>
^ permalink raw reply
* [GIT PULL] Please pull powerpc/linux.git powerpc-5.8-2 tag
From: Michael Ellerman @ 2020-06-13 13:53 UTC (permalink / raw)
To: Linus Torvalds; +Cc: aik, linuxppc-dev, linux-kernel
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Hi Linus,
Please pull a powerpc fix for 5.8:
The following changes since commit 7ae77150d94d3b535c7b85e6b3647113095e79bf:
Merge tag 'powerpc-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux (2020-06-05 12:39:30 -0700)
are available in the git repository at:
https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git tags/powerpc-5.8-2
for you to fetch changes up to e881bfaf5a5f409390973e076333281465f2b0d9:
KVM: PPC: Fix nested guest RC bits update (2020-06-12 16:19:53 +1000)
- ------------------------------------------------------------------
powerpc fixes for 5.8 #2
One fix for a recent change which broke nested KVM guests on Power9.
Thanks to:
Alexey Kardashevskiy.
- ------------------------------------------------------------------
Alexey Kardashevskiy (1):
KVM: PPC: Fix nested guest RC bits update
arch/powerpc/kvm/book3s_hv_nested.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
-----BEGIN PGP SIGNATURE-----
iQIzBAEBCAAdFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl7k2b0ACgkQUevqPMjh
pYDmPQ/9Ffq4hgIdiJdzusd9tYEynumET8/cfRLYCmUVEkYQdpgpLgp/XnNFq/fg
CqqDN173ioTg5xN1QEZkcPKFwOqmlG2oYJI4s93nMAINmuCE7h4bsrGLOZNPIx1G
FD2v0piGkxmxRud1Qt7+cpIfbbw3wKnFqOQ1yzRop/weufp42PSUD00cY6DUa9Ip
LKWxBcje6yl56U+z31iWvDjNxP2cwZUz79ioKQG7YDigQh+aSVFaZ1NboA5fjde0
CSm0DrHfhPjlZTw2y3IvTbETCi7wU1dIrElf6e8RMsIOCg1UeUeiOsHP80fHnMFA
NP1JlsIEPjIxUb9cJ7Uc03wwUMioQcx7ZwISTWP6aQZx20nEdcFErqleCLr8e/KC
beWRz7TMfCG3v6GQv5yJKx+/AB8XWYcBe+X/8+7AAZS51DFHU4xvxIy9B43a3cCe
UozRzo/OTVvRvRPsM4TaIIZNPBV/WWm/+CwEZgGEOBqXK+ZQpwkDh5kP5P/nWv0g
HoK82XTGdMDokKuH+oStuo9kpMbj7ktJhOVFply8axXdQ9Jn2y/4t1mH4jsodza5
OWqpDDnRXzOlTasWBPIcdgmriYeJfEQ5rDmxRXfoqBTEoSVt2TC3e1ooN0IkCVcy
pPROKGvRHdfqMvXqwoDpLL3wF43u439bl8ROBC89nsxdpcBixoM=
=VRzv
-----END PGP SIGNATURE-----
^ permalink raw reply
* Re: [PATCH] powerpc/fsl_booke/32: fix build with CONFIG_RANDOMIZE_BASE
From: Christophe Leroy @ 2020-06-13 17:28 UTC (permalink / raw)
To: Arseny Solokha, Michael Ellerman, Jason Yan, linuxppc-dev
Cc: Scott Wood, Christophe Leroy, linux-kernel, stable
In-Reply-To: <20200613162801.1946619-1-asolokha@kb.kras.ru>
Le 13/06/2020 à 18:28, Arseny Solokha a écrit :
> Building the current 5.8 kernel for a e500 machine with
> CONFIG_RANDOMIZE_BASE set yields the following failure:
>
> arch/powerpc/mm/nohash/kaslr_booke.c: In function 'kaslr_early_init':
> arch/powerpc/mm/nohash/kaslr_booke.c:387:2: error: implicit declaration
> of function 'flush_icache_range'; did you mean 'flush_tlb_range'?
> [-Werror=implicit-function-declaration]
>
> Indeed, including asm/cacheflush.h into kaslr_booke.c fixes the build.
>
> The issue dates back to the introduction of that file and probably went
> unnoticed because there's no in-tree defconfig with CONFIG_RANDOMIZE_BASE
> set.
I don't get this problem with mpc85xx_defconfig + RELOCATABLE +
RANDOMIZE_BASE.
Christophe
>
> Fixes: 2b0e86cc5de6 ("powerpc/fsl_booke/32: implement KASLR infrastructure")
> Cc: stable@vger.kernel.org
> Signed-off-by: Arseny Solokha <asolokha@kb.kras.ru>
> ---
> arch/powerpc/mm/nohash/kaslr_booke.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/powerpc/mm/nohash/kaslr_booke.c b/arch/powerpc/mm/nohash/kaslr_booke.c
> index 4a75f2d9bf0e..bce0e5349978 100644
> --- a/arch/powerpc/mm/nohash/kaslr_booke.c
> +++ b/arch/powerpc/mm/nohash/kaslr_booke.c
> @@ -14,6 +14,7 @@
> #include <linux/memblock.h>
> #include <linux/libfdt.h>
> #include <linux/crash_core.h>
> +#include <asm/cacheflush.h>
> #include <asm/pgalloc.h>
> #include <asm/prom.h>
> #include <asm/kdump.h>
>
^ permalink raw reply
* Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.8-2 tag
From: pr-tracker-bot @ 2020-06-13 18:00 UTC (permalink / raw)
To: Michael Ellerman; +Cc: aik, linuxppc-dev, Linus Torvalds, linux-kernel
In-Reply-To: <87y2ordqcm.fsf@mpe.ellerman.id.au>
The pull request you sent on Sat, 13 Jun 2020 23:53:29 +1000:
> https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git tags/powerpc-5.8-2
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/08bf1a27c4c354b853fd81a79e953525bbcc8506
Thank you!
--
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker
^ permalink raw reply
* [PATCH] powerpc/fsl_booke/32: fix build with CONFIG_RANDOMIZE_BASE
From: Arseny Solokha @ 2020-06-13 16:28 UTC (permalink / raw)
To: Michael Ellerman, Jason Yan, linuxppc-dev
Cc: Scott Wood, Christophe Leroy, Arseny Solokha, linux-kernel,
stable
Building the current 5.8 kernel for a e500 machine with
CONFIG_RANDOMIZE_BASE set yields the following failure:
arch/powerpc/mm/nohash/kaslr_booke.c: In function 'kaslr_early_init':
arch/powerpc/mm/nohash/kaslr_booke.c:387:2: error: implicit declaration
of function 'flush_icache_range'; did you mean 'flush_tlb_range'?
[-Werror=implicit-function-declaration]
Indeed, including asm/cacheflush.h into kaslr_booke.c fixes the build.
The issue dates back to the introduction of that file and probably went
unnoticed because there's no in-tree defconfig with CONFIG_RANDOMIZE_BASE
set.
Fixes: 2b0e86cc5de6 ("powerpc/fsl_booke/32: implement KASLR infrastructure")
Cc: stable@vger.kernel.org
Signed-off-by: Arseny Solokha <asolokha@kb.kras.ru>
---
arch/powerpc/mm/nohash/kaslr_booke.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/powerpc/mm/nohash/kaslr_booke.c b/arch/powerpc/mm/nohash/kaslr_booke.c
index 4a75f2d9bf0e..bce0e5349978 100644
--- a/arch/powerpc/mm/nohash/kaslr_booke.c
+++ b/arch/powerpc/mm/nohash/kaslr_booke.c
@@ -14,6 +14,7 @@
#include <linux/memblock.h>
#include <linux/libfdt.h>
#include <linux/crash_core.h>
+#include <asm/cacheflush.h>
#include <asm/pgalloc.h>
#include <asm/prom.h>
#include <asm/kdump.h>
--
2.27.0
^ permalink raw reply related
* Re: [PATCH] powerpc/fsl_booke/32: fix build with CONFIG_RANDOMIZE_BASE
From: Arseny Solokha @ 2020-06-13 21:20 UTC (permalink / raw)
To: Christophe Leroy
Cc: Christophe Leroy, Jason Yan, linux-kernel, stable, Scott Wood,
Arseny Solokha, linuxppc-dev
In-Reply-To: <754d31be-730b-8f18-4ead-ba2f303650d0@csgroup.eu>
> Le 13/06/2020 à 18:28, Arseny Solokha a écrit :
>> Building the current 5.8 kernel for a e500 machine with
>> CONFIG_RANDOMIZE_BASE set yields the following failure:
>>
>> arch/powerpc/mm/nohash/kaslr_booke.c: In function 'kaslr_early_init':
>> arch/powerpc/mm/nohash/kaslr_booke.c:387:2: error: implicit declaration
>> of function 'flush_icache_range'; did you mean 'flush_tlb_range'?
>> [-Werror=implicit-function-declaration]
>>
>> Indeed, including asm/cacheflush.h into kaslr_booke.c fixes the build.
>>
>> The issue dates back to the introduction of that file and probably went
>> unnoticed because there's no in-tree defconfig with CONFIG_RANDOMIZE_BASE
>> set.
>
> I don't get this problem with mpc85xx_defconfig + RELOCATABLE +
> RANDOMIZE_BASE.
Ah, OK. So the critical difference between mpc85xx_defconfig and our custom
config is that the former sets CONFIG_BLOCK while ours doesn't. Then we have the
following dependence chain:
arch/powerpc/mm/nohash/kaslr_booke.c
include/linux/swap.h
include/linux/memcontrol.h
include/linux/writeback.h
include/linux/blk-cgroup.h
include/linux/blkdev.h
#ifdef CONFIG_BLOCK
#include <linux/pagemap.h>
#endif
include/linux/pagemap.h
include/linux/highmem.h
arch/powerpc/include/asm/cacheflush.h
and that's how the latter doesn't get included in
arch/powerpc/mm/nohash/kaslr_booke.c, because in our config CONFIG_BLOCK is not
defined in the first place.
Arseny
> Christophe
>
>>
>> Fixes: 2b0e86cc5de6 ("powerpc/fsl_booke/32: implement KASLR infrastructure")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Arseny Solokha <asolokha@kb.kras.ru>
>> ---
>> arch/powerpc/mm/nohash/kaslr_booke.c | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/arch/powerpc/mm/nohash/kaslr_booke.c b/arch/powerpc/mm/nohash/kaslr_booke.c
>> index 4a75f2d9bf0e..bce0e5349978 100644
>> --- a/arch/powerpc/mm/nohash/kaslr_booke.c
>> +++ b/arch/powerpc/mm/nohash/kaslr_booke.c
>> @@ -14,6 +14,7 @@
>> #include <linux/memblock.h>
>> #include <linux/libfdt.h>
>> #include <linux/crash_core.h>
>> +#include <asm/cacheflush.h>
>> #include <asm/pgalloc.h>
>> #include <asm/prom.h>
>> #include <asm/kdump.h>
>>
^ permalink raw reply
* Re: [PATCH] scsi: target/sbp: remove firewire SBP target driver
From: Finn Thain @ 2020-06-14 0:03 UTC (permalink / raw)
To: Chris Boot
Cc: Bart Van Assche, linux-scsi, Chuhong Yuan, linux-kernel,
Nicholas Bellinger, target-devel, Martin K . Petersen,
linux1394-devel, linuxppc-dev, Stefan Richter
In-Reply-To: <01020172acd3d10f-3964f076-a820-43fc-9494-3f3946e9b7b5-000000@eu-west-1.amazonses.com>
On Sat, 13 Jun 2020, Chris Boot wrote:
> I no longer have the time to maintain this subsystem nor the hardware to
> test patches with.
Then why not patch MAINTAINERS, and orphan it, as per usual practice?
$ git log --oneline MAINTAINERS | grep -i orphan
> It also doesn't appear to have any active users so I doubt anyone will
> miss it.
>
It's not unusual that any Linux driver written more than 5 years ago
"doesn't appear to have any active users".
If a driver has been orphaned and broken in the past, and no-one stepped
up to fix it within a reasonable period, removal would make sense. But
that's not the case here.
I haven't used this driver for a long time, but I still own PowerMacs with
firewire, and I know I'm not the only one.
> Signed-off-by: Chris Boot <bootc@bootc.net>
> ---
> MAINTAINERS | 9 -
> drivers/target/Kconfig | 1 -
> drivers/target/Makefile | 1 -
> drivers/target/sbp/Kconfig | 12 -
> drivers/target/sbp/Makefile | 2 -
> drivers/target/sbp/sbp_target.c | 2350 -------------------------------
> drivers/target/sbp/sbp_target.h | 243 ----
> 7 files changed, 2618 deletions(-)
> delete mode 100644 drivers/target/sbp/Kconfig
> delete mode 100644 drivers/target/sbp/Makefile
> delete mode 100644 drivers/target/sbp/sbp_target.c
> delete mode 100644 drivers/target/sbp/sbp_target.h
>
^ permalink raw reply
* [PATCH] powerpc/powernv/pci: add ifdef to avoid dead code
From: Greg Thelen @ 2020-06-14 5:54 UTC (permalink / raw)
To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
Oliver O'Halloran
Cc: Greg Thelen, linuxppc-dev, linux-kernel
Commit dc3d8f85bb57 ("powerpc/powernv/pci: Re-work bus PE
configuration") removed a couple pnv_ioda_setup_bus_dma() calls. The
only remaining calls are behind CONFIG_IOMMU_API. Thus builds without
CONFIG_IOMMU_API see:
arch/powerpc/platforms/powernv/pci-ioda.c:1888:13: error: 'pnv_ioda_setup_bus_dma' defined but not used
Add CONFIG_IOMMU_API ifdef guard to avoid dead code.
Fixes: dc3d8f85bb57 ("powerpc/powernv/pci: Re-work bus PE configuration")
Signed-off-by: Greg Thelen <gthelen@google.com>
---
arch/powerpc/platforms/powernv/pci-ioda.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 73a63efcf855..f7762052b7c4 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1885,6 +1885,7 @@ static bool pnv_pci_ioda_iommu_bypass_supported(struct pci_dev *pdev,
return false;
}
+#ifdef CONFIG_IOMMU_API
static void pnv_ioda_setup_bus_dma(struct pnv_ioda_pe *pe, struct pci_bus *bus)
{
struct pci_dev *dev;
@@ -1897,6 +1898,7 @@ static void pnv_ioda_setup_bus_dma(struct pnv_ioda_pe *pe, struct pci_bus *bus)
pnv_ioda_setup_bus_dma(pe, dev->subordinate);
}
}
+#endif
static inline __be64 __iomem *pnv_ioda_get_inval_reg(struct pnv_phb *phb,
bool real_mode)
--
2.27.0.290.gba653c62da-goog
^ permalink raw reply related
* Re: [PATCH] powerpc/powernv/pci: add ifdef to avoid dead code
From: Christophe Leroy @ 2020-06-14 7:26 UTC (permalink / raw)
To: Greg Thelen, Michael Ellerman, Benjamin Herrenschmidt,
Paul Mackerras, Oliver O'Halloran
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <20200614055418.33497-1-gthelen@google.com>
Hi,
Le 14/06/2020 à 07:54, Greg Thelen a écrit :
> Commit dc3d8f85bb57 ("powerpc/powernv/pci: Re-work bus PE
> configuration") removed a couple pnv_ioda_setup_bus_dma() calls. The
> only remaining calls are behind CONFIG_IOMMU_API. Thus builds without
> CONFIG_IOMMU_API see:
> arch/powerpc/platforms/powernv/pci-ioda.c:1888:13: error: 'pnv_ioda_setup_bus_dma' defined but not used
>
> Add CONFIG_IOMMU_API ifdef guard to avoid dead code.
I think you should move the function down into the same ifdef as the
callers instead of adding a new ifdef.
Christophe
>
> Fixes: dc3d8f85bb57 ("powerpc/powernv/pci: Re-work bus PE configuration")
> Signed-off-by: Greg Thelen <gthelen@google.com>
> ---
> arch/powerpc/platforms/powernv/pci-ioda.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 73a63efcf855..f7762052b7c4 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -1885,6 +1885,7 @@ static bool pnv_pci_ioda_iommu_bypass_supported(struct pci_dev *pdev,
> return false;
> }
>
> +#ifdef CONFIG_IOMMU_API
> static void pnv_ioda_setup_bus_dma(struct pnv_ioda_pe *pe, struct pci_bus *bus)
> {
> struct pci_dev *dev;
> @@ -1897,6 +1898,7 @@ static void pnv_ioda_setup_bus_dma(struct pnv_ioda_pe *pe, struct pci_bus *bus)
> pnv_ioda_setup_bus_dma(pe, dev->subordinate);
> }
> }
> +#endif
>
> static inline __be64 __iomem *pnv_ioda_get_inval_reg(struct pnv_phb *phb,
> bool real_mode)
>
^ permalink raw reply
* [PATCH] powerpc/perf: fix missing is_sier_aviable() during build
From: Madhavan Srinivasan @ 2020-06-14 8:36 UTC (permalink / raw)
To: mpe; +Cc: Aneesh Kumar K . V, Madhavan Srinivasan, linuxppc-dev
Compilation error:
arch/powerpc/perf/perf_regs.c:80:undefined reference to `.is_sier_available'
Currently is_sier_available() is part of core-book3s.c.
But then, core-book3s.c is added to build based on
CONFIG_PPC_PERF_CTRS. A config with CONFIG_PERF_EVENTS
and without CONFIG_PPC_PERF_CTRS will have a build break
because of missing is_sier_available(). Patch adds
is_sier_available() in asm/perf_event.h to fix the build
break for configs missing CONFIG_PPC_PERF_CTRS.
Fixes: 333804dc3b7a9 ('powerpc/perf: Update perf_regs structure to include SIER")
Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
---
arch/powerpc/include/asm/perf_event.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/powerpc/include/asm/perf_event.h b/arch/powerpc/include/asm/perf_event.h
index eed3954082fa..1e8b2e1ec1db 100644
--- a/arch/powerpc/include/asm/perf_event.h
+++ b/arch/powerpc/include/asm/perf_event.h
@@ -12,6 +12,8 @@
#ifdef CONFIG_PPC_PERF_CTRS
#include <asm/perf_event_server.h>
+#else
+static inline bool is_sier_available(void) { return false; }
#endif
#ifdef CONFIG_FSL_EMB_PERF_EVENT
--
2.26.2
^ permalink raw reply related
* Re: PowerPC KVM-PR issue
From: Nicholas Piggin @ 2020-06-14 8:50 UTC (permalink / raw)
To: Christian Zigotzky, kvm-ppc@vger.kernel.org, linuxppc-dev
Cc: Darren Stevens, R.T.Dickinson, Christian Zigotzky
In-Reply-To: <fffeb817-35e0-2562-b3cf-2fd476948c76@xenosoft.de>
Excerpts from Christian Zigotzky's message of June 12, 2020 11:01 pm:
> On 11 June 2020 at 04:47 pm, Christian Zigotzky wrote:
>> On 10 June 2020 at 01:23 pm, Christian Zigotzky wrote:
>>> On 10 June 2020 at 11:06 am, Christian Zigotzky wrote:
>>>> On 10 June 2020 at 00:18 am, Christian Zigotzky wrote:
>>>>> Hello,
>>>>>
>>>>> KVM-PR doesn't work anymore on my Nemo board [1]. I figured out
>>>>> that the Git kernels and the kernel 5.7 are affected.
>>>>>
>>>>> Error message: Fienix kernel: kvmppc_exit_pr_progint: emulation at
>>>>> 700 failed (00000000)
>>>>>
>>>>> I can boot virtual QEMU PowerPC machines with KVM-PR with the
>>>>> kernel 5.6 without any problems on my Nemo board.
>>>>>
>>>>> I tested it with QEMU 2.5.0 and QEMU 5.0.0 today.
>>>>>
>>>>> Could you please check KVM-PR on your PowerPC machine?
>>>>>
>>>>> Thanks,
>>>>> Christian
>>>>>
>>>>> [1] https://en.wikipedia.org/wiki/AmigaOne_X1000
>>>>
>>>> I figured out that the PowerPC updates 5.7-1 [1] are responsible for
>>>> the KVM-PR issue. Please test KVM-PR on your PowerPC machines and
>>>> check the PowerPC updates 5.7-1 [1].
>>>>
>>>> Thanks
>>>>
>>>> [1]
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d38c07afc356ddebaa3ed8ecb3f553340e05c969
>>>>
>>>>
>>> I tested the latest Git kernel with Mac-on-Linux/KVM-PR today.
>>> Unfortunately I can't use KVM-PR with MoL anymore because of this
>>> issue (see screenshots [1]). Please check the PowerPC updates 5.7-1.
>>>
>>> Thanks
>>>
>>> [1]
>>> -
>>> https://i.pinimg.com/originals/0c/b3/64/0cb364a40241fa2b7f297d4272bbb8b7.png
>>> -
>>> https://i.pinimg.com/originals/9a/61/d1/9a61d170b1c9f514f7a78a3014ffd18f.png
>>>
>> Hi All,
>>
>> I bisected today because of the KVM-PR issue.
>>
>> Result:
>>
>> 9600f261acaaabd476d7833cec2dd20f2919f1a0 is the first bad commit
>> commit 9600f261acaaabd476d7833cec2dd20f2919f1a0
>> Author: Nicholas Piggin <npiggin@gmail.com>
>> Date: Wed Feb 26 03:35:21 2020 +1000
>>
>> powerpc/64s/exception: Move KVM test to common code
>>
>> This allows more code to be moved out of unrelocated regions. The
>> system call KVMTEST is changed to be open-coded and remain in the
>> tramp area to avoid having to move it to entry_64.S. The custom
>> nature
>> of the system call entry code means the hcall case can be made more
>> streamlined than regular interrupt handlers.
>>
>> mpe: Incorporate fix from Nick:
>>
>> Moving KVM test to the common entry code missed the case of HMI and
>> MCE, which do not do __GEN_COMMON_ENTRY (because they don't want to
>> switch to virt mode).
>>
>> This means a MCE or HMI exception that is taken while KVM is
>> running a
>> guest context will not be switched out of that context, and KVM won't
>> be notified. Found by running sigfuz in guest with patched host on
>> POWER9 DD2.3, which causes some TM related HMI interrupts (which are
>> expected and supposed to be handled by KVM).
>>
>> This fix adds a __GEN_REALMODE_COMMON_ENTRY for those handlers to add
>> the KVM test. This makes them look a little more like other handlers
>> that all use __GEN_COMMON_ENTRY.
>>
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
>> Link:
>> https://lore.kernel.org/r/20200225173541.1549955-13-npiggin@gmail.com
>>
>> :040000 040000 ec21cec22d165f8696d69532734cb2985d532cb0
>> 87dd49a9cd7202ec79350e8ca26cea01f1dbd93d M arch
>>
>> -----
>>
>> The following commit is the problem: powerpc/64s/exception: Move KVM
>> test to common code [1]
>>
>> These changes were included in the PowerPC updates 5.7-1. [2]
>>
>> Another test:
>>
>> git checkout d38c07afc356ddebaa3ed8ecb3f553340e05c969 (PowerPC updates
>> 5.7-1 [2] ) -> KVM-PR doesn't work.
>>
>> After that: git revert d38c07afc356ddebaa3ed8ecb3f553340e05c969 -m 1
>> -> KVM-PR works.
>>
>> Could you please check the first bad commit? [1]
>>
>> Thanks,
>> Christian
>>
>>
>> [1]
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9600f261acaaabd476d7833cec2dd20f2919f1a0
>> [2]
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d38c07afc356ddebaa3ed8ecb3f553340e05c969
>
> Hi All,
>
> I tried to revert the __GEN_REALMODE_COMMON_ENTRY fix for the latest Git
> kernel and for the stable kernel 5.7.2 but without any success. There
> was lot of restructuring work during the kernel 5.7 development time in
> the PowerPC area so it isn't possible reactivate the old code. That
> means we have lost the whole KVM-PR support. I also reported this issue
> to Alexander Graf two days ago. He wrote: "Howdy :). It looks pretty
> broken. Have you ever made a bisect to see where the problem comes from?"
>
> Please check the KVM-PR code.
Hey, thanks for debugging it and reporting. I'm looking into it, will
try to get a fix soon.
Thanks,
Nick
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox