* [PATCH v4 00/11] s390/kvm: Handle guest-related program interrupts in KVM
@ 2024-10-22 12:05 Claudio Imbrenda
2024-10-22 12:05 ` [PATCH v4 01/11] s390/entry: Remove __GMAP_ASCE and use _PIF_GUEST_FAULT again Claudio Imbrenda
` (11 more replies)
0 siblings, 12 replies; 16+ messages in thread
From: Claudio Imbrenda @ 2024-10-22 12:05 UTC (permalink / raw)
To: linux-kernel
Cc: borntraeger, nsg, nrb, frankja, seiden, hca, agordeev, gor,
gerald.schaefer, kvm, linux-s390, david
This patchseries moves the handling of host program interrupts that
happen while a KVM guest is running into KVM itself.
All program interrupts that happen in the host while a KVM guest is
running are due to DAT exceptions. It is cleaner and more maintainable
to have KVM handle those.
As a side effect, some more cleanups is also possible.
Moreover, this series serves as a foundation for an upcoming series
that will further move as much s390 KVM memory managament as possible
into KVM itself, and away from the rest of the kernel.
v3->v4:
* patch 5: move check for primary ASCE from the interrupt handler to
the handlers of the specific faults where we expect the ASCE
indication to be meaningful.
* patch 6: remove enabled_gmap from struct kvm_cpu_arch, since it is
now unused.
* picked up some R-Bs from Heiko
Claudio Imbrenda (8):
s390/entry: Remove __GMAP_ASCE and use _PIF_GUEST_FAULT again
s390/kvm: Remove kvm_arch_fault_in_page()
s390/mm/gmap: Refactor gmap_fault() and add support for pfault
s390/mm/gmap: Fix __gmap_fault() return code
s390/mm/fault: Handle guest-related program interrupts in KVM
s390/kvm: Stop using gmap_{en,dis}able()
s390/mm/gmap: Remove gmap_{en,dis}able()
s390: Remove gmap pointer from lowcore
Heiko Carstens (3):
s390/mm: Simplify get_fault_type()
s390/mm: Get rid of fault type switch statements
s390/mm: Convert to LOCK_MM_AND_FIND_VMA
arch/s390/Kconfig | 1 +
arch/s390/include/asm/gmap.h | 3 -
arch/s390/include/asm/kvm_host.h | 5 +-
arch/s390/include/asm/lowcore.h | 3 +-
arch/s390/include/asm/processor.h | 5 +-
arch/s390/include/asm/ptrace.h | 2 +
arch/s390/kernel/asm-offsets.c | 3 -
arch/s390/kernel/entry.S | 44 ++-----
arch/s390/kernel/traps.c | 23 +++-
arch/s390/kvm/intercept.c | 4 +-
arch/s390/kvm/kvm-s390.c | 142 +++++++++++++++-------
arch/s390/kvm/kvm-s390.h | 8 +-
arch/s390/kvm/vsie.c | 17 ++-
arch/s390/mm/fault.c | 195 +++++-------------------------
arch/s390/mm/gmap.c | 151 +++++++++++++++--------
15 files changed, 281 insertions(+), 325 deletions(-)
--
2.47.0
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v4 01/11] s390/entry: Remove __GMAP_ASCE and use _PIF_GUEST_FAULT again
2024-10-22 12:05 [PATCH v4 00/11] s390/kvm: Handle guest-related program interrupts in KVM Claudio Imbrenda
@ 2024-10-22 12:05 ` Claudio Imbrenda
2024-10-22 12:05 ` [PATCH v4 02/11] s390/kvm: Remove kvm_arch_fault_in_page() Claudio Imbrenda
` (10 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Claudio Imbrenda @ 2024-10-22 12:05 UTC (permalink / raw)
To: linux-kernel
Cc: borntraeger, nsg, nrb, frankja, seiden, hca, agordeev, gor,
gerald.schaefer, kvm, linux-s390, david
Now that the guest ASCE is passed as a parameter to __sie64a(),
_PIF_GUEST_FAULT can be used again to determine whether the fault was a
guest or host fault.
Since the guest ASCE will not be taken from the gmap pointer in lowcore
anymore, __GMAP_ASCE can be removed. For the same reason the guest
ASCE needs now to be saved into the cr1 save area unconditionally.
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
---
arch/s390/include/asm/ptrace.h | 2 ++
arch/s390/kernel/asm-offsets.c | 2 --
arch/s390/kernel/entry.S | 26 ++++++++++++++------------
arch/s390/mm/fault.c | 6 ++----
4 files changed, 18 insertions(+), 18 deletions(-)
diff --git a/arch/s390/include/asm/ptrace.h b/arch/s390/include/asm/ptrace.h
index 2ad9324f6338..788bc4467445 100644
--- a/arch/s390/include/asm/ptrace.h
+++ b/arch/s390/include/asm/ptrace.h
@@ -14,11 +14,13 @@
#define PIF_SYSCALL 0 /* inside a system call */
#define PIF_EXECVE_PGSTE_RESTART 1 /* restart execve for PGSTE binaries */
#define PIF_SYSCALL_RET_SET 2 /* return value was set via ptrace */
+#define PIF_GUEST_FAULT 3 /* indicates program check in sie64a */
#define PIF_FTRACE_FULL_REGS 4 /* all register contents valid (ftrace) */
#define _PIF_SYSCALL BIT(PIF_SYSCALL)
#define _PIF_EXECVE_PGSTE_RESTART BIT(PIF_EXECVE_PGSTE_RESTART)
#define _PIF_SYSCALL_RET_SET BIT(PIF_SYSCALL_RET_SET)
+#define _PIF_GUEST_FAULT BIT(PIF_GUEST_FAULT)
#define _PIF_FTRACE_FULL_REGS BIT(PIF_FTRACE_FULL_REGS)
#define PSW32_MASK_PER _AC(0x40000000, UL)
diff --git a/arch/s390/kernel/asm-offsets.c b/arch/s390/kernel/asm-offsets.c
index 5529248d84fb..3a6ee5043761 100644
--- a/arch/s390/kernel/asm-offsets.c
+++ b/arch/s390/kernel/asm-offsets.c
@@ -13,7 +13,6 @@
#include <linux/purgatory.h>
#include <linux/pgtable.h>
#include <linux/ftrace.h>
-#include <asm/gmap.h>
#include <asm/stacktrace.h>
int main(void)
@@ -161,7 +160,6 @@ int main(void)
OFFSET(__LC_PGM_TDB, lowcore, pgm_tdb);
BLANK();
/* gmap/sie offsets */
- OFFSET(__GMAP_ASCE, gmap, asce);
OFFSET(__SIE_PROG0C, kvm_s390_sie_block, prog0c);
OFFSET(__SIE_PROG20, kvm_s390_sie_block, prog20);
/* kexec_sha_region */
diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S
index d6d5317f768e..454841229ef4 100644
--- a/arch/s390/kernel/entry.S
+++ b/arch/s390/kernel/entry.S
@@ -327,13 +327,23 @@ SYM_CODE_START(pgm_check_handler)
GET_LC %r13
stpt __LC_SYS_ENTER_TIMER(%r13)
BPOFF
- lgr %r10,%r15
lmg %r8,%r9,__LC_PGM_OLD_PSW(%r13)
+ xgr %r10,%r10
+ xgr %r12,%r12
tmhh %r8,0x0001 # coming from user space?
jno .Lpgm_skip_asce
lctlg %c1,%c1,__LC_KERNEL_ASCE(%r13)
j 3f # -> fault in user space
.Lpgm_skip_asce:
+#if IS_ENABLED(CONFIG_KVM)
+ lg %r11,__LC_CURRENT(%r13)
+ tm __TI_sie(%r11),0xff
+ jz 1f
+ BPENTER __SF_SIE_FLAGS(%r15),_TIF_ISOLATE_BP_GUEST
+ SIEEXIT __SF_SIE_CONTROL(%r15),%r13
+ lg %r12,__SF_SIE_GUEST_ASCE(%r15)
+ lghi %r10,_PIF_GUEST_FAULT
+#endif
1: tmhh %r8,0x4000 # PER bit set in old PSW ?
jnz 2f # -> enabled, can't be a double fault
tm __LC_PGM_ILC+3(%r13),0x80 # check for per exception
@@ -344,21 +354,13 @@ SYM_CODE_START(pgm_check_handler)
CHECK_VMAP_STACK __LC_SAVE_AREA,%r13,4f
3: lg %r15,__LC_KERNEL_STACK(%r13)
4: la %r11,STACK_FRAME_OVERHEAD(%r15)
- xc __PT_FLAGS(8,%r11),__PT_FLAGS(%r11)
+ stg %r10,__PT_FLAGS(%r11)
+ stg %r12,__PT_CR1(%r11)
xc __SF_BACKCHAIN(8,%r15),__SF_BACKCHAIN(%r15)
stmg %r0,%r7,__PT_R0(%r11)
mvc __PT_R8(64,%r11),__LC_SAVE_AREA(%r13)
mvc __PT_LAST_BREAK(8,%r11),__LC_PGM_LAST_BREAK(%r13)
- stctg %c1,%c1,__PT_CR1(%r11)
-#if IS_ENABLED(CONFIG_KVM)
- ltg %r12,__LC_GMAP(%r13)
- jz 5f
- clc __GMAP_ASCE(8,%r12), __PT_CR1(%r11)
- jne 5f
- BPENTER __SF_SIE_FLAGS(%r10),_TIF_ISOLATE_BP_GUEST
- SIEEXIT __SF_SIE_CONTROL(%r10),%r13
-#endif
-5: stmg %r8,%r9,__PT_PSW(%r11)
+ stmg %r8,%r9,__PT_PSW(%r11)
# clear user controlled registers to prevent speculative use
xgr %r0,%r0
xgr %r1,%r1
diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
index ad8b0d6b77ea..a6cf33b0f339 100644
--- a/arch/s390/mm/fault.c
+++ b/arch/s390/mm/fault.c
@@ -68,15 +68,13 @@ early_initcall(fault_init);
static enum fault_type get_fault_type(struct pt_regs *regs)
{
union teid teid = { .val = regs->int_parm_long };
- struct gmap *gmap;
if (likely(teid.as == PSW_BITS_AS_PRIMARY)) {
if (user_mode(regs))
return USER_FAULT;
if (!IS_ENABLED(CONFIG_PGSTE))
return KERNEL_FAULT;
- gmap = (struct gmap *)get_lowcore()->gmap;
- if (gmap && gmap->asce == regs->cr1)
+ if (test_pt_regs_flag(regs, PIF_GUEST_FAULT))
return GMAP_FAULT;
return KERNEL_FAULT;
}
@@ -187,7 +185,7 @@ static void dump_fault_info(struct pt_regs *regs)
pr_cont("user ");
break;
case GMAP_FAULT:
- asce = ((struct gmap *)get_lowcore()->gmap)->asce;
+ asce = regs->cr1;
pr_cont("gmap ");
break;
case KERNEL_FAULT:
--
2.47.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v4 02/11] s390/kvm: Remove kvm_arch_fault_in_page()
2024-10-22 12:05 [PATCH v4 00/11] s390/kvm: Handle guest-related program interrupts in KVM Claudio Imbrenda
2024-10-22 12:05 ` [PATCH v4 01/11] s390/entry: Remove __GMAP_ASCE and use _PIF_GUEST_FAULT again Claudio Imbrenda
@ 2024-10-22 12:05 ` Claudio Imbrenda
2024-10-22 12:05 ` [PATCH v4 03/11] s390/mm/gmap: Refactor gmap_fault() and add support for pfault Claudio Imbrenda
` (9 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Claudio Imbrenda @ 2024-10-22 12:05 UTC (permalink / raw)
To: linux-kernel
Cc: borntraeger, nsg, nrb, frankja, seiden, hca, agordeev, gor,
gerald.schaefer, kvm, linux-s390, david
kvm_arch_fault_in_page() is a useless wrapper around gmap_fault(); just
use gmap_fault() directly instead.
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
---
arch/s390/kvm/intercept.c | 4 ++--
arch/s390/kvm/kvm-s390.c | 18 +-----------------
arch/s390/kvm/kvm-s390.h | 1 -
3 files changed, 3 insertions(+), 20 deletions(-)
diff --git a/arch/s390/kvm/intercept.c b/arch/s390/kvm/intercept.c
index b16352083ff9..5bbaadf75dc6 100644
--- a/arch/s390/kvm/intercept.c
+++ b/arch/s390/kvm/intercept.c
@@ -367,7 +367,7 @@ static int handle_mvpg_pei(struct kvm_vcpu *vcpu)
reg2, &srcaddr, GACC_FETCH, 0);
if (rc)
return kvm_s390_inject_prog_cond(vcpu, rc);
- rc = kvm_arch_fault_in_page(vcpu, srcaddr, 0);
+ rc = gmap_fault(vcpu->arch.gmap, srcaddr, 0);
if (rc != 0)
return rc;
@@ -376,7 +376,7 @@ static int handle_mvpg_pei(struct kvm_vcpu *vcpu)
reg1, &dstaddr, GACC_STORE, 0);
if (rc)
return kvm_s390_inject_prog_cond(vcpu, rc);
- rc = kvm_arch_fault_in_page(vcpu, dstaddr, 1);
+ rc = gmap_fault(vcpu->arch.gmap, dstaddr, FAULT_FLAG_WRITE);
if (rc != 0)
return rc;
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index bb7134faaebf..08f0c80ef5e9 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -4579,22 +4579,6 @@ int kvm_s390_try_set_tod_clock(struct kvm *kvm, const struct kvm_s390_vm_tod_clo
return 1;
}
-/**
- * kvm_arch_fault_in_page - fault-in guest page if necessary
- * @vcpu: The corresponding virtual cpu
- * @gpa: Guest physical address
- * @writable: Whether the page should be writable or not
- *
- * Make sure that a guest page has been faulted-in on the host.
- *
- * Return: Zero on success, negative error code otherwise.
- */
-long kvm_arch_fault_in_page(struct kvm_vcpu *vcpu, gpa_t gpa, int writable)
-{
- return gmap_fault(vcpu->arch.gmap, gpa,
- writable ? FAULT_FLAG_WRITE : 0);
-}
-
static void __kvm_inject_pfault_token(struct kvm_vcpu *vcpu, bool start_token,
unsigned long token)
{
@@ -4797,7 +4781,7 @@ static int vcpu_post_run(struct kvm_vcpu *vcpu, int exit_reason)
if (kvm_arch_setup_async_pf(vcpu))
return 0;
vcpu->stat.pfault_sync++;
- return kvm_arch_fault_in_page(vcpu, current->thread.gmap_addr, 1);
+ return gmap_fault(vcpu->arch.gmap, current->thread.gmap_addr, FAULT_FLAG_WRITE);
}
return vcpu_post_run_fault_in_sie(vcpu);
}
diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
index e680c6bf0c9d..0765ad1031c4 100644
--- a/arch/s390/kvm/kvm-s390.h
+++ b/arch/s390/kvm/kvm-s390.h
@@ -394,7 +394,6 @@ int kvm_s390_handle_sigp_pei(struct kvm_vcpu *vcpu);
/* implemented in kvm-s390.c */
int kvm_s390_try_set_tod_clock(struct kvm *kvm, const struct kvm_s390_vm_tod_clock *gtod);
-long kvm_arch_fault_in_page(struct kvm_vcpu *vcpu, gpa_t gpa, int writable);
int kvm_s390_store_status_unloaded(struct kvm_vcpu *vcpu, unsigned long addr);
int kvm_s390_vcpu_store_status(struct kvm_vcpu *vcpu, unsigned long addr);
int kvm_s390_vcpu_start(struct kvm_vcpu *vcpu);
--
2.47.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v4 03/11] s390/mm/gmap: Refactor gmap_fault() and add support for pfault
2024-10-22 12:05 [PATCH v4 00/11] s390/kvm: Handle guest-related program interrupts in KVM Claudio Imbrenda
2024-10-22 12:05 ` [PATCH v4 01/11] s390/entry: Remove __GMAP_ASCE and use _PIF_GUEST_FAULT again Claudio Imbrenda
2024-10-22 12:05 ` [PATCH v4 02/11] s390/kvm: Remove kvm_arch_fault_in_page() Claudio Imbrenda
@ 2024-10-22 12:05 ` Claudio Imbrenda
2024-10-22 12:05 ` [PATCH v4 04/11] s390/mm/gmap: Fix __gmap_fault() return code Claudio Imbrenda
` (8 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Claudio Imbrenda @ 2024-10-22 12:05 UTC (permalink / raw)
To: linux-kernel
Cc: borntraeger, nsg, nrb, frankja, seiden, hca, agordeev, gor,
gerald.schaefer, kvm, linux-s390, david
When specifying FAULT_FLAG_RETRY_NOWAIT as flag for gmap_fault(), the
gmap fault will be processed only if it can be resolved quickly and
without sleeping. This will be needed for pfault.
Refactor gmap_fault() to improve readability.
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
---
arch/s390/mm/gmap.c | 119 +++++++++++++++++++++++++++++++++++++-------
1 file changed, 100 insertions(+), 19 deletions(-)
diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c
index eb0b51a36be0..f51ad948ba53 100644
--- a/arch/s390/mm/gmap.c
+++ b/arch/s390/mm/gmap.c
@@ -637,44 +637,125 @@ int __gmap_link(struct gmap *gmap, unsigned long gaddr, unsigned long vmaddr)
}
/**
- * gmap_fault - resolve a fault on a guest address
+ * fixup_user_fault_nowait - manually resolve a user page fault without waiting
+ * @mm: mm_struct of target mm
+ * @address: user address
+ * @fault_flags:flags to pass down to handle_mm_fault()
+ * @unlocked: did we unlock the mmap_lock while retrying
+ *
+ * This function behaves similarly to fixup_user_fault(), but it guarantees
+ * that the fault will be resolved without waiting. The function might drop
+ * and re-acquire the mm lock, in which case @unlocked will be set to true.
+ *
+ * The guarantee is that the fault is handled without waiting, but the
+ * function itself might sleep, due to the lock.
+ *
+ * Context: Needs to be called with mm->mmap_lock held in read mode, and will
+ * return with the lock held in read mode; @unlocked will indicate whether
+ * the lock has been dropped and re-acquired. This is the same behaviour as
+ * fixup_user_fault().
+ *
+ * Return: 0 on success, -EAGAIN if the fault cannot be resolved without
+ * waiting, -EFAULT if the fault cannot be resolved, -ENOMEM if out of
+ * memory.
+ */
+static int fixup_user_fault_nowait(struct mm_struct *mm, unsigned long address,
+ unsigned int fault_flags, bool *unlocked)
+{
+ struct vm_area_struct *vma;
+ unsigned int test_flags;
+ vm_fault_t fault;
+ int rc;
+
+ fault_flags |= FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_RETRY_NOWAIT;
+ test_flags = fault_flags & FAULT_FLAG_WRITE ? VM_WRITE : VM_READ;
+
+ vma = find_vma(mm, address);
+ if (unlikely(!vma || address < vma->vm_start))
+ return -EFAULT;
+ if (unlikely(!(vma->vm_flags & test_flags)))
+ return -EFAULT;
+
+ fault = handle_mm_fault(vma, address, fault_flags, NULL);
+ /* the mm lock has been dropped, take it again */
+ if (fault & VM_FAULT_COMPLETED) {
+ *unlocked = true;
+ mmap_read_lock(mm);
+ return 0;
+ }
+ /* the mm lock has not been dropped */
+ if (fault & VM_FAULT_ERROR) {
+ rc = vm_fault_to_errno(fault, 0);
+ BUG_ON(!rc);
+ return rc;
+ }
+ /* the mm lock has not been dropped because of FAULT_FLAG_RETRY_NOWAIT */
+ if (fault & VM_FAULT_RETRY)
+ return -EAGAIN;
+ /* nothing needed to be done and the mm lock has not been dropped */
+ return 0;
+}
+
+/**
+ * __gmap_fault - resolve a fault on a guest address
* @gmap: pointer to guest mapping meta data structure
* @gaddr: guest address
* @fault_flags: flags to pass down to handle_mm_fault()
*
- * Returns 0 on success, -ENOMEM for out of memory conditions, and -EFAULT
- * if the vm address is already mapped to a different guest segment.
+ * Context: Needs to be called with mm->mmap_lock held in read mode. Might
+ * drop and re-acquire the lock. Will always return with the lock held.
*/
-int gmap_fault(struct gmap *gmap, unsigned long gaddr,
- unsigned int fault_flags)
+static int __gmap_fault(struct gmap *gmap, unsigned long gaddr, unsigned int fault_flags)
{
unsigned long vmaddr;
- int rc;
bool unlocked;
-
- mmap_read_lock(gmap->mm);
+ int rc = 0;
retry:
unlocked = false;
+
vmaddr = __gmap_translate(gmap, gaddr);
- if (IS_ERR_VALUE(vmaddr)) {
- rc = vmaddr;
- goto out_up;
- }
- if (fixup_user_fault(gmap->mm, vmaddr, fault_flags,
- &unlocked)) {
- rc = -EFAULT;
- goto out_up;
+ if (IS_ERR_VALUE(vmaddr))
+ return vmaddr;
+
+ if (fault_flags & FAULT_FLAG_RETRY_NOWAIT) {
+ rc = fixup_user_fault_nowait(gmap->mm, vmaddr, fault_flags, &unlocked);
+ if (rc)
+ return rc;
+ } else if (fixup_user_fault(gmap->mm, vmaddr, fault_flags, &unlocked)) {
+ return -EFAULT;
}
/*
* In the case that fixup_user_fault unlocked the mmap_lock during
- * faultin redo __gmap_translate to not race with a map/unmap_segment.
+ * fault-in, redo __gmap_translate() to avoid racing with a
+ * map/unmap_segment.
+ * In particular, __gmap_translate(), fixup_user_fault{,_nowait}(),
+ * and __gmap_link() must all be called atomically in one go; if the
+ * lock had been dropped in between, a retry is needed.
*/
if (unlocked)
goto retry;
- rc = __gmap_link(gmap, gaddr, vmaddr);
-out_up:
+ return __gmap_link(gmap, gaddr, vmaddr);
+}
+
+/**
+ * gmap_fault - resolve a fault on a guest address
+ * @gmap: pointer to guest mapping meta data structure
+ * @gaddr: guest address
+ * @fault_flags: flags to pass down to handle_mm_fault()
+ *
+ * Returns 0 on success, -ENOMEM for out of memory conditions, -EFAULT if the
+ * vm address is already mapped to a different guest segment, and -EAGAIN if
+ * FAULT_FLAG_RETRY_NOWAIT was specified and the fault could not be processed
+ * immediately.
+ */
+int gmap_fault(struct gmap *gmap, unsigned long gaddr, unsigned int fault_flags)
+{
+ int rc;
+
+ mmap_read_lock(gmap->mm);
+ rc = __gmap_fault(gmap, gaddr, fault_flags);
mmap_read_unlock(gmap->mm);
return rc;
}
--
2.47.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v4 04/11] s390/mm/gmap: Fix __gmap_fault() return code
2024-10-22 12:05 [PATCH v4 00/11] s390/kvm: Handle guest-related program interrupts in KVM Claudio Imbrenda
` (2 preceding siblings ...)
2024-10-22 12:05 ` [PATCH v4 03/11] s390/mm/gmap: Refactor gmap_fault() and add support for pfault Claudio Imbrenda
@ 2024-10-22 12:05 ` Claudio Imbrenda
2024-10-22 12:05 ` [PATCH v4 05/11] s390/mm/fault: Handle guest-related program interrupts in KVM Claudio Imbrenda
` (7 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Claudio Imbrenda @ 2024-10-22 12:05 UTC (permalink / raw)
To: linux-kernel
Cc: borntraeger, nsg, nrb, frankja, seiden, hca, agordeev, gor,
gerald.schaefer, kvm, linux-s390, david
Errors in fixup_user_fault() were masked and -EFAULT was returned for
any error, including out of memory.
Fix this by returning the correct error code. This means that in many
cases the error code will be propagated all the way to userspace.
Suggested-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>
---
arch/s390/mm/gmap.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c
index f51ad948ba53..a8746f71c679 100644
--- a/arch/s390/mm/gmap.c
+++ b/arch/s390/mm/gmap.c
@@ -718,13 +718,12 @@ static int __gmap_fault(struct gmap *gmap, unsigned long gaddr, unsigned int fau
if (IS_ERR_VALUE(vmaddr))
return vmaddr;
- if (fault_flags & FAULT_FLAG_RETRY_NOWAIT) {
+ if (fault_flags & FAULT_FLAG_RETRY_NOWAIT)
rc = fixup_user_fault_nowait(gmap->mm, vmaddr, fault_flags, &unlocked);
- if (rc)
- return rc;
- } else if (fixup_user_fault(gmap->mm, vmaddr, fault_flags, &unlocked)) {
- return -EFAULT;
- }
+ else
+ rc = fixup_user_fault(gmap->mm, vmaddr, fault_flags, &unlocked);
+ if (rc)
+ return rc;
/*
* In the case that fixup_user_fault unlocked the mmap_lock during
* fault-in, redo __gmap_translate() to avoid racing with a
--
2.47.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v4 05/11] s390/mm/fault: Handle guest-related program interrupts in KVM
2024-10-22 12:05 [PATCH v4 00/11] s390/kvm: Handle guest-related program interrupts in KVM Claudio Imbrenda
` (3 preceding siblings ...)
2024-10-22 12:05 ` [PATCH v4 04/11] s390/mm/gmap: Fix __gmap_fault() return code Claudio Imbrenda
@ 2024-10-22 12:05 ` Claudio Imbrenda
2024-10-22 12:38 ` Alexander Gordeev
2024-10-22 12:05 ` [PATCH v4 06/11] s390/kvm: Stop using gmap_{en,dis}able() Claudio Imbrenda
` (6 subsequent siblings)
11 siblings, 1 reply; 16+ messages in thread
From: Claudio Imbrenda @ 2024-10-22 12:05 UTC (permalink / raw)
To: linux-kernel
Cc: borntraeger, nsg, nrb, frankja, seiden, hca, agordeev, gor,
gerald.schaefer, kvm, linux-s390, david
Any program interrupt that happens in the host during the execution of
a KVM guest will now short circuit the fault handler and return to KVM
immediately. Guest fault handling (including pfault) will happen
entirely inside KVM.
When sie64a() returns zero, current->thread.gmap_int_code will contain
the program interrupt number that caused the exit, or zero if the exit
was not caused by a host program interrupt.
KVM will now take care of handling all guest faults in vcpu_post_run().
Since gmap faults will not be visible by the rest of the kernel, remove
GMAP_FAULT, the linux fault handlers for secure execution faults, the
exception table entries for the sie instruction, the nop padding after
the sie instruction, and all other references to guest faults from the
s390 code.
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Co-developed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
[agordeev@linux.ibm.com: remove spurious flags &= ~FAULT_FLAG_RETRY_NOWAIT]
---
arch/s390/include/asm/kvm_host.h | 3 +
arch/s390/include/asm/processor.h | 5 +-
arch/s390/kernel/entry.S | 22 ------
arch/s390/kernel/traps.c | 23 ++++--
arch/s390/kvm/kvm-s390.c | 119 ++++++++++++++++++++++++------
arch/s390/kvm/kvm-s390.h | 7 ++
arch/s390/kvm/vsie.c | 13 ++--
arch/s390/mm/fault.c | 99 +------------------------
8 files changed, 135 insertions(+), 156 deletions(-)
diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 8e77afbed58e..603b56bfccd3 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -527,6 +527,9 @@ struct kvm_vcpu_stat {
#define PGM_REGION_FIRST_TRANS 0x39
#define PGM_REGION_SECOND_TRANS 0x3a
#define PGM_REGION_THIRD_TRANS 0x3b
+#define PGM_SECURE_STORAGE_ACCESS 0x3d
+#define PGM_NON_SECURE_STORAGE_ACCESS 0x3e
+#define PGM_SECURE_STORAGE_VIOLATION 0x3f
#define PGM_MONITOR 0x40
#define PGM_PER 0x80
#define PGM_CRYPTO_OPERATION 0x119
diff --git a/arch/s390/include/asm/processor.h b/arch/s390/include/asm/processor.h
index 9a5236acc0a8..8761fd01a9f0 100644
--- a/arch/s390/include/asm/processor.h
+++ b/arch/s390/include/asm/processor.h
@@ -39,6 +39,7 @@
#include <asm/runtime_instr.h>
#include <asm/irqflags.h>
#include <asm/alternative.h>
+#include <asm/fault.h>
struct pcpu {
unsigned long ec_mask; /* bit mask for ec_xxx functions */
@@ -187,10 +188,8 @@ struct thread_struct {
unsigned long hardirq_timer; /* task cputime in hardirq context */
unsigned long softirq_timer; /* task cputime in softirq context */
const sys_call_ptr_t *sys_call_table; /* system call table address */
- unsigned long gmap_addr; /* address of last gmap fault. */
- unsigned int gmap_write_flag; /* gmap fault write indication */
+ union teid gmap_teid; /* address and flags of last gmap fault */
unsigned int gmap_int_code; /* int code of last gmap fault */
- unsigned int gmap_pfault; /* signal of a pending guest pfault */
int ufpu_flags; /* user fpu flags */
int kfpu_flags; /* kernel fpu flags */
diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S
index 454841229ef4..924bcb71a33f 100644
--- a/arch/s390/kernel/entry.S
+++ b/arch/s390/kernel/entry.S
@@ -222,17 +222,6 @@ SYM_FUNC_START(__sie64a)
lctlg %c1,%c1,__LC_KERNEL_ASCE(%r14) # load primary asce
lg %r14,__LC_CURRENT(%r14)
mvi __TI_sie(%r14),0
-# some program checks are suppressing. C code (e.g. do_protection_exception)
-# will rewind the PSW by the ILC, which is often 4 bytes in case of SIE. There
-# are some corner cases (e.g. runtime instrumentation) where ILC is unpredictable.
-# Other instructions between __sie64a and .Lsie_done should not cause program
-# interrupts. So lets use 3 nops as a landing pad for all possible rewinds.
-.Lrewind_pad6:
- nopr 7
-.Lrewind_pad4:
- nopr 7
-.Lrewind_pad2:
- nopr 7
SYM_INNER_LABEL(sie_exit, SYM_L_GLOBAL)
lg %r14,__SF_SIE_SAVEAREA(%r15) # load guest register save area
stmg %r0,%r13,0(%r14) # save guest gprs 0-13
@@ -244,15 +233,6 @@ SYM_INNER_LABEL(sie_exit, SYM_L_GLOBAL)
lmg %r6,%r14,__SF_GPRS(%r15) # restore kernel registers
lg %r2,__SF_SIE_REASON(%r15) # return exit reason code
BR_EX %r14
-.Lsie_fault:
- lghi %r14,-EFAULT
- stg %r14,__SF_SIE_REASON(%r15) # set exit reason code
- j sie_exit
-
- EX_TABLE(.Lrewind_pad6,.Lsie_fault)
- EX_TABLE(.Lrewind_pad4,.Lsie_fault)
- EX_TABLE(.Lrewind_pad2,.Lsie_fault)
- EX_TABLE(sie_exit,.Lsie_fault)
SYM_FUNC_END(__sie64a)
EXPORT_SYMBOL(__sie64a)
EXPORT_SYMBOL(sie_exit)
@@ -341,7 +321,6 @@ SYM_CODE_START(pgm_check_handler)
jz 1f
BPENTER __SF_SIE_FLAGS(%r15),_TIF_ISOLATE_BP_GUEST
SIEEXIT __SF_SIE_CONTROL(%r15),%r13
- lg %r12,__SF_SIE_GUEST_ASCE(%r15)
lghi %r10,_PIF_GUEST_FAULT
#endif
1: tmhh %r8,0x4000 # PER bit set in old PSW ?
@@ -355,7 +334,6 @@ SYM_CODE_START(pgm_check_handler)
3: lg %r15,__LC_KERNEL_STACK(%r13)
4: la %r11,STACK_FRAME_OVERHEAD(%r15)
stg %r10,__PT_FLAGS(%r11)
- stg %r12,__PT_CR1(%r11)
xc __SF_BACKCHAIN(8,%r15),__SF_BACKCHAIN(%r15)
stmg %r0,%r7,__PT_R0(%r11)
mvc __PT_R8(64,%r11),__LC_SAVE_AREA(%r13)
diff --git a/arch/s390/kernel/traps.c b/arch/s390/kernel/traps.c
index 160b2acba8db..b7ce15107477 100644
--- a/arch/s390/kernel/traps.c
+++ b/arch/s390/kernel/traps.c
@@ -31,6 +31,7 @@
#include <asm/asm-extable.h>
#include <asm/vtime.h>
#include <asm/fpu.h>
+#include <asm/fault.h>
#include "entry.h"
static inline void __user *get_trap_ip(struct pt_regs *regs)
@@ -317,9 +318,23 @@ void noinstr __do_pgm_check(struct pt_regs *regs)
struct lowcore *lc = get_lowcore();
irqentry_state_t state;
unsigned int trapnr;
+ union teid teid = { .val = lc->trans_exc_code };
regs->int_code = lc->pgm_int_code;
- regs->int_parm_long = lc->trans_exc_code;
+ regs->int_parm_long = teid.val;
+
+ /*
+ * In case of a guest fault, short-circuit the fault handler and return.
+ * This way the sie64a() function will return 0; fault address and
+ * other relevant bits are saved in current->thread.gmap_teid, and
+ * the fault number in current->thread.gmap_int_code. KVM will be
+ * able to use this information to handle the fault.
+ */
+ if (test_pt_regs_flag(regs, PIF_GUEST_FAULT)) {
+ current->thread.gmap_teid.val = regs->int_parm_long;
+ current->thread.gmap_int_code = regs->int_code & 0xffff;
+ return;
+ }
state = irqentry_enter(regs);
@@ -408,8 +423,8 @@ static void (*pgm_check_table[128])(struct pt_regs *regs) = {
[0x3b] = do_dat_exception,
[0x3c] = default_trap_handler,
[0x3d] = do_secure_storage_access,
- [0x3e] = do_non_secure_storage_access,
- [0x3f] = do_secure_storage_violation,
+ [0x3e] = default_trap_handler,
+ [0x3f] = default_trap_handler,
[0x40] = monitor_event_exception,
[0x41 ... 0x7f] = default_trap_handler,
};
@@ -420,5 +435,3 @@ static void (*pgm_check_table[128])(struct pt_regs *regs) = {
__stringify(default_trap_handler))
COND_TRAP(do_secure_storage_access);
-COND_TRAP(do_non_secure_storage_access);
-COND_TRAP(do_secure_storage_violation);
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 08f0c80ef5e9..050710549a10 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -4646,12 +4646,11 @@ static bool kvm_arch_setup_async_pf(struct kvm_vcpu *vcpu)
if (!vcpu->arch.gmap->pfault_enabled)
return false;
- hva = gfn_to_hva(vcpu->kvm, gpa_to_gfn(current->thread.gmap_addr));
- hva += current->thread.gmap_addr & ~PAGE_MASK;
+ hva = gfn_to_hva(vcpu->kvm, current->thread.gmap_teid.addr);
if (read_guest_real(vcpu, vcpu->arch.pfault_token, &arch.pfault_token, 8))
return false;
- return kvm_setup_async_pf(vcpu, current->thread.gmap_addr, hva, &arch);
+ return kvm_setup_async_pf(vcpu, current->thread.gmap_teid.addr * PAGE_SIZE, hva, &arch);
}
static int vcpu_pre_run(struct kvm_vcpu *vcpu)
@@ -4689,6 +4688,7 @@ static int vcpu_pre_run(struct kvm_vcpu *vcpu)
clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.gisa_int.kicked_mask);
vcpu->arch.sie_block->icptcode = 0;
+ current->thread.gmap_int_code = 0;
cpuflags = atomic_read(&vcpu->arch.sie_block->cpuflags);
VCPU_EVENT(vcpu, 6, "entering sie flags %x", cpuflags);
trace_kvm_s390_sie_enter(vcpu, cpuflags);
@@ -4696,7 +4696,7 @@ static int vcpu_pre_run(struct kvm_vcpu *vcpu)
return 0;
}
-static int vcpu_post_run_fault_in_sie(struct kvm_vcpu *vcpu)
+static int vcpu_post_run_addressing_exception(struct kvm_vcpu *vcpu)
{
struct kvm_s390_pgm_info pgm_info = {
.code = PGM_ADDRESSING,
@@ -4732,10 +4732,100 @@ static int vcpu_post_run_fault_in_sie(struct kvm_vcpu *vcpu)
return kvm_s390_inject_prog_irq(vcpu, &pgm_info);
}
+static int vcpu_post_run_handle_fault(struct kvm_vcpu *vcpu)
+{
+ unsigned long gaddr;
+ unsigned int flags;
+ int rc = 0;
+
+ gaddr = current->thread.gmap_teid.addr * PAGE_SIZE;
+ if (kvm_s390_cur_gmap_fault_is_write())
+ flags = FAULT_FLAG_WRITE;
+
+ switch (current->thread.gmap_int_code) {
+ case 0:
+ vcpu->stat.exit_null++;
+ break;
+ case PGM_NON_SECURE_STORAGE_ACCESS:
+ KVM_BUG_ON(current->thread.gmap_teid.as != PSW_BITS_AS_PRIMARY, vcpu->kvm);
+ /*
+ * This is normal operation; a page belonging to a protected
+ * guest has not been imported yet. Try to import the page into
+ * the protected guest.
+ */
+ if (gmap_convert_to_secure(vcpu->arch.gmap, gaddr) == -EINVAL)
+ send_sig(SIGSEGV, current, 0);
+ break;
+ case PGM_SECURE_STORAGE_ACCESS:
+ case PGM_SECURE_STORAGE_VIOLATION:
+ KVM_BUG_ON(current->thread.gmap_teid.as != PSW_BITS_AS_PRIMARY, vcpu->kvm);
+ /*
+ * This can happen after a reboot with asynchronous teardown;
+ * the new guest (normal or protected) will run on top of the
+ * previous protected guest. The old pages need to be destroyed
+ * so the new guest can use them.
+ */
+ if (gmap_destroy_page(vcpu->arch.gmap, gaddr)) {
+ /*
+ * Either KVM messed up the secure guest mapping or the
+ * same page is mapped into multiple secure guests.
+ *
+ * This exception is only triggered when a guest 2 is
+ * running and can therefore never occur in kernel
+ * context.
+ */
+ pr_warn_ratelimited("Secure storage violation (%x) in task: %s, pid %d\n",
+ current->thread.gmap_int_code, current->comm,
+ current->pid);
+ send_sig(SIGSEGV, current, 0);
+ }
+ break;
+ case PGM_PROTECTION:
+ case PGM_SEGMENT_TRANSLATION:
+ case PGM_PAGE_TRANSLATION:
+ case PGM_ASCE_TYPE:
+ case PGM_REGION_FIRST_TRANS:
+ case PGM_REGION_SECOND_TRANS:
+ case PGM_REGION_THIRD_TRANS:
+ KVM_BUG_ON(current->thread.gmap_teid.as != PSW_BITS_AS_PRIMARY, vcpu->kvm);
+ if (vcpu->arch.gmap->pfault_enabled) {
+ rc = gmap_fault(vcpu->arch.gmap, gaddr, flags | FAULT_FLAG_RETRY_NOWAIT);
+ if (rc == -EFAULT)
+ return vcpu_post_run_addressing_exception(vcpu);
+ if (rc == -EAGAIN) {
+ trace_kvm_s390_major_guest_pfault(vcpu);
+ if (kvm_arch_setup_async_pf(vcpu))
+ return 0;
+ vcpu->stat.pfault_sync++;
+ } else {
+ return rc;
+ }
+ }
+ rc = gmap_fault(vcpu->arch.gmap, gaddr, flags);
+ if (rc == -EFAULT) {
+ if (kvm_is_ucontrol(vcpu->kvm)) {
+ vcpu->run->exit_reason = KVM_EXIT_S390_UCONTROL;
+ vcpu->run->s390_ucontrol.trans_exc_code = gaddr;
+ vcpu->run->s390_ucontrol.pgm_code = 0x10;
+ return -EREMOTE;
+ }
+ return vcpu_post_run_addressing_exception(vcpu);
+ }
+ break;
+ default:
+ KVM_BUG(1, vcpu->kvm, "Unexpected program interrupt 0x%x, TEID 0x%016lx",
+ current->thread.gmap_int_code, current->thread.gmap_teid.val);
+ send_sig(SIGSEGV, current, 0);
+ break;
+ }
+ return rc;
+}
+
static int vcpu_post_run(struct kvm_vcpu *vcpu, int exit_reason)
{
struct mcck_volatile_info *mcck_info;
struct sie_page *sie_page;
+ int rc;
VCPU_EVENT(vcpu, 6, "exit sie icptcode %d",
vcpu->arch.sie_block->icptcode);
@@ -4757,7 +4847,7 @@ static int vcpu_post_run(struct kvm_vcpu *vcpu, int exit_reason)
}
if (vcpu->arch.sie_block->icptcode > 0) {
- int rc = kvm_handle_sie_intercept(vcpu);
+ rc = kvm_handle_sie_intercept(vcpu);
if (rc != -EOPNOTSUPP)
return rc;
@@ -4766,24 +4856,9 @@ static int vcpu_post_run(struct kvm_vcpu *vcpu, int exit_reason)
vcpu->run->s390_sieic.ipa = vcpu->arch.sie_block->ipa;
vcpu->run->s390_sieic.ipb = vcpu->arch.sie_block->ipb;
return -EREMOTE;
- } else if (exit_reason != -EFAULT) {
- vcpu->stat.exit_null++;
- return 0;
- } else if (kvm_is_ucontrol(vcpu->kvm)) {
- vcpu->run->exit_reason = KVM_EXIT_S390_UCONTROL;
- vcpu->run->s390_ucontrol.trans_exc_code =
- current->thread.gmap_addr;
- vcpu->run->s390_ucontrol.pgm_code = 0x10;
- return -EREMOTE;
- } else if (current->thread.gmap_pfault) {
- trace_kvm_s390_major_guest_pfault(vcpu);
- current->thread.gmap_pfault = 0;
- if (kvm_arch_setup_async_pf(vcpu))
- return 0;
- vcpu->stat.pfault_sync++;
- return gmap_fault(vcpu->arch.gmap, current->thread.gmap_addr, FAULT_FLAG_WRITE);
}
- return vcpu_post_run_fault_in_sie(vcpu);
+
+ return vcpu_post_run_handle_fault(vcpu);
}
#define PSW_INT_MASK (PSW_MASK_EXT | PSW_MASK_IO | PSW_MASK_MCHECK)
diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
index 0765ad1031c4..597d7a71deeb 100644
--- a/arch/s390/kvm/kvm-s390.h
+++ b/arch/s390/kvm/kvm-s390.h
@@ -528,6 +528,13 @@ static inline int kvm_s390_use_sca_entries(void)
void kvm_s390_reinject_machine_check(struct kvm_vcpu *vcpu,
struct mcck_volatile_info *mcck_info);
+static inline bool kvm_s390_cur_gmap_fault_is_write(void)
+{
+ if (current->thread.gmap_int_code == PGM_PROTECTION)
+ return true;
+ return test_facility(75) && (current->thread.gmap_teid.fsi == TEID_FSI_STORE);
+}
+
/**
* kvm_s390_vcpu_crypto_reset_all
*
diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
index 89cafea4c41f..35e7dd882148 100644
--- a/arch/s390/kvm/vsie.c
+++ b/arch/s390/kvm/vsie.c
@@ -925,16 +925,16 @@ static int handle_fault(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
if (current->thread.gmap_int_code == PGM_PROTECTION)
/* we can directly forward all protection exceptions */
return inject_fault(vcpu, PGM_PROTECTION,
- current->thread.gmap_addr, 1);
+ current->thread.gmap_teid.addr * PAGE_SIZE, 1);
rc = kvm_s390_shadow_fault(vcpu, vsie_page->gmap,
- current->thread.gmap_addr, NULL);
+ current->thread.gmap_teid.addr * PAGE_SIZE, NULL);
if (rc > 0) {
rc = inject_fault(vcpu, rc,
- current->thread.gmap_addr,
- current->thread.gmap_write_flag);
+ current->thread.gmap_teid.addr * PAGE_SIZE,
+ kvm_s390_cur_gmap_fault_is_write());
if (rc >= 0)
- vsie_page->fault_addr = current->thread.gmap_addr;
+ vsie_page->fault_addr = current->thread.gmap_teid.addr * PAGE_SIZE;
}
return rc;
}
@@ -1148,6 +1148,7 @@ static int do_vsie_run(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
* also kick the vSIE.
*/
vcpu->arch.sie_block->prog0c |= PROG_IN_SIE;
+ current->thread.gmap_int_code = 0;
barrier();
if (!kvm_s390_vcpu_sie_inhibited(vcpu))
rc = sie64a(scb_s, vcpu->run->s.regs.gprs, gmap_get_enabled()->asce);
@@ -1172,7 +1173,7 @@ static int do_vsie_run(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
if (rc > 0)
rc = 0; /* we could still have an icpt */
- else if (rc == -EFAULT)
+ else if (current->thread.gmap_int_code)
return handle_fault(vcpu, vsie_page);
switch (scb_s->icptcode) {
diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
index a6cf33b0f339..e48910b0b816 100644
--- a/arch/s390/mm/fault.c
+++ b/arch/s390/mm/fault.c
@@ -49,7 +49,6 @@
enum fault_type {
KERNEL_FAULT,
USER_FAULT,
- GMAP_FAULT,
};
static DEFINE_STATIC_KEY_FALSE(have_store_indication);
@@ -72,10 +71,6 @@ static enum fault_type get_fault_type(struct pt_regs *regs)
if (likely(teid.as == PSW_BITS_AS_PRIMARY)) {
if (user_mode(regs))
return USER_FAULT;
- if (!IS_ENABLED(CONFIG_PGSTE))
- return KERNEL_FAULT;
- if (test_pt_regs_flag(regs, PIF_GUEST_FAULT))
- return GMAP_FAULT;
return KERNEL_FAULT;
}
if (teid.as == PSW_BITS_AS_SECONDARY)
@@ -184,10 +179,6 @@ static void dump_fault_info(struct pt_regs *regs)
asce = get_lowcore()->user_asce.val;
pr_cont("user ");
break;
- case GMAP_FAULT:
- asce = regs->cr1;
- pr_cont("gmap ");
- break;
case KERNEL_FAULT:
asce = get_lowcore()->kernel_asce.val;
pr_cont("kernel ");
@@ -285,7 +276,6 @@ static void do_exception(struct pt_regs *regs, int access)
struct mm_struct *mm;
enum fault_type type;
unsigned int flags;
- struct gmap *gmap;
vm_fault_t fault;
bool is_write;
@@ -304,7 +294,6 @@ static void do_exception(struct pt_regs *regs, int access)
case KERNEL_FAULT:
return handle_fault_error_nolock(regs, 0);
case USER_FAULT:
- case GMAP_FAULT:
if (faulthandler_disabled() || !mm)
return handle_fault_error_nolock(regs, 0);
break;
@@ -348,18 +337,6 @@ static void do_exception(struct pt_regs *regs, int access)
}
lock_mmap:
mmap_read_lock(mm);
- gmap = NULL;
- if (IS_ENABLED(CONFIG_PGSTE) && type == GMAP_FAULT) {
- gmap = (struct gmap *)get_lowcore()->gmap;
- current->thread.gmap_addr = address;
- current->thread.gmap_write_flag = !!(flags & FAULT_FLAG_WRITE);
- current->thread.gmap_int_code = regs->int_code & 0xffff;
- address = __gmap_translate(gmap, address);
- if (address == -EFAULT)
- return handle_fault_error(regs, SEGV_MAPERR);
- if (gmap->pfault_enabled)
- flags |= FAULT_FLAG_RETRY_NOWAIT;
- }
retry:
vma = find_vma(mm, address);
if (!vma)
@@ -375,50 +352,22 @@ static void do_exception(struct pt_regs *regs, int access)
return handle_fault_error(regs, SEGV_ACCERR);
fault = handle_mm_fault(vma, address, flags, regs);
if (fault_signal_pending(fault, regs)) {
- if (flags & FAULT_FLAG_RETRY_NOWAIT)
- mmap_read_unlock(mm);
if (!user_mode(regs))
handle_fault_error_nolock(regs, 0);
return;
}
/* The fault is fully completed (including releasing mmap lock) */
- if (fault & VM_FAULT_COMPLETED) {
- if (gmap) {
- mmap_read_lock(mm);
- goto gmap;
- }
+ if (fault & VM_FAULT_COMPLETED)
return;
- }
if (unlikely(fault & VM_FAULT_ERROR)) {
mmap_read_unlock(mm);
goto error;
}
if (fault & VM_FAULT_RETRY) {
- if (IS_ENABLED(CONFIG_PGSTE) && gmap && (flags & FAULT_FLAG_RETRY_NOWAIT)) {
- /*
- * FAULT_FLAG_RETRY_NOWAIT has been set,
- * mmap_lock has not been released
- */
- current->thread.gmap_pfault = 1;
- return handle_fault_error(regs, 0);
- }
- flags &= ~FAULT_FLAG_RETRY_NOWAIT;
flags |= FAULT_FLAG_TRIED;
mmap_read_lock(mm);
goto retry;
}
-gmap:
- if (IS_ENABLED(CONFIG_PGSTE) && gmap) {
- address = __gmap_link(gmap, current->thread.gmap_addr,
- address);
- if (address == -EFAULT)
- return handle_fault_error(regs, SEGV_MAPERR);
- if (address == -ENOMEM) {
- fault = VM_FAULT_OOM;
- mmap_read_unlock(mm);
- goto error;
- }
- }
mmap_read_unlock(mm);
return;
error:
@@ -494,7 +443,6 @@ void do_secure_storage_access(struct pt_regs *regs)
struct folio_walk fw;
struct mm_struct *mm;
struct folio *folio;
- struct gmap *gmap;
int rc;
/*
@@ -520,15 +468,6 @@ void do_secure_storage_access(struct pt_regs *regs)
panic("Unexpected PGM 0x3d with TEID bit 61=0");
}
switch (get_fault_type(regs)) {
- case GMAP_FAULT:
- mm = current->mm;
- gmap = (struct gmap *)get_lowcore()->gmap;
- mmap_read_lock(mm);
- addr = __gmap_translate(gmap, addr);
- mmap_read_unlock(mm);
- if (IS_ERR_VALUE(addr))
- return handle_fault_error_nolock(regs, SEGV_MAPERR);
- fallthrough;
case USER_FAULT:
mm = current->mm;
mmap_read_lock(mm);
@@ -564,40 +503,4 @@ void do_secure_storage_access(struct pt_regs *regs)
}
NOKPROBE_SYMBOL(do_secure_storage_access);
-void do_non_secure_storage_access(struct pt_regs *regs)
-{
- struct gmap *gmap = (struct gmap *)get_lowcore()->gmap;
- unsigned long gaddr = get_fault_address(regs);
-
- if (WARN_ON_ONCE(get_fault_type(regs) != GMAP_FAULT))
- return handle_fault_error_nolock(regs, SEGV_MAPERR);
- if (gmap_convert_to_secure(gmap, gaddr) == -EINVAL)
- send_sig(SIGSEGV, current, 0);
-}
-NOKPROBE_SYMBOL(do_non_secure_storage_access);
-
-void do_secure_storage_violation(struct pt_regs *regs)
-{
- struct gmap *gmap = (struct gmap *)get_lowcore()->gmap;
- unsigned long gaddr = get_fault_address(regs);
-
- /*
- * If the VM has been rebooted, its address space might still contain
- * secure pages from the previous boot.
- * Clear the page so it can be reused.
- */
- if (!gmap_destroy_page(gmap, gaddr))
- return;
- /*
- * Either KVM messed up the secure guest mapping or the same
- * page is mapped into multiple secure guests.
- *
- * This exception is only triggered when a guest 2 is running
- * and can therefore never occur in kernel context.
- */
- pr_warn_ratelimited("Secure storage violation in task: %s, pid %d\n",
- current->comm, current->pid);
- send_sig(SIGSEGV, current, 0);
-}
-
#endif /* CONFIG_PGSTE */
--
2.47.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v4 06/11] s390/kvm: Stop using gmap_{en,dis}able()
2024-10-22 12:05 [PATCH v4 00/11] s390/kvm: Handle guest-related program interrupts in KVM Claudio Imbrenda
` (4 preceding siblings ...)
2024-10-22 12:05 ` [PATCH v4 05/11] s390/mm/fault: Handle guest-related program interrupts in KVM Claudio Imbrenda
@ 2024-10-22 12:05 ` Claudio Imbrenda
2024-10-22 12:05 ` [PATCH v4 07/11] s390/mm/gmap: Remove gmap_{en,dis}able() Claudio Imbrenda
` (5 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Claudio Imbrenda @ 2024-10-22 12:05 UTC (permalink / raw)
To: linux-kernel
Cc: borntraeger, nsg, nrb, frankja, seiden, hca, agordeev, gor,
gerald.schaefer, kvm, linux-s390, david
Stop using gmap_enable(), gmap_disable(), gmap_get_enabled().
The correct guest ASCE is passed as a parameter of sie64a(), there is
no need to save the current gmap in lowcore.
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Acked-by: Steffen Eiden <seiden@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
---
arch/s390/include/asm/kvm_host.h | 2 --
arch/s390/kvm/kvm-s390.c | 7 +------
arch/s390/kvm/vsie.c | 4 +---
3 files changed, 2 insertions(+), 11 deletions(-)
diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 603b56bfccd3..51201b4ac93a 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -750,8 +750,6 @@ struct kvm_vcpu_arch {
struct hrtimer ckc_timer;
struct kvm_s390_pgm_info pgm;
struct gmap *gmap;
- /* backup location for the currently enabled gmap when scheduled out */
- struct gmap *enabled_gmap;
struct kvm_guestdbg_info_arch guestdbg;
unsigned long pfault_token;
unsigned long pfault_select;
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 050710549a10..a5750c14bb4d 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -3719,7 +3719,6 @@ __u64 kvm_s390_get_cpu_timer(struct kvm_vcpu *vcpu)
void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
{
- gmap_enable(vcpu->arch.enabled_gmap);
kvm_s390_set_cpuflags(vcpu, CPUSTAT_RUNNING);
if (vcpu->arch.cputm_enabled && !is_vcpu_idle(vcpu))
__start_cpu_timer_accounting(vcpu);
@@ -3732,8 +3731,6 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
if (vcpu->arch.cputm_enabled && !is_vcpu_idle(vcpu))
__stop_cpu_timer_accounting(vcpu);
kvm_s390_clear_cpuflags(vcpu, CPUSTAT_RUNNING);
- vcpu->arch.enabled_gmap = gmap_get_enabled();
- gmap_disable(vcpu->arch.enabled_gmap);
}
@@ -3751,8 +3748,6 @@ void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
}
if (test_kvm_facility(vcpu->kvm, 74) || vcpu->kvm->arch.user_instr0)
vcpu->arch.sie_block->ictl |= ICTL_OPEREXC;
- /* make vcpu_load load the right gmap on the first trigger */
- vcpu->arch.enabled_gmap = vcpu->arch.gmap;
}
static bool kvm_has_pckmo_subfunc(struct kvm *kvm, unsigned long nr)
@@ -4894,7 +4889,7 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
}
exit_reason = sie64a(vcpu->arch.sie_block,
vcpu->run->s.regs.gprs,
- gmap_get_enabled()->asce);
+ vcpu->arch.gmap->asce);
if (kvm_s390_pv_cpu_is_protected(vcpu)) {
memcpy(vcpu->run->s.regs.gprs,
sie_page->pv_grregs,
diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
index 35e7dd882148..d03f95e528fe 100644
--- a/arch/s390/kvm/vsie.c
+++ b/arch/s390/kvm/vsie.c
@@ -1151,7 +1151,7 @@ static int do_vsie_run(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
current->thread.gmap_int_code = 0;
barrier();
if (!kvm_s390_vcpu_sie_inhibited(vcpu))
- rc = sie64a(scb_s, vcpu->run->s.regs.gprs, gmap_get_enabled()->asce);
+ rc = sie64a(scb_s, vcpu->run->s.regs.gprs, vsie_page->gmap->asce);
barrier();
vcpu->arch.sie_block->prog0c &= ~PROG_IN_SIE;
@@ -1296,10 +1296,8 @@ static int vsie_run(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
if (!rc)
rc = map_prefix(vcpu, vsie_page);
if (!rc) {
- gmap_enable(vsie_page->gmap);
update_intervention_requests(vsie_page);
rc = do_vsie_run(vcpu, vsie_page);
- gmap_enable(vcpu->arch.gmap);
}
atomic_andnot(PROG_BLOCK_SIE, &scb_s->prog20);
--
2.47.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v4 07/11] s390/mm/gmap: Remove gmap_{en,dis}able()
2024-10-22 12:05 [PATCH v4 00/11] s390/kvm: Handle guest-related program interrupts in KVM Claudio Imbrenda
` (5 preceding siblings ...)
2024-10-22 12:05 ` [PATCH v4 06/11] s390/kvm: Stop using gmap_{en,dis}able() Claudio Imbrenda
@ 2024-10-22 12:05 ` Claudio Imbrenda
2024-10-23 8:39 ` Steffen Eiden
2024-10-22 12:05 ` [PATCH v4 08/11] s390: Remove gmap pointer from lowcore Claudio Imbrenda
` (4 subsequent siblings)
11 siblings, 1 reply; 16+ messages in thread
From: Claudio Imbrenda @ 2024-10-22 12:05 UTC (permalink / raw)
To: linux-kernel
Cc: borntraeger, nsg, nrb, frankja, seiden, hca, agordeev, gor,
gerald.schaefer, kvm, linux-s390, david
Remove gmap_enable(), gmap_disable(), and gmap_get_enabled() since they do
not have any users anymore.
Suggested-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
---
arch/s390/include/asm/gmap.h | 3 ---
arch/s390/mm/gmap.c | 31 -------------------------------
2 files changed, 34 deletions(-)
diff --git a/arch/s390/include/asm/gmap.h b/arch/s390/include/asm/gmap.h
index 9725586f4259..64761c78f774 100644
--- a/arch/s390/include/asm/gmap.h
+++ b/arch/s390/include/asm/gmap.h
@@ -107,9 +107,6 @@ void gmap_remove(struct gmap *gmap);
struct gmap *gmap_get(struct gmap *gmap);
void gmap_put(struct gmap *gmap);
-void gmap_enable(struct gmap *gmap);
-void gmap_disable(struct gmap *gmap);
-struct gmap *gmap_get_enabled(void);
int gmap_map_segment(struct gmap *gmap, unsigned long from,
unsigned long to, unsigned long len);
int gmap_unmap_segment(struct gmap *gmap, unsigned long to, unsigned long len);
diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c
index a8746f71c679..329682655af2 100644
--- a/arch/s390/mm/gmap.c
+++ b/arch/s390/mm/gmap.c
@@ -281,37 +281,6 @@ void gmap_remove(struct gmap *gmap)
}
EXPORT_SYMBOL_GPL(gmap_remove);
-/**
- * gmap_enable - switch primary space to the guest address space
- * @gmap: pointer to the guest address space structure
- */
-void gmap_enable(struct gmap *gmap)
-{
- get_lowcore()->gmap = (unsigned long)gmap;
-}
-EXPORT_SYMBOL_GPL(gmap_enable);
-
-/**
- * gmap_disable - switch back to the standard primary address space
- * @gmap: pointer to the guest address space structure
- */
-void gmap_disable(struct gmap *gmap)
-{
- get_lowcore()->gmap = 0UL;
-}
-EXPORT_SYMBOL_GPL(gmap_disable);
-
-/**
- * gmap_get_enabled - get a pointer to the currently enabled gmap
- *
- * Returns a pointer to the currently enabled gmap. 0 if none is enabled.
- */
-struct gmap *gmap_get_enabled(void)
-{
- return (struct gmap *)get_lowcore()->gmap;
-}
-EXPORT_SYMBOL_GPL(gmap_get_enabled);
-
/*
* gmap_alloc_table is assumed to be called with mmap_lock held
*/
--
2.47.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v4 08/11] s390: Remove gmap pointer from lowcore
2024-10-22 12:05 [PATCH v4 00/11] s390/kvm: Handle guest-related program interrupts in KVM Claudio Imbrenda
` (6 preceding siblings ...)
2024-10-22 12:05 ` [PATCH v4 07/11] s390/mm/gmap: Remove gmap_{en,dis}able() Claudio Imbrenda
@ 2024-10-22 12:05 ` Claudio Imbrenda
2024-10-22 12:05 ` [PATCH v4 09/11] s390/mm: Simplify get_fault_type() Claudio Imbrenda
` (3 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Claudio Imbrenda @ 2024-10-22 12:05 UTC (permalink / raw)
To: linux-kernel
Cc: borntraeger, nsg, nrb, frankja, seiden, hca, agordeev, gor,
gerald.schaefer, kvm, linux-s390, david
Remove the gmap pointer from lowcore, since it is not used anymore.
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Steffen Eiden <seiden@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
---
arch/s390/include/asm/lowcore.h | 3 +--
arch/s390/kernel/asm-offsets.c | 1 -
2 files changed, 1 insertion(+), 3 deletions(-)
diff --git a/arch/s390/include/asm/lowcore.h b/arch/s390/include/asm/lowcore.h
index 48c64716d1f2..42a092fa1029 100644
--- a/arch/s390/include/asm/lowcore.h
+++ b/arch/s390/include/asm/lowcore.h
@@ -165,8 +165,7 @@ struct lowcore {
__u64 percpu_offset; /* 0x03b8 */
__u8 pad_0x03c0[0x03c8-0x03c0]; /* 0x03c0 */
__u64 machine_flags; /* 0x03c8 */
- __u64 gmap; /* 0x03d0 */
- __u8 pad_0x03d8[0x0400-0x03d8]; /* 0x03d8 */
+ __u8 pad_0x03d0[0x0400-0x03d0]; /* 0x03d0 */
__u32 return_lpswe; /* 0x0400 */
__u32 return_mcck_lpswe; /* 0x0404 */
diff --git a/arch/s390/kernel/asm-offsets.c b/arch/s390/kernel/asm-offsets.c
index 3a6ee5043761..1d7ed0faff8b 100644
--- a/arch/s390/kernel/asm-offsets.c
+++ b/arch/s390/kernel/asm-offsets.c
@@ -137,7 +137,6 @@ int main(void)
OFFSET(__LC_USER_ASCE, lowcore, user_asce);
OFFSET(__LC_LPP, lowcore, lpp);
OFFSET(__LC_CURRENT_PID, lowcore, current_pid);
- OFFSET(__LC_GMAP, lowcore, gmap);
OFFSET(__LC_LAST_BREAK, lowcore, last_break);
/* software defined ABI-relevant lowcore locations 0xe00 - 0xe20 */
OFFSET(__LC_DUMP_REIPL, lowcore, ipib);
--
2.47.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v4 09/11] s390/mm: Simplify get_fault_type()
2024-10-22 12:05 [PATCH v4 00/11] s390/kvm: Handle guest-related program interrupts in KVM Claudio Imbrenda
` (7 preceding siblings ...)
2024-10-22 12:05 ` [PATCH v4 08/11] s390: Remove gmap pointer from lowcore Claudio Imbrenda
@ 2024-10-22 12:05 ` Claudio Imbrenda
2024-10-22 12:06 ` [PATCH v4 10/11] s390/mm: Get rid of fault type switch statements Claudio Imbrenda
` (2 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Claudio Imbrenda @ 2024-10-22 12:05 UTC (permalink / raw)
To: linux-kernel
Cc: borntraeger, nsg, nrb, frankja, seiden, hca, agordeev, gor,
gerald.schaefer, kvm, linux-s390, david
From: Heiko Carstens <hca@linux.ibm.com>
With the gmap code gone get_fault_type() can be simplified:
- every fault with user_mode(regs) == true must be a fault in user address
space
- every fault with user_mode(regs) == false is only a fault in user
address space if the used address space is the secondary address space
- every other fault is within the kernel address space
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
---
arch/s390/mm/fault.c | 11 ++---------
1 file changed, 2 insertions(+), 9 deletions(-)
diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
index e48910b0b816..6e96fc7905fc 100644
--- a/arch/s390/mm/fault.c
+++ b/arch/s390/mm/fault.c
@@ -68,17 +68,10 @@ static enum fault_type get_fault_type(struct pt_regs *regs)
{
union teid teid = { .val = regs->int_parm_long };
- if (likely(teid.as == PSW_BITS_AS_PRIMARY)) {
- if (user_mode(regs))
- return USER_FAULT;
- return KERNEL_FAULT;
- }
- if (teid.as == PSW_BITS_AS_SECONDARY)
+ if (user_mode(regs))
return USER_FAULT;
- /* Access register mode, not used in the kernel */
- if (teid.as == PSW_BITS_AS_ACCREG)
+ if (teid.as == PSW_BITS_AS_SECONDARY)
return USER_FAULT;
- /* Home space -> access via kernel ASCE */
return KERNEL_FAULT;
}
--
2.47.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v4 10/11] s390/mm: Get rid of fault type switch statements
2024-10-22 12:05 [PATCH v4 00/11] s390/kvm: Handle guest-related program interrupts in KVM Claudio Imbrenda
` (8 preceding siblings ...)
2024-10-22 12:05 ` [PATCH v4 09/11] s390/mm: Simplify get_fault_type() Claudio Imbrenda
@ 2024-10-22 12:06 ` Claudio Imbrenda
2024-10-22 12:06 ` [PATCH v4 11/11] s390/mm: Convert to LOCK_MM_AND_FIND_VMA Claudio Imbrenda
2024-10-22 14:45 ` [PATCH v4 00/11] s390/kvm: Handle guest-related program interrupts in KVM Heiko Carstens
11 siblings, 0 replies; 16+ messages in thread
From: Claudio Imbrenda @ 2024-10-22 12:06 UTC (permalink / raw)
To: linux-kernel
Cc: borntraeger, nsg, nrb, frankja, seiden, hca, agordeev, gor,
gerald.schaefer, kvm, linux-s390, david
From: Heiko Carstens <hca@linux.ibm.com>
With GMAP_FAULT fault type gone, there are only KERNEL_FAULT and
USER_FAULT fault types left. Therefore there is no need for any fault
type switch statements left.
Rename get_fault_type() into is_kernel_fault() and let it return a
boolean value. Change all switch statements to if statements. This
removes quite a bit of code.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
---
arch/s390/mm/fault.c | 70 ++++++++++++++------------------------------
1 file changed, 22 insertions(+), 48 deletions(-)
diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
index 6e96fc7905fc..93ae097ef0e0 100644
--- a/arch/s390/mm/fault.c
+++ b/arch/s390/mm/fault.c
@@ -46,11 +46,6 @@
#include <asm/uv.h>
#include "../kernel/entry.h"
-enum fault_type {
- KERNEL_FAULT,
- USER_FAULT,
-};
-
static DEFINE_STATIC_KEY_FALSE(have_store_indication);
static int __init fault_init(void)
@@ -64,15 +59,15 @@ early_initcall(fault_init);
/*
* Find out which address space caused the exception.
*/
-static enum fault_type get_fault_type(struct pt_regs *regs)
+static bool is_kernel_fault(struct pt_regs *regs)
{
union teid teid = { .val = regs->int_parm_long };
if (user_mode(regs))
- return USER_FAULT;
+ return false;
if (teid.as == PSW_BITS_AS_SECONDARY)
- return USER_FAULT;
- return KERNEL_FAULT;
+ return false;
+ return true;
}
static unsigned long get_fault_address(struct pt_regs *regs)
@@ -167,17 +162,12 @@ static void dump_fault_info(struct pt_regs *regs)
break;
}
pr_cont("mode while using ");
- switch (get_fault_type(regs)) {
- case USER_FAULT:
- asce = get_lowcore()->user_asce.val;
- pr_cont("user ");
- break;
- case KERNEL_FAULT:
+ if (is_kernel_fault(regs)) {
asce = get_lowcore()->kernel_asce.val;
pr_cont("kernel ");
- break;
- default:
- unreachable();
+ } else {
+ asce = get_lowcore()->user_asce.val;
+ pr_cont("user ");
}
pr_cont("ASCE.\n");
dump_pagetable(asce, get_fault_address(regs));
@@ -212,7 +202,6 @@ static void do_sigsegv(struct pt_regs *regs, int si_code)
static void handle_fault_error_nolock(struct pt_regs *regs, int si_code)
{
- enum fault_type fault_type;
unsigned long address;
bool is_write;
@@ -223,17 +212,15 @@ static void handle_fault_error_nolock(struct pt_regs *regs, int si_code)
}
if (fixup_exception(regs))
return;
- fault_type = get_fault_type(regs);
- if (fault_type == KERNEL_FAULT) {
+ if (is_kernel_fault(regs)) {
address = get_fault_address(regs);
is_write = fault_is_write(regs);
if (kfence_handle_page_fault(address, is_write, regs))
return;
- }
- if (fault_type == KERNEL_FAULT)
pr_alert("Unable to handle kernel pointer dereference in virtual kernel address space\n");
- else
+ } else {
pr_alert("Unable to handle kernel paging request in virtual user address space\n");
+ }
dump_fault_info(regs);
die(regs, "Oops");
}
@@ -267,7 +254,6 @@ static void do_exception(struct pt_regs *regs, int access)
struct vm_area_struct *vma;
unsigned long address;
struct mm_struct *mm;
- enum fault_type type;
unsigned int flags;
vm_fault_t fault;
bool is_write;
@@ -282,15 +268,8 @@ static void do_exception(struct pt_regs *regs, int access)
mm = current->mm;
address = get_fault_address(regs);
is_write = fault_is_write(regs);
- type = get_fault_type(regs);
- switch (type) {
- case KERNEL_FAULT:
+ if (is_kernel_fault(regs) || faulthandler_disabled() || !mm)
return handle_fault_error_nolock(regs, 0);
- case USER_FAULT:
- if (faulthandler_disabled() || !mm)
- return handle_fault_error_nolock(regs, 0);
- break;
- }
perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
flags = FAULT_FLAG_DEFAULT;
if (user_mode(regs))
@@ -460,8 +439,15 @@ void do_secure_storage_access(struct pt_regs *regs)
*/
panic("Unexpected PGM 0x3d with TEID bit 61=0");
}
- switch (get_fault_type(regs)) {
- case USER_FAULT:
+ if (is_kernel_fault(regs)) {
+ folio = phys_to_folio(addr);
+ if (unlikely(!folio_try_get(folio)))
+ return;
+ rc = arch_make_folio_accessible(folio);
+ folio_put(folio);
+ if (rc)
+ BUG();
+ } else {
mm = current->mm;
mmap_read_lock(mm);
vma = find_vma(mm, addr);
@@ -470,7 +456,7 @@ void do_secure_storage_access(struct pt_regs *regs)
folio = folio_walk_start(&fw, vma, addr, 0);
if (!folio) {
mmap_read_unlock(mm);
- break;
+ return;
}
/* arch_make_folio_accessible() needs a raised refcount. */
folio_get(folio);
@@ -480,18 +466,6 @@ void do_secure_storage_access(struct pt_regs *regs)
if (rc)
send_sig(SIGSEGV, current, 0);
mmap_read_unlock(mm);
- break;
- case KERNEL_FAULT:
- folio = phys_to_folio(addr);
- if (unlikely(!folio_try_get(folio)))
- break;
- rc = arch_make_folio_accessible(folio);
- folio_put(folio);
- if (rc)
- BUG();
- break;
- default:
- unreachable();
}
}
NOKPROBE_SYMBOL(do_secure_storage_access);
--
2.47.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v4 11/11] s390/mm: Convert to LOCK_MM_AND_FIND_VMA
2024-10-22 12:05 [PATCH v4 00/11] s390/kvm: Handle guest-related program interrupts in KVM Claudio Imbrenda
` (9 preceding siblings ...)
2024-10-22 12:06 ` [PATCH v4 10/11] s390/mm: Get rid of fault type switch statements Claudio Imbrenda
@ 2024-10-22 12:06 ` Claudio Imbrenda
2024-10-22 12:29 ` Alexander Gordeev
2024-10-22 14:45 ` [PATCH v4 00/11] s390/kvm: Handle guest-related program interrupts in KVM Heiko Carstens
11 siblings, 1 reply; 16+ messages in thread
From: Claudio Imbrenda @ 2024-10-22 12:06 UTC (permalink / raw)
To: linux-kernel
Cc: borntraeger, nsg, nrb, frankja, seiden, hca, agordeev, gor,
gerald.schaefer, kvm, linux-s390, david
From: Heiko Carstens <hca@linux.ibm.com>
With the gmap code gone s390 can be easily converted to
LOCK_MM_AND_FIND_VMA like it has been done for most other
architectures.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
---
arch/s390/Kconfig | 1 +
arch/s390/mm/fault.c | 13 ++-----------
2 files changed, 3 insertions(+), 11 deletions(-)
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index d339fe4fdedf..8109446f7b24 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -224,6 +224,7 @@ config S390
select HAVE_VIRT_CPU_ACCOUNTING_IDLE
select IOMMU_HELPER if PCI
select IOMMU_SUPPORT if PCI
+ select LOCK_MM_AND_FIND_VMA
select MMU_GATHER_MERGE_VMAS
select MMU_GATHER_NO_GATHER
select MMU_GATHER_RCU_TABLE_FREE
diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
index 93ae097ef0e0..8bd2b8d64273 100644
--- a/arch/s390/mm/fault.c
+++ b/arch/s390/mm/fault.c
@@ -308,18 +308,10 @@ static void do_exception(struct pt_regs *regs, int access)
return;
}
lock_mmap:
- mmap_read_lock(mm);
retry:
- vma = find_vma(mm, address);
+ vma = lock_mm_and_find_vma(mm, address, regs);
if (!vma)
- return handle_fault_error(regs, SEGV_MAPERR);
- if (unlikely(vma->vm_start > address)) {
- if (!(vma->vm_flags & VM_GROWSDOWN))
- return handle_fault_error(regs, SEGV_MAPERR);
- vma = expand_stack(mm, address);
- if (!vma)
- return handle_fault_error_nolock(regs, SEGV_MAPERR);
- }
+ return handle_fault_error_nolock(regs, SEGV_MAPERR);
if (unlikely(!(vma->vm_flags & access)))
return handle_fault_error(regs, SEGV_ACCERR);
fault = handle_mm_fault(vma, address, flags, regs);
@@ -337,7 +329,6 @@ static void do_exception(struct pt_regs *regs, int access)
}
if (fault & VM_FAULT_RETRY) {
flags |= FAULT_FLAG_TRIED;
- mmap_read_lock(mm);
goto retry;
}
mmap_read_unlock(mm);
--
2.47.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH v4 11/11] s390/mm: Convert to LOCK_MM_AND_FIND_VMA
2024-10-22 12:06 ` [PATCH v4 11/11] s390/mm: Convert to LOCK_MM_AND_FIND_VMA Claudio Imbrenda
@ 2024-10-22 12:29 ` Alexander Gordeev
0 siblings, 0 replies; 16+ messages in thread
From: Alexander Gordeev @ 2024-10-22 12:29 UTC (permalink / raw)
To: Claudio Imbrenda
Cc: linux-kernel, borntraeger, nsg, nrb, frankja, seiden, hca, gor,
gerald.schaefer, kvm, linux-s390, david
On Tue, Oct 22, 2024 at 02:06:01PM +0200, Claudio Imbrenda wrote:
> From: Heiko Carstens <hca@linux.ibm.com>
>
> With the gmap code gone s390 can be easily converted to
> LOCK_MM_AND_FIND_VMA like it has been done for most other
> architectures.
>
> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
> Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
> ---
> arch/s390/Kconfig | 1 +
> arch/s390/mm/fault.c | 13 ++-----------
> 2 files changed, 3 insertions(+), 11 deletions(-)
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v4 05/11] s390/mm/fault: Handle guest-related program interrupts in KVM
2024-10-22 12:05 ` [PATCH v4 05/11] s390/mm/fault: Handle guest-related program interrupts in KVM Claudio Imbrenda
@ 2024-10-22 12:38 ` Alexander Gordeev
0 siblings, 0 replies; 16+ messages in thread
From: Alexander Gordeev @ 2024-10-22 12:38 UTC (permalink / raw)
To: Claudio Imbrenda
Cc: linux-kernel, borntraeger, nsg, nrb, frankja, seiden, hca, gor,
gerald.schaefer, kvm, linux-s390, david
On Tue, Oct 22, 2024 at 02:05:55PM +0200, Claudio Imbrenda wrote:
Hi Claudio!
> [agordeev@linux.ibm.com: remove spurious flags &= ~FAULT_FLAG_RETRY_NOWAIT]
A square-brackets line is for something else ;)
Simply drop it when/if you post a new version, please.
Otherwise I think Heiko could take do that.
Thanks!
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v4 00/11] s390/kvm: Handle guest-related program interrupts in KVM
2024-10-22 12:05 [PATCH v4 00/11] s390/kvm: Handle guest-related program interrupts in KVM Claudio Imbrenda
` (10 preceding siblings ...)
2024-10-22 12:06 ` [PATCH v4 11/11] s390/mm: Convert to LOCK_MM_AND_FIND_VMA Claudio Imbrenda
@ 2024-10-22 14:45 ` Heiko Carstens
11 siblings, 0 replies; 16+ messages in thread
From: Heiko Carstens @ 2024-10-22 14:45 UTC (permalink / raw)
To: Claudio Imbrenda
Cc: linux-kernel, borntraeger, nsg, nrb, frankja, seiden, agordeev,
gor, gerald.schaefer, kvm, linux-s390, david
On Tue, Oct 22, 2024 at 02:05:50PM +0200, Claudio Imbrenda wrote:
> This patchseries moves the handling of host program interrupts that
> happen while a KVM guest is running into KVM itself.
>
> All program interrupts that happen in the host while a KVM guest is
> running are due to DAT exceptions. It is cleaner and more maintainable
> to have KVM handle those.
>
> As a side effect, some more cleanups is also possible.
>
> Moreover, this series serves as a foundation for an upcoming series
> that will further move as much s390 KVM memory managament as possible
> into KVM itself, and away from the rest of the kernel.
...
> Claudio Imbrenda (8):
> s390/entry: Remove __GMAP_ASCE and use _PIF_GUEST_FAULT again
> s390/kvm: Remove kvm_arch_fault_in_page()
> s390/mm/gmap: Refactor gmap_fault() and add support for pfault
> s390/mm/gmap: Fix __gmap_fault() return code
> s390/mm/fault: Handle guest-related program interrupts in KVM
> s390/kvm: Stop using gmap_{en,dis}able()
> s390/mm/gmap: Remove gmap_{en,dis}able()
> s390: Remove gmap pointer from lowcore
>
> Heiko Carstens (3):
> s390/mm: Simplify get_fault_type()
> s390/mm: Get rid of fault type switch statements
> s390/mm: Convert to LOCK_MM_AND_FIND_VMA
>
> arch/s390/Kconfig | 1 +
> arch/s390/include/asm/gmap.h | 3 -
> arch/s390/include/asm/kvm_host.h | 5 +-
> arch/s390/include/asm/lowcore.h | 3 +-
> arch/s390/include/asm/processor.h | 5 +-
> arch/s390/include/asm/ptrace.h | 2 +
> arch/s390/kernel/asm-offsets.c | 3 -
> arch/s390/kernel/entry.S | 44 ++-----
> arch/s390/kernel/traps.c | 23 +++-
> arch/s390/kvm/intercept.c | 4 +-
> arch/s390/kvm/kvm-s390.c | 142 +++++++++++++++-------
> arch/s390/kvm/kvm-s390.h | 8 +-
> arch/s390/kvm/vsie.c | 17 ++-
> arch/s390/mm/fault.c | 195 +++++-------------------------
> arch/s390/mm/gmap.c | 151 +++++++++++++++--------
> 15 files changed, 281 insertions(+), 325 deletions(-)
Series applied, thanks!
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v4 07/11] s390/mm/gmap: Remove gmap_{en,dis}able()
2024-10-22 12:05 ` [PATCH v4 07/11] s390/mm/gmap: Remove gmap_{en,dis}able() Claudio Imbrenda
@ 2024-10-23 8:39 ` Steffen Eiden
0 siblings, 0 replies; 16+ messages in thread
From: Steffen Eiden @ 2024-10-23 8:39 UTC (permalink / raw)
To: Claudio Imbrenda, linux-kernel
Cc: borntraeger, nsg, nrb, frankja, hca, agordeev, gor,
gerald.schaefer, kvm, linux-s390, david
On 10/22/24 2:05 PM, Claudio Imbrenda wrote:
> Remove gmap_enable(), gmap_disable(), and gmap_get_enabled() since they do
> not have any users anymore.
>
> Suggested-by: Heiko Carstens <hca@linux.ibm.com>
> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
> Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Steffen Eiden <seiden@linux.ibm.com>
> ---
> arch/s390/include/asm/gmap.h | 3 ---
> arch/s390/mm/gmap.c | 31 -------------------------------
> 2 files changed, 34 deletions(-)
...
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2024-10-23 8:39 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-22 12:05 [PATCH v4 00/11] s390/kvm: Handle guest-related program interrupts in KVM Claudio Imbrenda
2024-10-22 12:05 ` [PATCH v4 01/11] s390/entry: Remove __GMAP_ASCE and use _PIF_GUEST_FAULT again Claudio Imbrenda
2024-10-22 12:05 ` [PATCH v4 02/11] s390/kvm: Remove kvm_arch_fault_in_page() Claudio Imbrenda
2024-10-22 12:05 ` [PATCH v4 03/11] s390/mm/gmap: Refactor gmap_fault() and add support for pfault Claudio Imbrenda
2024-10-22 12:05 ` [PATCH v4 04/11] s390/mm/gmap: Fix __gmap_fault() return code Claudio Imbrenda
2024-10-22 12:05 ` [PATCH v4 05/11] s390/mm/fault: Handle guest-related program interrupts in KVM Claudio Imbrenda
2024-10-22 12:38 ` Alexander Gordeev
2024-10-22 12:05 ` [PATCH v4 06/11] s390/kvm: Stop using gmap_{en,dis}able() Claudio Imbrenda
2024-10-22 12:05 ` [PATCH v4 07/11] s390/mm/gmap: Remove gmap_{en,dis}able() Claudio Imbrenda
2024-10-23 8:39 ` Steffen Eiden
2024-10-22 12:05 ` [PATCH v4 08/11] s390: Remove gmap pointer from lowcore Claudio Imbrenda
2024-10-22 12:05 ` [PATCH v4 09/11] s390/mm: Simplify get_fault_type() Claudio Imbrenda
2024-10-22 12:06 ` [PATCH v4 10/11] s390/mm: Get rid of fault type switch statements Claudio Imbrenda
2024-10-22 12:06 ` [PATCH v4 11/11] s390/mm: Convert to LOCK_MM_AND_FIND_VMA Claudio Imbrenda
2024-10-22 12:29 ` Alexander Gordeev
2024-10-22 14:45 ` [PATCH v4 00/11] s390/kvm: Handle guest-related program interrupts in KVM Heiko Carstens
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).