* [PATCH v4 00/11] s390x/pci: zPCI interpretation support
@ 2022-03-14 19:49 Matthew Rosato
2022-03-14 19:49 ` [PATCH v4 01/11] Update linux headers Matthew Rosato
` (10 more replies)
0 siblings, 11 replies; 12+ messages in thread
From: Matthew Rosato @ 2022-03-14 19:49 UTC (permalink / raw)
To: qemu-s390x
Cc: farman, kvm, pmorel, schnelle, cohuck, richard.henderson, thuth,
qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
borntraeger
For QEMU, the majority of the work in enabling instruction interpretation
is handled via a new KVM ioctls to enable interpretation, interrupt
forwarding and registration of the guest IOAT tables. In order to make
use of the KVM-managed IOMMU domain operations on the host, we also add
some code to vfio to indicate that a given device wishes to register the
alternate domain type for its group.
This series also adds a new, optional 'interpret' parameter to zpci which
can be used to disable interpretation support (interpret=off) as well as
an 'forwarding_assist' parameter to determine whether or not the firmware
assist will be used for interrupt delivery (default when interpretation
is in use) or whether the host will be responsible for delivering all
interrupts (forwarding_assist=off).
The ZPCI_INTERP CPU feature is added beginning with the z14 model to
enable this support.
As a consequence of implementing zPCI interpretation, ISM devices now
become eligible for passthrough (but only when zPCI interpretation is
available).
From the perspective of guest configuration, you passthrough zPCI devices
in the same manner as before, with intepretation support being used by
default if available in kernel+qemu.
Associated kernel series:
https://lore.kernel.org/kvm/20220314194451.58266-1-mjrosato@linux.ibm.com/
Changelog v3->v4
- Unfortunately I had to remove some Review tags because the userspace API
moved from vfio device feature ioctls to KVM ioctls in response to
feedback from the kernel series. The vast majority of the QEMU logic
remains intact however, with most changes being to the way we issue
ioctls.
- Additional logic was added to test for availability of the KVM ioctl,
this replaces the probe logic done for the vfio ioctls
- Add code to issue indicate on VFIO_SET_IOMMU that a KVM-managed IOMMU
domain is to be allocated.
Matthew Rosato (11):
Update linux headers
vfio: handle KVM-owned IOMMU requests
target/s390x: add zpci-interp to cpu models
s390x/pci: add routine to get host function handle from CLP info
s390x/pci: enable for load/store intepretation
s390x/pci: don't fence interpreted devices without MSI-X
s390x/pci: enable adapter event notification for interpreted devices
s390x/pci: use KVM-managed IOMMU for interpretation
s390x/pci: use I/O Address Translation assist when interpreting
s390x/pci: use dtsm provided from vfio capabilities for interpreted
devices
s390x/pci: let intercept devices have separate PCI groups
hw/s390x/meson.build | 1 +
hw/s390x/s390-pci-bus.c | 125 ++++++++++++++++++++--
hw/s390x/s390-pci-inst.c | 136 +++++++++++++++++++++--
hw/s390x/s390-pci-kvm.c | 160 ++++++++++++++++++++++++++++
hw/s390x/s390-pci-vfio.c | 151 ++++++++++++++++++++++----
hw/s390x/s390-virtio-ccw.c | 1 +
hw/vfio/ap.c | 2 +-
hw/vfio/ccw.c | 2 +-
hw/vfio/common.c | 26 ++++-
hw/vfio/pci.c | 3 +-
hw/vfio/pci.h | 1 +
hw/vfio/platform.c | 2 +-
include/hw/s390x/s390-pci-bus.h | 8 +-
include/hw/s390x/s390-pci-inst.h | 2 +-
include/hw/s390x/s390-pci-kvm.h | 68 ++++++++++++
include/hw/s390x/s390-pci-vfio.h | 11 ++
include/hw/vfio/vfio-common.h | 4 +-
linux-headers/asm-s390/kvm.h | 1 +
linux-headers/asm-x86/kvm.h | 3 +
linux-headers/linux/kvm.h | 51 ++++++++-
linux-headers/linux/vfio.h | 6 ++
linux-headers/linux/vfio_zdev.h | 6 ++
target/s390x/cpu_features_def.h.inc | 1 +
target/s390x/gen-features.c | 2 +
target/s390x/kvm/kvm.c | 8 ++
target/s390x/kvm/kvm_s390x.h | 1 +
26 files changed, 731 insertions(+), 51 deletions(-)
create mode 100644 hw/s390x/s390-pci-kvm.c
create mode 100644 include/hw/s390x/s390-pci-kvm.h
--
2.27.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v4 01/11] Update linux headers
2022-03-14 19:49 [PATCH v4 00/11] s390x/pci: zPCI interpretation support Matthew Rosato
@ 2022-03-14 19:49 ` Matthew Rosato
2022-03-14 19:49 ` [PATCH v4 02/11] vfio: handle KVM-owned IOMMU requests Matthew Rosato
` (9 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Matthew Rosato @ 2022-03-14 19:49 UTC (permalink / raw)
To: qemu-s390x
Cc: farman, kvm, pmorel, schnelle, cohuck, richard.henderson, thuth,
qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
borntraeger
This is a placeholder that pulls in 5.17-rc7 + unmerged kernel changes
required by this item. A proper header sync can be done once the
associated kernel code merges.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
| 1 +
| 3 ++
| 51 +++++++++++++++++++++++++++++++--
| 6 ++++
| 6 ++++
5 files changed, 64 insertions(+), 3 deletions(-)
--git a/linux-headers/asm-s390/kvm.h b/linux-headers/asm-s390/kvm.h
index f053b8304a..d8259ff9a1 100644
--- a/linux-headers/asm-s390/kvm.h
+++ b/linux-headers/asm-s390/kvm.h
@@ -130,6 +130,7 @@ struct kvm_s390_vm_cpu_machine {
#define KVM_S390_VM_CPU_FEAT_PFMFI 11
#define KVM_S390_VM_CPU_FEAT_SIGPIF 12
#define KVM_S390_VM_CPU_FEAT_KSS 13
+#define KVM_S390_VM_CPU_FEAT_ZPCI_INTERP 14
struct kvm_s390_vm_cpu_feat {
__u64 feat[16];
};
--git a/linux-headers/asm-x86/kvm.h b/linux-headers/asm-x86/kvm.h
index 2da3316bb5..bf6e96011d 100644
--- a/linux-headers/asm-x86/kvm.h
+++ b/linux-headers/asm-x86/kvm.h
@@ -452,6 +452,9 @@ struct kvm_sync_regs {
#define KVM_STATE_VMX_PREEMPTION_TIMER_DEADLINE 0x00000001
+/* attributes for system fd (group 0) */
+#define KVM_X86_XCOMP_GUEST_SUPP 0
+
struct kvm_vmx_nested_state_data {
__u8 vmcs12[KVM_STATE_NESTED_VMX_VMCS_SIZE];
__u8 shadow_vmcs12[KVM_STATE_NESTED_VMX_VMCS_SIZE];
--git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 00af3bc333..8f82e6ff20 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -1133,6 +1133,9 @@ struct kvm_ppc_resize_hpt {
#define KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM 206
#define KVM_CAP_VM_GPA_BITS 207
#define KVM_CAP_XSAVE2 208
+#define KVM_CAP_SYS_ATTRIBUTES 209
+#define KVM_CAP_PPC_AIL_MODE_3 210
+#define KVM_CAP_S390_ZPCI_OP 211
#ifdef KVM_CAP_IRQ_ROUTING
@@ -1623,9 +1626,6 @@ struct kvm_enc_region {
#define KVM_S390_NORMAL_RESET _IO(KVMIO, 0xc3)
#define KVM_S390_CLEAR_RESET _IO(KVMIO, 0xc4)
-/* Available with KVM_CAP_XSAVE2 */
-#define KVM_GET_XSAVE2 _IOR(KVMIO, 0xcf, struct kvm_xsave)
-
struct kvm_s390_pv_sec_parm {
__u64 origin;
__u64 length;
@@ -2047,4 +2047,49 @@ struct kvm_stats_desc {
#define KVM_GET_STATS_FD _IO(KVMIO, 0xce)
+/* Available with KVM_CAP_XSAVE2 */
+#define KVM_GET_XSAVE2 _IOR(KVMIO, 0xcf, struct kvm_xsave)
+
+/* Available with KVM_CAP_S390_ZPCI_OP */
+#define KVM_S390_ZPCI_OP _IOW(KVMIO, 0xd0, struct kvm_s390_zpci_op)
+
+struct kvm_s390_zpci_op {
+ /* in */
+ __u32 fh; /* target device */
+ __u8 op; /* operation to perform */
+ __u8 pad[3];
+ union {
+ /* for KVM_S390_ZPCIOP_REG_INT */
+ struct {
+ __u64 ibv; /* Guest addr of interrupt bit vector */
+ __u64 sb; /* Guest addr of summary bit */
+ __u32 flags;
+ __u32 noi; /* Number of interrupts */
+ __u8 isc; /* Guest interrupt subclass */
+ __u8 sbo; /* Offset of guest summary bit vector */
+ __u16 pad;
+ } reg_int;
+ /* for KVM_S390_ZPCIOP_REG_IOAT */
+ struct {
+ __u64 iota; /* I/O Translation settings */
+ } reg_ioat;
+ __u8 reserved[64];
+ } u;
+ /* out */
+ __u32 newfh; /* updated device handle */
+};
+
+/* types for kvm_s390_zpci_op->op */
+#define KVM_S390_ZPCIOP_INIT 0
+#define KVM_S390_ZPCIOP_END 1
+#define KVM_S390_ZPCIOP_START_INTERP 2
+#define KVM_S390_ZPCIOP_STOP_INTERP 3
+#define KVM_S390_ZPCIOP_REG_INT 4
+#define KVM_S390_ZPCIOP_DEREG_INT 5
+#define KVM_S390_ZPCIOP_REG_IOAT 6
+#define KVM_S390_ZPCIOP_DEREG_IOAT 7
+
+/* flags for kvm_s390_zpci_op->u.reg_int.flags */
+#define KVM_S390_ZPCIOP_REGINT_HOST (1 << 0)
+
#endif /* __LINUX_KVM_H */
--git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index e680594f27..38d43e1205 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -52,6 +52,12 @@
/* Supports the vaddr flag for DMA map and unmap */
#define VFIO_UPDATE_VADDR 10
+/*
+ * The KVM_IOMMU type implies that the hypervisor will control the mappings
+ * rather than userspace
+ */
+#define VFIO_KVM_IOMMU 11
+
/*
* The IOCTL interface is designed for extensibility by embedding the
* structure length (argsz) and flags into structures passed between
--git a/linux-headers/linux/vfio_zdev.h b/linux-headers/linux/vfio_zdev.h
index b4309397b6..29351687e9 100644
--- a/linux-headers/linux/vfio_zdev.h
+++ b/linux-headers/linux/vfio_zdev.h
@@ -29,6 +29,9 @@ struct vfio_device_info_cap_zpci_base {
__u16 fmb_length; /* Measurement Block Length (in bytes) */
__u8 pft; /* PCI Function Type */
__u8 gid; /* PCI function group ID */
+ /* End of version 1 */
+ __u32 fh; /* PCI function handle */
+ /* End of version 2 */
};
/**
@@ -47,6 +50,9 @@ struct vfio_device_info_cap_zpci_group {
__u16 noi; /* Maximum number of MSIs */
__u16 maxstbl; /* Maximum Store Block Length */
__u8 version; /* Supported PCI Version */
+ /* End of version 1 */
+ __u8 dtsm; /* Supported IOAT Designations */
+ /* End of version 2 */
};
/**
--
2.27.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 02/11] vfio: handle KVM-owned IOMMU requests
2022-03-14 19:49 [PATCH v4 00/11] s390x/pci: zPCI interpretation support Matthew Rosato
2022-03-14 19:49 ` [PATCH v4 01/11] Update linux headers Matthew Rosato
@ 2022-03-14 19:49 ` Matthew Rosato
2022-03-14 19:49 ` [PATCH v4 03/11] target/s390x: add zpci-interp to cpu models Matthew Rosato
` (8 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Matthew Rosato @ 2022-03-14 19:49 UTC (permalink / raw)
To: qemu-s390x
Cc: farman, kvm, pmorel, schnelle, cohuck, richard.henderson, thuth,
qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
borntraeger
s390x PCI devices need to use a special KVM-managed IOMMU domain as part
of zPCI interpretation. To facilitate this, let a vfio device indicate
that it wishes to use a KVM-managed IOMMU so that it can be reflected by
the group and, ultimately, trigger a KVM-managed argument for the
VFIO_SET_IOMMU ioctl.
This patch sets up the framework to allow a device to trigger the
VFIO_SET_IOMMU with the new KVM-owend type. A subsequent patch will add
exploitation by s390x PCI.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
hw/vfio/ap.c | 2 +-
hw/vfio/ccw.c | 2 +-
hw/vfio/common.c | 26 +++++++++++++++++++++-----
hw/vfio/pci.c | 3 ++-
hw/vfio/pci.h | 1 +
hw/vfio/platform.c | 2 +-
include/hw/vfio/vfio-common.h | 4 +++-
7 files changed, 30 insertions(+), 10 deletions(-)
diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c
index e0dd561e85..22c402771a 100644
--- a/hw/vfio/ap.c
+++ b/hw/vfio/ap.c
@@ -81,7 +81,7 @@ static VFIOGroup *vfio_ap_get_group(VFIOAPDevice *vapdev, Error **errp)
g_free(group_path);
- return vfio_get_group(groupid, &address_space_memory, errp);
+ return vfio_get_group(groupid, &address_space_memory, false, errp);
}
static void vfio_ap_realize(DeviceState *dev, Error **errp)
diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
index 0354737666..08b0af5897 100644
--- a/hw/vfio/ccw.c
+++ b/hw/vfio/ccw.c
@@ -650,7 +650,7 @@ static VFIOGroup *vfio_ccw_get_group(S390CCWDevice *cdev, Error **errp)
return NULL;
}
- return vfio_get_group(groupid, &address_space_memory, errp);
+ return vfio_get_group(groupid, &address_space_memory, false, errp);
}
static void vfio_ccw_realize(DeviceState *dev, Error **errp)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 080046e3f5..227880bf84 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1873,7 +1873,7 @@ static int vfio_get_iommu_type(VFIOContainer *container,
return -EINVAL;
}
-static int vfio_init_container(VFIOContainer *container, int group_fd,
+static int vfio_init_container(VFIOContainer *container, VFIOGroup *group,
Error **errp)
{
int iommu_type, ret;
@@ -1883,12 +1883,20 @@ static int vfio_init_container(VFIOContainer *container, int group_fd,
return iommu_type;
}
- ret = ioctl(group_fd, VFIO_GROUP_SET_CONTAINER, &container->fd);
+ ret = ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &container->fd);
if (ret) {
error_setg_errno(errp, errno, "Failed to set group container");
return -errno;
}
+ /*
+ * In the case where KVM will manage the IOMMU, we must instruct the host
+ * IOMMU to use the appropriate domain ops
+ */
+ if (group->kvm_managed_iommu) {
+ iommu_type = VFIO_KVM_IOMMU;
+ }
+
while (ioctl(container->fd, VFIO_SET_IOMMU, iommu_type)) {
if (iommu_type == VFIO_SPAPR_TCE_v2_IOMMU) {
/*
@@ -2062,7 +2070,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
QLIST_INIT(&container->hostwin_list);
QLIST_INIT(&container->vrdl_list);
- ret = vfio_init_container(container, group->fd, errp);
+ ret = vfio_init_container(container, group, errp);
if (ret) {
goto free_container_exit;
}
@@ -2265,7 +2273,8 @@ static void vfio_disconnect_container(VFIOGroup *group)
}
}
-VFIOGroup *vfio_get_group(int groupid, AddressSpace *as, Error **errp)
+VFIOGroup *vfio_get_group(int groupid, AddressSpace *as, bool kvm_managed_iommu,
+ Error **errp)
{
VFIOGroup *group;
char path[32];
@@ -2273,7 +2282,13 @@ VFIOGroup *vfio_get_group(int groupid, AddressSpace *as, Error **errp)
QLIST_FOREACH(group, &vfio_group_list, next) {
if (group->groupid == groupid) {
- /* Found it. Now is it already in the right context? */
+ /* Found it. Ensure using same IOMMU type */
+ if (group->kvm_managed_iommu != kvm_managed_iommu) {
+ error_setg(errp, "group %d using conflicting iommu ops",
+ group->groupid);
+ return NULL;
+ }
+ /* Is it already in the right context? */
if (group->container->space->as == as) {
return group;
} else {
@@ -2307,6 +2322,7 @@ VFIOGroup *vfio_get_group(int groupid, AddressSpace *as, Error **errp)
}
group->groupid = groupid;
+ group->kvm_managed_iommu = kvm_managed_iommu;
QLIST_INIT(&group->device_list);
if (vfio_connect_container(group, as, errp)) {
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 7b45353ce2..80f7e2880a 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2855,7 +2855,8 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
trace_vfio_realize(vdev->vbasedev.name, groupid);
- group = vfio_get_group(groupid, pci_device_iommu_address_space(pdev), errp);
+ group = vfio_get_group(groupid, pci_device_iommu_address_space(pdev),
+ vdev->kvm_managed_iommu, errp);
if (!group) {
goto error;
}
diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
index 64777516d1..f74524384c 100644
--- a/hw/vfio/pci.h
+++ b/hw/vfio/pci.h
@@ -171,6 +171,7 @@ struct VFIOPCIDevice {
bool no_kvm_ioeventfd;
bool no_vfio_ioeventfd;
bool enable_ramfb;
+ bool kvm_managed_iommu;
VFIODisplay *dpy;
Notifier irqchip_change_notifier;
};
diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
index f8f08a0f36..08793401dd 100644
--- a/hw/vfio/platform.c
+++ b/hw/vfio/platform.c
@@ -577,7 +577,7 @@ static int vfio_base_device_init(VFIODevice *vbasedev, Error **errp)
trace_vfio_platform_base_device_init(vbasedev->name, groupid);
- group = vfio_get_group(groupid, &address_space_memory, errp);
+ group = vfio_get_group(groupid, &address_space_memory, false, errp);
if (!group) {
return -ENOENT;
}
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 8af11b0a76..37aa6ca162 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -162,6 +162,7 @@ typedef struct VFIOGroup {
QLIST_ENTRY(VFIOGroup) next;
QLIST_ENTRY(VFIOGroup) container_next;
bool ram_block_discard_allowed;
+ bool kvm_managed_iommu;
} VFIOGroup;
typedef struct VFIODMABuf {
@@ -208,7 +209,8 @@ void vfio_region_unmap(VFIORegion *region);
void vfio_region_exit(VFIORegion *region);
void vfio_region_finalize(VFIORegion *region);
void vfio_reset_handler(void *opaque);
-VFIOGroup *vfio_get_group(int groupid, AddressSpace *as, Error **errp);
+VFIOGroup *vfio_get_group(int groupid, AddressSpace *as, bool kvm_managed_iommu,
+ Error **errp);
void vfio_put_group(VFIOGroup *group);
int vfio_get_device(VFIOGroup *group, const char *name,
VFIODevice *vbasedev, Error **errp);
--
2.27.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 03/11] target/s390x: add zpci-interp to cpu models
2022-03-14 19:49 [PATCH v4 00/11] s390x/pci: zPCI interpretation support Matthew Rosato
2022-03-14 19:49 ` [PATCH v4 01/11] Update linux headers Matthew Rosato
2022-03-14 19:49 ` [PATCH v4 02/11] vfio: handle KVM-owned IOMMU requests Matthew Rosato
@ 2022-03-14 19:49 ` Matthew Rosato
2022-03-14 19:49 ` [PATCH v4 04/11] s390x/pci: add routine to get host function handle from CLP info Matthew Rosato
` (7 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Matthew Rosato @ 2022-03-14 19:49 UTC (permalink / raw)
To: qemu-s390x
Cc: farman, kvm, pmorel, schnelle, cohuck, richard.henderson, thuth,
qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
borntraeger
The zpci-interp feature is used to specify whether zPCI interpretation is
to be used for this guest.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
hw/s390x/s390-virtio-ccw.c | 1 +
target/s390x/cpu_features_def.h.inc | 1 +
target/s390x/gen-features.c | 2 ++
target/s390x/kvm/kvm.c | 1 +
4 files changed, 5 insertions(+)
diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index 90480e7cf9..b190234308 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -805,6 +805,7 @@ static void ccw_machine_6_2_instance_options(MachineState *machine)
static const S390FeatInit qemu_cpu_feat = { S390_FEAT_LIST_QEMU_V6_2 };
ccw_machine_7_0_instance_options(machine);
+ s390_cpudef_featoff_greater(14, 1, S390_FEAT_ZPCI_INTERP);
s390_set_qemu_cpu_model(0x3906, 14, 2, qemu_cpu_feat);
}
diff --git a/target/s390x/cpu_features_def.h.inc b/target/s390x/cpu_features_def.h.inc
index e86662bb3b..4ade3182aa 100644
--- a/target/s390x/cpu_features_def.h.inc
+++ b/target/s390x/cpu_features_def.h.inc
@@ -146,6 +146,7 @@ DEF_FEAT(SIE_CEI, "cei", SCLP_CPU, 43, "SIE: Conditional-external-interception f
DEF_FEAT(DAT_ENH_2, "dateh2", MISC, 0, "DAT-enhancement facility 2")
DEF_FEAT(CMM, "cmm", MISC, 0, "Collaborative-memory-management facility")
DEF_FEAT(AP, "ap", MISC, 0, "AP instructions installed")
+DEF_FEAT(ZPCI_INTERP, "zpci-interp", MISC, 0, "zPCI interpretation")
/* Features exposed via the PLO instruction. */
DEF_FEAT(PLO_CL, "plo-cl", PLO, 0, "PLO Compare and load (32 bit in general registers)")
diff --git a/target/s390x/gen-features.c b/target/s390x/gen-features.c
index 22846121c4..9db6bd545e 100644
--- a/target/s390x/gen-features.c
+++ b/target/s390x/gen-features.c
@@ -554,6 +554,7 @@ static uint16_t full_GEN14_GA1[] = {
S390_FEAT_HPMA2,
S390_FEAT_SIE_KSS,
S390_FEAT_GROUP_MULTIPLE_EPOCH_PTFF,
+ S390_FEAT_ZPCI_INTERP,
};
#define full_GEN14_GA2 EmptyFeat
@@ -650,6 +651,7 @@ static uint16_t default_GEN14_GA1[] = {
S390_FEAT_GROUP_MSA_EXT_8,
S390_FEAT_MULTIPLE_EPOCH,
S390_FEAT_GROUP_MULTIPLE_EPOCH_PTFF,
+ S390_FEAT_ZPCI_INTERP,
};
#define default_GEN14_GA2 EmptyFeat
diff --git a/target/s390x/kvm/kvm.c b/target/s390x/kvm/kvm.c
index 6acf14d5ec..0357bfda89 100644
--- a/target/s390x/kvm/kvm.c
+++ b/target/s390x/kvm/kvm.c
@@ -2294,6 +2294,7 @@ static int kvm_to_feat[][2] = {
{ KVM_S390_VM_CPU_FEAT_PFMFI, S390_FEAT_SIE_PFMFI},
{ KVM_S390_VM_CPU_FEAT_SIGPIF, S390_FEAT_SIE_SIGPIF},
{ KVM_S390_VM_CPU_FEAT_KSS, S390_FEAT_SIE_KSS},
+ { KVM_S390_VM_CPU_FEAT_ZPCI_INTERP, S390_FEAT_ZPCI_INTERP },
};
static int query_cpu_feat(S390FeatBitmap features)
--
2.27.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 04/11] s390x/pci: add routine to get host function handle from CLP info
2022-03-14 19:49 [PATCH v4 00/11] s390x/pci: zPCI interpretation support Matthew Rosato
` (2 preceding siblings ...)
2022-03-14 19:49 ` [PATCH v4 03/11] target/s390x: add zpci-interp to cpu models Matthew Rosato
@ 2022-03-14 19:49 ` Matthew Rosato
2022-03-14 19:49 ` [PATCH v4 05/11] s390x/pci: enable for load/store intepretation Matthew Rosato
` (6 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Matthew Rosato @ 2022-03-14 19:49 UTC (permalink / raw)
To: qemu-s390x
Cc: farman, kvm, pmorel, schnelle, cohuck, richard.henderson, thuth,
qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
borntraeger
In order to interface with the underlying host zPCI device, we need
to know it's function handle. Add a routine to grab this from the
vfio CLP capabilities chain.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
hw/s390x/s390-pci-vfio.c | 83 ++++++++++++++++++++++++++------
include/hw/s390x/s390-pci-vfio.h | 6 +++
2 files changed, 73 insertions(+), 16 deletions(-)
diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
index 6f80a47e29..4bf0a7e22d 100644
--- a/hw/s390x/s390-pci-vfio.c
+++ b/hw/s390x/s390-pci-vfio.c
@@ -124,6 +124,27 @@ static void s390_pci_read_base(S390PCIBusDevice *pbdev,
pbdev->zpci_fn.pft = 0;
}
+static bool get_host_fh(S390PCIBusDevice *pbdev, struct vfio_device_info *info,
+ uint32_t *fh)
+{
+ struct vfio_info_cap_header *hdr;
+ struct vfio_device_info_cap_zpci_base *cap;
+ VFIOPCIDevice *vpci = container_of(pbdev->pdev, VFIOPCIDevice, pdev);
+
+ hdr = vfio_get_device_info_cap(info, VFIO_DEVICE_INFO_CAP_ZPCI_BASE);
+
+ /* Can only get the host fh with version 2 or greater */
+ if (hdr == NULL || hdr->version < 2) {
+ trace_s390_pci_clp_cap(vpci->vbasedev.name,
+ VFIO_DEVICE_INFO_CAP_ZPCI_BASE);
+ return false;
+ }
+ cap = (void *) hdr;
+
+ *fh = cap->fh;
+ return true;
+}
+
static void s390_pci_read_group(S390PCIBusDevice *pbdev,
struct vfio_device_info *info)
{
@@ -217,25 +238,13 @@ static void s390_pci_read_pfip(S390PCIBusDevice *pbdev,
memcpy(pbdev->zpci_fn.pfip, cap->pfip, CLP_PFIP_NR_SEGMENTS);
}
-/*
- * This function will issue the VFIO_DEVICE_GET_INFO ioctl and look for
- * capabilities that contain information about CLP features provided by the
- * underlying host.
- * On entry, defaults have already been placed into the guest CLP response
- * buffers. On exit, defaults will have been overwritten for any CLP features
- * found in the capability chain; defaults will remain for any CLP features not
- * found in the chain.
- */
-void s390_pci_get_clp_info(S390PCIBusDevice *pbdev)
+static struct vfio_device_info *get_device_info(S390PCIBusDevice *pbdev,
+ uint32_t argsz)
{
- g_autofree struct vfio_device_info *info = NULL;
+ struct vfio_device_info *info = g_malloc0(argsz);
VFIOPCIDevice *vfio_pci;
- uint32_t argsz;
int fd;
- argsz = sizeof(*info);
- info = g_malloc0(argsz);
-
vfio_pci = container_of(pbdev->pdev, VFIOPCIDevice, pdev);
fd = vfio_pci->vbasedev.fd;
@@ -250,7 +259,8 @@ retry:
if (ioctl(fd, VFIO_DEVICE_GET_INFO, info)) {
trace_s390_pci_clp_dev_info(vfio_pci->vbasedev.name);
- return;
+ free(info);
+ return NULL;
}
if (info->argsz > argsz) {
@@ -259,6 +269,47 @@ retry:
goto retry;
}
+ return info;
+}
+
+/*
+ * Get the host function handle from the vfio CLP capabilities chain. Returns
+ * true if a fh value was placed into the provided buffer. Returns false
+ * if a fh could not be obtained (ioctl failed or capabilitiy version does
+ * not include the fh)
+ */
+bool s390_pci_get_host_fh(S390PCIBusDevice *pbdev, uint32_t *fh)
+{
+ g_autofree struct vfio_device_info *info = NULL;
+
+ assert(fh);
+
+ info = get_device_info(pbdev, sizeof(*info));
+ if (!info) {
+ return false;
+ }
+
+ return get_host_fh(pbdev, info, fh);
+}
+
+/*
+ * This function will issue the VFIO_DEVICE_GET_INFO ioctl and look for
+ * capabilities that contain information about CLP features provided by the
+ * underlying host.
+ * On entry, defaults have already been placed into the guest CLP response
+ * buffers. On exit, defaults will have been overwritten for any CLP features
+ * found in the capability chain; defaults will remain for any CLP features not
+ * found in the chain.
+ */
+void s390_pci_get_clp_info(S390PCIBusDevice *pbdev)
+{
+ g_autofree struct vfio_device_info *info = NULL;
+
+ info = get_device_info(pbdev, sizeof(*info));
+ if (!info) {
+ return;
+ }
+
/*
* Find the CLP features provided and fill in the guest CLP responses.
* Always call s390_pci_read_base first as information from this could
diff --git a/include/hw/s390x/s390-pci-vfio.h b/include/hw/s390x/s390-pci-vfio.h
index ff708aef50..0c2e4b5175 100644
--- a/include/hw/s390x/s390-pci-vfio.h
+++ b/include/hw/s390x/s390-pci-vfio.h
@@ -20,6 +20,7 @@ bool s390_pci_update_dma_avail(int fd, unsigned int *avail);
S390PCIDMACount *s390_pci_start_dma_count(S390pciState *s,
S390PCIBusDevice *pbdev);
void s390_pci_end_dma_count(S390pciState *s, S390PCIDMACount *cnt);
+bool s390_pci_get_host_fh(S390PCIBusDevice *pbdev, uint32_t *fh);
void s390_pci_get_clp_info(S390PCIBusDevice *pbdev);
#else
static inline bool s390_pci_update_dma_avail(int fd, unsigned int *avail)
@@ -33,6 +34,11 @@ static inline S390PCIDMACount *s390_pci_start_dma_count(S390pciState *s,
}
static inline void s390_pci_end_dma_count(S390pciState *s,
S390PCIDMACount *cnt) { }
+static inline bool s390_pci_get_host_fh(S390PCIBusDevice *pbdev,
+ unsigned int *fh)
+{
+ return false;
+}
static inline void s390_pci_get_clp_info(S390PCIBusDevice *pbdev) { }
#endif
--
2.27.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 05/11] s390x/pci: enable for load/store intepretation
2022-03-14 19:49 [PATCH v4 00/11] s390x/pci: zPCI interpretation support Matthew Rosato
` (3 preceding siblings ...)
2022-03-14 19:49 ` [PATCH v4 04/11] s390x/pci: add routine to get host function handle from CLP info Matthew Rosato
@ 2022-03-14 19:49 ` Matthew Rosato
2022-03-14 19:49 ` [PATCH v4 06/11] s390x/pci: don't fence interpreted devices without MSI-X Matthew Rosato
` (5 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Matthew Rosato @ 2022-03-14 19:49 UTC (permalink / raw)
To: qemu-s390x
Cc: farman, kvm, pmorel, schnelle, cohuck, richard.henderson, thuth,
qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
borntraeger
Use the associated kvm ioctl to enable interpretation for devices
when requested. As part of this process, we must use the host function
handle rather than a QEMU-generated one -- we use an initial value from
vfio CLP and maintain an updated fh value from kvm ioctl response info.
By default, unless interpret=off is specified, interpretation support will
always be assumed and exploited if the necessary ioctl and features are
available on the host kernel. When these are unavailable, we will silently
revert to the interception model; this allows existing guest configurations
to work unmodified on hosts with and without zPCI interpretation support,
allowing QEMU to choose the best support model available.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
hw/s390x/meson.build | 1 +
hw/s390x/s390-pci-bus.c | 67 ++++++++++++++++++-
hw/s390x/s390-pci-inst.c | 54 ++++++++++++++-
hw/s390x/s390-pci-kvm.c | 112 ++++++++++++++++++++++++++++++++
include/hw/s390x/s390-pci-bus.h | 1 +
include/hw/s390x/s390-pci-kvm.h | 46 +++++++++++++
target/s390x/kvm/kvm.c | 7 ++
target/s390x/kvm/kvm_s390x.h | 1 +
8 files changed, 287 insertions(+), 2 deletions(-)
create mode 100644 hw/s390x/s390-pci-kvm.c
create mode 100644 include/hw/s390x/s390-pci-kvm.h
diff --git a/hw/s390x/meson.build b/hw/s390x/meson.build
index 28484256ec..6e6e47fcda 100644
--- a/hw/s390x/meson.build
+++ b/hw/s390x/meson.build
@@ -23,6 +23,7 @@ s390x_ss.add(when: 'CONFIG_KVM', if_true: files(
's390-skeys-kvm.c',
's390-stattrib-kvm.c',
'pv.c',
+ 's390-pci-kvm.c',
))
s390x_ss.add(when: 'CONFIG_TCG', if_true: files(
'tod-tcg.c',
diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 4b2bdd94b3..7ce7bda26d 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -16,6 +16,7 @@
#include "qapi/visitor.h"
#include "hw/s390x/s390-pci-bus.h"
#include "hw/s390x/s390-pci-inst.h"
+#include "hw/s390x/s390-pci-kvm.h"
#include "hw/s390x/s390-pci-vfio.h"
#include "hw/pci/pci_bus.h"
#include "hw/qdev-properties.h"
@@ -971,12 +972,45 @@ static void s390_pci_update_subordinate(PCIDevice *dev, uint32_t nr)
}
}
+static int s390_pci_interp_plug(S390pciState *s, S390PCIBusDevice *pbdev)
+{
+ uint32_t idx;
+ int rc;
+
+ rc = s390_pci_kvm_plug(pbdev);
+ if (rc) {
+ return rc;
+ }
+
+ /* Next, see if the idx is already in-use */
+ idx = pbdev->fh & FH_MASK_INDEX;
+ if (pbdev->idx != idx) {
+ if (s390_pci_find_dev_by_idx(s, idx)) {
+ return -EINVAL;
+ }
+ /*
+ * Update the idx entry with the passed through idx
+ * If the relinquished idx is lower than next_idx, use it
+ * to replace next_idx
+ */
+ g_hash_table_remove(s->zpci_table, &pbdev->idx);
+ if (idx < s->next_idx) {
+ s->next_idx = idx;
+ }
+ pbdev->idx = idx;
+ g_hash_table_insert(s->zpci_table, &pbdev->idx, pbdev);
+ }
+
+ return 0;
+}
+
static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
Error **errp)
{
S390pciState *s = S390_PCI_HOST_BRIDGE(hotplug_dev);
PCIDevice *pdev = NULL;
S390PCIBusDevice *pbdev = NULL;
+ int rc;
if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_BRIDGE)) {
PCIBridge *pb = PCI_BRIDGE(dev);
@@ -1022,12 +1056,35 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
set_pbdev_info(pbdev);
if (object_dynamic_cast(OBJECT(dev), "vfio-pci")) {
- pbdev->fh |= FH_SHM_VFIO;
+ /*
+ * By default, interpretation is always requested; if the available
+ * facilities indicate it is not available, fallback to the
+ * interception model.
+ */
+ if (pbdev->interp) {
+ if (s390_pci_kvm_zpciop_allowed()) {
+ rc = s390_pci_interp_plug(s, pbdev);
+ if (rc) {
+ error_setg(errp, "Plug failed for zPCI device in "
+ "interpretation mode: %d", rc);
+ return;
+ }
+ } else {
+ DPRINTF("zPCI interpretation facilities missing.\n");
+ pbdev->interp = false;
+ }
+ }
pbdev->iommu->dma_limit = s390_pci_start_dma_count(s, pbdev);
/* Fill in CLP information passed via the vfio region */
s390_pci_get_clp_info(pbdev);
+ if (!pbdev->interp) {
+ /* Do vfio passthrough but intercept for I/O */
+ pbdev->fh |= FH_SHM_VFIO;
+ }
} else {
pbdev->fh |= FH_SHM_EMUL;
+ /* Always intercept emulated devices */
+ pbdev->interp = false;
}
if (s390_pci_msix_init(pbdev)) {
@@ -1078,6 +1135,8 @@ static void s390_pcihost_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
pbdev->pdev = NULL;
pbdev->state = ZPCI_FS_RESERVED;
} else if (object_dynamic_cast(OBJECT(dev), TYPE_S390_PCI_DEVICE)) {
+ int rc;
+
pbdev = S390_PCI_DEVICE(dev);
pbdev->fid = 0;
QTAILQ_REMOVE(&s->zpci_devs, pbdev, link);
@@ -1085,6 +1144,11 @@ static void s390_pcihost_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
if (pbdev->iommu->dma_limit) {
s390_pci_end_dma_count(s, pbdev->iommu->dma_limit);
}
+ rc = s390_pci_kvm_unplug(pbdev);
+ if (rc) {
+ error_setg(errp, "Unplug failed for zPCI device in interpretation "
+ "mode rc=%d", rc);
+ }
qdev_unrealize(dev);
}
}
@@ -1360,6 +1424,7 @@ static Property s390_pci_device_properties[] = {
DEFINE_PROP_UINT16("uid", S390PCIBusDevice, uid, UID_UNDEFINED),
DEFINE_PROP_S390_PCI_FID("fid", S390PCIBusDevice, fid),
DEFINE_PROP_STRING("target", S390PCIBusDevice, target),
+ DEFINE_PROP_BOOL("interpret", S390PCIBusDevice, interp, true),
DEFINE_PROP_END_OF_LIST(),
};
diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index 6d400d4147..92ea7b73e4 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -18,6 +18,8 @@
#include "sysemu/hw_accel.h"
#include "hw/s390x/s390-pci-inst.h"
#include "hw/s390x/s390-pci-bus.h"
+#include "hw/s390x/s390-pci-kvm.h"
+#include "hw/s390x/s390-pci-vfio.h"
#include "hw/s390x/tod.h"
#ifndef DEBUG_S390PCI_INST
@@ -156,6 +158,37 @@ out:
return rc;
}
+static int clp_enable_interp(S390PCIBusDevice *pbdev)
+{
+ int rc;
+
+ rc = s390_pci_kvm_interp_enable(pbdev);
+ if (rc) {
+ DPRINTF("Failed to enable interpretation\n");
+ return rc;
+ }
+
+ if (!(pbdev->fh & FH_MASK_ENABLE)) {
+ DPRINTF("Passthrough handle is not enabled\n");
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int clp_disable_interp(S390PCIBusDevice *pbdev)
+{
+ int rc;
+
+ rc = s390_pci_kvm_interp_disable(pbdev);
+ if (rc) {
+ DPRINTF("Failed to disable interpretation\n");
+ return rc;
+ }
+
+ return 0;
+}
+
int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra)
{
ClpReqHdr *reqh;
@@ -246,7 +279,19 @@ int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra)
goto out;
}
- pbdev->fh |= FH_MASK_ENABLE;
+ /*
+ * If interpretation is specified, attempt to enable this now and
+ * update with the host fh
+ */
+ if (pbdev->interp) {
+ if (clp_enable_interp(pbdev)) {
+ stw_p(&ressetpci->hdr.rsp, CLP_RC_SETPCIFN_ERR);
+ goto out;
+ }
+ } else {
+ pbdev->fh |= FH_MASK_ENABLE;
+ }
+
pbdev->state = ZPCI_FS_ENABLED;
stl_p(&ressetpci->fh, pbdev->fh);
stw_p(&ressetpci->hdr.rsp, CLP_RC_OK);
@@ -257,6 +302,13 @@ int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra)
goto out;
}
device_legacy_reset(DEVICE(pbdev));
+ if (pbdev->interp) {
+ if (clp_disable_interp(pbdev)) {
+ stw_p(&ressetpci->hdr.rsp, CLP_RC_SETPCIFN_ERR);
+ goto out;
+ }
+ }
+ /* Mask off the enabled bit for interpreted devices too */
pbdev->fh &= ~FH_MASK_ENABLE;
pbdev->state = ZPCI_FS_DISABLED;
stl_p(&ressetpci->fh, pbdev->fh);
diff --git a/hw/s390x/s390-pci-kvm.c b/hw/s390x/s390-pci-kvm.c
new file mode 100644
index 0000000000..755ea0618a
--- /dev/null
+++ b/hw/s390x/s390-pci-kvm.c
@@ -0,0 +1,112 @@
+/*
+ * s390 zPCI KVM interfaces
+ *
+ * Copyright 2022 IBM Corp.
+ * Author(s): Matthew Rosato <mjrosato@linux.ibm.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or (at
+ * your option) any later version. See the COPYING file in the top-level
+ * directory.
+ */
+
+#include "qemu/osdep.h"
+
+#include <linux/kvm.h>
+
+#include "kvm/kvm_s390x.h"
+#include "hw/s390x/s390-pci-bus.h"
+#include "hw/s390x/s390-pci-kvm.h"
+#include "hw/s390x/s390-pci-vfio.h"
+
+bool s390_pci_kvm_zpciop_allowed(void)
+{
+ return s390_has_feat(S390_FEAT_ZPCI_INTERP) && kvm_s390_get_zpci_op();
+}
+
+int s390_pci_kvm_plug(S390PCIBusDevice *pbdev)
+{
+ int rc;
+
+ struct kvm_s390_zpci_op args = {
+ .op = KVM_S390_ZPCIOP_INIT
+ };
+
+ if (!s390_pci_get_host_fh(pbdev, &args.fh)) {
+ return -EINVAL;
+ }
+
+ rc = kvm_vm_ioctl(kvm_state, KVM_S390_ZPCI_OP, &args);
+ if (!rc) {
+ /*
+ * The host device is already in an enabled state, but we always present
+ * the initial device state to the guest as disabled (ZPCI_FS_DISABLED).
+ * Therefore, mask off the enable bit from the passthrough handle until
+ * the guest issues a CLP SET PCI FN later to enable the device.
+ */
+ pbdev->fh = (args.newfh & ~FH_MASK_ENABLE);
+ }
+
+ return rc;
+}
+
+int s390_pci_kvm_unplug(S390PCIBusDevice *pbdev)
+{
+ struct kvm_s390_zpci_op args = {
+ .fh = pbdev->fh | FH_MASK_ENABLE,
+ .op = KVM_S390_ZPCIOP_END
+ };
+
+ return kvm_vm_ioctl(kvm_state, KVM_S390_ZPCI_OP, &args);
+}
+
+int s390_pci_kvm_interp_enable(S390PCIBusDevice *pbdev)
+{
+ uint32_t fh;
+ int rc;
+
+ struct kvm_s390_zpci_op args = {
+ .fh = pbdev->fh | FH_MASK_ENABLE,
+ .op = KVM_S390_ZPCIOP_START_INTERP
+ };
+
+ retry:
+ rc = kvm_vm_ioctl(kvm_state, KVM_S390_ZPCI_OP, &args);
+
+ if (rc == -ENODEV) {
+ /*
+ * If the function wasn't found, re-sync the function handle with vfio
+ * and if a change is detected, retry the operation with the new fh.
+ * This can happen while the device is disabled to the guest due to
+ * vfio-triggered events (e.g. vfio hot reset for ISM during plug)
+ */
+ if (!s390_pci_get_host_fh(pbdev, &fh)) {
+ return -EINVAL;
+ }
+ if (fh != args.fh) {
+ args.fh = fh;
+ goto retry;
+ }
+ }
+ if (!rc) {
+ pbdev->fh = args.newfh;
+ }
+
+ return rc;
+}
+
+int s390_pci_kvm_interp_disable(S390PCIBusDevice *pbdev)
+{
+ int rc;
+
+ struct kvm_s390_zpci_op args = {
+ .fh = pbdev->fh,
+ .op = KVM_S390_ZPCIOP_STOP_INTERP
+ };
+
+ rc = kvm_vm_ioctl(kvm_state, KVM_S390_ZPCI_OP, &args);
+ if (!rc) {
+ pbdev->fh = args.newfh;
+ }
+
+ return rc;
+}
diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
index da3cde2bb4..a9843dfe97 100644
--- a/include/hw/s390x/s390-pci-bus.h
+++ b/include/hw/s390x/s390-pci-bus.h
@@ -350,6 +350,7 @@ struct S390PCIBusDevice {
IndAddr *indicator;
bool pci_unplug_request_processed;
bool unplug_requested;
+ bool interp;
QTAILQ_ENTRY(S390PCIBusDevice) link;
};
diff --git a/include/hw/s390x/s390-pci-kvm.h b/include/hw/s390x/s390-pci-kvm.h
new file mode 100644
index 0000000000..6b2528cf82
--- /dev/null
+++ b/include/hw/s390x/s390-pci-kvm.h
@@ -0,0 +1,46 @@
+/*
+ * s390 PCI KVM interfaces
+ *
+ * Copyright 2022 IBM Corp.
+ * Author(s): Matthew Rosato <mjrosato@linux.ibm.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or (at
+ * your option) any later version. See the COPYING file in the top-level
+ * directory.
+ */
+
+#ifndef HW_S390_PCI_KVM_H
+#define HW_S390_PCI_KVM_H
+
+#include "hw/s390x/s390-pci-bus.h"
+
+#ifdef CONFIG_KVM
+bool s390_pci_kvm_zpciop_allowed(void);
+int s390_pci_kvm_plug(S390PCIBusDevice *pbdev);
+int s390_pci_kvm_unplug(S390PCIBusDevice *pbdev);
+int s390_pci_kvm_interp_enable(S390PCIBusDevice *pbdev);
+int s390_pci_kvm_interp_disable(S390PCIBusDevice *pbdev);
+#else
+static inline bool s390_pci_kvm_zpciop_allowed(void)
+{
+ return false;
+}
+static inline int s390_pci_kvm_plug(S390PCIBusDevice *pbdev)
+{
+ return -EINVAL;
+}
+static inline int s390_pci_kvm_unplug(S390PCIBusDevice *pbdev)
+{
+ return -EINVAL;
+}
+static inline int s390_pci_kvm_interp_enable(S390PCIBusDevice *pbdev)
+{
+ return -EINVAL;
+}
+static inline int s390_pci_kvm_interp_enable(S390PCIBusDevice *pbdev)
+{
+ return -EINVAL;
+}
+#endif
+
+#endif
diff --git a/target/s390x/kvm/kvm.c b/target/s390x/kvm/kvm.c
index 0357bfda89..288fbd1d75 100644
--- a/target/s390x/kvm/kvm.c
+++ b/target/s390x/kvm/kvm.c
@@ -157,6 +157,7 @@ static int cap_ri;
static int cap_hpage_1m;
static int cap_vcpu_resets;
static int cap_protected;
+static int cap_zpci_op;
static int active_cmma;
@@ -358,6 +359,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
cap_s390_irq = kvm_check_extension(s, KVM_CAP_S390_INJECT_IRQ);
cap_vcpu_resets = kvm_check_extension(s, KVM_CAP_S390_VCPU_RESETS);
cap_protected = kvm_check_extension(s, KVM_CAP_S390_PROTECTED);
+ cap_zpci_op = kvm_check_extension(s, KVM_CAP_S390_ZPCI_OP);
kvm_vm_enable_cap(s, KVM_CAP_S390_USER_SIGP, 0);
kvm_vm_enable_cap(s, KVM_CAP_S390_VECTOR_REGISTERS, 0);
@@ -2567,3 +2569,8 @@ bool kvm_arch_cpu_check_are_resettable(void)
{
return true;
}
+
+int kvm_s390_get_zpci_op(void)
+{
+ return cap_zpci_op;
+}
diff --git a/target/s390x/kvm/kvm_s390x.h b/target/s390x/kvm/kvm_s390x.h
index 05a5e1e6f4..aaae8570de 100644
--- a/target/s390x/kvm/kvm_s390x.h
+++ b/target/s390x/kvm/kvm_s390x.h
@@ -27,6 +27,7 @@ void kvm_s390_vcpu_interrupt_pre_save(S390CPU *cpu);
int kvm_s390_vcpu_interrupt_post_load(S390CPU *cpu);
int kvm_s390_get_hpage_1m(void);
int kvm_s390_get_ri(void);
+int kvm_s390_get_zpci_op(void);
int kvm_s390_get_clock(uint8_t *tod_high, uint64_t *tod_clock);
int kvm_s390_get_clock_ext(uint8_t *tod_high, uint64_t *tod_clock);
int kvm_s390_set_clock(uint8_t tod_high, uint64_t tod_clock);
--
2.27.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 06/11] s390x/pci: don't fence interpreted devices without MSI-X
2022-03-14 19:49 [PATCH v4 00/11] s390x/pci: zPCI interpretation support Matthew Rosato
` (4 preceding siblings ...)
2022-03-14 19:49 ` [PATCH v4 05/11] s390x/pci: enable for load/store intepretation Matthew Rosato
@ 2022-03-14 19:49 ` Matthew Rosato
2022-03-14 19:49 ` [PATCH v4 07/11] s390x/pci: enable adapter event notification for interpreted devices Matthew Rosato
` (4 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Matthew Rosato @ 2022-03-14 19:49 UTC (permalink / raw)
To: qemu-s390x
Cc: farman, kvm, pmorel, schnelle, cohuck, richard.henderson, thuth,
qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
borntraeger
Lack of MSI-X support is not an issue for interpreted passthrough
devices, so let's let these in. This will allow, for example, ISM
devices to be passed through -- but only when interpretation is
available and being used.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
hw/s390x/s390-pci-bus.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 7ce7bda26d..acc91af64c 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -1087,7 +1087,7 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
pbdev->interp = false;
}
- if (s390_pci_msix_init(pbdev)) {
+ if (s390_pci_msix_init(pbdev) && !pbdev->interp) {
error_setg(errp, "MSI-X support is mandatory "
"in the S390 architecture");
return;
--
2.27.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 07/11] s390x/pci: enable adapter event notification for interpreted devices
2022-03-14 19:49 [PATCH v4 00/11] s390x/pci: zPCI interpretation support Matthew Rosato
` (5 preceding siblings ...)
2022-03-14 19:49 ` [PATCH v4 06/11] s390x/pci: don't fence interpreted devices without MSI-X Matthew Rosato
@ 2022-03-14 19:49 ` Matthew Rosato
2022-03-14 19:49 ` [PATCH v4 08/11] s390x/pci: use KVM-managed IOMMU for interpretation Matthew Rosato
` (3 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Matthew Rosato @ 2022-03-14 19:49 UTC (permalink / raw)
To: qemu-s390x
Cc: farman, kvm, pmorel, schnelle, cohuck, richard.henderson, thuth,
qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
borntraeger
Use the associated kvm ioctl operation to enable adapter event notification
and forwarding for devices when requested. This feature will be set up
with or without firmware assist based upon the 'forwarding_assist' setting.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
hw/s390x/s390-pci-bus.c | 20 ++++++++++++++---
hw/s390x/s390-pci-inst.c | 40 +++++++++++++++++++++++++++++++--
hw/s390x/s390-pci-kvm.c | 27 ++++++++++++++++++++++
include/hw/s390x/s390-pci-bus.h | 1 +
include/hw/s390x/s390-pci-kvm.h | 12 ++++++++++
5 files changed, 95 insertions(+), 5 deletions(-)
diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index acc91af64c..5043b8c85c 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -190,7 +190,10 @@ void s390_pci_sclp_deconfigure(SCCB *sccb)
rc = SCLP_RC_NO_ACTION_REQUIRED;
break;
default:
- if (pbdev->summary_ind) {
+ if (pbdev->interp && (pbdev->fh & FH_MASK_ENABLE)) {
+ /* Interpreted devices were using interrupt forwarding */
+ s390_pci_kvm_aif_disable(pbdev);
+ } else if (pbdev->summary_ind) {
pci_dereg_irqs(pbdev);
}
if (pbdev->iommu->enabled) {
@@ -1072,6 +1075,7 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
} else {
DPRINTF("zPCI interpretation facilities missing.\n");
pbdev->interp = false;
+ pbdev->forwarding_assist = false;
}
}
pbdev->iommu->dma_limit = s390_pci_start_dma_count(s, pbdev);
@@ -1080,11 +1084,13 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
if (!pbdev->interp) {
/* Do vfio passthrough but intercept for I/O */
pbdev->fh |= FH_SHM_VFIO;
+ pbdev->forwarding_assist = false;
}
} else {
pbdev->fh |= FH_SHM_EMUL;
/* Always intercept emulated devices */
pbdev->interp = false;
+ pbdev->forwarding_assist = false;
}
if (s390_pci_msix_init(pbdev) && !pbdev->interp) {
@@ -1241,7 +1247,10 @@ static void s390_pcihost_reset(DeviceState *dev)
/* Process all pending unplug requests */
QTAILQ_FOREACH_SAFE(pbdev, &s->zpci_devs, link, next) {
if (pbdev->unplug_requested) {
- if (pbdev->summary_ind) {
+ if (pbdev->interp && (pbdev->fh & FH_MASK_ENABLE)) {
+ /* Interpreted devices were using interrupt forwarding */
+ s390_pci_kvm_aif_disable(pbdev);
+ } else if (pbdev->summary_ind) {
pci_dereg_irqs(pbdev);
}
if (pbdev->iommu->enabled) {
@@ -1379,7 +1388,10 @@ static void s390_pci_device_reset(DeviceState *dev)
break;
}
- if (pbdev->summary_ind) {
+ if (pbdev->interp && (pbdev->fh & FH_MASK_ENABLE)) {
+ /* Interpreted devices were using interrupt forwarding */
+ s390_pci_kvm_aif_disable(pbdev);
+ } else if (pbdev->summary_ind) {
pci_dereg_irqs(pbdev);
}
if (pbdev->iommu->enabled) {
@@ -1425,6 +1437,8 @@ static Property s390_pci_device_properties[] = {
DEFINE_PROP_S390_PCI_FID("fid", S390PCIBusDevice, fid),
DEFINE_PROP_STRING("target", S390PCIBusDevice, target),
DEFINE_PROP_BOOL("interpret", S390PCIBusDevice, interp, true),
+ DEFINE_PROP_BOOL("forwarding_assist", S390PCIBusDevice, forwarding_assist,
+ true),
DEFINE_PROP_END_OF_LIST(),
};
diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index 92ea7b73e4..f7b01e2059 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -1102,6 +1102,32 @@ static void fmb_update(void *opaque)
timer_mod(pbdev->fmb_timer, t + pbdev->pci_group->zpci_group.mui);
}
+static int mpcifc_reg_int_interp(S390PCIBusDevice *pbdev, ZpciFib *fib)
+{
+ int rc;
+
+ rc = s390_pci_kvm_aif_enable(pbdev, fib, pbdev->forwarding_assist);
+ if (rc) {
+ DPRINTF("Failed to enable interrupt forwarding\n");
+ return rc;
+ }
+
+ return 0;
+}
+
+static int mpcifc_dereg_int_interp(S390PCIBusDevice *pbdev, ZpciFib *fib)
+{
+ int rc;
+
+ rc = s390_pci_kvm_aif_disable(pbdev);
+ if (rc) {
+ DPRINTF("Failed to disable interrupt forwarding\n");
+ return rc;
+ }
+
+ return 0;
+}
+
int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar,
uintptr_t ra)
{
@@ -1156,7 +1182,12 @@ int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar,
switch (oc) {
case ZPCI_MOD_FC_REG_INT:
- if (pbdev->summary_ind) {
+ if (pbdev->interp) {
+ if (mpcifc_reg_int_interp(pbdev, &fib)) {
+ cc = ZPCI_PCI_LS_ERR;
+ s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
+ }
+ } else if (pbdev->summary_ind) {
cc = ZPCI_PCI_LS_ERR;
s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
} else if (reg_irqs(env, pbdev, fib)) {
@@ -1165,7 +1196,12 @@ int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar,
}
break;
case ZPCI_MOD_FC_DEREG_INT:
- if (!pbdev->summary_ind) {
+ if (pbdev->interp) {
+ if (mpcifc_dereg_int_interp(pbdev, &fib)) {
+ cc = ZPCI_PCI_LS_ERR;
+ s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
+ }
+ } else if (!pbdev->summary_ind) {
cc = ZPCI_PCI_LS_ERR;
s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
} else {
diff --git a/hw/s390x/s390-pci-kvm.c b/hw/s390x/s390-pci-kvm.c
index 755ea0618a..ebb862abd0 100644
--- a/hw/s390x/s390-pci-kvm.c
+++ b/hw/s390x/s390-pci-kvm.c
@@ -16,6 +16,7 @@
#include "kvm/kvm_s390x.h"
#include "hw/s390x/s390-pci-bus.h"
#include "hw/s390x/s390-pci-kvm.h"
+#include "hw/s390x/s390-pci-inst.h"
#include "hw/s390x/s390-pci-vfio.h"
bool s390_pci_kvm_zpciop_allowed(void)
@@ -110,3 +111,29 @@ int s390_pci_kvm_interp_disable(S390PCIBusDevice *pbdev)
return rc;
}
+
+int s390_pci_kvm_aif_enable(S390PCIBusDevice *pbdev, ZpciFib *fib, bool assist)
+{
+ struct kvm_s390_zpci_op args = {
+ .fh = pbdev->fh,
+ .op = KVM_S390_ZPCIOP_REG_INT,
+ .u.reg_int.ibv = fib->aibv,
+ .u.reg_int.sb = fib->aisb,
+ .u.reg_int.noi = FIB_DATA_NOI(fib->data),
+ .u.reg_int.isc = FIB_DATA_ISC(fib->data),
+ .u.reg_int.sbo = FIB_DATA_AISBO(fib->data),
+ .u.reg_int.flags = (assist) ? 0 : KVM_S390_ZPCIOP_REGINT_HOST
+ };
+
+ return kvm_vm_ioctl(kvm_state, KVM_S390_ZPCI_OP, &args);
+}
+
+int s390_pci_kvm_aif_disable(S390PCIBusDevice *pbdev)
+{
+ struct kvm_s390_zpci_op args = {
+ .fh = pbdev->fh,
+ .op = KVM_S390_ZPCIOP_DEREG_INT
+ };
+
+ return kvm_vm_ioctl(kvm_state, KVM_S390_ZPCI_OP, &args);
+}
diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
index a9843dfe97..5b09f0cf2f 100644
--- a/include/hw/s390x/s390-pci-bus.h
+++ b/include/hw/s390x/s390-pci-bus.h
@@ -351,6 +351,7 @@ struct S390PCIBusDevice {
bool pci_unplug_request_processed;
bool unplug_requested;
bool interp;
+ bool forwarding_assist;
QTAILQ_ENTRY(S390PCIBusDevice) link;
};
diff --git a/include/hw/s390x/s390-pci-kvm.h b/include/hw/s390x/s390-pci-kvm.h
index 6b2528cf82..09004d3f6c 100644
--- a/include/hw/s390x/s390-pci-kvm.h
+++ b/include/hw/s390x/s390-pci-kvm.h
@@ -13,6 +13,7 @@
#define HW_S390_PCI_KVM_H
#include "hw/s390x/s390-pci-bus.h"
+#include "hw/s390x/s390-pci-inst.h"
#ifdef CONFIG_KVM
bool s390_pci_kvm_zpciop_allowed(void);
@@ -20,6 +21,8 @@ int s390_pci_kvm_plug(S390PCIBusDevice *pbdev);
int s390_pci_kvm_unplug(S390PCIBusDevice *pbdev);
int s390_pci_kvm_interp_enable(S390PCIBusDevice *pbdev);
int s390_pci_kvm_interp_disable(S390PCIBusDevice *pbdev);
+int s390_pci_kvm_aif_enable(S390PCIBusDevice *pbdev, ZpciFib *fib, bool assist);
+int s390_pci_kvm_aif_disable(S390PCIBusDevice *pbdev);
#else
static inline bool s390_pci_kvm_zpciop_allowed(void)
{
@@ -41,6 +44,15 @@ static inline int s390_pci_kvm_interp_enable(S390PCIBusDevice *pbdev)
{
return -EINVAL;
}
+static inline int s390_pci_kvm_aif_enable(S390PCIBusDevice *pbdev, ZpciFib *fib,
+ bool assist)
+{
+ return -EINVAL;
+}
+static inline int s390_pci_kvm_aif_disable(S390PCIBusDevice *pbdev)
+{
+ return -EINVAL;
+}
#endif
#endif
--
2.27.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 08/11] s390x/pci: use KVM-managed IOMMU for interpretation
2022-03-14 19:49 [PATCH v4 00/11] s390x/pci: zPCI interpretation support Matthew Rosato
` (6 preceding siblings ...)
2022-03-14 19:49 ` [PATCH v4 07/11] s390x/pci: enable adapter event notification for interpreted devices Matthew Rosato
@ 2022-03-14 19:49 ` Matthew Rosato
2022-03-14 19:49 ` [PATCH v4 09/11] s390x/pci: use I/O Address Translation assist when interpreting Matthew Rosato
` (2 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Matthew Rosato @ 2022-03-14 19:49 UTC (permalink / raw)
To: qemu-s390x
Cc: farman, kvm, pmorel, schnelle, cohuck, richard.henderson, thuth,
qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
borntraeger
When interpreting zPCI instructions, KVM will control the IOMMU mappings
in response to RPCIT instructions rather than relying on mapping ioctls
from userspace. Mark the vfio device in pre_plug so that the appropriate
iommu domain will be allocated on the host during VFIO_SET_IOMMU.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
hw/s390x/s390-pci-bus.c | 11 +++++++++++
hw/s390x/s390-pci-vfio.c | 22 ++++++++++++++++++++++
include/hw/s390x/s390-pci-vfio.h | 5 +++++
3 files changed, 38 insertions(+)
diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 5043b8c85c..513a276711 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -950,6 +950,17 @@ static void s390_pcihost_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
error_setg(errp, "multifunction not supported in s390");
return;
}
+ /*
+ * If we have a vfio-pci device that wishes to use interpretation
+ * we must update the host IOMMU domain ops.
+ */
+ if (s390_pci_kvm_zpciop_allowed() &&
+ object_dynamic_cast(OBJECT(dev), "vfio-pci")) {
+ if (s390_pci_set_kvm_iommu(s, dev)) {
+ error_setg(errp, "KVM IOMMU not available for interpretation");
+ return;
+ }
+ }
} else if (object_dynamic_cast(OBJECT(dev), TYPE_S390_PCI_DEVICE)) {
S390PCIBusDevice *pbdev = S390_PCI_DEVICE(dev);
diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
index 4bf0a7e22d..7808e8d939 100644
--- a/hw/s390x/s390-pci-vfio.c
+++ b/hw/s390x/s390-pci-vfio.c
@@ -324,3 +324,25 @@ void s390_pci_get_clp_info(S390PCIBusDevice *pbdev)
return;
}
+
+/*
+ * This function will determine if the specified VFIOPCIDevice is linked to a
+ * zPCI device that requests interpretation support. In this case, we must
+ * inform vfio that the KVM-managed IOMMU should be requested when the
+ * VFIO_SET_IOMMU ioctl is issued.
+ */
+int s390_pci_set_kvm_iommu(S390pciState *s, DeviceState *dev)
+{
+ VFIOPCIDevice *vdev = VFIO_PCI(dev);
+ S390PCIBusDevice *pbdev = s390_pci_find_dev_by_target(s, dev->id);
+
+ if (!pbdev) {
+ return -ENODEV;
+ }
+
+ if (pbdev->interp) {
+ vdev->kvm_managed_iommu = true;
+ }
+
+ return 0;
+}
diff --git a/include/hw/s390x/s390-pci-vfio.h b/include/hw/s390x/s390-pci-vfio.h
index 0c2e4b5175..5026f978c2 100644
--- a/include/hw/s390x/s390-pci-vfio.h
+++ b/include/hw/s390x/s390-pci-vfio.h
@@ -22,6 +22,7 @@ S390PCIDMACount *s390_pci_start_dma_count(S390pciState *s,
void s390_pci_end_dma_count(S390pciState *s, S390PCIDMACount *cnt);
bool s390_pci_get_host_fh(S390PCIBusDevice *pbdev, uint32_t *fh);
void s390_pci_get_clp_info(S390PCIBusDevice *pbdev);
+int s390_pci_set_kvm_iommu(S390pciState *s, DeviceState *dev);
#else
static inline bool s390_pci_update_dma_avail(int fd, unsigned int *avail)
{
@@ -40,6 +41,10 @@ static inline bool s390_pci_get_host_fh(S390PCIBusDevice *pbdev,
return false;
}
static inline void s390_pci_get_clp_info(S390PCIBusDevice *pbdev) { }
+static inline int s390_pci_set_kvm_iommu(S390pciState *s, DeviceState *dev)
+{
+ return -EINVAL;
+}
#endif
#endif
--
2.27.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 09/11] s390x/pci: use I/O Address Translation assist when interpreting
2022-03-14 19:49 [PATCH v4 00/11] s390x/pci: zPCI interpretation support Matthew Rosato
` (7 preceding siblings ...)
2022-03-14 19:49 ` [PATCH v4 08/11] s390x/pci: use KVM-managed IOMMU for interpretation Matthew Rosato
@ 2022-03-14 19:49 ` Matthew Rosato
2022-03-14 19:49 ` [PATCH v4 10/11] s390x/pci: use dtsm provided from vfio capabilities for interpreted devices Matthew Rosato
2022-03-14 19:49 ` [PATCH v4 11/11] s390x/pci: let intercept devices have separate PCI groups Matthew Rosato
10 siblings, 0 replies; 12+ messages in thread
From: Matthew Rosato @ 2022-03-14 19:49 UTC (permalink / raw)
To: qemu-s390x
Cc: farman, kvm, pmorel, schnelle, cohuck, richard.henderson, thuth,
qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
borntraeger
Allow the underlying kvm host to handle the Refresh PCI Translation
instruction intercepts.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
hw/s390x/s390-pci-bus.c | 6 ++---
hw/s390x/s390-pci-inst.c | 42 +++++++++++++++++++++++++++++---
hw/s390x/s390-pci-kvm.c | 21 ++++++++++++++++
include/hw/s390x/s390-pci-inst.h | 2 +-
include/hw/s390x/s390-pci-kvm.h | 10 ++++++++
5 files changed, 74 insertions(+), 7 deletions(-)
diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 513a276711..32894fbe9c 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -197,7 +197,7 @@ void s390_pci_sclp_deconfigure(SCCB *sccb)
pci_dereg_irqs(pbdev);
}
if (pbdev->iommu->enabled) {
- pci_dereg_ioat(pbdev->iommu);
+ pci_dereg_ioat(pbdev);
}
pbdev->state = ZPCI_FS_STANDBY;
rc = SCLP_RC_NORMAL_COMPLETION;
@@ -1265,7 +1265,7 @@ static void s390_pcihost_reset(DeviceState *dev)
pci_dereg_irqs(pbdev);
}
if (pbdev->iommu->enabled) {
- pci_dereg_ioat(pbdev->iommu);
+ pci_dereg_ioat(pbdev);
}
pbdev->state = ZPCI_FS_STANDBY;
s390_pci_perform_unplug(pbdev);
@@ -1406,7 +1406,7 @@ static void s390_pci_device_reset(DeviceState *dev)
pci_dereg_irqs(pbdev);
}
if (pbdev->iommu->enabled) {
- pci_dereg_ioat(pbdev->iommu);
+ pci_dereg_ioat(pbdev);
}
fmb_timer_free(pbdev);
diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index f7b01e2059..86bb04e859 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -969,6 +969,19 @@ int pci_dereg_irqs(S390PCIBusDevice *pbdev)
return 0;
}
+static int reg_ioat_interp(S390PCIBusDevice *pbdev, uint64_t iota)
+{
+ int rc;
+
+ rc = s390_pci_kvm_ioat_enable(pbdev, iota);
+ if (rc) {
+ return rc;
+ }
+
+ pbdev->iommu->enabled = true;
+ return 0;
+}
+
static int reg_ioat(CPUS390XState *env, S390PCIBusDevice *pbdev, ZpciFib fib,
uintptr_t ra)
{
@@ -986,6 +999,16 @@ static int reg_ioat(CPUS390XState *env, S390PCIBusDevice *pbdev, ZpciFib fib,
return -EINVAL;
}
+ /* If this is an interpreted device, we must use the IOAT assist */
+ if (pbdev->interp) {
+ if (reg_ioat_interp(pbdev, g_iota)) {
+ error_report("failure starting ioat assist");
+ s390_program_interrupt(env, PGM_OPERAND, ra);
+ return -EINVAL;
+ }
+ return 0;
+ }
+
/* currently we only support designation type 1 with translation */
if (!(dt == ZPCI_IOTA_RTTO && t)) {
error_report("unsupported ioat dt %d t %d", dt, t);
@@ -1002,8 +1025,21 @@ static int reg_ioat(CPUS390XState *env, S390PCIBusDevice *pbdev, ZpciFib fib,
return 0;
}
-void pci_dereg_ioat(S390PCIIOMMU *iommu)
+static void dereg_ioat_interp(S390PCIBusDevice *pbdev)
+{
+ s390_pci_kvm_ioat_disable(pbdev);
+ pbdev->iommu->enabled = false;
+}
+
+void pci_dereg_ioat(S390PCIBusDevice *pbdev)
{
+ S390PCIIOMMU *iommu = pbdev->iommu;
+
+ if (pbdev->interp) {
+ dereg_ioat_interp(pbdev);
+ return;
+ }
+
s390_pci_iommu_disable(iommu);
iommu->pba = 0;
iommu->pal = 0;
@@ -1228,7 +1264,7 @@ int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar,
cc = ZPCI_PCI_LS_ERR;
s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
} else {
- pci_dereg_ioat(pbdev->iommu);
+ pci_dereg_ioat(pbdev);
}
break;
case ZPCI_MOD_FC_REREG_IOAT:
@@ -1239,7 +1275,7 @@ int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar,
cc = ZPCI_PCI_LS_ERR;
s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
} else {
- pci_dereg_ioat(pbdev->iommu);
+ pci_dereg_ioat(pbdev);
if (reg_ioat(env, pbdev, fib, ra)) {
cc = ZPCI_PCI_LS_ERR;
s390_set_status_code(env, r1, ZPCI_MOD_ST_INSUF_RES);
diff --git a/hw/s390x/s390-pci-kvm.c b/hw/s390x/s390-pci-kvm.c
index ebb862abd0..2332efd676 100644
--- a/hw/s390x/s390-pci-kvm.c
+++ b/hw/s390x/s390-pci-kvm.c
@@ -137,3 +137,24 @@ int s390_pci_kvm_aif_disable(S390PCIBusDevice *pbdev)
return kvm_vm_ioctl(kvm_state, KVM_S390_ZPCI_OP, &args);
}
+
+int s390_pci_kvm_ioat_enable(S390PCIBusDevice *pbdev, uint64_t iota)
+{
+ struct kvm_s390_zpci_op args = {
+ .fh = pbdev->fh,
+ .op = KVM_S390_ZPCIOP_REG_IOAT,
+ .u.reg_ioat.iota = iota
+ };
+
+ return kvm_vm_ioctl(kvm_state, KVM_S390_ZPCI_OP, &args);
+}
+
+int s390_pci_kvm_ioat_disable(S390PCIBusDevice *pbdev)
+{
+ struct kvm_s390_zpci_op args = {
+ .fh = pbdev->fh,
+ .op = KVM_S390_ZPCIOP_DEREG_IOAT
+ };
+
+ return kvm_vm_ioctl(kvm_state, KVM_S390_ZPCI_OP, &args);
+}
diff --git a/include/hw/s390x/s390-pci-inst.h b/include/hw/s390x/s390-pci-inst.h
index a55c448aad..13566fb7b4 100644
--- a/include/hw/s390x/s390-pci-inst.h
+++ b/include/hw/s390x/s390-pci-inst.h
@@ -99,7 +99,7 @@ typedef struct ZpciFib {
} QEMU_PACKED ZpciFib;
int pci_dereg_irqs(S390PCIBusDevice *pbdev);
-void pci_dereg_ioat(S390PCIIOMMU *iommu);
+void pci_dereg_ioat(S390PCIBusDevice *pbdev);
int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra);
int pcilg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra);
int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra);
diff --git a/include/hw/s390x/s390-pci-kvm.h b/include/hw/s390x/s390-pci-kvm.h
index 09004d3f6c..e52e92d0f0 100644
--- a/include/hw/s390x/s390-pci-kvm.h
+++ b/include/hw/s390x/s390-pci-kvm.h
@@ -23,6 +23,8 @@ int s390_pci_kvm_interp_enable(S390PCIBusDevice *pbdev);
int s390_pci_kvm_interp_disable(S390PCIBusDevice *pbdev);
int s390_pci_kvm_aif_enable(S390PCIBusDevice *pbdev, ZpciFib *fib, bool assist);
int s390_pci_kvm_aif_disable(S390PCIBusDevice *pbdev);
+int s390_pci_kvm_ioat_enable(S390PCIBusDevice *pbdev, uint64_t iota);
+int s390_pci_kvm_ioat_disable(S390PCIBusDevice *pbdev);
#else
static inline bool s390_pci_kvm_zpciop_allowed(void)
{
@@ -53,6 +55,14 @@ static inline int s390_pci_kvm_aif_disable(S390PCIBusDevice *pbdev)
{
return -EINVAL;
}
+int s390_pci_kvm_ioat_enable(S390PCIBusDevice *pbdev, uint64_t iota)
+{
+ return -EINVAL;
+}
+int s390_pci_kvm_ioat_disable(S390PCIBusDevice *pbdev)
+{
+ return -EINVAL;
+}
#endif
#endif
--
2.27.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 10/11] s390x/pci: use dtsm provided from vfio capabilities for interpreted devices
2022-03-14 19:49 [PATCH v4 00/11] s390x/pci: zPCI interpretation support Matthew Rosato
` (8 preceding siblings ...)
2022-03-14 19:49 ` [PATCH v4 09/11] s390x/pci: use I/O Address Translation assist when interpreting Matthew Rosato
@ 2022-03-14 19:49 ` Matthew Rosato
2022-03-14 19:49 ` [PATCH v4 11/11] s390x/pci: let intercept devices have separate PCI groups Matthew Rosato
10 siblings, 0 replies; 12+ messages in thread
From: Matthew Rosato @ 2022-03-14 19:49 UTC (permalink / raw)
To: qemu-s390x
Cc: farman, kvm, pmorel, schnelle, cohuck, richard.henderson, thuth,
qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
borntraeger
When using the IOAT assist via interpretation, we should advertise what
the host driver supports, not QEMU.
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
hw/s390x/s390-pci-vfio.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
index 7808e8d939..def4e90a40 100644
--- a/hw/s390x/s390-pci-vfio.c
+++ b/hw/s390x/s390-pci-vfio.c
@@ -181,7 +181,11 @@ static void s390_pci_read_group(S390PCIBusDevice *pbdev,
resgrp->i = cap->noi;
resgrp->maxstbl = cap->maxstbl;
resgrp->version = cap->version;
- resgrp->dtsm = ZPCI_DTSM;
+ if (hdr->version >= 2 && pbdev->interp) {
+ resgrp->dtsm = cap->dtsm;
+ } else {
+ resgrp->dtsm = ZPCI_DTSM;
+ }
}
}
--
2.27.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 11/11] s390x/pci: let intercept devices have separate PCI groups
2022-03-14 19:49 [PATCH v4 00/11] s390x/pci: zPCI interpretation support Matthew Rosato
` (9 preceding siblings ...)
2022-03-14 19:49 ` [PATCH v4 10/11] s390x/pci: use dtsm provided from vfio capabilities for interpreted devices Matthew Rosato
@ 2022-03-14 19:49 ` Matthew Rosato
10 siblings, 0 replies; 12+ messages in thread
From: Matthew Rosato @ 2022-03-14 19:49 UTC (permalink / raw)
To: qemu-s390x
Cc: farman, kvm, pmorel, schnelle, cohuck, richard.henderson, thuth,
qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
borntraeger
Let's use the reserved pool of simulated PCI groups to allow intercept
devices to have separate groups from interpreted devices as some group
values may be different. If we run out of simulated PCI groups, subsequent
intercept devices just get the default group.
Furthermore, if we encounter any PCI groups from hostdevs that are marked
as simulated, let's just assign them to the default group to avoid
conflicts between host simulated groups and our own simulated groups.
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
hw/s390x/s390-pci-bus.c | 19 ++++++++++++++--
hw/s390x/s390-pci-vfio.c | 40 ++++++++++++++++++++++++++++++---
include/hw/s390x/s390-pci-bus.h | 6 ++++-
3 files changed, 59 insertions(+), 6 deletions(-)
diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 32894fbe9c..e932f72e89 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -748,13 +748,14 @@ static void s390_pci_iommu_free(S390pciState *s, PCIBus *bus, int32_t devfn)
object_unref(OBJECT(iommu));
}
-S390PCIGroup *s390_group_create(int id)
+S390PCIGroup *s390_group_create(int id, int host_id)
{
S390PCIGroup *group;
S390pciState *s = s390_get_phb();
group = g_new0(S390PCIGroup, 1);
group->id = id;
+ group->host_id = host_id;
QTAILQ_INSERT_TAIL(&s->zpci_groups, group, link);
return group;
}
@@ -772,12 +773,25 @@ S390PCIGroup *s390_group_find(int id)
return NULL;
}
+S390PCIGroup *s390_group_find_host_sim(int host_id)
+{
+ S390PCIGroup *group;
+ S390pciState *s = s390_get_phb();
+
+ QTAILQ_FOREACH(group, &s->zpci_groups, link) {
+ if (group->id >= ZPCI_SIM_GRP_START && group->host_id == host_id) {
+ return group;
+ }
+ }
+ return NULL;
+}
+
static void s390_pci_init_default_group(void)
{
S390PCIGroup *group;
ClpRspQueryPciGrp *resgrp;
- group = s390_group_create(ZPCI_DEFAULT_FN_GRP);
+ group = s390_group_create(ZPCI_DEFAULT_FN_GRP, ZPCI_DEFAULT_FN_GRP);
resgrp = &group->zpci_group;
resgrp->fr = 1;
resgrp->dasm = 0;
@@ -825,6 +839,7 @@ static void s390_pcihost_realize(DeviceState *dev, Error **errp)
NULL, g_free);
s->zpci_table = g_hash_table_new_full(g_int_hash, g_int_equal, NULL, NULL);
s->bus_no = 0;
+ s->next_sim_grp = ZPCI_SIM_GRP_START;
QTAILQ_INIT(&s->pending_sei);
QTAILQ_INIT(&s->zpci_devs);
QTAILQ_INIT(&s->zpci_dma_limit);
diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
index def4e90a40..de24c4ddf0 100644
--- a/hw/s390x/s390-pci-vfio.c
+++ b/hw/s390x/s390-pci-vfio.c
@@ -150,13 +150,17 @@ static void s390_pci_read_group(S390PCIBusDevice *pbdev,
{
struct vfio_info_cap_header *hdr;
struct vfio_device_info_cap_zpci_group *cap;
+ S390pciState *s = s390_get_phb();
ClpRspQueryPciGrp *resgrp;
VFIOPCIDevice *vpci = container_of(pbdev->pdev, VFIOPCIDevice, pdev);
hdr = vfio_get_device_info_cap(info, VFIO_DEVICE_INFO_CAP_ZPCI_GROUP);
- /* If capability not provided, just use the default group */
- if (hdr == NULL) {
+ /*
+ * If capability not provided or the underlying hostdev is simulated, just
+ * use the default group.
+ */
+ if (hdr == NULL || pbdev->zpci_fn.pfgid >= ZPCI_SIM_GRP_START) {
trace_s390_pci_clp_cap(vpci->vbasedev.name,
VFIO_DEVICE_INFO_CAP_ZPCI_GROUP);
pbdev->zpci_fn.pfgid = ZPCI_DEFAULT_FN_GRP;
@@ -165,11 +169,41 @@ static void s390_pci_read_group(S390PCIBusDevice *pbdev,
}
cap = (void *) hdr;
+ /*
+ * For an intercept device, let's use an existing simulated group if one
+ * one was already created for other intercept devices in this group.
+ * If not, create a new simulated group if any are still available.
+ * If all else fails, just fall back on the default group.
+ */
+ if (!pbdev->interp) {
+ pbdev->pci_group = s390_group_find_host_sim(pbdev->zpci_fn.pfgid);
+ if (pbdev->pci_group) {
+ /* Use existing simulated group */
+ pbdev->zpci_fn.pfgid = pbdev->pci_group->id;
+ return;
+ } else {
+ if (s->next_sim_grp == ZPCI_DEFAULT_FN_GRP) {
+ /* All out of simulated groups, use default */
+ trace_s390_pci_clp_cap(vpci->vbasedev.name,
+ VFIO_DEVICE_INFO_CAP_ZPCI_GROUP);
+ pbdev->zpci_fn.pfgid = ZPCI_DEFAULT_FN_GRP;
+ pbdev->pci_group = s390_group_find(ZPCI_DEFAULT_FN_GRP);
+ return;
+ } else {
+ /* We can assign a new simulated group */
+ pbdev->zpci_fn.pfgid = s->next_sim_grp;
+ s->next_sim_grp++;
+ /* Fall through to create the new sim group using CLP info */
+ }
+ }
+ }
+
/* See if the PCI group is already defined, create if not */
pbdev->pci_group = s390_group_find(pbdev->zpci_fn.pfgid);
if (!pbdev->pci_group) {
- pbdev->pci_group = s390_group_create(pbdev->zpci_fn.pfgid);
+ pbdev->pci_group = s390_group_create(pbdev->zpci_fn.pfgid,
+ pbdev->zpci_fn.pfgid);
resgrp = &pbdev->pci_group->zpci_group;
if (cap->flags & VFIO_DEVICE_INFO_ZPCI_FLAG_REFRESH) {
diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
index 5b09f0cf2f..0605fcea24 100644
--- a/include/hw/s390x/s390-pci-bus.h
+++ b/include/hw/s390x/s390-pci-bus.h
@@ -315,13 +315,16 @@ typedef struct ZpciFmb {
QEMU_BUILD_BUG_MSG(offsetof(ZpciFmb, fmt0) != 48, "padding in ZpciFmb");
#define ZPCI_DEFAULT_FN_GRP 0xFF
+#define ZPCI_SIM_GRP_START 0xF0
typedef struct S390PCIGroup {
ClpRspQueryPciGrp zpci_group;
int id;
+ int host_id;
QTAILQ_ENTRY(S390PCIGroup) link;
} S390PCIGroup;
-S390PCIGroup *s390_group_create(int id);
+S390PCIGroup *s390_group_create(int id, int host_id);
S390PCIGroup *s390_group_find(int id);
+S390PCIGroup *s390_group_find_host_sim(int host_id);
struct S390PCIBusDevice {
DeviceState qdev;
@@ -370,6 +373,7 @@ struct S390pciState {
QTAILQ_HEAD(, S390PCIBusDevice) zpci_devs;
QTAILQ_HEAD(, S390PCIDMACount) zpci_dma_limit;
QTAILQ_HEAD(, S390PCIGroup) zpci_groups;
+ uint8_t next_sim_grp;
};
S390pciState *s390_get_phb(void);
--
2.27.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
end of thread, other threads:[~2022-03-14 19:59 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-03-14 19:49 [PATCH v4 00/11] s390x/pci: zPCI interpretation support Matthew Rosato
2022-03-14 19:49 ` [PATCH v4 01/11] Update linux headers Matthew Rosato
2022-03-14 19:49 ` [PATCH v4 02/11] vfio: handle KVM-owned IOMMU requests Matthew Rosato
2022-03-14 19:49 ` [PATCH v4 03/11] target/s390x: add zpci-interp to cpu models Matthew Rosato
2022-03-14 19:49 ` [PATCH v4 04/11] s390x/pci: add routine to get host function handle from CLP info Matthew Rosato
2022-03-14 19:49 ` [PATCH v4 05/11] s390x/pci: enable for load/store intepretation Matthew Rosato
2022-03-14 19:49 ` [PATCH v4 06/11] s390x/pci: don't fence interpreted devices without MSI-X Matthew Rosato
2022-03-14 19:49 ` [PATCH v4 07/11] s390x/pci: enable adapter event notification for interpreted devices Matthew Rosato
2022-03-14 19:49 ` [PATCH v4 08/11] s390x/pci: use KVM-managed IOMMU for interpretation Matthew Rosato
2022-03-14 19:49 ` [PATCH v4 09/11] s390x/pci: use I/O Address Translation assist when interpreting Matthew Rosato
2022-03-14 19:49 ` [PATCH v4 10/11] s390x/pci: use dtsm provided from vfio capabilities for interpreted devices Matthew Rosato
2022-03-14 19:49 ` [PATCH v4 11/11] s390x/pci: let intercept devices have separate PCI groups Matthew Rosato
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).