* [Qemu-devel] [PATCH v8 00/19] KVM platform device passthrough
@ 2014-11-30 18:35 Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 01/19] vfio: move hw/misc/vfio.c to hw/vfio/pci.c Move vfio.h into include/hw/vfio Eric Auger
` (19 more replies)
0 siblings, 20 replies; 22+ messages in thread
From: Eric Auger @ 2014-11-30 18:35 UTC (permalink / raw)
To: eric.auger, christoffer.dall, qemu-devel, agraf, pbonzini,
kim.phillips, a.rigo, manish.jaggi, joel.schopp, zhaoshenglong,
ard.biesheuvel
Cc: peter.maydell, patches, eric.auger, will.deacon, stuart.yoder,
Bharat.Bhushan, alex.williamson, a.motakis, kvmarm
This RFC series aims at enabling KVM platform device passthrough.
It implements a VFIO platform device, derived from VFIO PCI device.
The VFIO platform device uses the host VFIO platform driver which must
be bound to the assigned device prior to the QEMU system start.
- the guest can directly access the device register space
- assigned device IRQs are transparently routed to the guest by
QEMU/KVM (3 methods currently are supported: user-level eventfd
handling, irqfd, forwarded IRQs)
- iommu is transparently programmed to prevent the device from
accessing physical pages outside of the guest address space
This patch series is made of the following patch file groups:
1-11) PCI modifications to prepare for platform device introduction
12-15) VFIO & calxeda midway platform device without irqfd support
16) VFIO platform device with irqfd support
17-19) VFIO platform device with IRQ forwarding support
Each group is independent and should be separately upstreamable.
Dependency List:
QEMU dependencies:
[1] [PATCH v5] machvirt dynamic sysbus device instantiation
Eric Auger
[2] [PATCH v3 0/2] actual checks of KVM_CAP_IRQFD and KVM_CAP_IRQFD_RESAMPLE,
Eric Auger
http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00589.html
[3] [PATCH v2] vfio: migration to trace points
Eric Auger
https://patchwork.ozlabs.org/patch/394785/
Kernel Dependencies:
[1] [PATCH v10 00/20] VFIO support for platform and AMBA devices on ARM
Antonios Motakis
http://comments.gmane.org/gmane.linux.kernel.iommu/7096
[2] [PATCH v3 0/6] vfio: type1: support for ARM SMMUS with VFIO_IOMMU_TYPE1
Antonios Motakis
http://www.spinics.net/lists/kvm-arm/msg11738.html
[3] [PATCH v4] ARM: KVM: add irqfd support
Eric Auger
https://lkml.org/lkml/2014/9/1/141
[4] [RFC PATCH 0/9] ARM: Forwarding physical interrupts to a guest VM,
Marc Zyngier
http://lwn.net/Articles/603514/
[5] [PATCH v3 0/9] KVM-VFIO IRQ forward control
Eric Auger
https://lkml.org/lkml/2014/9/1/344
- kernel pieces can be found at:
http://git.linaro.org/people/eric.auger/linux.git (branch 3.18-rc6-v10)
- QEMU pieces can be found at:
http://git.linaro.org/people/eric.auger/qemu.git (branch vfio_integ_v8)
The patch series was tested on Calxeda Midway (ARMv7) where one xgmac
is assigned to KVM host while the second one is assigned to the guest.
Reworked PCI device is not tested.
Wiki for Calxeda Midway setup:
https://wiki.linaro.org/LEG/Engineering/Virtualization/Platform_Device_Passthrough_on_Midway
History:
v7->v8:
- rebase on v2.2.0-rc3 and integrate
"Add skip_dump flag to ignore memory region during dump"
- KVM header evolution with subindex addition in kvm_arch_forwarded_irq
- split [PATCH v7 03/16] hw/vfio/pci: introduce VFIODevice into 4 patches
- vfio_compute_needs_reset does not return bool anymore
- add some comments about exposed MMIO region and IRQ in calxeda xgmac
device
- vfio_[un]mask_irqindex renamed into vfio_[un]mask_single_irqindex
- rework IRQ startup: former machine init done notifier is replaced by a
reset notifier. machine file passes the interrupt controller
DeviceState handle (not the platform bus first irq parameter).
- sysbus-fdt:
- move the add_fdt_node_functions array declaration between the device
specific code and the generic code to avoid forward declarations of
decice specific functions
- rename add_basic_vfio_fdt_node into add_calxeda_midway_xgmac_fdt_node
emphasizing the fact it is xgmac specific
v6->v7:
- fake injection test modality removed
- VFIO_DEVICE_TYPE_PLATFORM only introduced with VFIO platform
- new helper functions to start VFIO IRQ on machine init done notifier
(introduced in hw/vfio/platform: add vfio-platform support and notifier
registration invoked in hw/arm/virt: add support for VFIO devices).
vfio_start_irq_injection is replaced by vfio_register_irq_starter.
v5->v6:
- rebase on 2.1rc5 PCI code
- forwarded IRQ first integraton
- vfio_device property renamed into host property
- split IRQ setup in different functions that match the 3 supported
injection techniques (user handled eventfd, irqfd, forwarded IRQ):
removes dynamic switch between injection methods
- introduce fake interrupts as a test modality:
x makes possible to test multiple IRQ user-side handling.
x this is a test feature only: enable to trigger a fd as if the
real physical IRQ hit. No virtual IRQ is injected into the guest
but handling is simulated so that the state machine can be tested
- user handled eventfd:
x add mutex to protect IRQ state & list manipulation,
x correct misleading comment in vfio_intp_interrupt.
x Fix bugs using fake interrupt modality
- irqfd no more advertised in this patchset (handled in [3])
- VFIOPlatformDeviceClass becomes abstract and Calxeda xgmac device
and class is re-introduced (as per v4)
- all DPRINTF removed in platform and replaced by trace-points
- corrects compilation with configure --disable-kvm
- simplifies the split for vfio_get_device and introduce a unique
specialized function named vfio_populate_device
- group_list renamed into vfio_group_list
- hw/arm/dyn_sysbus_devtree.c currently only support vfio-calxeda-xgmac
instantiation. Needs to be specialized for other VFIO devices
- fix 2 bugs in dyn_sysbus_devtree(reg_attr index and compat)
v4->v5:
- rebase on v2.1.0 PCI code
- take into account Alex Williamson comments on PCI code rework
- trace updates in vfio_region_write/read
- remove fd from VFIORegion
- get/put ckeanup
- bug fix: bar region's vbasedev field duly initialization
- misc cleanups in platform device
- device tree node generation removed from device and handled in
hw/arm/dyn_sysbus_devtree.c
- remove "hw/vfio: add an example calxeda_xgmac": with removal of
device tree node generation we do not have so many things to
implement in that derived device yet. May be re-introduced later
on if needed typically for reset/migration.
- no GSI routing table anymore
v3->v4 changes (Eric Auger, Alvise Rigo)
- rebase on last VFIO PCI code (v2.1.0-rc0)
- full git history rework to ease PCI code change review
- mv include files in hw/vfio
- DPRINTF reformatting temporarily moved out
- support of VFIO virq (removal of resamplefd handler on user-side)
- integration with sysbus dynamic instantiation framwork
- removal of unrealize and cleanup routines until it is better
understood what is really needed
- Support of VFIO for Amba devices should be handled in an inherited
device to specialize the device tree generation (clock handle currently
missing in framework however)
- "Always use eventfd as notifying mechanism" temporarily moved out
- static instantiation is not mainstream (although it remains possible)
note if static instantiation is used, irqfd must be setup in machine file
when virtual IRQ is known
- create the GSI routing table on qemu side
v2->v3 changes (Alvise Rigo, Eric Auger):
- Following Alex W recommandations, further efforts to factorize the
code between PCI:introduction of VFIODevice and VFIORegion
as base classes
- unique reset handler for platform and PCI
- cleanup following Kim's comments
- multiple IRQ support mechanics should be in place although not
tested
- Better handling of MMIO multiple regions
- New features and fixes by Alvise (multiple compat string, exec
flag, force eventfd usage, amba device tree support)
- irqfd support
v1->v2 changes (Kim Phillips, Eric Auger):
- IRQ initial support (legacy mode where eventfds are handled on
user side)
- hacked dynamic instantiation
v1 (Kim Phillips):
- initial split between PCI and platform
- MMIO support only
- static instantiation
Best Regards
Eric
Eric Auger (18):
hw/vfio/pci: Rename VFIODevice into VFIOPCIDevice
hw/vfio/pci: generalize mask/unmask to any IRQ index
hw/vfio/pci: introduce minimalist VFIODevice with fd
hw/vfio/pci: add type, name and group fields in VFIODevice
hw/vfio/pci: handle reset at VFIODevice
hw/vfio/pci: Introduce VFIORegion
hw/vfio/pci: split vfio_get_device
hw/vfio/pci: rename group_list into vfio_group_list
hw/vfio/pci: use name field in format strings
hw/vfio: create common module
hw/vfio/platform: add vfio-platform support
hw/vfio: calxeda xgmac device
hw/arm/virt: add support for VFIO devices
hw/arm/sysbus-fdt: enable vfio-calxeda-xgmac dynamic instantiation
hw/vfio/platform: Add irqfd support
linux-headers: Update KVM headers from linux-next tag ToBeFilled
hw/vfio/common: vfio_kvm_device_fd moved in the common header
hw/vfio/platform: add forwarded irq support
Kim Phillips (1):
vfio: move hw/misc/vfio.c to hw/vfio/pci.c Move vfio.h into
include/hw/vfio
LICENSE | 2 +-
MAINTAINERS | 2 +-
hw/Makefile.objs | 1 +
hw/arm/sysbus-fdt.c | 88 ++
hw/arm/virt.c | 15 +-
hw/misc/Makefile.objs | 1 -
hw/ppc/spapr_pci_vfio.c | 2 +-
hw/vfio/Makefile.objs | 6 +
hw/vfio/calxeda_xgmac.c | 54 ++
hw/vfio/common.c | 960 +++++++++++++++++++
hw/{misc/vfio.c => vfio/pci.c} | 1671 +++++++---------------------------
hw/vfio/platform.c | 777 ++++++++++++++++
include/hw/vfio/vfio-calxeda-xgmac.h | 46 +
include/hw/vfio/vfio-common.h | 157 ++++
include/hw/vfio/vfio-platform.h | 88 ++
include/hw/{misc => vfio}/vfio.h | 0
linux-headers/linux/kvm.h | 10 +
trace-events | 137 +--
18 files changed, 2599 insertions(+), 1418 deletions(-)
create mode 100644 hw/vfio/Makefile.objs
create mode 100644 hw/vfio/calxeda_xgmac.c
create mode 100644 hw/vfio/common.c
rename hw/{misc/vfio.c => vfio/pci.c} (65%)
create mode 100644 hw/vfio/platform.c
create mode 100644 include/hw/vfio/vfio-calxeda-xgmac.h
create mode 100644 include/hw/vfio/vfio-common.h
create mode 100644 include/hw/vfio/vfio-platform.h
rename include/hw/{misc => vfio}/vfio.h (100%)
--
1.8.3.2
^ permalink raw reply [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH v8 01/19] vfio: move hw/misc/vfio.c to hw/vfio/pci.c Move vfio.h into include/hw/vfio
2014-11-30 18:35 [Qemu-devel] [PATCH v8 00/19] KVM platform device passthrough Eric Auger
@ 2014-11-30 18:35 ` Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 02/19] hw/vfio/pci: Rename VFIODevice into VFIOPCIDevice Eric Auger
` (18 subsequent siblings)
19 siblings, 0 replies; 22+ messages in thread
From: Eric Auger @ 2014-11-30 18:35 UTC (permalink / raw)
To: eric.auger, christoffer.dall, qemu-devel, agraf, pbonzini,
kim.phillips, a.rigo, manish.jaggi, joel.schopp, zhaoshenglong,
ard.biesheuvel
Cc: peter.maydell, patches, Kim Phillips, eric.auger, will.deacon,
stuart.yoder, Bharat.Bhushan, alex.williamson, a.motakis, kvmarm
From: Kim Phillips <kim.phillips@linaro.org>
This is done in preparation for the addition of VFIO platform
device support.
Signed-off-by: Kim Phillips <kim.phillips@linaro.org>
---
LICENSE | 2 +-
MAINTAINERS | 2 +-
hw/Makefile.objs | 1 +
hw/misc/Makefile.objs | 1 -
hw/ppc/spapr_pci_vfio.c | 2 +-
hw/vfio/Makefile.objs | 3 +++
hw/{misc/vfio.c => vfio/pci.c} | 2 +-
include/hw/{misc => vfio}/vfio.h | 0
8 files changed, 8 insertions(+), 5 deletions(-)
create mode 100644 hw/vfio/Makefile.objs
rename hw/{misc/vfio.c => vfio/pci.c} (99%)
rename include/hw/{misc => vfio}/vfio.h (100%)
diff --git a/LICENSE b/LICENSE
index da70e94..0e0b4b9 100644
--- a/LICENSE
+++ b/LICENSE
@@ -11,7 +11,7 @@ option) any later version.
As of July 2013, contributions under version 2 of the GNU General Public
License (and no later version) are only accepted for the following files
-or directories: bsd-user/, linux-user/, hw/misc/vfio.c, hw/xen/xen_pt*.
+or directories: bsd-user/, linux-user/, hw/vfio/, hw/xen/xen_pt*.
3) The Tiny Code Generator (TCG) is released under the BSD license
(see license headers in files).
diff --git a/MAINTAINERS b/MAINTAINERS
index bcb69e8..255b512 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -657,7 +657,7 @@ F: hw/usb/dev-serial.c
VFIO
M: Alex Williamson <alex.williamson@redhat.com>
S: Supported
-F: hw/misc/vfio.c
+F: hw/vfio/*
vhost
M: Michael S. Tsirkin <mst@redhat.com>
diff --git a/hw/Makefile.objs b/hw/Makefile.objs
index 52a1464..73afa41 100644
--- a/hw/Makefile.objs
+++ b/hw/Makefile.objs
@@ -26,6 +26,7 @@ devices-dirs-$(CONFIG_SOFTMMU) += ssi/
devices-dirs-$(CONFIG_SOFTMMU) += timer/
devices-dirs-$(CONFIG_TPM) += tpm/
devices-dirs-$(CONFIG_SOFTMMU) += usb/
+devices-dirs-$(CONFIG_SOFTMMU) += vfio/
devices-dirs-$(CONFIG_VIRTIO) += virtio/
devices-dirs-$(CONFIG_SOFTMMU) += watchdog/
devices-dirs-$(CONFIG_SOFTMMU) += xen/
diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
index 979e532..e47fea8 100644
--- a/hw/misc/Makefile.objs
+++ b/hw/misc/Makefile.objs
@@ -21,7 +21,6 @@ common-obj-$(CONFIG_MACIO) += macio/
ifeq ($(CONFIG_PCI), y)
obj-$(CONFIG_KVM) += ivshmem.o
-obj-$(CONFIG_LINUX) += vfio.o
endif
obj-$(CONFIG_REALVIEW) += arm_sysctl.o
diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
index d3bddf2..144912b 100644
--- a/hw/ppc/spapr_pci_vfio.c
+++ b/hw/ppc/spapr_pci_vfio.c
@@ -20,7 +20,7 @@
#include "hw/ppc/spapr.h"
#include "hw/pci-host/spapr.h"
#include "linux/vfio.h"
-#include "hw/misc/vfio.h"
+#include "hw/vfio/vfio.h"
static Property spapr_phb_vfio_properties[] = {
DEFINE_PROP_INT32("iommu", sPAPRPHBVFIOState, iommugroupid, -1),
diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
new file mode 100644
index 0000000..31c7dab
--- /dev/null
+++ b/hw/vfio/Makefile.objs
@@ -0,0 +1,3 @@
+ifeq ($(CONFIG_LINUX), y)
+obj-$(CONFIG_PCI) += pci.o
+endif
diff --git a/hw/misc/vfio.c b/hw/vfio/pci.c
similarity index 99%
rename from hw/misc/vfio.c
rename to hw/vfio/pci.c
index 6c36c8b..7e69415 100644
--- a/hw/misc/vfio.c
+++ b/hw/vfio/pci.c
@@ -39,8 +39,8 @@
#include "qemu/range.h"
#include "sysemu/kvm.h"
#include "sysemu/sysemu.h"
-#include "hw/misc/vfio.h"
#include "trace.h"
+#include "hw/vfio/vfio.h"
/* Extra debugging, trap acceleration paths for more logging */
#define VFIO_ALLOW_MMAP 1
diff --git a/include/hw/misc/vfio.h b/include/hw/vfio/vfio.h
similarity index 100%
rename from include/hw/misc/vfio.h
rename to include/hw/vfio/vfio.h
--
1.8.3.2
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH v8 02/19] hw/vfio/pci: Rename VFIODevice into VFIOPCIDevice
2014-11-30 18:35 [Qemu-devel] [PATCH v8 00/19] KVM platform device passthrough Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 01/19] vfio: move hw/misc/vfio.c to hw/vfio/pci.c Move vfio.h into include/hw/vfio Eric Auger
@ 2014-11-30 18:35 ` Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 03/19] hw/vfio/pci: generalize mask/unmask to any IRQ index Eric Auger
` (17 subsequent siblings)
19 siblings, 0 replies; 22+ messages in thread
From: Eric Auger @ 2014-11-30 18:35 UTC (permalink / raw)
To: eric.auger, christoffer.dall, qemu-devel, agraf, pbonzini,
kim.phillips, a.rigo, manish.jaggi, joel.schopp, zhaoshenglong,
ard.biesheuvel
Cc: peter.maydell, patches, eric.auger, will.deacon, stuart.yoder,
Bharat.Bhushan, alex.williamson, a.motakis, kvmarm
This prepares for the introduction of VFIOPlatformDevice
Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
hw/vfio/pci.c | 210 +++++++++++++++++++++++++++++-----------------------------
1 file changed, 106 insertions(+), 104 deletions(-)
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 7e69415..0d7d4a0 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -48,11 +48,11 @@
#define VFIO_ALLOW_KVM_MSI 1
#define VFIO_ALLOW_KVM_MSIX 1
-struct VFIODevice;
+struct VFIOPCIDevice;
typedef struct VFIOQuirk {
MemoryRegion mem;
- struct VFIODevice *vdev;
+ struct VFIOPCIDevice *vdev;
QLIST_ENTRY(VFIOQuirk) next;
struct {
uint32_t base_offset:TARGET_PAGE_BITS;
@@ -123,7 +123,7 @@ typedef struct VFIOMSIVector {
*/
EventNotifier interrupt;
EventNotifier kvm_interrupt;
- struct VFIODevice *vdev; /* back pointer to device */
+ struct VFIOPCIDevice *vdev; /* back pointer to device */
int virq;
bool use;
} VFIOMSIVector;
@@ -185,7 +185,7 @@ typedef struct VFIOMSIXInfo {
void *mmap;
} VFIOMSIXInfo;
-typedef struct VFIODevice {
+typedef struct VFIOPCIDevice {
PCIDevice pdev;
int fd;
VFIOINTx intx;
@@ -203,7 +203,7 @@ typedef struct VFIODevice {
VFIOBAR bars[PCI_NUM_REGIONS - 1]; /* No ROM */
VFIOVGA vga; /* 0xa0000, 0x3b0, 0x3c0 */
PCIHostDeviceAddress host;
- QLIST_ENTRY(VFIODevice) next;
+ QLIST_ENTRY(VFIOPCIDevice) next;
struct VFIOGroup *group;
EventNotifier err_notifier;
uint32_t features;
@@ -218,13 +218,13 @@ typedef struct VFIODevice {
bool has_pm_reset;
bool needs_reset;
bool rom_read_failed;
-} VFIODevice;
+} VFIOPCIDevice;
typedef struct VFIOGroup {
int fd;
int groupid;
VFIOContainer *container;
- QLIST_HEAD(, VFIODevice) device_list;
+ QLIST_HEAD(, VFIOPCIDevice) device_list;
QLIST_ENTRY(VFIOGroup) next;
QLIST_ENTRY(VFIOGroup) container_next;
} VFIOGroup;
@@ -268,16 +268,16 @@ static QLIST_HEAD(, VFIOGroup)
static int vfio_kvm_device_fd = -1;
#endif
-static void vfio_disable_interrupts(VFIODevice *vdev);
+static void vfio_disable_interrupts(VFIOPCIDevice *vdev);
static uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len);
static void vfio_pci_write_config(PCIDevice *pdev, uint32_t addr,
uint32_t val, int len);
-static void vfio_mmap_set_enabled(VFIODevice *vdev, bool enabled);
+static void vfio_mmap_set_enabled(VFIOPCIDevice *vdev, bool enabled);
/*
* Common VFIO interrupt disable
*/
-static void vfio_disable_irqindex(VFIODevice *vdev, int index)
+static void vfio_disable_irqindex(VFIOPCIDevice *vdev, int index)
{
struct vfio_irq_set irq_set = {
.argsz = sizeof(irq_set),
@@ -293,7 +293,7 @@ static void vfio_disable_irqindex(VFIODevice *vdev, int index)
/*
* INTx
*/
-static void vfio_unmask_intx(VFIODevice *vdev)
+static void vfio_unmask_intx(VFIOPCIDevice *vdev)
{
struct vfio_irq_set irq_set = {
.argsz = sizeof(irq_set),
@@ -307,7 +307,7 @@ static void vfio_unmask_intx(VFIODevice *vdev)
}
#ifdef CONFIG_KVM /* Unused outside of CONFIG_KVM code */
-static void vfio_mask_intx(VFIODevice *vdev)
+static void vfio_mask_intx(VFIOPCIDevice *vdev)
{
struct vfio_irq_set irq_set = {
.argsz = sizeof(irq_set),
@@ -338,7 +338,7 @@ static void vfio_mask_intx(VFIODevice *vdev)
*/
static void vfio_intx_mmap_enable(void *opaque)
{
- VFIODevice *vdev = opaque;
+ VFIOPCIDevice *vdev = opaque;
if (vdev->intx.pending) {
timer_mod(vdev->intx.mmap_timer,
@@ -351,7 +351,7 @@ static void vfio_intx_mmap_enable(void *opaque)
static void vfio_intx_interrupt(void *opaque)
{
- VFIODevice *vdev = opaque;
+ VFIOPCIDevice *vdev = opaque;
if (!event_notifier_test_and_clear(&vdev->intx.interrupt)) {
return;
@@ -370,7 +370,7 @@ static void vfio_intx_interrupt(void *opaque)
}
}
-static void vfio_eoi(VFIODevice *vdev)
+static void vfio_eoi(VFIOPCIDevice *vdev)
{
if (!vdev->intx.pending) {
return;
@@ -384,7 +384,7 @@ static void vfio_eoi(VFIODevice *vdev)
vfio_unmask_intx(vdev);
}
-static void vfio_enable_intx_kvm(VFIODevice *vdev)
+static void vfio_enable_intx_kvm(VFIOPCIDevice *vdev)
{
#ifdef CONFIG_KVM
struct kvm_irqfd irqfd = {
@@ -462,7 +462,7 @@ fail:
#endif
}
-static void vfio_disable_intx_kvm(VFIODevice *vdev)
+static void vfio_disable_intx_kvm(VFIOPCIDevice *vdev)
{
#ifdef CONFIG_KVM
struct kvm_irqfd irqfd = {
@@ -506,7 +506,7 @@ static void vfio_disable_intx_kvm(VFIODevice *vdev)
static void vfio_update_irq(PCIDevice *pdev)
{
- VFIODevice *vdev = DO_UPCAST(VFIODevice, pdev, pdev);
+ VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev);
PCIINTxRoute route;
if (vdev->interrupt != VFIO_INT_INTx) {
@@ -537,7 +537,7 @@ static void vfio_update_irq(PCIDevice *pdev)
vfio_eoi(vdev);
}
-static int vfio_enable_intx(VFIODevice *vdev)
+static int vfio_enable_intx(VFIOPCIDevice *vdev)
{
uint8_t pin = vfio_pci_read_config(&vdev->pdev, PCI_INTERRUPT_PIN, 1);
int ret, argsz;
@@ -602,7 +602,7 @@ static int vfio_enable_intx(VFIODevice *vdev)
return 0;
}
-static void vfio_disable_intx(VFIODevice *vdev)
+static void vfio_disable_intx(VFIOPCIDevice *vdev)
{
int fd;
@@ -629,7 +629,7 @@ static void vfio_disable_intx(VFIODevice *vdev)
static void vfio_msi_interrupt(void *opaque)
{
VFIOMSIVector *vector = opaque;
- VFIODevice *vdev = vector->vdev;
+ VFIOPCIDevice *vdev = vector->vdev;
int nr = vector - vdev->msi_vectors;
if (!event_notifier_test_and_clear(&vector->interrupt)) {
@@ -661,7 +661,7 @@ static void vfio_msi_interrupt(void *opaque)
}
}
-static int vfio_enable_vectors(VFIODevice *vdev, bool msix)
+static int vfio_enable_vectors(VFIOPCIDevice *vdev, bool msix)
{
struct vfio_irq_set *irq_set;
int ret = 0, i, argsz;
@@ -752,7 +752,7 @@ static void vfio_update_kvm_msi_virq(VFIOMSIVector *vector, MSIMessage msg)
static int vfio_msix_vector_do_use(PCIDevice *pdev, unsigned int nr,
MSIMessage *msg, IOHandler *handler)
{
- VFIODevice *vdev = DO_UPCAST(VFIODevice, pdev, pdev);
+ VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev);
VFIOMSIVector *vector;
int ret;
@@ -841,7 +841,7 @@ static int vfio_msix_vector_use(PCIDevice *pdev,
static void vfio_msix_vector_release(PCIDevice *pdev, unsigned int nr)
{
- VFIODevice *vdev = DO_UPCAST(VFIODevice, pdev, pdev);
+ VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev);
VFIOMSIVector *vector = &vdev->msi_vectors[nr];
trace_vfio_msix_vector_release(vdev->host.domain, vdev->host.bus,
@@ -880,7 +880,7 @@ static void vfio_msix_vector_release(PCIDevice *pdev, unsigned int nr)
}
}
-static void vfio_enable_msix(VFIODevice *vdev)
+static void vfio_enable_msix(VFIOPCIDevice *vdev)
{
vfio_disable_interrupts(vdev);
@@ -913,7 +913,7 @@ static void vfio_enable_msix(VFIODevice *vdev)
vdev->host.slot, vdev->host.function);
}
-static void vfio_enable_msi(VFIODevice *vdev)
+static void vfio_enable_msi(VFIOPCIDevice *vdev)
{
int ret, i;
@@ -991,7 +991,7 @@ retry:
vdev->nr_vectors);
}
-static void vfio_disable_msi_common(VFIODevice *vdev)
+static void vfio_disable_msi_common(VFIOPCIDevice *vdev)
{
int i;
@@ -1015,7 +1015,7 @@ static void vfio_disable_msi_common(VFIODevice *vdev)
vfio_enable_intx(vdev);
}
-static void vfio_disable_msix(VFIODevice *vdev)
+static void vfio_disable_msix(VFIOPCIDevice *vdev)
{
int i;
@@ -1042,7 +1042,7 @@ static void vfio_disable_msix(VFIODevice *vdev)
vdev->host.slot, vdev->host.function);
}
-static void vfio_disable_msi(VFIODevice *vdev)
+static void vfio_disable_msi(VFIOPCIDevice *vdev)
{
vfio_disable_irqindex(vdev, VFIO_PCI_MSI_IRQ_INDEX);
vfio_disable_msi_common(vdev);
@@ -1051,7 +1051,7 @@ static void vfio_disable_msi(VFIODevice *vdev)
vdev->host.slot, vdev->host.function);
}
-static void vfio_update_msi(VFIODevice *vdev)
+static void vfio_update_msi(VFIOPCIDevice *vdev)
{
int i;
@@ -1104,7 +1104,7 @@ static void vfio_bar_write(void *opaque, hwaddr addr,
#ifdef DEBUG_VFIO
{
- VFIODevice *vdev = container_of(bar, VFIODevice, bars[bar->nr]);
+ VFIOPCIDevice *vdev = container_of(bar, VFIOPCIDevice, bars[bar->nr]);
trace_vfio_bar_write(vdev->host.domain, vdev->host.bus,
vdev->host.slot, vdev->host.function,
@@ -1120,7 +1120,7 @@ static void vfio_bar_write(void *opaque, hwaddr addr,
* which access will service the interrupt, so we're potentially
* getting quite a few host interrupts per guest interrupt.
*/
- vfio_eoi(container_of(bar, VFIODevice, bars[bar->nr]));
+ vfio_eoi(container_of(bar, VFIOPCIDevice, bars[bar->nr]));
}
static uint64_t vfio_bar_read(void *opaque,
@@ -1158,7 +1158,7 @@ static uint64_t vfio_bar_read(void *opaque,
#ifdef DEBUG_VFIO
{
- VFIODevice *vdev = container_of(bar, VFIODevice, bars[bar->nr]);
+ VFIOPCIDevice *vdev = container_of(bar, VFIOPCIDevice, bars[bar->nr]);
trace_vfio_bar_read(vdev->host.domain, vdev->host.bus,
vdev->host.slot, vdev->host.function,
@@ -1167,7 +1167,7 @@ static uint64_t vfio_bar_read(void *opaque,
#endif
/* Same as write above */
- vfio_eoi(container_of(bar, VFIODevice, bars[bar->nr]));
+ vfio_eoi(container_of(bar, VFIOPCIDevice, bars[bar->nr]));
return data;
}
@@ -1178,7 +1178,7 @@ static const MemoryRegionOps vfio_bar_ops = {
.endianness = DEVICE_LITTLE_ENDIAN,
};
-static void vfio_pci_load_rom(VFIODevice *vdev)
+static void vfio_pci_load_rom(VFIOPCIDevice *vdev)
{
struct vfio_region_info reg_info = {
.argsz = sizeof(reg_info),
@@ -1236,7 +1236,7 @@ static void vfio_pci_load_rom(VFIODevice *vdev)
static uint64_t vfio_rom_read(void *opaque, hwaddr addr, unsigned size)
{
- VFIODevice *vdev = opaque;
+ VFIOPCIDevice *vdev = opaque;
union {
uint8_t byte;
uint16_t word;
@@ -1286,7 +1286,7 @@ static const MemoryRegionOps vfio_rom_ops = {
.endianness = DEVICE_LITTLE_ENDIAN,
};
-static bool vfio_blacklist_opt_rom(VFIODevice *vdev)
+static bool vfio_blacklist_opt_rom(VFIOPCIDevice *vdev)
{
PCIDevice *pdev = &vdev->pdev;
uint16_t vendor_id, device_id;
@@ -1306,7 +1306,7 @@ static bool vfio_blacklist_opt_rom(VFIODevice *vdev)
return false;
}
-static void vfio_pci_size_rom(VFIODevice *vdev)
+static void vfio_pci_size_rom(VFIOPCIDevice *vdev)
{
uint32_t orig, size = cpu_to_le32((uint32_t)PCI_ROM_ADDRESS_MASK);
off_t offset = vdev->config_offset + PCI_ROM_ADDRESS;
@@ -1484,7 +1484,7 @@ static uint64_t vfio_generic_window_quirk_read(void *opaque,
hwaddr addr, unsigned size)
{
VFIOQuirk *quirk = opaque;
- VFIODevice *vdev = quirk->vdev;
+ VFIOPCIDevice *vdev = quirk->vdev;
uint64_t data;
if (vfio_flags_enabled(quirk->data.flags, quirk->data.read_flags) &&
@@ -1520,7 +1520,7 @@ static void vfio_generic_window_quirk_write(void *opaque, hwaddr addr,
uint64_t data, unsigned size)
{
VFIOQuirk *quirk = opaque;
- VFIODevice *vdev = quirk->vdev;
+ VFIOPCIDevice *vdev = quirk->vdev;
if (ranges_overlap(addr, size,
quirk->data.address_offset, quirk->data.address_size)) {
@@ -1578,7 +1578,7 @@ static uint64_t vfio_generic_quirk_read(void *opaque,
hwaddr addr, unsigned size)
{
VFIOQuirk *quirk = opaque;
- VFIODevice *vdev = quirk->vdev;
+ VFIOPCIDevice *vdev = quirk->vdev;
hwaddr base = quirk->data.address_match & TARGET_PAGE_MASK;
hwaddr offset = quirk->data.address_match & ~TARGET_PAGE_MASK;
uint64_t data;
@@ -1611,7 +1611,7 @@ static void vfio_generic_quirk_write(void *opaque, hwaddr addr,
uint64_t data, unsigned size)
{
VFIOQuirk *quirk = opaque;
- VFIODevice *vdev = quirk->vdev;
+ VFIOPCIDevice *vdev = quirk->vdev;
hwaddr base = quirk->data.address_match & TARGET_PAGE_MASK;
hwaddr offset = quirk->data.address_match & ~TARGET_PAGE_MASK;
@@ -1659,7 +1659,7 @@ static uint64_t vfio_ati_3c3_quirk_read(void *opaque,
hwaddr addr, unsigned size)
{
VFIOQuirk *quirk = opaque;
- VFIODevice *vdev = quirk->vdev;
+ VFIOPCIDevice *vdev = quirk->vdev;
uint64_t data = vfio_pci_read_config(&vdev->pdev,
PCI_BASE_ADDRESS_0 + (4 * 4) + 1,
size);
@@ -1673,7 +1673,7 @@ static const MemoryRegionOps vfio_ati_3c3_quirk = {
.endianness = DEVICE_LITTLE_ENDIAN,
};
-static void vfio_vga_probe_ati_3c3_quirk(VFIODevice *vdev)
+static void vfio_vga_probe_ati_3c3_quirk(VFIOPCIDevice *vdev)
{
PCIDevice *pdev = &vdev->pdev;
VFIOQuirk *quirk;
@@ -1715,7 +1715,7 @@ static void vfio_vga_probe_ati_3c3_quirk(VFIODevice *vdev)
* that only read-only access is provided, but we drop writes when the window
* is enabled to config space nonetheless.
*/
-static void vfio_probe_ati_bar4_window_quirk(VFIODevice *vdev, int nr)
+static void vfio_probe_ati_bar4_window_quirk(VFIOPCIDevice *vdev, int nr)
{
PCIDevice *pdev = &vdev->pdev;
VFIOQuirk *quirk;
@@ -1778,7 +1778,7 @@ static uint64_t vfio_rtl8168_window_quirk_read(void *opaque,
hwaddr addr, unsigned size)
{
VFIOQuirk *quirk = opaque;
- VFIODevice *vdev = quirk->vdev;
+ VFIOPCIDevice *vdev = quirk->vdev;
switch (addr) {
case 4: /* address */
@@ -1824,7 +1824,7 @@ static void vfio_rtl8168_window_quirk_write(void *opaque, hwaddr addr,
uint64_t data, unsigned size)
{
VFIOQuirk *quirk = opaque;
- VFIODevice *vdev = quirk->vdev;
+ VFIOPCIDevice *vdev = quirk->vdev;
switch (addr) {
case 4: /* address */
@@ -1873,7 +1873,7 @@ static const MemoryRegionOps vfio_rtl8168_window_quirk = {
.endianness = DEVICE_LITTLE_ENDIAN,
};
-static void vfio_probe_rtl8168_bar2_window_quirk(VFIODevice *vdev, int nr)
+static void vfio_probe_rtl8168_bar2_window_quirk(VFIOPCIDevice *vdev, int nr)
{
PCIDevice *pdev = &vdev->pdev;
VFIOQuirk *quirk;
@@ -1902,7 +1902,7 @@ static void vfio_probe_rtl8168_bar2_window_quirk(VFIODevice *vdev, int nr)
/*
* Trap the BAR2 MMIO window to config space as well.
*/
-static void vfio_probe_ati_bar2_4000_quirk(VFIODevice *vdev, int nr)
+static void vfio_probe_ati_bar2_4000_quirk(VFIOPCIDevice *vdev, int nr)
{
PCIDevice *pdev = &vdev->pdev;
VFIOQuirk *quirk;
@@ -1971,7 +1971,7 @@ static uint64_t vfio_nvidia_3d0_quirk_read(void *opaque,
hwaddr addr, unsigned size)
{
VFIOQuirk *quirk = opaque;
- VFIODevice *vdev = quirk->vdev;
+ VFIOPCIDevice *vdev = quirk->vdev;
PCIDevice *pdev = &vdev->pdev;
uint64_t data = vfio_vga_read(&vdev->vga.region[QEMU_PCI_VGA_IO_HI],
addr + quirk->data.base_offset, size);
@@ -1990,7 +1990,7 @@ static void vfio_nvidia_3d0_quirk_write(void *opaque, hwaddr addr,
uint64_t data, unsigned size)
{
VFIOQuirk *quirk = opaque;
- VFIODevice *vdev = quirk->vdev;
+ VFIOPCIDevice *vdev = quirk->vdev;
PCIDevice *pdev = &vdev->pdev;
switch (quirk->data.flags) {
@@ -2037,7 +2037,7 @@ static const MemoryRegionOps vfio_nvidia_3d0_quirk = {
.endianness = DEVICE_LITTLE_ENDIAN,
};
-static void vfio_vga_probe_nvidia_3d0_quirk(VFIODevice *vdev)
+static void vfio_vga_probe_nvidia_3d0_quirk(VFIOPCIDevice *vdev)
{
PCIDevice *pdev = &vdev->pdev;
VFIOQuirk *quirk;
@@ -2130,7 +2130,7 @@ static const MemoryRegionOps vfio_nvidia_bar5_window_quirk = {
.endianness = DEVICE_LITTLE_ENDIAN,
};
-static void vfio_probe_nvidia_bar5_window_quirk(VFIODevice *vdev, int nr)
+static void vfio_probe_nvidia_bar5_window_quirk(VFIOPCIDevice *vdev, int nr)
{
PCIDevice *pdev = &vdev->pdev;
VFIOQuirk *quirk;
@@ -2166,7 +2166,7 @@ static void vfio_nvidia_88000_quirk_write(void *opaque, hwaddr addr,
uint64_t data, unsigned size)
{
VFIOQuirk *quirk = opaque;
- VFIODevice *vdev = quirk->vdev;
+ VFIOPCIDevice *vdev = quirk->vdev;
PCIDevice *pdev = &vdev->pdev;
hwaddr base = quirk->data.address_match & TARGET_PAGE_MASK;
@@ -2199,7 +2199,7 @@ static const MemoryRegionOps vfio_nvidia_88000_quirk = {
*
* Here's offset 0x88000...
*/
-static void vfio_probe_nvidia_bar0_88000_quirk(VFIODevice *vdev, int nr)
+static void vfio_probe_nvidia_bar0_88000_quirk(VFIOPCIDevice *vdev, int nr)
{
PCIDevice *pdev = &vdev->pdev;
VFIOQuirk *quirk;
@@ -2238,7 +2238,7 @@ static void vfio_probe_nvidia_bar0_88000_quirk(VFIODevice *vdev, int nr)
/*
* And here's the same for BAR0 offset 0x1800...
*/
-static void vfio_probe_nvidia_bar0_1800_quirk(VFIODevice *vdev, int nr)
+static void vfio_probe_nvidia_bar0_1800_quirk(VFIOPCIDevice *vdev, int nr)
{
PCIDevice *pdev = &vdev->pdev;
VFIOQuirk *quirk;
@@ -2283,13 +2283,13 @@ static void vfio_probe_nvidia_bar0_1800_quirk(VFIODevice *vdev, int nr)
/*
* Common quirk probe entry points.
*/
-static void vfio_vga_quirk_setup(VFIODevice *vdev)
+static void vfio_vga_quirk_setup(VFIOPCIDevice *vdev)
{
vfio_vga_probe_ati_3c3_quirk(vdev);
vfio_vga_probe_nvidia_3d0_quirk(vdev);
}
-static void vfio_vga_quirk_teardown(VFIODevice *vdev)
+static void vfio_vga_quirk_teardown(VFIOPCIDevice *vdev)
{
int i;
@@ -2304,7 +2304,7 @@ static void vfio_vga_quirk_teardown(VFIODevice *vdev)
}
}
-static void vfio_bar_quirk_setup(VFIODevice *vdev, int nr)
+static void vfio_bar_quirk_setup(VFIOPCIDevice *vdev, int nr)
{
vfio_probe_ati_bar4_window_quirk(vdev, nr);
vfio_probe_ati_bar2_4000_quirk(vdev, nr);
@@ -2314,7 +2314,7 @@ static void vfio_bar_quirk_setup(VFIODevice *vdev, int nr)
vfio_probe_rtl8168_bar2_window_quirk(vdev, nr);
}
-static void vfio_bar_quirk_teardown(VFIODevice *vdev, int nr)
+static void vfio_bar_quirk_teardown(VFIOPCIDevice *vdev, int nr)
{
VFIOBAR *bar = &vdev->bars[nr];
@@ -2332,7 +2332,7 @@ static void vfio_bar_quirk_teardown(VFIODevice *vdev, int nr)
*/
static uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len)
{
- VFIODevice *vdev = DO_UPCAST(VFIODevice, pdev, pdev);
+ VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev);
uint32_t emu_bits = 0, emu_val = 0, phys_val = 0, val;
memcpy(&emu_bits, vdev->emulated_config_bits + addr, len);
@@ -2367,7 +2367,7 @@ static uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len)
static void vfio_pci_write_config(PCIDevice *pdev, uint32_t addr,
uint32_t val, int len)
{
- VFIODevice *vdev = DO_UPCAST(VFIODevice, pdev, pdev);
+ VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev);
uint32_t val_le = cpu_to_le32(val);
trace_vfio_pci_write_config(vdev->host.domain, vdev->host.bus,
@@ -2722,7 +2722,7 @@ static void vfio_listener_release(VFIOContainer *container)
/*
* Interrupt setup
*/
-static void vfio_disable_interrupts(VFIODevice *vdev)
+static void vfio_disable_interrupts(VFIOPCIDevice *vdev)
{
switch (vdev->interrupt) {
case VFIO_INT_INTx:
@@ -2737,7 +2737,7 @@ static void vfio_disable_interrupts(VFIODevice *vdev)
}
}
-static int vfio_setup_msi(VFIODevice *vdev, int pos)
+static int vfio_setup_msi(VFIOPCIDevice *vdev, int pos)
{
uint16_t ctrl;
bool msi_64bit, msi_maskbit;
@@ -2777,7 +2777,7 @@ static int vfio_setup_msi(VFIODevice *vdev, int pos)
* need to first look for where the MSI-X table lives. So we
* unfortunately split MSI-X setup across two functions.
*/
-static int vfio_early_setup_msix(VFIODevice *vdev)
+static int vfio_early_setup_msix(VFIOPCIDevice *vdev)
{
uint8_t pos;
uint16_t ctrl;
@@ -2823,7 +2823,7 @@ static int vfio_early_setup_msix(VFIODevice *vdev)
return 0;
}
-static int vfio_setup_msix(VFIODevice *vdev, int pos)
+static int vfio_setup_msix(VFIOPCIDevice *vdev, int pos)
{
int ret;
@@ -2843,7 +2843,7 @@ static int vfio_setup_msix(VFIODevice *vdev, int pos)
return 0;
}
-static void vfio_teardown_msi(VFIODevice *vdev)
+static void vfio_teardown_msi(VFIOPCIDevice *vdev)
{
msi_uninit(&vdev->pdev);
@@ -2856,7 +2856,7 @@ static void vfio_teardown_msi(VFIODevice *vdev)
/*
* Resource setup
*/
-static void vfio_mmap_set_enabled(VFIODevice *vdev, bool enabled)
+static void vfio_mmap_set_enabled(VFIOPCIDevice *vdev, bool enabled)
{
int i;
@@ -2874,7 +2874,7 @@ static void vfio_mmap_set_enabled(VFIODevice *vdev, bool enabled)
}
}
-static void vfio_unmap_bar(VFIODevice *vdev, int nr)
+static void vfio_unmap_bar(VFIOPCIDevice *vdev, int nr)
{
VFIOBAR *bar = &vdev->bars[nr];
@@ -2893,7 +2893,7 @@ static void vfio_unmap_bar(VFIODevice *vdev, int nr)
}
}
-static int vfio_mmap_bar(VFIODevice *vdev, VFIOBAR *bar,
+static int vfio_mmap_bar(VFIOPCIDevice *vdev, VFIOBAR *bar,
MemoryRegion *mem, MemoryRegion *submem,
void **map, size_t size, off_t offset,
const char *name)
@@ -2932,7 +2932,7 @@ empty_region:
return ret;
}
-static void vfio_map_bar(VFIODevice *vdev, int nr)
+static void vfio_map_bar(VFIOPCIDevice *vdev, int nr)
{
VFIOBAR *bar = &vdev->bars[nr];
unsigned size = bar->size;
@@ -3001,7 +3001,7 @@ static void vfio_map_bar(VFIODevice *vdev, int nr)
vfio_bar_quirk_setup(vdev, nr);
}
-static void vfio_map_bars(VFIODevice *vdev)
+static void vfio_map_bars(VFIOPCIDevice *vdev)
{
int i;
@@ -3033,7 +3033,7 @@ static void vfio_map_bars(VFIODevice *vdev)
}
}
-static void vfio_unmap_bars(VFIODevice *vdev)
+static void vfio_unmap_bars(VFIOPCIDevice *vdev)
{
int i;
@@ -3069,7 +3069,7 @@ static void vfio_set_word_bits(uint8_t *buf, uint16_t val, uint16_t mask)
pci_set_word(buf, (pci_get_word(buf) & ~mask) | val);
}
-static void vfio_add_emulated_word(VFIODevice *vdev, int pos,
+static void vfio_add_emulated_word(VFIOPCIDevice *vdev, int pos,
uint16_t val, uint16_t mask)
{
vfio_set_word_bits(vdev->pdev.config + pos, val, mask);
@@ -3082,7 +3082,7 @@ static void vfio_set_long_bits(uint8_t *buf, uint32_t val, uint32_t mask)
pci_set_long(buf, (pci_get_long(buf) & ~mask) | val);
}
-static void vfio_add_emulated_long(VFIODevice *vdev, int pos,
+static void vfio_add_emulated_long(VFIOPCIDevice *vdev, int pos,
uint32_t val, uint32_t mask)
{
vfio_set_long_bits(vdev->pdev.config + pos, val, mask);
@@ -3090,7 +3090,7 @@ static void vfio_add_emulated_long(VFIODevice *vdev, int pos,
vfio_set_long_bits(vdev->emulated_config_bits + pos, mask, mask);
}
-static int vfio_setup_pcie_cap(VFIODevice *vdev, int pos, uint8_t size)
+static int vfio_setup_pcie_cap(VFIOPCIDevice *vdev, int pos, uint8_t size)
{
uint16_t flags;
uint8_t type;
@@ -3182,7 +3182,7 @@ static int vfio_setup_pcie_cap(VFIODevice *vdev, int pos, uint8_t size)
return pos;
}
-static void vfio_check_pcie_flr(VFIODevice *vdev, uint8_t pos)
+static void vfio_check_pcie_flr(VFIOPCIDevice *vdev, uint8_t pos)
{
uint32_t cap = pci_get_long(vdev->pdev.config + pos + PCI_EXP_DEVCAP);
@@ -3193,7 +3193,7 @@ static void vfio_check_pcie_flr(VFIODevice *vdev, uint8_t pos)
}
}
-static void vfio_check_pm_reset(VFIODevice *vdev, uint8_t pos)
+static void vfio_check_pm_reset(VFIOPCIDevice *vdev, uint8_t pos)
{
uint16_t csr = pci_get_word(vdev->pdev.config + pos + PCI_PM_CTRL);
@@ -3204,7 +3204,7 @@ static void vfio_check_pm_reset(VFIODevice *vdev, uint8_t pos)
}
}
-static void vfio_check_af_flr(VFIODevice *vdev, uint8_t pos)
+static void vfio_check_af_flr(VFIOPCIDevice *vdev, uint8_t pos)
{
uint8_t cap = pci_get_byte(vdev->pdev.config + pos + PCI_AF_CAP);
@@ -3215,7 +3215,7 @@ static void vfio_check_af_flr(VFIODevice *vdev, uint8_t pos)
}
}
-static int vfio_add_std_cap(VFIODevice *vdev, uint8_t pos)
+static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos)
{
PCIDevice *pdev = &vdev->pdev;
uint8_t cap_id, next, size;
@@ -3290,7 +3290,7 @@ static int vfio_add_std_cap(VFIODevice *vdev, uint8_t pos)
return 0;
}
-static int vfio_add_capabilities(VFIODevice *vdev)
+static int vfio_add_capabilities(VFIOPCIDevice *vdev)
{
PCIDevice *pdev = &vdev->pdev;
@@ -3302,7 +3302,7 @@ static int vfio_add_capabilities(VFIODevice *vdev)
return vfio_add_std_cap(vdev, pdev->config[PCI_CAPABILITY_LIST]);
}
-static void vfio_pci_pre_reset(VFIODevice *vdev)
+static void vfio_pci_pre_reset(VFIOPCIDevice *vdev)
{
PCIDevice *pdev = &vdev->pdev;
uint16_t cmd;
@@ -3339,7 +3339,7 @@ static void vfio_pci_pre_reset(VFIODevice *vdev)
vfio_pci_write_config(pdev, PCI_COMMAND, cmd, 2);
}
-static void vfio_pci_post_reset(VFIODevice *vdev)
+static void vfio_pci_post_reset(VFIOPCIDevice *vdev)
{
vfio_enable_intx(vdev);
}
@@ -3351,7 +3351,7 @@ static bool vfio_pci_host_match(PCIHostDeviceAddress *host1,
host1->slot == host2->slot && host1->function == host2->function);
}
-static int vfio_pci_hot_reset(VFIODevice *vdev, bool single)
+static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single)
{
VFIOGroup *group;
struct vfio_pci_hot_reset_info *info;
@@ -3402,7 +3402,7 @@ static int vfio_pci_hot_reset(VFIODevice *vdev, bool single)
/* Verify that we have all the groups required */
for (i = 0; i < info->count; i++) {
PCIHostDeviceAddress host;
- VFIODevice *tmp;
+ VFIOPCIDevice *tmp;
host.domain = devices[i].segment;
host.bus = devices[i].bus;
@@ -3496,7 +3496,7 @@ out:
/* Re-enable INTx on affected devices */
for (i = 0; i < info->count; i++) {
PCIHostDeviceAddress host;
- VFIODevice *tmp;
+ VFIOPCIDevice *tmp;
host.domain = devices[i].segment;
host.bus = devices[i].bus;
@@ -3546,12 +3546,12 @@ out_single:
* _one() will only do a hot reset for the one in-use devices case, calling
* _multi() will do nothing if a _one() would have been sufficient.
*/
-static int vfio_pci_hot_reset_one(VFIODevice *vdev)
+static int vfio_pci_hot_reset_one(VFIOPCIDevice *vdev)
{
return vfio_pci_hot_reset(vdev, true);
}
-static int vfio_pci_hot_reset_multi(VFIODevice *vdev)
+static int vfio_pci_hot_reset_multi(VFIOPCIDevice *vdev)
{
return vfio_pci_hot_reset(vdev, false);
}
@@ -3559,7 +3559,7 @@ static int vfio_pci_hot_reset_multi(VFIODevice *vdev)
static void vfio_pci_reset_handler(void *opaque)
{
VFIOGroup *group;
- VFIODevice *vdev;
+ VFIOPCIDevice *vdev;
QLIST_FOREACH(group, &group_list, next) {
QLIST_FOREACH(vdev, &group->device_list, next) {
@@ -3897,7 +3897,8 @@ static void vfio_put_group(VFIOGroup *group)
}
}
-static int vfio_get_device(VFIOGroup *group, const char *name, VFIODevice *vdev)
+static int vfio_get_device(VFIOGroup *group, const char *name,
+ VFIOPCIDevice *vdev)
{
struct vfio_device_info dev_info = { .argsz = sizeof(dev_info) };
struct vfio_region_info reg_info = { .argsz = sizeof(reg_info) };
@@ -4050,7 +4051,7 @@ error:
return ret;
}
-static void vfio_put_device(VFIODevice *vdev)
+static void vfio_put_device(VFIOPCIDevice *vdev)
{
QLIST_REMOVE(vdev, next);
vdev->group = NULL;
@@ -4064,7 +4065,7 @@ static void vfio_put_device(VFIODevice *vdev)
static void vfio_err_notifier_handler(void *opaque)
{
- VFIODevice *vdev = opaque;
+ VFIOPCIDevice *vdev = opaque;
if (!event_notifier_test_and_clear(&vdev->err_notifier)) {
return;
@@ -4093,7 +4094,7 @@ static void vfio_err_notifier_handler(void *opaque)
* and continue after disabling error recovery support for the
* device.
*/
-static void vfio_register_err_notifier(VFIODevice *vdev)
+static void vfio_register_err_notifier(VFIOPCIDevice *vdev)
{
int ret;
int argsz;
@@ -4134,7 +4135,7 @@ static void vfio_register_err_notifier(VFIODevice *vdev)
g_free(irq_set);
}
-static void vfio_unregister_err_notifier(VFIODevice *vdev)
+static void vfio_unregister_err_notifier(VFIOPCIDevice *vdev)
{
int argsz;
struct vfio_irq_set *irq_set;
@@ -4169,7 +4170,7 @@ static void vfio_unregister_err_notifier(VFIODevice *vdev)
static int vfio_initfn(PCIDevice *pdev)
{
- VFIODevice *pvdev, *vdev = DO_UPCAST(VFIODevice, pdev, pdev);
+ VFIOPCIDevice *pvdev, *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev);
VFIOGroup *group;
char path[PATH_MAX], iommu_group_path[PATH_MAX], *group_name;
ssize_t len;
@@ -4322,7 +4323,7 @@ out_put:
static void vfio_exitfn(PCIDevice *pdev)
{
- VFIODevice *vdev = DO_UPCAST(VFIODevice, pdev, pdev);
+ VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev);
VFIOGroup *group = vdev->group;
vfio_unregister_err_notifier(vdev);
@@ -4342,7 +4343,7 @@ static void vfio_exitfn(PCIDevice *pdev)
static void vfio_pci_reset(DeviceState *dev)
{
PCIDevice *pdev = DO_UPCAST(PCIDevice, qdev, dev);
- VFIODevice *vdev = DO_UPCAST(VFIODevice, pdev, pdev);
+ VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev);
trace_vfio_pci_reset(vdev->host.domain, vdev->host.bus,
vdev->host.slot, vdev->host.function);
@@ -4376,7 +4377,7 @@ post_reset:
static void vfio_instance_init(Object *obj)
{
PCIDevice *pci_dev = PCI_DEVICE(obj);
- VFIODevice *vdev = DO_UPCAST(VFIODevice, pdev, PCI_DEVICE(obj));
+ VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, PCI_DEVICE(obj));
device_add_bootindex_property(obj, &vdev->bootindex,
"bootindex", NULL,
@@ -4384,15 +4385,16 @@ static void vfio_instance_init(Object *obj)
}
static Property vfio_pci_dev_properties[] = {
- DEFINE_PROP_PCI_HOST_DEVADDR("host", VFIODevice, host),
- DEFINE_PROP_UINT32("x-intx-mmap-timeout-ms", VFIODevice,
+ DEFINE_PROP_PCI_HOST_DEVADDR("host", VFIOPCIDevice, host),
+ DEFINE_PROP_UINT32("x-intx-mmap-timeout-ms", VFIOPCIDevice,
intx.mmap_timeout, 1100),
- DEFINE_PROP_BIT("x-vga", VFIODevice, features,
+ DEFINE_PROP_BIT("x-vga", VFIOPCIDevice, features,
VFIO_FEATURE_ENABLE_VGA_BIT, false),
+ DEFINE_PROP_INT32("bootindex", VFIOPCIDevice, bootindex, -1),
/*
* TODO - support passed fds... is this necessary?
- * DEFINE_PROP_STRING("vfiofd", VFIODevice, vfiofd_name),
- * DEFINE_PROP_STRING("vfiogroupfd, VFIODevice, vfiogroupfd_name),
+ * DEFINE_PROP_STRING("vfiofd", VFIOPCIDevice, vfiofd_name),
+ * DEFINE_PROP_STRING("vfiogroupfd, VFIOPCIDevice, vfiogroupfd_name),
*/
DEFINE_PROP_END_OF_LIST(),
};
@@ -4422,7 +4424,7 @@ static void vfio_pci_dev_class_init(ObjectClass *klass, void *data)
static const TypeInfo vfio_pci_dev_info = {
.name = "vfio-pci",
.parent = TYPE_PCI_DEVICE,
- .instance_size = sizeof(VFIODevice),
+ .instance_size = sizeof(VFIOPCIDevice),
.class_init = vfio_pci_dev_class_init,
.instance_init = vfio_instance_init,
};
--
1.8.3.2
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH v8 03/19] hw/vfio/pci: generalize mask/unmask to any IRQ index
2014-11-30 18:35 [Qemu-devel] [PATCH v8 00/19] KVM platform device passthrough Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 01/19] vfio: move hw/misc/vfio.c to hw/vfio/pci.c Move vfio.h into include/hw/vfio Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 02/19] hw/vfio/pci: Rename VFIODevice into VFIOPCIDevice Eric Auger
@ 2014-11-30 18:35 ` Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 04/19] hw/vfio/pci: introduce minimalist VFIODevice with fd Eric Auger
` (16 subsequent siblings)
19 siblings, 0 replies; 22+ messages in thread
From: Eric Auger @ 2014-11-30 18:35 UTC (permalink / raw)
To: eric.auger, christoffer.dall, qemu-devel, agraf, pbonzini,
kim.phillips, a.rigo, manish.jaggi, joel.schopp, zhaoshenglong,
ard.biesheuvel
Cc: peter.maydell, patches, eric.auger, will.deacon, stuart.yoder,
Bharat.Bhushan, alex.williamson, a.motakis, kvmarm
To prepare for platform device introduction, rename vfio_mask_intx
and vfio_unmask_intx into vfio_mask_single_irqindex and respectively
unmask_single_irqindex. Also use a nex index parameter.
With that name and prototype the function will be usable for other
indexes than VFIO_PCI_INTX_IRQ_INDEX.
Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
hw/vfio/pci.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 0d7d4a0..387da1a 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -293,12 +293,12 @@ static void vfio_disable_irqindex(VFIOPCIDevice *vdev, int index)
/*
* INTx
*/
-static void vfio_unmask_intx(VFIOPCIDevice *vdev)
+static void vfio_unmask_single_irqindex(VFIOPCIDevice *vdev, int index)
{
struct vfio_irq_set irq_set = {
.argsz = sizeof(irq_set),
.flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_UNMASK,
- .index = VFIO_PCI_INTX_IRQ_INDEX,
+ .index = index,
.start = 0,
.count = 1,
};
@@ -307,12 +307,12 @@ static void vfio_unmask_intx(VFIOPCIDevice *vdev)
}
#ifdef CONFIG_KVM /* Unused outside of CONFIG_KVM code */
-static void vfio_mask_intx(VFIOPCIDevice *vdev)
+static void vfio_mask_single_irqindex(VFIOPCIDevice *vdev, int index)
{
struct vfio_irq_set irq_set = {
.argsz = sizeof(irq_set),
.flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_MASK,
- .index = VFIO_PCI_INTX_IRQ_INDEX,
+ .index = index,
.start = 0,
.count = 1,
};
@@ -381,7 +381,7 @@ static void vfio_eoi(VFIOPCIDevice *vdev)
vdev->intx.pending = false;
pci_irq_deassert(&vdev->pdev);
- vfio_unmask_intx(vdev);
+ vfio_unmask_single_irqindex(vdev, VFIO_PCI_INTX_IRQ_INDEX);
}
static void vfio_enable_intx_kvm(VFIOPCIDevice *vdev)
@@ -404,7 +404,7 @@ static void vfio_enable_intx_kvm(VFIOPCIDevice *vdev)
/* Get to a known interrupt state */
qemu_set_fd_handler(irqfd.fd, NULL, NULL, vdev);
- vfio_mask_intx(vdev);
+ vfio_mask_single_irqindex(vdev, VFIO_PCI_INTX_IRQ_INDEX);
vdev->intx.pending = false;
pci_irq_deassert(&vdev->pdev);
@@ -442,7 +442,7 @@ static void vfio_enable_intx_kvm(VFIOPCIDevice *vdev)
}
/* Let'em rip */
- vfio_unmask_intx(vdev);
+ vfio_unmask_single_irqindex(vdev, VFIO_PCI_INTX_IRQ_INDEX);
vdev->intx.kvm_accel = true;
@@ -458,7 +458,7 @@ fail_irqfd:
event_notifier_cleanup(&vdev->intx.unmask);
fail:
qemu_set_fd_handler(irqfd.fd, vfio_intx_interrupt, NULL, vdev);
- vfio_unmask_intx(vdev);
+ vfio_unmask_single_irqindex(vdev, VFIO_PCI_INTX_IRQ_INDEX);
#endif
}
@@ -479,7 +479,7 @@ static void vfio_disable_intx_kvm(VFIOPCIDevice *vdev)
* Get to a known state, hardware masked, QEMU ready to accept new
* interrupts, QEMU IRQ de-asserted.
*/
- vfio_mask_intx(vdev);
+ vfio_mask_single_irqindex(vdev, VFIO_PCI_INTX_IRQ_INDEX);
vdev->intx.pending = false;
pci_irq_deassert(&vdev->pdev);
@@ -497,7 +497,7 @@ static void vfio_disable_intx_kvm(VFIOPCIDevice *vdev)
vdev->intx.kvm_accel = false;
/* If we've missed an event, let it re-fire through QEMU */
- vfio_unmask_intx(vdev);
+ vfio_unmask_single_irqindex(vdev, VFIO_PCI_INTX_IRQ_INDEX);
trace_vfio_disable_intx_kvm(vdev->host.domain, vdev->host.bus,
vdev->host.slot, vdev->host.function);
--
1.8.3.2
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH v8 04/19] hw/vfio/pci: introduce minimalist VFIODevice with fd
2014-11-30 18:35 [Qemu-devel] [PATCH v8 00/19] KVM platform device passthrough Eric Auger
` (2 preceding siblings ...)
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 03/19] hw/vfio/pci: generalize mask/unmask to any IRQ index Eric Auger
@ 2014-11-30 18:35 ` Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 05/19] hw/vfio/pci: add type, name and group fields in VFIODevice Eric Auger
` (15 subsequent siblings)
19 siblings, 0 replies; 22+ messages in thread
From: Eric Auger @ 2014-11-30 18:35 UTC (permalink / raw)
To: eric.auger, christoffer.dall, qemu-devel, agraf, pbonzini,
kim.phillips, a.rigo, manish.jaggi, joel.schopp, zhaoshenglong,
ard.biesheuvel
Cc: peter.maydell, patches, eric.auger, will.deacon, stuart.yoder,
Bharat.Bhushan, alex.williamson, a.motakis, kvmarm
Introduce a new base VFIODevice strcut that will be used by both PCI
and Platform VFIO device. Move VFIOPCIDevice fd field there. Obviously
other fields from VFIOPCIDevice will be moved there but this patch
file is introduced to ease the review.
Also vfio_mask_single_irqindex, vfio_unmask_single_irqindex,
vfio_disable_irqindex now take a VFIODevice handle as argument.
Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
hw/vfio/pci.c | 117 +++++++++++++++++++++++++++++++---------------------------
1 file changed, 63 insertions(+), 54 deletions(-)
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 387da1a..cd9ce4e 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -185,9 +185,13 @@ typedef struct VFIOMSIXInfo {
void *mmap;
} VFIOMSIXInfo;
+typedef struct VFIODevice {
+ int fd;
+} VFIODevice;
+
typedef struct VFIOPCIDevice {
PCIDevice pdev;
- int fd;
+ VFIODevice vbasedev;
VFIOINTx intx;
unsigned int config_size;
uint8_t *emulated_config_bits; /* QEMU emulated bits, little-endian */
@@ -277,7 +281,7 @@ static void vfio_mmap_set_enabled(VFIOPCIDevice *vdev, bool enabled);
/*
* Common VFIO interrupt disable
*/
-static void vfio_disable_irqindex(VFIOPCIDevice *vdev, int index)
+static void vfio_disable_irqindex(VFIODevice *vbasedev, int index)
{
struct vfio_irq_set irq_set = {
.argsz = sizeof(irq_set),
@@ -287,13 +291,13 @@ static void vfio_disable_irqindex(VFIOPCIDevice *vdev, int index)
.count = 0,
};
- ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
+ ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
}
/*
* INTx
*/
-static void vfio_unmask_single_irqindex(VFIOPCIDevice *vdev, int index)
+static void vfio_unmask_single_irqindex(VFIODevice *vbasedev, int index)
{
struct vfio_irq_set irq_set = {
.argsz = sizeof(irq_set),
@@ -303,11 +307,11 @@ static void vfio_unmask_single_irqindex(VFIOPCIDevice *vdev, int index)
.count = 1,
};
- ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
+ ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
}
#ifdef CONFIG_KVM /* Unused outside of CONFIG_KVM code */
-static void vfio_mask_single_irqindex(VFIOPCIDevice *vdev, int index)
+static void vfio_mask_single_irqindex(VFIODevice *vbasedev, int index)
{
struct vfio_irq_set irq_set = {
.argsz = sizeof(irq_set),
@@ -317,7 +321,7 @@ static void vfio_mask_single_irqindex(VFIOPCIDevice *vdev, int index)
.count = 1,
};
- ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
+ ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
}
#endif
@@ -381,7 +385,7 @@ static void vfio_eoi(VFIOPCIDevice *vdev)
vdev->intx.pending = false;
pci_irq_deassert(&vdev->pdev);
- vfio_unmask_single_irqindex(vdev, VFIO_PCI_INTX_IRQ_INDEX);
+ vfio_unmask_single_irqindex(&vdev->vbasedev, VFIO_PCI_INTX_IRQ_INDEX);
}
static void vfio_enable_intx_kvm(VFIOPCIDevice *vdev)
@@ -404,7 +408,7 @@ static void vfio_enable_intx_kvm(VFIOPCIDevice *vdev)
/* Get to a known interrupt state */
qemu_set_fd_handler(irqfd.fd, NULL, NULL, vdev);
- vfio_mask_single_irqindex(vdev, VFIO_PCI_INTX_IRQ_INDEX);
+ vfio_mask_single_irqindex(&vdev->vbasedev, VFIO_PCI_INTX_IRQ_INDEX);
vdev->intx.pending = false;
pci_irq_deassert(&vdev->pdev);
@@ -434,7 +438,7 @@ static void vfio_enable_intx_kvm(VFIOPCIDevice *vdev)
*pfd = irqfd.resamplefd;
- ret = ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
+ ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_SET_IRQS, irq_set);
g_free(irq_set);
if (ret) {
error_report("vfio: Error: Failed to setup INTx unmask fd: %m");
@@ -442,7 +446,7 @@ static void vfio_enable_intx_kvm(VFIOPCIDevice *vdev)
}
/* Let'em rip */
- vfio_unmask_single_irqindex(vdev, VFIO_PCI_INTX_IRQ_INDEX);
+ vfio_unmask_single_irqindex(&vdev->vbasedev, VFIO_PCI_INTX_IRQ_INDEX);
vdev->intx.kvm_accel = true;
@@ -458,7 +462,7 @@ fail_irqfd:
event_notifier_cleanup(&vdev->intx.unmask);
fail:
qemu_set_fd_handler(irqfd.fd, vfio_intx_interrupt, NULL, vdev);
- vfio_unmask_single_irqindex(vdev, VFIO_PCI_INTX_IRQ_INDEX);
+ vfio_unmask_single_irqindex(&vdev->vbasedev, VFIO_PCI_INTX_IRQ_INDEX);
#endif
}
@@ -479,7 +483,7 @@ static void vfio_disable_intx_kvm(VFIOPCIDevice *vdev)
* Get to a known state, hardware masked, QEMU ready to accept new
* interrupts, QEMU IRQ de-asserted.
*/
- vfio_mask_single_irqindex(vdev, VFIO_PCI_INTX_IRQ_INDEX);
+ vfio_mask_single_irqindex(&vdev->vbasedev, VFIO_PCI_INTX_IRQ_INDEX);
vdev->intx.pending = false;
pci_irq_deassert(&vdev->pdev);
@@ -497,7 +501,7 @@ static void vfio_disable_intx_kvm(VFIOPCIDevice *vdev)
vdev->intx.kvm_accel = false;
/* If we've missed an event, let it re-fire through QEMU */
- vfio_unmask_single_irqindex(vdev, VFIO_PCI_INTX_IRQ_INDEX);
+ vfio_unmask_single_irqindex(&vdev->vbasedev, VFIO_PCI_INTX_IRQ_INDEX);
trace_vfio_disable_intx_kvm(vdev->host.domain, vdev->host.bus,
vdev->host.slot, vdev->host.function);
@@ -583,7 +587,7 @@ static int vfio_enable_intx(VFIOPCIDevice *vdev)
*pfd = event_notifier_get_fd(&vdev->intx.interrupt);
qemu_set_fd_handler(*pfd, vfio_intx_interrupt, NULL, vdev);
- ret = ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
+ ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_SET_IRQS, irq_set);
g_free(irq_set);
if (ret) {
error_report("vfio: Error: Failed to setup INTx fd: %m");
@@ -608,7 +612,7 @@ static void vfio_disable_intx(VFIOPCIDevice *vdev)
timer_del(vdev->intx.mmap_timer);
vfio_disable_intx_kvm(vdev);
- vfio_disable_irqindex(vdev, VFIO_PCI_INTX_IRQ_INDEX);
+ vfio_disable_irqindex(&vdev->vbasedev, VFIO_PCI_INTX_IRQ_INDEX);
vdev->intx.pending = false;
pci_irq_deassert(&vdev->pdev);
vfio_mmap_set_enabled(vdev, true);
@@ -698,7 +702,7 @@ static int vfio_enable_vectors(VFIOPCIDevice *vdev, bool msix)
fds[i] = fd;
}
- ret = ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
+ ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_SET_IRQS, irq_set);
g_free(irq_set);
@@ -795,7 +799,7 @@ static int vfio_msix_vector_do_use(PCIDevice *pdev, unsigned int nr,
* increase them as needed.
*/
if (vdev->nr_vectors < nr + 1) {
- vfio_disable_irqindex(vdev, VFIO_PCI_MSIX_IRQ_INDEX);
+ vfio_disable_irqindex(&vdev->vbasedev, VFIO_PCI_MSIX_IRQ_INDEX);
vdev->nr_vectors = nr + 1;
ret = vfio_enable_vectors(vdev, true);
if (ret) {
@@ -823,7 +827,7 @@ static int vfio_msix_vector_do_use(PCIDevice *pdev, unsigned int nr,
*pfd = event_notifier_get_fd(&vector->interrupt);
}
- ret = ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
+ ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_SET_IRQS, irq_set);
g_free(irq_set);
if (ret) {
error_report("vfio: failed to modify vector, %d", ret);
@@ -874,7 +878,7 @@ static void vfio_msix_vector_release(PCIDevice *pdev, unsigned int nr)
*pfd = event_notifier_get_fd(&vector->interrupt);
- ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
+ ioctl(vdev->vbasedev.fd, VFIO_DEVICE_SET_IRQS, irq_set);
g_free(irq_set);
}
@@ -1033,7 +1037,7 @@ static void vfio_disable_msix(VFIOPCIDevice *vdev)
}
if (vdev->nr_vectors) {
- vfio_disable_irqindex(vdev, VFIO_PCI_MSIX_IRQ_INDEX);
+ vfio_disable_irqindex(&vdev->vbasedev, VFIO_PCI_MSIX_IRQ_INDEX);
}
vfio_disable_msi_common(vdev);
@@ -1044,7 +1048,7 @@ static void vfio_disable_msix(VFIOPCIDevice *vdev)
static void vfio_disable_msi(VFIOPCIDevice *vdev)
{
- vfio_disable_irqindex(vdev, VFIO_PCI_MSI_IRQ_INDEX);
+ vfio_disable_irqindex(&vdev->vbasedev, VFIO_PCI_MSI_IRQ_INDEX);
vfio_disable_msi_common(vdev);
trace_vfio_disable_msi(vdev->host.domain, vdev->host.bus,
@@ -1188,7 +1192,7 @@ static void vfio_pci_load_rom(VFIOPCIDevice *vdev)
off_t off = 0;
size_t bytes;
- if (ioctl(vdev->fd, VFIO_DEVICE_GET_REGION_INFO, ®_info)) {
+ if (ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_REGION_INFO, ®_info)) {
error_report("vfio: Error getting ROM info: %m");
return;
}
@@ -1218,7 +1222,8 @@ static void vfio_pci_load_rom(VFIOPCIDevice *vdev)
memset(vdev->rom, 0xff, size);
while (size) {
- bytes = pread(vdev->fd, vdev->rom + off, size, vdev->rom_offset + off);
+ bytes = pread(vdev->vbasedev.fd, vdev->rom + off,
+ size, vdev->rom_offset + off);
if (bytes == 0) {
break;
} else if (bytes > 0) {
@@ -1312,6 +1317,7 @@ static void vfio_pci_size_rom(VFIOPCIDevice *vdev)
off_t offset = vdev->config_offset + PCI_ROM_ADDRESS;
DeviceState *dev = DEVICE(vdev);
char name[32];
+ int fd = vdev->vbasedev.fd;
if (vdev->pdev.romfile || !vdev->pdev.rom_bar) {
/* Since pci handles romfile, just print a message and return */
@@ -1330,10 +1336,10 @@ static void vfio_pci_size_rom(VFIOPCIDevice *vdev)
* Use the same size ROM BAR as the physical device. The contents
* will get filled in later when the guest tries to read it.
*/
- if (pread(vdev->fd, &orig, 4, offset) != 4 ||
- pwrite(vdev->fd, &size, 4, offset) != 4 ||
- pread(vdev->fd, &size, 4, offset) != 4 ||
- pwrite(vdev->fd, &orig, 4, offset) != 4) {
+ if (pread(fd, &orig, 4, offset) != 4 ||
+ pwrite(fd, &size, 4, offset) != 4 ||
+ pread(fd, &size, 4, offset) != 4 ||
+ pwrite(fd, &orig, 4, offset) != 4) {
error_report("%s(%04x:%02x:%02x.%x) failed: %m",
__func__, vdev->host.domain, vdev->host.bus,
vdev->host.slot, vdev->host.function);
@@ -2345,7 +2351,8 @@ static uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len)
if (~emu_bits & (0xffffffffU >> (32 - len * 8))) {
ssize_t ret;
- ret = pread(vdev->fd, &phys_val, len, vdev->config_offset + addr);
+ ret = pread(vdev->vbasedev.fd, &phys_val, len,
+ vdev->config_offset + addr);
if (ret != len) {
error_report("%s(%04x:%02x:%02x.%x, 0x%x, 0x%x) failed: %m",
__func__, vdev->host.domain, vdev->host.bus,
@@ -2375,7 +2382,8 @@ static void vfio_pci_write_config(PCIDevice *pdev, uint32_t addr,
addr, val, len);
/* Write everything to VFIO, let it filter out what we can't write */
- if (pwrite(vdev->fd, &val_le, len, vdev->config_offset + addr) != len) {
+ if (pwrite(vdev->vbasedev.fd, &val_le, len, vdev->config_offset + addr)
+ != len) {
error_report("%s(%04x:%02x:%02x.%x, 0x%x, 0x%x, 0x%x) failed: %m",
__func__, vdev->host.domain, vdev->host.bus,
vdev->host.slot, vdev->host.function, addr, val, len);
@@ -2743,7 +2751,7 @@ static int vfio_setup_msi(VFIOPCIDevice *vdev, int pos)
bool msi_64bit, msi_maskbit;
int ret, entries;
- if (pread(vdev->fd, &ctrl, sizeof(ctrl),
+ if (pread(vdev->vbasedev.fd, &ctrl, sizeof(ctrl),
vdev->config_offset + pos + PCI_CAP_FLAGS) != sizeof(ctrl)) {
return -errno;
}
@@ -2782,23 +2790,24 @@ static int vfio_early_setup_msix(VFIOPCIDevice *vdev)
uint8_t pos;
uint16_t ctrl;
uint32_t table, pba;
+ int fd = vdev->vbasedev.fd;
pos = pci_find_capability(&vdev->pdev, PCI_CAP_ID_MSIX);
if (!pos) {
return 0;
}
- if (pread(vdev->fd, &ctrl, sizeof(ctrl),
+ if (pread(fd, &ctrl, sizeof(ctrl),
vdev->config_offset + pos + PCI_CAP_FLAGS) != sizeof(ctrl)) {
return -errno;
}
- if (pread(vdev->fd, &table, sizeof(table),
+ if (pread(fd, &table, sizeof(table),
vdev->config_offset + pos + PCI_MSIX_TABLE) != sizeof(table)) {
return -errno;
}
- if (pread(vdev->fd, &pba, sizeof(pba),
+ if (pread(fd, &pba, sizeof(pba),
vdev->config_offset + pos + PCI_MSIX_PBA) != sizeof(pba)) {
return -errno;
}
@@ -2951,7 +2960,7 @@ static void vfio_map_bar(VFIOPCIDevice *vdev, int nr)
vdev->host.function, nr);
/* Determine what type of BAR this is for registration */
- ret = pread(vdev->fd, &pci_bar, sizeof(pci_bar),
+ ret = pread(vdev->vbasedev.fd, &pci_bar, sizeof(pci_bar),
vdev->config_offset + PCI_BASE_ADDRESS_0 + (4 * nr));
if (ret != sizeof(pci_bar)) {
error_report("vfio: Failed to read BAR %d (%m)", nr);
@@ -3371,7 +3380,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single)
info = g_malloc0(sizeof(*info));
info->argsz = sizeof(*info);
- ret = ioctl(vdev->fd, VFIO_DEVICE_GET_PCI_HOT_RESET_INFO, info);
+ ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_PCI_HOT_RESET_INFO, info);
if (ret && errno != ENOSPC) {
ret = -errno;
if (!vdev->has_pm_reset) {
@@ -3387,7 +3396,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single)
info->argsz = sizeof(*info) + (count * sizeof(*devices));
devices = &info->devices[0];
- ret = ioctl(vdev->fd, VFIO_DEVICE_GET_PCI_HOT_RESET_INFO, info);
+ ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_PCI_HOT_RESET_INFO, info);
if (ret) {
ret = -errno;
error_report("vfio: hot reset info failed: %m");
@@ -3483,7 +3492,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single)
}
/* Bus reset! */
- ret = ioctl(vdev->fd, VFIO_DEVICE_PCI_HOT_RESET, reset);
+ ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_PCI_HOT_RESET, reset);
g_free(reset);
trace_vfio_pci_hot_reset_result(vdev->host.domain,
@@ -3914,12 +3923,12 @@ static int vfio_get_device(VFIOGroup *group, const char *name,
return ret;
}
- vdev->fd = ret;
+ vdev->vbasedev.fd = ret;
vdev->group = group;
QLIST_INSERT_HEAD(&group->device_list, vdev, next);
/* Sanity check device */
- ret = ioctl(vdev->fd, VFIO_DEVICE_GET_INFO, &dev_info);
+ ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_INFO, &dev_info);
if (ret) {
error_report("vfio: error getting device info: %m");
goto error;
@@ -3949,7 +3958,7 @@ static int vfio_get_device(VFIOGroup *group, const char *name,
for (i = VFIO_PCI_BAR0_REGION_INDEX; i < VFIO_PCI_ROM_REGION_INDEX; i++) {
reg_info.index = i;
- ret = ioctl(vdev->fd, VFIO_DEVICE_GET_REGION_INFO, ®_info);
+ ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_REGION_INFO, ®_info);
if (ret) {
error_report("vfio: Error getting region %d info: %m", i);
goto error;
@@ -3963,14 +3972,14 @@ static int vfio_get_device(VFIOGroup *group, const char *name,
vdev->bars[i].flags = reg_info.flags;
vdev->bars[i].size = reg_info.size;
vdev->bars[i].fd_offset = reg_info.offset;
- vdev->bars[i].fd = vdev->fd;
+ vdev->bars[i].fd = vdev->vbasedev.fd;
vdev->bars[i].nr = i;
QLIST_INIT(&vdev->bars[i].quirks);
}
reg_info.index = VFIO_PCI_CONFIG_REGION_INDEX;
- ret = ioctl(vdev->fd, VFIO_DEVICE_GET_REGION_INFO, ®_info);
+ ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_REGION_INFO, ®_info);
if (ret) {
error_report("vfio: Error getting config info: %m");
goto error;
@@ -3993,7 +4002,7 @@ static int vfio_get_device(VFIOGroup *group, const char *name,
.index = VFIO_PCI_VGA_REGION_INDEX,
};
- ret = ioctl(vdev->fd, VFIO_DEVICE_GET_REGION_INFO, &vga_info);
+ ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_REGION_INFO, &vga_info);
if (ret) {
error_report(
"vfio: Device does not support requested feature x-vga");
@@ -4010,7 +4019,7 @@ static int vfio_get_device(VFIOGroup *group, const char *name,
}
vdev->vga.fd_offset = vga_info.offset;
- vdev->vga.fd = vdev->fd;
+ vdev->vga.fd = vdev->vbasedev.fd;
vdev->vga.region[QEMU_PCI_VGA_MEM].offset = QEMU_PCI_VGA_MEM_BASE;
vdev->vga.region[QEMU_PCI_VGA_MEM].nr = QEMU_PCI_VGA_MEM;
@@ -4028,7 +4037,7 @@ static int vfio_get_device(VFIOGroup *group, const char *name,
}
irq_info.index = VFIO_PCI_ERR_IRQ_INDEX;
- ret = ioctl(vdev->fd, VFIO_DEVICE_GET_IRQ_INFO, &irq_info);
+ ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_IRQ_INFO, &irq_info);
if (ret) {
/* This can fail for an old kernel or legacy PCI dev */
trace_vfio_get_device_get_irq_info_failure();
@@ -4046,7 +4055,7 @@ error:
if (ret) {
QLIST_REMOVE(vdev, next);
vdev->group = NULL;
- close(vdev->fd);
+ close(vdev->vbasedev.fd);
}
return ret;
}
@@ -4055,8 +4064,8 @@ static void vfio_put_device(VFIOPCIDevice *vdev)
{
QLIST_REMOVE(vdev, next);
vdev->group = NULL;
- trace_vfio_put_device(vdev->fd);
- close(vdev->fd);
+ trace_vfio_put_device(vdev->vbasedev.fd);
+ close(vdev->vbasedev.fd);
if (vdev->msix) {
g_free(vdev->msix);
vdev->msix = NULL;
@@ -4125,7 +4134,7 @@ static void vfio_register_err_notifier(VFIOPCIDevice *vdev)
*pfd = event_notifier_get_fd(&vdev->err_notifier);
qemu_set_fd_handler(*pfd, vfio_err_notifier_handler, NULL, vdev);
- ret = ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
+ ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_SET_IRQS, irq_set);
if (ret) {
error_report("vfio: Failed to set up error notification");
qemu_set_fd_handler(*pfd, NULL, NULL, vdev);
@@ -4158,7 +4167,7 @@ static void vfio_unregister_err_notifier(VFIOPCIDevice *vdev)
pfd = (int32_t *)&irq_set->data;
*pfd = -1;
- ret = ioctl(vdev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
+ ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_SET_IRQS, irq_set);
if (ret) {
error_report("vfio: Failed to de-assign error fd: %m");
}
@@ -4237,7 +4246,7 @@ static int vfio_initfn(PCIDevice *pdev)
}
/* Get a copy of config space */
- ret = pread(vdev->fd, vdev->pdev.config,
+ ret = pread(vdev->vbasedev.fd, vdev->pdev.config,
MIN(pci_config_size(&vdev->pdev), vdev->config_size),
vdev->config_offset);
if (ret < (int)MIN(pci_config_size(&vdev->pdev), vdev->config_size)) {
@@ -4351,7 +4360,7 @@ static void vfio_pci_reset(DeviceState *dev)
vfio_pci_pre_reset(vdev);
if (vdev->reset_works && (vdev->has_flr || !vdev->has_pm_reset) &&
- !ioctl(vdev->fd, VFIO_DEVICE_RESET)) {
+ !ioctl(vdev->vbasedev.fd, VFIO_DEVICE_RESET)) {
trace_vfio_pci_reset_flr(vdev->host.domain, vdev->host.bus,
vdev->host.slot, vdev->host.function);
goto post_reset;
@@ -4364,7 +4373,7 @@ static void vfio_pci_reset(DeviceState *dev)
/* If nothing else works and the device supports PM reset, use it */
if (vdev->reset_works && vdev->has_pm_reset &&
- !ioctl(vdev->fd, VFIO_DEVICE_RESET)) {
+ !ioctl(vdev->vbasedev.fd, VFIO_DEVICE_RESET)) {
trace_vfio_pci_reset_pm(vdev->host.domain, vdev->host.bus,
vdev->host.slot, vdev->host.function);
goto post_reset;
--
1.8.3.2
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH v8 05/19] hw/vfio/pci: add type, name and group fields in VFIODevice
2014-11-30 18:35 [Qemu-devel] [PATCH v8 00/19] KVM platform device passthrough Eric Auger
` (3 preceding siblings ...)
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 04/19] hw/vfio/pci: introduce minimalist VFIODevice with fd Eric Auger
@ 2014-11-30 18:35 ` Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 06/19] hw/vfio/pci: handle reset at VFIODevice Eric Auger
` (14 subsequent siblings)
19 siblings, 0 replies; 22+ messages in thread
From: Eric Auger @ 2014-11-30 18:35 UTC (permalink / raw)
To: eric.auger, christoffer.dall, qemu-devel, agraf, pbonzini,
kim.phillips, a.rigo, manish.jaggi, joel.schopp, zhaoshenglong,
ard.biesheuvel
Cc: peter.maydell, patches, eric.auger, will.deacon, stuart.yoder,
Bharat.Bhushan, alex.williamson, a.motakis, kvmarm
Add 3 new fields in the VFIODevice struct. Type is set to
VFIO_DEVICE_TYPE_PCI. The type enum value will later be used
to discriminate between VFIO PCI and platform devices. The name is
set to domain:bus:slot:function. Currently used to test whether
the device already is attached to the group. Later on, the name
will be used to simplify all traces. The group is simply moved
from VFIOPCIDevice to VFIODevice.
Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
hw/vfio/pci.c | 27 ++++++++++++++++++---------
1 file changed, 18 insertions(+), 9 deletions(-)
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index cd9ce4e..157e1a5 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -48,6 +48,10 @@
#define VFIO_ALLOW_KVM_MSI 1
#define VFIO_ALLOW_KVM_MSIX 1
+enum {
+ VFIO_DEVICE_TYPE_PCI = 0,
+};
+
struct VFIOPCIDevice;
typedef struct VFIOQuirk {
@@ -186,7 +190,10 @@ typedef struct VFIOMSIXInfo {
} VFIOMSIXInfo;
typedef struct VFIODevice {
+ struct VFIOGroup *group;
+ char *name;
int fd;
+ int type;
} VFIODevice;
typedef struct VFIOPCIDevice {
@@ -208,7 +215,6 @@ typedef struct VFIOPCIDevice {
VFIOVGA vga; /* 0xa0000, 0x3b0, 0x3c0 */
PCIHostDeviceAddress host;
QLIST_ENTRY(VFIOPCIDevice) next;
- struct VFIOGroup *group;
EventNotifier err_notifier;
uint32_t features;
#define VFIO_FEATURE_ENABLE_VGA_BIT 0
@@ -3924,7 +3930,7 @@ static int vfio_get_device(VFIOGroup *group, const char *name,
}
vdev->vbasedev.fd = ret;
- vdev->group = group;
+ vdev->vbasedev.group = group;
QLIST_INSERT_HEAD(&group->device_list, vdev, next);
/* Sanity check device */
@@ -4054,7 +4060,7 @@ static int vfio_get_device(VFIOGroup *group, const char *name,
error:
if (ret) {
QLIST_REMOVE(vdev, next);
- vdev->group = NULL;
+ vdev->vbasedev.group = NULL;
close(vdev->vbasedev.fd);
}
return ret;
@@ -4063,9 +4069,10 @@ error:
static void vfio_put_device(VFIOPCIDevice *vdev)
{
QLIST_REMOVE(vdev, next);
- vdev->group = NULL;
+ vdev->vbasedev.group = NULL;
trace_vfio_put_device(vdev->vbasedev.fd);
close(vdev->vbasedev.fd);
+ g_free(vdev->vbasedev.name);
if (vdev->msix) {
g_free(vdev->msix);
vdev->msix = NULL;
@@ -4197,6 +4204,11 @@ static int vfio_initfn(PCIDevice *pdev)
return -errno;
}
+ vdev->vbasedev.type = VFIO_DEVICE_TYPE_PCI;
+ g_strdup_printf(vdev->vbasedev.name, "%04x:%02x:%02x.%01x",
+ vdev->host.domain, vdev->host.bus, vdev->host.slot,
+ vdev->host.function);
+
strncat(path, "iommu_group", sizeof(path) - strlen(path) - 1);
len = readlink(path, iommu_group_path, sizeof(path));
@@ -4227,10 +4239,7 @@ static int vfio_initfn(PCIDevice *pdev)
vdev->host.function);
QLIST_FOREACH(pvdev, &group->device_list, next) {
- if (pvdev->host.domain == vdev->host.domain &&
- pvdev->host.bus == vdev->host.bus &&
- pvdev->host.slot == vdev->host.slot &&
- pvdev->host.function == vdev->host.function) {
+ if (strcmp(pvdev->vbasedev.name, vdev->vbasedev.name) == 0) {
error_report("vfio: error: device %s is already attached", path);
vfio_put_group(group);
@@ -4333,7 +4342,7 @@ out_put:
static void vfio_exitfn(PCIDevice *pdev)
{
VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev);
- VFIOGroup *group = vdev->group;
+ VFIOGroup *group = vdev->vbasedev.group;
vfio_unregister_err_notifier(vdev);
pci_device_set_intx_routing_notifier(&vdev->pdev, NULL);
--
1.8.3.2
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH v8 06/19] hw/vfio/pci: handle reset at VFIODevice
2014-11-30 18:35 [Qemu-devel] [PATCH v8 00/19] KVM platform device passthrough Eric Auger
` (4 preceding siblings ...)
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 05/19] hw/vfio/pci: add type, name and group fields in VFIODevice Eric Auger
@ 2014-11-30 18:35 ` Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 07/19] hw/vfio/pci: Introduce VFIORegion Eric Auger
` (13 subsequent siblings)
19 siblings, 0 replies; 22+ messages in thread
From: Eric Auger @ 2014-11-30 18:35 UTC (permalink / raw)
To: eric.auger, christoffer.dall, qemu-devel, agraf, pbonzini,
kim.phillips, a.rigo, manish.jaggi, joel.schopp, zhaoshenglong,
ard.biesheuvel
Cc: peter.maydell, patches, eric.auger, will.deacon, stuart.yoder,
Bharat.Bhushan, alex.williamson, a.motakis, kvmarm
Since we can potentially have both PCI and platform devices in
the same VFIO group, this latter now owns a list of VFIODevices.
A unified reset handler, vfio_reset_handler, is registered, looping
through this VFIODevice list. 2 specialized operations are introduced
(vfio_compute_needs_reset and vfio_hot_reset_multi): they allow to
implement type specific behavior. also reset_works and needs_reset
VFIOPCIDevice fields are moved into VFIODevice.
Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
v8: compared to [PATCH v7 03/16] hw/vfio/pci: introduce VFIODevice,
vfio_compute_needs_reset does not return a bool anymore.
---
hw/vfio/pci.c | 93 ++++++++++++++++++++++++++++++++++++++++-------------------
1 file changed, 63 insertions(+), 30 deletions(-)
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 157e1a5..e68865b 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -189,13 +189,24 @@ typedef struct VFIOMSIXInfo {
void *mmap;
} VFIOMSIXInfo;
+typedef struct VFIODeviceOps VFIODeviceOps;
+
typedef struct VFIODevice {
+ QLIST_ENTRY(VFIODevice) next;
struct VFIOGroup *group;
char *name;
int fd;
int type;
+ bool reset_works;
+ bool needs_reset;
+ VFIODeviceOps *ops;
} VFIODevice;
+struct VFIODeviceOps {
+ void (*vfio_compute_needs_reset)(VFIODevice *vdev);
+ int (*vfio_hot_reset_multi)(VFIODevice *vdev);
+};
+
typedef struct VFIOPCIDevice {
PCIDevice pdev;
VFIODevice vbasedev;
@@ -214,19 +225,16 @@ typedef struct VFIOPCIDevice {
VFIOBAR bars[PCI_NUM_REGIONS - 1]; /* No ROM */
VFIOVGA vga; /* 0xa0000, 0x3b0, 0x3c0 */
PCIHostDeviceAddress host;
- QLIST_ENTRY(VFIOPCIDevice) next;
EventNotifier err_notifier;
uint32_t features;
#define VFIO_FEATURE_ENABLE_VGA_BIT 0
#define VFIO_FEATURE_ENABLE_VGA (1 << VFIO_FEATURE_ENABLE_VGA_BIT)
int32_t bootindex;
uint8_t pm_cap;
- bool reset_works;
bool has_vga;
bool pci_aer;
bool has_flr;
bool has_pm_reset;
- bool needs_reset;
bool rom_read_failed;
} VFIOPCIDevice;
@@ -234,7 +242,7 @@ typedef struct VFIOGroup {
int fd;
int groupid;
VFIOContainer *container;
- QLIST_HEAD(, VFIOPCIDevice) device_list;
+ QLIST_HEAD(, VFIODevice) device_list;
QLIST_ENTRY(VFIOGroup) next;
QLIST_ENTRY(VFIOGroup) container_next;
} VFIOGroup;
@@ -3381,7 +3389,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single)
single ? "one" : "multi");
vfio_pci_pre_reset(vdev);
- vdev->needs_reset = false;
+ vdev->vbasedev.needs_reset = false;
info = g_malloc0(sizeof(*info));
info->argsz = sizeof(*info);
@@ -3418,6 +3426,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single)
for (i = 0; i < info->count; i++) {
PCIHostDeviceAddress host;
VFIOPCIDevice *tmp;
+ VFIODevice *vbasedev_iter;
host.domain = devices[i].segment;
host.bus = devices[i].bus;
@@ -3449,7 +3458,11 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single)
}
/* Prep dependent devices for reset and clear our marker. */
- QLIST_FOREACH(tmp, &group->device_list, next) {
+ QLIST_FOREACH(vbasedev_iter, &group->device_list, next) {
+ if (vbasedev_iter->type != VFIO_DEVICE_TYPE_PCI) {
+ continue;
+ }
+ tmp = container_of(vbasedev_iter, VFIOPCIDevice, vbasedev);
if (vfio_pci_host_match(&host, &tmp->host)) {
if (single) {
error_report("vfio: found another in-use device "
@@ -3459,7 +3472,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single)
goto out_single;
}
vfio_pci_pre_reset(tmp);
- tmp->needs_reset = false;
+ tmp->vbasedev.needs_reset = false;
multi = true;
break;
}
@@ -3512,6 +3525,7 @@ out:
for (i = 0; i < info->count; i++) {
PCIHostDeviceAddress host;
VFIOPCIDevice *tmp;
+ VFIODevice *vbasedev_iter;
host.domain = devices[i].segment;
host.bus = devices[i].bus;
@@ -3532,7 +3546,11 @@ out:
break;
}
- QLIST_FOREACH(tmp, &group->device_list, next) {
+ QLIST_FOREACH(vbasedev_iter, &group->device_list, next) {
+ if (vbasedev_iter->type != VFIO_DEVICE_TYPE_PCI) {
+ continue;
+ }
+ tmp = container_of(vbasedev_iter, VFIOPCIDevice, vbasedev);
if (vfio_pci_host_match(&host, &tmp->host)) {
vfio_pci_post_reset(tmp);
break;
@@ -3566,28 +3584,40 @@ static int vfio_pci_hot_reset_one(VFIOPCIDevice *vdev)
return vfio_pci_hot_reset(vdev, true);
}
-static int vfio_pci_hot_reset_multi(VFIOPCIDevice *vdev)
+static int vfio_pci_hot_reset_multi(VFIODevice *vbasedev)
{
+ VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
return vfio_pci_hot_reset(vdev, false);
}
-static void vfio_pci_reset_handler(void *opaque)
+static void vfio_pci_compute_needs_reset(VFIODevice *vbasedev)
+{
+ VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
+ if (!vbasedev->reset_works || (!vdev->has_flr && vdev->has_pm_reset)) {
+ vbasedev->needs_reset = true;
+ }
+}
+
+static VFIODeviceOps vfio_pci_ops = {
+ .vfio_compute_needs_reset = vfio_pci_compute_needs_reset,
+ .vfio_hot_reset_multi = vfio_pci_hot_reset_multi,
+};
+
+static void vfio_reset_handler(void *opaque)
{
VFIOGroup *group;
- VFIOPCIDevice *vdev;
+ VFIODevice *vbasedev;
QLIST_FOREACH(group, &group_list, next) {
- QLIST_FOREACH(vdev, &group->device_list, next) {
- if (!vdev->reset_works || (!vdev->has_flr && vdev->has_pm_reset)) {
- vdev->needs_reset = true;
- }
+ QLIST_FOREACH(vbasedev, &group->device_list, next) {
+ vbasedev->ops->vfio_compute_needs_reset(vbasedev);
}
}
QLIST_FOREACH(group, &group_list, next) {
- QLIST_FOREACH(vdev, &group->device_list, next) {
- if (vdev->needs_reset) {
- vfio_pci_hot_reset_multi(vdev);
+ QLIST_FOREACH(vbasedev, &group->device_list, next) {
+ if (vbasedev->needs_reset) {
+ vbasedev->ops->vfio_hot_reset_multi(vbasedev);
}
}
}
@@ -3876,7 +3906,7 @@ static VFIOGroup *vfio_get_group(int groupid, AddressSpace *as)
}
if (QLIST_EMPTY(&group_list)) {
- qemu_register_reset(vfio_pci_reset_handler, NULL);
+ qemu_register_reset(vfio_reset_handler, NULL);
}
QLIST_INSERT_HEAD(&group_list, group, next);
@@ -3908,7 +3938,7 @@ static void vfio_put_group(VFIOGroup *group)
g_free(group);
if (QLIST_EMPTY(&group_list)) {
- qemu_unregister_reset(vfio_pci_reset_handler, NULL);
+ qemu_unregister_reset(vfio_reset_handler, NULL);
}
}
@@ -3931,7 +3961,7 @@ static int vfio_get_device(VFIOGroup *group, const char *name,
vdev->vbasedev.fd = ret;
vdev->vbasedev.group = group;
- QLIST_INSERT_HEAD(&group->device_list, vdev, next);
+ QLIST_INSERT_HEAD(&group->device_list, &vdev->vbasedev, next);
/* Sanity check device */
ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_INFO, &dev_info);
@@ -3948,7 +3978,7 @@ static int vfio_get_device(VFIOGroup *group, const char *name,
goto error;
}
- vdev->reset_works = !!(dev_info.flags & VFIO_DEVICE_FLAGS_RESET);
+ vdev->vbasedev.reset_works = !!(dev_info.flags & VFIO_DEVICE_FLAGS_RESET);
if (dev_info.num_regions < VFIO_PCI_CONFIG_REGION_INDEX + 1) {
error_report("vfio: unexpected number of io regions %u",
@@ -4059,7 +4089,7 @@ static int vfio_get_device(VFIOGroup *group, const char *name,
error:
if (ret) {
- QLIST_REMOVE(vdev, next);
+ QLIST_REMOVE(&vdev->vbasedev, next);
vdev->vbasedev.group = NULL;
close(vdev->vbasedev.fd);
}
@@ -4068,7 +4098,7 @@ error:
static void vfio_put_device(VFIOPCIDevice *vdev)
{
- QLIST_REMOVE(vdev, next);
+ QLIST_REMOVE(&vdev->vbasedev, next);
vdev->vbasedev.group = NULL;
trace_vfio_put_device(vdev->vbasedev.fd);
close(vdev->vbasedev.fd);
@@ -4186,7 +4216,8 @@ static void vfio_unregister_err_notifier(VFIOPCIDevice *vdev)
static int vfio_initfn(PCIDevice *pdev)
{
- VFIOPCIDevice *pvdev, *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev);
+ VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev);
+ VFIODevice *vbasedev_iter;
VFIOGroup *group;
char path[PATH_MAX], iommu_group_path[PATH_MAX], *group_name;
ssize_t len;
@@ -4204,6 +4235,8 @@ static int vfio_initfn(PCIDevice *pdev)
return -errno;
}
+ vdev->vbasedev.ops = &vfio_pci_ops;
+
vdev->vbasedev.type = VFIO_DEVICE_TYPE_PCI;
g_strdup_printf(vdev->vbasedev.name, "%04x:%02x:%02x.%01x",
vdev->host.domain, vdev->host.bus, vdev->host.slot,
@@ -4238,9 +4271,8 @@ static int vfio_initfn(PCIDevice *pdev)
vdev->host.domain, vdev->host.bus, vdev->host.slot,
vdev->host.function);
- QLIST_FOREACH(pvdev, &group->device_list, next) {
- if (strcmp(pvdev->vbasedev.name, vdev->vbasedev.name) == 0) {
-
+ QLIST_FOREACH(vbasedev_iter, &group->device_list, next) {
+ if (strcmp(vbasedev_iter->name, vdev->vbasedev.name) == 0) {
error_report("vfio: error: device %s is already attached", path);
vfio_put_group(group);
return -EBUSY;
@@ -4368,7 +4400,8 @@ static void vfio_pci_reset(DeviceState *dev)
vfio_pci_pre_reset(vdev);
- if (vdev->reset_works && (vdev->has_flr || !vdev->has_pm_reset) &&
+ if (vdev->vbasedev.reset_works &&
+ (vdev->has_flr || !vdev->has_pm_reset) &&
!ioctl(vdev->vbasedev.fd, VFIO_DEVICE_RESET)) {
trace_vfio_pci_reset_flr(vdev->host.domain, vdev->host.bus,
vdev->host.slot, vdev->host.function);
@@ -4381,7 +4414,7 @@ static void vfio_pci_reset(DeviceState *dev)
}
/* If nothing else works and the device supports PM reset, use it */
- if (vdev->reset_works && vdev->has_pm_reset &&
+ if (vdev->vbasedev.reset_works && vdev->has_pm_reset &&
!ioctl(vdev->vbasedev.fd, VFIO_DEVICE_RESET)) {
trace_vfio_pci_reset_pm(vdev->host.domain, vdev->host.bus,
vdev->host.slot, vdev->host.function);
--
1.8.3.2
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH v8 07/19] hw/vfio/pci: Introduce VFIORegion
2014-11-30 18:35 [Qemu-devel] [PATCH v8 00/19] KVM platform device passthrough Eric Auger
` (5 preceding siblings ...)
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 06/19] hw/vfio/pci: handle reset at VFIODevice Eric Auger
@ 2014-11-30 18:35 ` Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 08/19] hw/vfio/pci: split vfio_get_device Eric Auger
` (12 subsequent siblings)
19 siblings, 0 replies; 22+ messages in thread
From: Eric Auger @ 2014-11-30 18:35 UTC (permalink / raw)
To: eric.auger, christoffer.dall, qemu-devel, agraf, pbonzini,
kim.phillips, a.rigo, manish.jaggi, joel.schopp, zhaoshenglong,
ard.biesheuvel
Cc: peter.maydell, patches, eric.auger, will.deacon, stuart.yoder,
Bharat.Bhushan, alex.williamson, a.motakis, kvmarm
This structure is going to be shared by VFIOPCIDevice and
VFIOPlatformDevice. VFIOBAR includes it.
vfio_eoi becomes an ops of VFIODevice specialized by parent device.
This makes possible to transform vfio_bar_write/read into generic
vfio_region_write/read that will be used by VFIOPlatformDevice too.
vfio_mmap_bar becomes vfio_map_region
Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
v7->v8:
- integrate "Add skip_dump flag to ignore memory region during dump"
v4->v5:
- remove fd field from VFIORegion
- change error_report format string in vfio_region_write/read
- remove #ifdef DEBUG_VFIO in the same function
- correct missing initialization of bar region's vbasedev field
- change Object * parameter name of vfio_mmap_region and remove
useless OBJECT()
Conflicts:
hw/vfio/pci.c
Conflicts:
hw/vfio/pci.c
---
hw/vfio/pci.c | 193 ++++++++++++++++++++++++++++++----------------------------
trace-events | 4 +-
2 files changed, 103 insertions(+), 94 deletions(-)
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index e68865b..10c1697 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -77,15 +77,19 @@ typedef struct VFIOQuirk {
} data;
} VFIOQuirk;
-typedef struct VFIOBAR {
- off_t fd_offset; /* offset of BAR within device fd */
- int fd; /* device fd, allows us to pass VFIOBAR as opaque data */
+typedef struct VFIORegion {
+ struct VFIODevice *vbasedev;
+ off_t fd_offset; /* offset of region within device fd */
MemoryRegion mem; /* slow, read/write access */
MemoryRegion mmap_mem; /* direct mapped access */
void *mmap;
size_t size;
uint32_t flags; /* VFIO region flags (rd/wr/mmap) */
- uint8_t nr; /* cache the BAR number for debug */
+ uint8_t nr; /* cache the region number for debug */
+} VFIORegion;
+
+typedef struct VFIOBAR {
+ VFIORegion region;
bool ioport;
bool mem64;
QLIST_HEAD(, VFIOQuirk) quirks;
@@ -205,6 +209,7 @@ typedef struct VFIODevice {
struct VFIODeviceOps {
void (*vfio_compute_needs_reset)(VFIODevice *vdev);
int (*vfio_hot_reset_multi)(VFIODevice *vdev);
+ void (*vfio_eoi)(VFIODevice *vdev);
};
typedef struct VFIOPCIDevice {
@@ -388,8 +393,10 @@ static void vfio_intx_interrupt(void *opaque)
}
}
-static void vfio_eoi(VFIOPCIDevice *vdev)
+static void vfio_eoi(VFIODevice *vbasedev)
{
+ VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
+
if (!vdev->intx.pending) {
return;
}
@@ -399,7 +406,7 @@ static void vfio_eoi(VFIOPCIDevice *vdev)
vdev->intx.pending = false;
pci_irq_deassert(&vdev->pdev);
- vfio_unmask_single_irqindex(&vdev->vbasedev, VFIO_PCI_INTX_IRQ_INDEX);
+ vfio_unmask_single_irqindex(vbasedev, VFIO_PCI_INTX_IRQ_INDEX);
}
static void vfio_enable_intx_kvm(VFIOPCIDevice *vdev)
@@ -552,7 +559,7 @@ static void vfio_update_irq(PCIDevice *pdev)
vfio_enable_intx_kvm(vdev);
/* Re-enable the interrupt in cased we missed an EOI */
- vfio_eoi(vdev);
+ vfio_eoi(&vdev->vbasedev);
}
static int vfio_enable_intx(VFIOPCIDevice *vdev)
@@ -1089,10 +1096,11 @@ static void vfio_update_msi(VFIOPCIDevice *vdev)
/*
* IO Port/MMIO - Beware of the endians, VFIO is always little endian
*/
-static void vfio_bar_write(void *opaque, hwaddr addr,
- uint64_t data, unsigned size)
+static void vfio_region_write(void *opaque, hwaddr addr,
+ uint64_t data, unsigned size)
{
- VFIOBAR *bar = opaque;
+ VFIORegion *region = opaque;
+ VFIODevice *vbasedev = region->vbasedev;
union {
uint8_t byte;
uint16_t word;
@@ -1115,20 +1123,14 @@ static void vfio_bar_write(void *opaque, hwaddr addr,
break;
}
- if (pwrite(bar->fd, &buf, size, bar->fd_offset + addr) != size) {
- error_report("%s(,0x%"HWADDR_PRIx", 0x%"PRIx64", %d) failed: %m",
- __func__, addr, data, size);
+ if (pwrite(vbasedev->fd, &buf, size, region->fd_offset + addr) != size) {
+ error_report("%s(%s:region%d+0x%"HWADDR_PRIx", 0x%"PRIx64
+ ",%d) failed: %m",
+ __func__, vbasedev->name, region->nr,
+ addr, data, size);
}
-#ifdef DEBUG_VFIO
- {
- VFIOPCIDevice *vdev = container_of(bar, VFIOPCIDevice, bars[bar->nr]);
-
- trace_vfio_bar_write(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function,
- region->nr, addr, data, size);
- }
-#endif
+ trace_vfio_region_write(vbasedev->name, region->nr, addr, data, size);
/*
* A read or write to a BAR always signals an INTx EOI. This will
@@ -1138,13 +1140,14 @@ static void vfio_bar_write(void *opaque, hwaddr addr,
* which access will service the interrupt, so we're potentially
* getting quite a few host interrupts per guest interrupt.
*/
- vfio_eoi(container_of(bar, VFIOPCIDevice, bars[bar->nr]));
+ vbasedev->ops->vfio_eoi(vbasedev);
}
-static uint64_t vfio_bar_read(void *opaque,
- hwaddr addr, unsigned size)
+static uint64_t vfio_region_read(void *opaque,
+ hwaddr addr, unsigned size)
{
- VFIOBAR *bar = opaque;
+ VFIORegion *region = opaque;
+ VFIODevice *vbasedev = region->vbasedev;
union {
uint8_t byte;
uint16_t word;
@@ -1153,9 +1156,10 @@ static uint64_t vfio_bar_read(void *opaque,
} buf;
uint64_t data = 0;
- if (pread(bar->fd, &buf, size, bar->fd_offset + addr) != size) {
- error_report("%s(,0x%"HWADDR_PRIx", %d) failed: %m",
- __func__, addr, size);
+ if (pread(vbasedev->fd, &buf, size, region->fd_offset + addr) != size) {
+ error_report("%s(%s:region%d+0x%"HWADDR_PRIx", %d) failed: %m",
+ __func__, vbasedev->name, region->nr,
+ addr, size);
return (uint64_t)-1;
}
@@ -1174,25 +1178,17 @@ static uint64_t vfio_bar_read(void *opaque,
break;
}
-#ifdef DEBUG_VFIO
- {
- VFIOPCIDevice *vdev = container_of(bar, VFIOPCIDevice, bars[bar->nr]);
-
- trace_vfio_bar_read(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function,
- region->nr, addr, size, data);
- }
-#endif
+ trace_vfio_region_read(vbasedev->name, region->nr, addr, size, data);
/* Same as write above */
- vfio_eoi(container_of(bar, VFIOPCIDevice, bars[bar->nr]));
+ vbasedev->ops->vfio_eoi(vbasedev);
return data;
}
-static const MemoryRegionOps vfio_bar_ops = {
- .read = vfio_bar_read,
- .write = vfio_bar_write,
+static const MemoryRegionOps vfio_region_ops = {
+ .read = vfio_region_read,
+ .write = vfio_region_write,
.endianness = DEVICE_LITTLE_ENDIAN,
};
@@ -1529,8 +1525,8 @@ static uint64_t vfio_generic_window_quirk_read(void *opaque,
quirk->data.bar,
addr, size, data);
} else {
- data = vfio_bar_read(&vdev->bars[quirk->data.bar],
- addr + quirk->data.base_offset, size);
+ data = vfio_region_read(&vdev->bars[quirk->data.bar].region,
+ addr + quirk->data.base_offset, size);
}
return data;
@@ -1584,7 +1580,7 @@ static void vfio_generic_window_quirk_write(void *opaque, hwaddr addr,
return;
}
- vfio_bar_write(&vdev->bars[quirk->data.bar],
+ vfio_region_write(&vdev->bars[quirk->data.bar].region,
addr + quirk->data.base_offset, data, size);
}
@@ -1621,7 +1617,8 @@ static uint64_t vfio_generic_quirk_read(void *opaque,
quirk->data.bar,
addr + base, size, data);
} else {
- data = vfio_bar_read(&vdev->bars[quirk->data.bar], addr + base, size);
+ data = vfio_region_read(&vdev->bars[quirk->data.bar].region,
+ addr + base, size);
}
return data;
@@ -1653,7 +1650,8 @@ static void vfio_generic_quirk_write(void *opaque, hwaddr addr,
quirk->data.bar,
addr + base, data, size);
} else {
- vfio_bar_write(&vdev->bars[quirk->data.bar], addr + base, data, size);
+ vfio_region_write(&vdev->bars[quirk->data.bar].region,
+ addr + base, data, size);
}
}
@@ -1706,7 +1704,7 @@ static void vfio_vga_probe_ati_3c3_quirk(VFIOPCIDevice *vdev)
* As long as the BAR is >= 256 bytes it will be aligned such that the
* lower byte is always zero. Filter out anything else, if it exists.
*/
- if (!vdev->bars[4].ioport || vdev->bars[4].size < 256) {
+ if (!vdev->bars[4].ioport || vdev->bars[4].region.size < 256) {
return;
}
@@ -1758,7 +1756,7 @@ static void vfio_probe_ati_bar4_window_quirk(VFIOPCIDevice *vdev, int nr)
memory_region_init_io(&quirk->mem, OBJECT(vdev),
&vfio_generic_window_quirk, quirk,
"vfio-ati-bar4-window-quirk", 8);
- memory_region_add_subregion_overlap(&vdev->bars[nr].mem,
+ memory_region_add_subregion_overlap(&vdev->bars[nr].region.mem,
quirk->data.base_offset, &quirk->mem, 1);
QLIST_INSERT_HEAD(&vdev->bars[nr].quirks, quirk, next);
@@ -1837,7 +1835,8 @@ static uint64_t vfio_rtl8168_window_quirk_read(void *opaque,
vdev->host.domain, vdev->host.bus,
vdev->host.slot, vdev->host.function);
- return vfio_bar_read(&vdev->bars[quirk->data.bar], addr + 0x70, size);
+ return vfio_region_read(&vdev->bars[quirk->data.bar].region,
+ addr + 0x70, size);
}
static void vfio_rtl8168_window_quirk_write(void *opaque, hwaddr addr,
@@ -1879,7 +1878,8 @@ static void vfio_rtl8168_window_quirk_write(void *opaque, hwaddr addr,
vdev->host.domain, vdev->host.bus,
vdev->host.slot, vdev->host.function);
- vfio_bar_write(&vdev->bars[quirk->data.bar], addr + 0x70, data, size);
+ vfio_region_write(&vdev->bars[quirk->data.bar].region,
+ addr + 0x70, data, size);
}
static const MemoryRegionOps vfio_rtl8168_window_quirk = {
@@ -1909,7 +1909,7 @@ static void vfio_probe_rtl8168_bar2_window_quirk(VFIOPCIDevice *vdev, int nr)
memory_region_init_io(&quirk->mem, OBJECT(vdev), &vfio_rtl8168_window_quirk,
quirk, "vfio-rtl8168-window-quirk", 8);
- memory_region_add_subregion_overlap(&vdev->bars[nr].mem,
+ memory_region_add_subregion_overlap(&vdev->bars[nr].region.mem,
0x70, &quirk->mem, 1);
QLIST_INSERT_HEAD(&vdev->bars[nr].quirks, quirk, next);
@@ -1943,7 +1943,7 @@ static void vfio_probe_ati_bar2_4000_quirk(VFIOPCIDevice *vdev, int nr)
memory_region_init_io(&quirk->mem, OBJECT(vdev), &vfio_generic_quirk, quirk,
"vfio-ati-bar2-4000-quirk",
TARGET_PAGE_ALIGN(quirk->data.address_mask + 1));
- memory_region_add_subregion_overlap(&vdev->bars[nr].mem,
+ memory_region_add_subregion_overlap(&vdev->bars[nr].region.mem,
quirk->data.address_match & TARGET_PAGE_MASK,
&quirk->mem, 1);
@@ -2063,7 +2063,7 @@ static void vfio_vga_probe_nvidia_3d0_quirk(VFIOPCIDevice *vdev)
VFIOQuirk *quirk;
if (pci_get_word(pdev->config + PCI_VENDOR_ID) != PCI_VENDOR_ID_NVIDIA ||
- !vdev->bars[1].size) {
+ !vdev->bars[1].region.size) {
return;
}
@@ -2172,7 +2172,8 @@ static void vfio_probe_nvidia_bar5_window_quirk(VFIOPCIDevice *vdev, int nr)
memory_region_init_io(&quirk->mem, OBJECT(vdev),
&vfio_nvidia_bar5_window_quirk, quirk,
"vfio-nvidia-bar5-window-quirk", 16);
- memory_region_add_subregion_overlap(&vdev->bars[nr].mem, 0, &quirk->mem, 1);
+ memory_region_add_subregion_overlap(&vdev->bars[nr].region.mem,
+ 0, &quirk->mem, 1);
QLIST_INSERT_HEAD(&vdev->bars[nr].quirks, quirk, next);
@@ -2200,7 +2201,8 @@ static void vfio_nvidia_88000_quirk_write(void *opaque, hwaddr addr,
*/
if ((pdev->cap_present & QEMU_PCI_CAP_MSI) &&
vfio_range_contained(addr, size, pdev->msi_cap, PCI_MSI_FLAGS)) {
- vfio_bar_write(&vdev->bars[quirk->data.bar], addr + base, data, size);
+ vfio_region_write(&vdev->bars[quirk->data.bar].region,
+ addr + base, data, size);
}
}
@@ -2243,7 +2245,7 @@ static void vfio_probe_nvidia_bar0_88000_quirk(VFIOPCIDevice *vdev, int nr)
memory_region_init_io(&quirk->mem, OBJECT(vdev), &vfio_nvidia_88000_quirk,
quirk, "vfio-nvidia-bar0-88000-quirk",
TARGET_PAGE_ALIGN(quirk->data.address_mask + 1));
- memory_region_add_subregion_overlap(&vdev->bars[nr].mem,
+ memory_region_add_subregion_overlap(&vdev->bars[nr].region.mem,
quirk->data.address_match & TARGET_PAGE_MASK,
&quirk->mem, 1);
@@ -2270,7 +2272,8 @@ static void vfio_probe_nvidia_bar0_1800_quirk(VFIOPCIDevice *vdev, int nr)
/* Log the chipset ID */
trace_vfio_probe_nvidia_bar0_1800_quirk_id(
- (unsigned int)(vfio_bar_read(&vdev->bars[0], 0, 4) >> 20) & 0xff);
+ (unsigned int)(vfio_region_read(&vdev->bars[0].region, 0, 4) >> 20)
+ & 0xff);
quirk = g_malloc0(sizeof(*quirk));
quirk->vdev = vdev;
@@ -2282,7 +2285,7 @@ static void vfio_probe_nvidia_bar0_1800_quirk(VFIOPCIDevice *vdev, int nr)
memory_region_init_io(&quirk->mem, OBJECT(vdev), &vfio_generic_quirk, quirk,
"vfio-nvidia-bar0-1800-quirk",
TARGET_PAGE_ALIGN(quirk->data.address_mask + 1));
- memory_region_add_subregion_overlap(&vdev->bars[nr].mem,
+ memory_region_add_subregion_overlap(&vdev->bars[nr].region.mem,
quirk->data.address_match & TARGET_PAGE_MASK,
&quirk->mem, 1);
@@ -2340,7 +2343,7 @@ static void vfio_bar_quirk_teardown(VFIOPCIDevice *vdev, int nr)
while (!QLIST_EMPTY(&bar->quirks)) {
VFIOQuirk *quirk = QLIST_FIRST(&bar->quirks);
- memory_region_del_subregion(&bar->mem, &quirk->mem);
+ memory_region_del_subregion(&bar->region.mem, &quirk->mem);
object_unparent(OBJECT(&quirk->mem));
QLIST_REMOVE(quirk, next);
g_free(quirk);
@@ -2851,9 +2854,9 @@ static int vfio_setup_msix(VFIOPCIDevice *vdev, int pos)
int ret;
ret = msix_init(&vdev->pdev, vdev->msix->entries,
- &vdev->bars[vdev->msix->table_bar].mem,
+ &vdev->bars[vdev->msix->table_bar].region.mem,
vdev->msix->table_bar, vdev->msix->table_offset,
- &vdev->bars[vdev->msix->pba_bar].mem,
+ &vdev->bars[vdev->msix->pba_bar].region.mem,
vdev->msix->pba_bar, vdev->msix->pba_offset, pos);
if (ret < 0) {
if (ret == -ENOTSUP) {
@@ -2871,8 +2874,9 @@ static void vfio_teardown_msi(VFIOPCIDevice *vdev)
msi_uninit(&vdev->pdev);
if (vdev->msix) {
- msix_uninit(&vdev->pdev, &vdev->bars[vdev->msix->table_bar].mem,
- &vdev->bars[vdev->msix->pba_bar].mem);
+ msix_uninit(&vdev->pdev,
+ &vdev->bars[vdev->msix->table_bar].region.mem,
+ &vdev->bars[vdev->msix->pba_bar].region.mem);
}
}
@@ -2886,11 +2890,11 @@ static void vfio_mmap_set_enabled(VFIOPCIDevice *vdev, bool enabled)
for (i = 0; i < PCI_ROM_SLOT; i++) {
VFIOBAR *bar = &vdev->bars[i];
- if (!bar->size) {
+ if (!bar->region.size) {
continue;
}
- memory_region_set_enabled(&bar->mmap_mem, enabled);
+ memory_region_set_enabled(&bar->region.mmap_mem, enabled);
if (vdev->msix && vdev->msix->table_bar == i) {
memory_region_set_enabled(&vdev->msix->mmap_mem, enabled);
}
@@ -2901,53 +2905,55 @@ static void vfio_unmap_bar(VFIOPCIDevice *vdev, int nr)
{
VFIOBAR *bar = &vdev->bars[nr];
- if (!bar->size) {
+ if (!bar->region.size) {
return;
}
vfio_bar_quirk_teardown(vdev, nr);
- memory_region_del_subregion(&bar->mem, &bar->mmap_mem);
- munmap(bar->mmap, memory_region_size(&bar->mmap_mem));
+ memory_region_del_subregion(&bar->region.mem, &bar->region.mmap_mem);
+ munmap(bar->region.mmap, memory_region_size(&bar->region.mmap_mem));
if (vdev->msix && vdev->msix->table_bar == nr) {
- memory_region_del_subregion(&bar->mem, &vdev->msix->mmap_mem);
+ memory_region_del_subregion(&bar->region.mem, &vdev->msix->mmap_mem);
munmap(vdev->msix->mmap, memory_region_size(&vdev->msix->mmap_mem));
}
}
-static int vfio_mmap_bar(VFIOPCIDevice *vdev, VFIOBAR *bar,
- MemoryRegion *mem, MemoryRegion *submem,
- void **map, size_t size, off_t offset,
- const char *name)
+static int vfio_mmap_region(Object *obj, VFIORegion *region,
+ MemoryRegion *mem, MemoryRegion *submem,
+ void **map, size_t size, off_t offset,
+ const char *name)
{
int ret = 0;
+ VFIODevice *vbasedev = region->vbasedev;
- if (VFIO_ALLOW_MMAP && size && bar->flags & VFIO_REGION_INFO_FLAG_MMAP) {
+ if (VFIO_ALLOW_MMAP && size && region->flags &
+ VFIO_REGION_INFO_FLAG_MMAP) {
int prot = 0;
- if (bar->flags & VFIO_REGION_INFO_FLAG_READ) {
+ if (region->flags & VFIO_REGION_INFO_FLAG_READ) {
prot |= PROT_READ;
}
- if (bar->flags & VFIO_REGION_INFO_FLAG_WRITE) {
+ if (region->flags & VFIO_REGION_INFO_FLAG_WRITE) {
prot |= PROT_WRITE;
}
*map = mmap(NULL, size, prot, MAP_SHARED,
- bar->fd, bar->fd_offset + offset);
+ vbasedev->fd, region->fd_offset + offset);
if (*map == MAP_FAILED) {
*map = NULL;
ret = -errno;
goto empty_region;
}
- memory_region_init_ram_ptr(submem, OBJECT(vdev), name, size, *map);
+ memory_region_init_ram_ptr(submem, obj, name, size, *map);
memory_region_set_skip_dump(submem);
} else {
empty_region:
/* Create a zero sized sub-region to make cleanup easy. */
- memory_region_init(submem, OBJECT(vdev), name, 0);
+ memory_region_init(submem, obj, name, 0);
}
memory_region_add_subregion(mem, offset, submem);
@@ -2958,7 +2964,7 @@ empty_region:
static void vfio_map_bar(VFIOPCIDevice *vdev, int nr)
{
VFIOBAR *bar = &vdev->bars[nr];
- unsigned size = bar->size;
+ unsigned size = bar->region.size;
char name[64];
uint32_t pci_bar;
uint8_t type;
@@ -2988,9 +2994,9 @@ static void vfio_map_bar(VFIOPCIDevice *vdev, int nr)
~PCI_BASE_ADDRESS_MEM_MASK);
/* A "slow" read/write mapping underlies all BARs */
- memory_region_init_io(&bar->mem, OBJECT(vdev), &vfio_bar_ops,
+ memory_region_init_io(&bar->region.mem, OBJECT(vdev), &vfio_region_ops,
bar, name, size);
- pci_register_bar(&vdev->pdev, nr, type, &bar->mem);
+ pci_register_bar(&vdev->pdev, nr, type, &bar->region.mem);
/*
* We can't mmap areas overlapping the MSIX vector table, so we
@@ -3001,8 +3007,9 @@ static void vfio_map_bar(VFIOPCIDevice *vdev, int nr)
}
strncat(name, " mmap", sizeof(name) - strlen(name) - 1);
- if (vfio_mmap_bar(vdev, bar, &bar->mem,
- &bar->mmap_mem, &bar->mmap, size, 0, name)) {
+ if (vfio_mmap_region(OBJECT(vdev), &bar->region, &bar->region.mem,
+ &bar->region.mmap_mem, &bar->region.mmap,
+ size, 0, name)) {
error_report("%s unsupported. Performance may be slow", name);
}
@@ -3012,10 +3019,11 @@ static void vfio_map_bar(VFIOPCIDevice *vdev, int nr)
start = HOST_PAGE_ALIGN(vdev->msix->table_offset +
(vdev->msix->entries * PCI_MSIX_ENTRY_SIZE));
- size = start < bar->size ? bar->size - start : 0;
+ size = start < bar->region.size ? bar->region.size - start : 0;
strncat(name, " msix-hi", sizeof(name) - strlen(name) - 1);
/* VFIOMSIXInfo contains another MemoryRegion for this mapping */
- if (vfio_mmap_bar(vdev, bar, &bar->mem, &vdev->msix->mmap_mem,
+ if (vfio_mmap_region(OBJECT(vdev), &bar->region, &bar->region.mem,
+ &vdev->msix->mmap_mem,
&vdev->msix->mmap, size, start, name)) {
error_report("%s unsupported. Performance may be slow", name);
}
@@ -3601,6 +3609,7 @@ static void vfio_pci_compute_needs_reset(VFIODevice *vbasedev)
static VFIODeviceOps vfio_pci_ops = {
.vfio_compute_needs_reset = vfio_pci_compute_needs_reset,
.vfio_hot_reset_multi = vfio_pci_hot_reset_multi,
+ .vfio_eoi = vfio_eoi,
};
static void vfio_reset_handler(void *opaque)
@@ -4005,11 +4014,11 @@ static int vfio_get_device(VFIOGroup *group, const char *name,
(unsigned long)reg_info.offset,
(unsigned long)reg_info.flags);
- vdev->bars[i].flags = reg_info.flags;
- vdev->bars[i].size = reg_info.size;
- vdev->bars[i].fd_offset = reg_info.offset;
- vdev->bars[i].fd = vdev->vbasedev.fd;
- vdev->bars[i].nr = i;
+ vdev->bars[i].region.vbasedev = &vdev->vbasedev;
+ vdev->bars[i].region.flags = reg_info.flags;
+ vdev->bars[i].region.size = reg_info.size;
+ vdev->bars[i].region.fd_offset = reg_info.offset;
+ vdev->bars[i].region.nr = i;
QLIST_INIT(&vdev->bars[i].quirks);
}
diff --git a/trace-events b/trace-events
index cfe2db4..7a93178 100644
--- a/trace-events
+++ b/trace-events
@@ -1412,8 +1412,8 @@ vfio_pci_reset(int domain, int bus, int slot, int fn) " (%04x:%02x:%02x.%x)"
vfio_pci_reset_flr(int domain, int bus, int slot, int fn) "%04x:%02x:%02x.%x FLR/VFIO_DEVICE_RESET"
vfio_pci_reset_pm(int domain, int bus, int slot, int fn) "%04x:%02x:%02x.%x PCI PM Reset"
-vfio_bar_write(int domain, int bus, int slot, int fn, int index, uint64_t addr, uint64_t data, unsigned size) " (%04x:%02x:%02x.%x:region%d+0x%"PRIx64", 0x%"PRIx64 ", %d)"
-vfio_bar_read(int domain, int bus, int slot, int fn, int index, uint64_t addr, unsigned size, uint64_t data) " (%04x:%02x:%02x.%x:region%d+0x%"PRIx64", %d) = 0x%"PRIx64
+vfio_region_write(const char *name, int index, uint64_t addr, uint64_t data, unsigned size) " (%s:region%d+0x%"PRIx64", 0x%"PRIx64 ", %d)"
+vfio_region_read(const char *name, int index, uint64_t addr, unsigned size, uint64_t data) " (%s:region%d+0x%"PRIx64", %d) = 0x%"PRIx64
vfio_iommu_map_notify(uint64_t iova_start, uint64_t iova_end) "iommu map @ %"PRIx64" - %"PRIx64
vfio_listener_region_add_skip(uint64_t start, uint64_t end) "SKIPPING region_add %"PRIx64" - %"PRIx64
vfio_listener_region_add_iommu(uint64_t start, uint64_t end) "region_add [iommu] %"PRIx64" - %"PRIx64
--
1.8.3.2
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH v8 08/19] hw/vfio/pci: split vfio_get_device
2014-11-30 18:35 [Qemu-devel] [PATCH v8 00/19] KVM platform device passthrough Eric Auger
` (6 preceding siblings ...)
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 07/19] hw/vfio/pci: Introduce VFIORegion Eric Auger
@ 2014-11-30 18:35 ` Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 09/19] hw/vfio/pci: rename group_list into vfio_group_list Eric Auger
` (11 subsequent siblings)
19 siblings, 0 replies; 22+ messages in thread
From: Eric Auger @ 2014-11-30 18:35 UTC (permalink / raw)
To: eric.auger, christoffer.dall, qemu-devel, agraf, pbonzini,
kim.phillips, a.rigo, manish.jaggi, joel.schopp, zhaoshenglong,
ard.biesheuvel
Cc: peter.maydell, patches, eric.auger, will.deacon, stuart.yoder,
Bharat.Bhushan, alex.williamson, a.motakis, kvmarm
vfio_get_device now takes a VFIODevice as argument. The function is split
into 2 parts: vfio_get_device which is generic and vfio_populate_device
which is bus specific.
3 new fields are introduced in VFIODevice to store dev_info.
vfio_put_base_device is created.
---
v5->v6:
- simplifies the split for vfio_get_device:
vfio_check_device, vfio_populate_regions, vfio_populate_interrupts
are now gathered into a unique specialization function dubbed
vfio_populate_device
v4->v5:
- cleanup up of error handling and get/put operations in
vfio_check_device, vfio_populate_regions, vfio_populate_interrupts and
vfio_get_device.
- correct misuse of errno
- vfio_populate_regions always returns 0
- VFIODevice .name deallocation done in vfio_put_device instead of
vfio_put_base_device
- vfio_put_base_device done at vfio_get_device level.
Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
hw/vfio/pci.c | 130 +++++++++++++++++++++++++++++++++++-----------------------
trace-events | 10 ++---
2 files changed, 83 insertions(+), 57 deletions(-)
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 10c1697..60ff22b 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -204,12 +204,16 @@ typedef struct VFIODevice {
bool reset_works;
bool needs_reset;
VFIODeviceOps *ops;
+ unsigned int num_irqs;
+ unsigned int num_regions;
+ unsigned int flags;
} VFIODevice;
struct VFIODeviceOps {
void (*vfio_compute_needs_reset)(VFIODevice *vdev);
int (*vfio_hot_reset_multi)(VFIODevice *vdev);
void (*vfio_eoi)(VFIODevice *vdev);
+ int (*vfio_populate_device)(VFIODevice *vdev);
};
typedef struct VFIOPCIDevice {
@@ -296,6 +300,8 @@ static uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len);
static void vfio_pci_write_config(PCIDevice *pdev, uint32_t addr,
uint32_t val, int len);
static void vfio_mmap_set_enabled(VFIOPCIDevice *vdev, bool enabled);
+static void vfio_put_base_device(VFIODevice *vbasedev);
+static int vfio_populate_device(VFIODevice *vbasedev);
/*
* Common VFIO interrupt disable
@@ -3610,6 +3616,7 @@ static VFIODeviceOps vfio_pci_ops = {
.vfio_compute_needs_reset = vfio_pci_compute_needs_reset,
.vfio_hot_reset_multi = vfio_pci_hot_reset_multi,
.vfio_eoi = vfio_eoi,
+ .vfio_populate_device = vfio_populate_device,
};
static void vfio_reset_handler(void *opaque)
@@ -3951,70 +3958,45 @@ static void vfio_put_group(VFIOGroup *group)
}
}
-static int vfio_get_device(VFIOGroup *group, const char *name,
- VFIOPCIDevice *vdev)
+static int vfio_populate_device(VFIODevice *vbasedev)
{
- struct vfio_device_info dev_info = { .argsz = sizeof(dev_info) };
+ VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
struct vfio_region_info reg_info = { .argsz = sizeof(reg_info) };
struct vfio_irq_info irq_info = { .argsz = sizeof(irq_info) };
- int ret, i;
-
- ret = ioctl(group->fd, VFIO_GROUP_GET_DEVICE_FD, name);
- if (ret < 0) {
- error_report("vfio: error getting device %s from group %d: %m",
- name, group->groupid);
- error_printf("Verify all devices in group %d are bound to vfio-pci "
- "or pci-stub and not already in use\n", group->groupid);
- return ret;
- }
-
- vdev->vbasedev.fd = ret;
- vdev->vbasedev.group = group;
- QLIST_INSERT_HEAD(&group->device_list, &vdev->vbasedev, next);
+ int i, ret = -1;
/* Sanity check device */
- ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_INFO, &dev_info);
- if (ret) {
- error_report("vfio: error getting device info: %m");
- goto error;
- }
-
- trace_vfio_get_device_irq(name, dev_info.flags,
- dev_info.num_regions, dev_info.num_irqs);
-
- if (!(dev_info.flags & VFIO_DEVICE_FLAGS_PCI)) {
+ if (!(vbasedev->flags & VFIO_DEVICE_FLAGS_PCI)) {
error_report("vfio: Um, this isn't a PCI device");
goto error;
}
- vdev->vbasedev.reset_works = !!(dev_info.flags & VFIO_DEVICE_FLAGS_RESET);
-
- if (dev_info.num_regions < VFIO_PCI_CONFIG_REGION_INDEX + 1) {
+ if (vbasedev->num_regions < VFIO_PCI_CONFIG_REGION_INDEX + 1) {
error_report("vfio: unexpected number of io regions %u",
- dev_info.num_regions);
+ vbasedev->num_regions);
goto error;
}
- if (dev_info.num_irqs < VFIO_PCI_MSIX_IRQ_INDEX + 1) {
- error_report("vfio: unexpected number of irqs %u", dev_info.num_irqs);
+ if (vbasedev->num_irqs < VFIO_PCI_MSIX_IRQ_INDEX + 1) {
+ error_report("vfio: unexpected number of irqs %u", vbasedev->num_irqs);
goto error;
}
for (i = VFIO_PCI_BAR0_REGION_INDEX; i < VFIO_PCI_ROM_REGION_INDEX; i++) {
reg_info.index = i;
- ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_REGION_INFO, ®_info);
+ ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_REGION_INFO, ®_info);
if (ret) {
error_report("vfio: Error getting region %d info: %m", i);
goto error;
}
- trace_vfio_get_device_region(name, i,
- (unsigned long)reg_info.size,
- (unsigned long)reg_info.offset,
- (unsigned long)reg_info.flags);
+ trace_vfio_populate_device_region(vbasedev->name, i,
+ (unsigned long)reg_info.size,
+ (unsigned long)reg_info.offset,
+ (unsigned long)reg_info.flags);
- vdev->bars[i].region.vbasedev = &vdev->vbasedev;
+ vdev->bars[i].region.vbasedev = vbasedev;
vdev->bars[i].region.flags = reg_info.flags;
vdev->bars[i].region.size = reg_info.size;
vdev->bars[i].region.fd_offset = reg_info.offset;
@@ -4030,9 +4012,10 @@ static int vfio_get_device(VFIOGroup *group, const char *name,
goto error;
}
- trace_vfio_get_device_config(name, (unsigned long)reg_info.size,
- (unsigned long)reg_info.offset,
- (unsigned long)reg_info.flags);
+ trace_vfio_populate_device_config(vdev->vbasedev.name,
+ (unsigned long)reg_info.size,
+ (unsigned long)reg_info.offset,
+ (unsigned long)reg_info.flags);
vdev->config_size = reg_info.size;
if (vdev->config_size == PCI_CONFIG_SPACE_SIZE) {
@@ -4041,7 +4024,7 @@ static int vfio_get_device(VFIOGroup *group, const char *name,
vdev->config_offset = reg_info.offset;
if ((vdev->features & VFIO_FEATURE_ENABLE_VGA) &&
- dev_info.num_regions > VFIO_PCI_VGA_REGION_INDEX) {
+ vbasedev->num_regions > VFIO_PCI_VGA_REGION_INDEX) {
struct vfio_region_info vga_info = {
.argsz = sizeof(vga_info),
.index = VFIO_PCI_VGA_REGION_INDEX,
@@ -4085,7 +4068,7 @@ static int vfio_get_device(VFIOGroup *group, const char *name,
ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_IRQ_INFO, &irq_info);
if (ret) {
/* This can fail for an old kernel or legacy PCI dev */
- trace_vfio_get_device_get_irq_info_failure();
+ trace_vfio_populate_device_get_irq_info_failure();
ret = 0;
} else if (irq_info.count == 1) {
vdev->pci_aer = true;
@@ -4097,25 +4080,68 @@ static int vfio_get_device(VFIOGroup *group, const char *name,
}
error:
+ return ret;
+}
+
+static int vfio_get_device(VFIOGroup *group, const char *name,
+ VFIODevice *vbasedev)
+{
+ struct vfio_device_info dev_info = { .argsz = sizeof(dev_info) };
+ int ret;
+
+ ret = ioctl(group->fd, VFIO_GROUP_GET_DEVICE_FD, name);
+ if (ret < 0) {
+ error_report("vfio: error getting device %s from group %d: %m",
+ name, group->groupid);
+ error_printf("Verify all devices in group %d are bound to vfio-<bus> "
+ "or pci-stub and not already in use\n", group->groupid);
+ return ret;
+ }
+
+ vbasedev->fd = ret;
+ vbasedev->group = group;
+ QLIST_INSERT_HEAD(&group->device_list, vbasedev, next);
+
+ ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_INFO, &dev_info);
+ if (ret) {
+ error_report("vfio: error getting device info: %m");
+ goto error;
+ }
+
+ vbasedev->num_irqs = dev_info.num_irqs;
+ vbasedev->num_regions = dev_info.num_regions;
+ vbasedev->flags = dev_info.flags;
+
+ trace_vfio_get_device(name, dev_info.flags,
+ dev_info.num_regions, dev_info.num_irqs);
+
+ vbasedev->reset_works = !!(dev_info.flags & VFIO_DEVICE_FLAGS_RESET);
+
+ ret = vbasedev->ops->vfio_populate_device(vbasedev);
+
+error:
if (ret) {
- QLIST_REMOVE(&vdev->vbasedev, next);
- vdev->vbasedev.group = NULL;
- close(vdev->vbasedev.fd);
+ vfio_put_base_device(vbasedev);
}
return ret;
}
+void vfio_put_base_device(VFIODevice *vbasedev)
+{
+ QLIST_REMOVE(vbasedev, next);
+ vbasedev->group = NULL;
+ trace_vfio_put_base_device(vbasedev->fd);
+ close(vbasedev->fd);
+}
+
static void vfio_put_device(VFIOPCIDevice *vdev)
{
- QLIST_REMOVE(&vdev->vbasedev, next);
- vdev->vbasedev.group = NULL;
- trace_vfio_put_device(vdev->vbasedev.fd);
- close(vdev->vbasedev.fd);
g_free(vdev->vbasedev.name);
if (vdev->msix) {
g_free(vdev->msix);
vdev->msix = NULL;
}
+ vfio_put_base_device(&vdev->vbasedev);
}
static void vfio_err_notifier_handler(void *opaque)
@@ -4288,7 +4314,7 @@ static int vfio_initfn(PCIDevice *pdev)
}
}
- ret = vfio_get_device(group, path, vdev);
+ ret = vfio_get_device(group, path, &vdev->vbasedev);
if (ret) {
error_report("vfio: failed to get device %s", path);
vfio_put_group(group);
diff --git a/trace-events b/trace-events
index 7a93178..55a559b 100644
--- a/trace-events
+++ b/trace-events
@@ -1403,10 +1403,10 @@ vfio_pci_hot_reset(int domain, int bus, int slot, int fn, const char *type) " (%
vfio_pci_hot_reset_has_dep_devices(int domain, int bus, int slot, int fn) "%04x:%02x:%02x.%x: hot reset dependent devices:"
vfio_pci_hot_reset_dep_devices(int domain, int bus, int slot, int function, int group_id) "\t%04x:%02x:%02x.%x group %d"
vfio_pci_hot_reset_result(int domain, int bus, int slot, int fn, const char *result) "%04x:%02x:%02x.%x hot reset: %s"
-vfio_get_device_region(const char *region_name, int index, unsigned long size, unsigned long offset, unsigned long flags) "Device %s region %d:\n size: 0x%lx, offset: 0x%lx, flags: 0x%lx"
-vfio_get_device_config(const char *name, unsigned long size, unsigned long offset, unsigned long flags) "Device %s config:\n size: 0x%lx, offset: 0x%lx, flags: 0x%lx"
-vfio_get_device_get_irq_info_failure(void) "VFIO_DEVICE_GET_IRQ_INFO failure: %m"
-vfio_get_device_irq(const char *name, unsigned flags, unsigned num_regions, unsigned num_irqs) "Device %s flags: %u, regions: %u, irgs: %u"
+vfio_populate_device_region(const char *region_name, int index, unsigned long size, unsigned long offset, unsigned long flags) "Device %s region %d:\n size: 0x%lx, offset: 0x%lx, flags: 0x%lx"
+vfio_populate_device_config(const char *name, unsigned long size, unsigned long offset, unsigned long flags) "Device %s config:\n size: 0x%lx, offset: 0x%lx, flags: 0x%lx"
+vfio_populate_device_get_irq_info_failure(void) "VFIO_DEVICE_GET_IRQ_INFO failure: %m"
+vfio_get_device(const char *name, unsigned flags, unsigned num_regions, unsigned num_irqs) "Device %s flags: %u, regions: %u, irgs: %u"
vfio_initfn(int domain, int bus, int slot, int fn, int group_id) " (%04x:%02x:%02x.%x) group %d"
vfio_pci_reset(int domain, int bus, int slot, int fn) " (%04x:%02x:%02x.%x)"
vfio_pci_reset_flr(int domain, int bus, int slot, int fn) "%04x:%02x:%02x.%x FLR/VFIO_DEVICE_RESET"
@@ -1422,7 +1422,7 @@ vfio_listener_region_del_skip(uint64_t start, uint64_t end) "SKIPPING region_del
vfio_listener_region_del(uint64_t start, uint64_t end) "region_del %"PRIx64" - %"PRIx64
vfio_disconnect_container(int fd) "close container->fd=%d"
vfio_put_group(int fd) "close group->fd=%d"
-vfio_put_device(int fd) "close vdev->fd=%d"
+vfio_put_base_device(int fd) "close vdev->fd=%d"
#hw/acpi/memory_hotplug.c
mhp_acpi_invalid_slot_selected(uint32_t slot) "0x%"PRIx32
--
1.8.3.2
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH v8 09/19] hw/vfio/pci: rename group_list into vfio_group_list
2014-11-30 18:35 [Qemu-devel] [PATCH v8 00/19] KVM platform device passthrough Eric Auger
` (7 preceding siblings ...)
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 08/19] hw/vfio/pci: split vfio_get_device Eric Auger
@ 2014-11-30 18:35 ` Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 10/19] hw/vfio/pci: use name field in format strings Eric Auger
` (10 subsequent siblings)
19 siblings, 0 replies; 22+ messages in thread
From: Eric Auger @ 2014-11-30 18:35 UTC (permalink / raw)
To: eric.auger, christoffer.dall, qemu-devel, agraf, pbonzini,
kim.phillips, a.rigo, manish.jaggi, joel.schopp, zhaoshenglong,
ard.biesheuvel
Cc: peter.maydell, patches, eric.auger, will.deacon, stuart.yoder,
Bharat.Bhushan, alex.williamson, a.motakis, kvmarm
better fit in the rest of the namespace
Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
hw/vfio/pci.c | 22 +++++++++++-----------
1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 60ff22b..d4a0e0f 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -282,7 +282,7 @@ static const VFIORomBlacklistEntry romblacklist[] = {
#define MSIX_CAP_LENGTH 12
static QLIST_HEAD(, VFIOGroup)
- group_list = QLIST_HEAD_INITIALIZER(group_list);
+ vfio_group_list = QLIST_HEAD_INITIALIZER(vfio_group_list);
#ifdef CONFIG_KVM
/*
@@ -3454,7 +3454,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single)
continue;
}
- QLIST_FOREACH(group, &group_list, next) {
+ QLIST_FOREACH(group, &vfio_group_list, next) {
if (group->groupid == devices[i].group_id) {
break;
}
@@ -3501,7 +3501,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single)
/* Determine how many group fds need to be passed */
count = 0;
- QLIST_FOREACH(group, &group_list, next) {
+ QLIST_FOREACH(group, &vfio_group_list, next) {
for (i = 0; i < info->count; i++) {
if (group->groupid == devices[i].group_id) {
count++;
@@ -3515,7 +3515,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single)
fds = &reset->group_fds[0];
/* Fill in group fds */
- QLIST_FOREACH(group, &group_list, next) {
+ QLIST_FOREACH(group, &vfio_group_list, next) {
for (i = 0; i < info->count; i++) {
if (group->groupid == devices[i].group_id) {
fds[reset->count++] = group->fd;
@@ -3550,7 +3550,7 @@ out:
continue;
}
- QLIST_FOREACH(group, &group_list, next) {
+ QLIST_FOREACH(group, &vfio_group_list, next) {
if (group->groupid == devices[i].group_id) {
break;
}
@@ -3624,13 +3624,13 @@ static void vfio_reset_handler(void *opaque)
VFIOGroup *group;
VFIODevice *vbasedev;
- QLIST_FOREACH(group, &group_list, next) {
+ QLIST_FOREACH(group, &vfio_group_list, next) {
QLIST_FOREACH(vbasedev, &group->device_list, next) {
vbasedev->ops->vfio_compute_needs_reset(vbasedev);
}
}
- QLIST_FOREACH(group, &group_list, next) {
+ QLIST_FOREACH(group, &vfio_group_list, next) {
QLIST_FOREACH(vbasedev, &group->device_list, next) {
if (vbasedev->needs_reset) {
vbasedev->ops->vfio_hot_reset_multi(vbasedev);
@@ -3879,7 +3879,7 @@ static VFIOGroup *vfio_get_group(int groupid, AddressSpace *as)
char path[32];
struct vfio_group_status status = { .argsz = sizeof(status) };
- QLIST_FOREACH(group, &group_list, next) {
+ QLIST_FOREACH(group, &vfio_group_list, next) {
if (group->groupid == groupid) {
/* Found it. Now is it already in the right context? */
if (group->container->space->as == as) {
@@ -3921,11 +3921,11 @@ static VFIOGroup *vfio_get_group(int groupid, AddressSpace *as)
goto close_fd_exit;
}
- if (QLIST_EMPTY(&group_list)) {
+ if (QLIST_EMPTY(&vfio_group_list)) {
qemu_register_reset(vfio_reset_handler, NULL);
}
- QLIST_INSERT_HEAD(&group_list, group, next);
+ QLIST_INSERT_HEAD(&vfio_group_list, group, next);
vfio_kvm_device_add_group(group);
@@ -3953,7 +3953,7 @@ static void vfio_put_group(VFIOGroup *group)
close(group->fd);
g_free(group);
- if (QLIST_EMPTY(&group_list)) {
+ if (QLIST_EMPTY(&vfio_group_list)) {
qemu_unregister_reset(vfio_reset_handler, NULL);
}
}
--
1.8.3.2
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH v8 10/19] hw/vfio/pci: use name field in format strings
2014-11-30 18:35 [Qemu-devel] [PATCH v8 00/19] KVM platform device passthrough Eric Auger
` (8 preceding siblings ...)
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 09/19] hw/vfio/pci: rename group_list into vfio_group_list Eric Auger
@ 2014-11-30 18:35 ` Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 11/19] hw/vfio: create common module Eric Auger
` (9 subsequent siblings)
19 siblings, 0 replies; 22+ messages in thread
From: Eric Auger @ 2014-11-30 18:35 UTC (permalink / raw)
To: eric.auger, christoffer.dall, qemu-devel, agraf, pbonzini,
kim.phillips, a.rigo, manish.jaggi, joel.schopp, zhaoshenglong,
ard.biesheuvel
Cc: peter.maydell, patches, eric.auger, will.deacon, stuart.yoder,
Bharat.Bhushan, alex.williamson, a.motakis, kvmarm
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Conflicts:
trace-events
---
hw/vfio/pci.c | 213 ++++++++++++++++------------------------------------------
trace-events | 109 ++++++++++++++++--------------
2 files changed, 116 insertions(+), 206 deletions(-)
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index d4a0e0f..6e15c8a 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -386,9 +386,7 @@ static void vfio_intx_interrupt(void *opaque)
return;
}
- trace_vfio_intx_interrupt(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function,
- 'A' + vdev->intx.pin);
+ trace_vfio_intx_interrupt(vdev->vbasedev.name, 'A' + vdev->intx.pin);
vdev->intx.pending = true;
pci_irq_assert(&vdev->pdev);
@@ -407,8 +405,7 @@ static void vfio_eoi(VFIODevice *vbasedev)
return;
}
- trace_vfio_eoi(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function);
+ trace_vfio_eoi(vbasedev->name);
vdev->intx.pending = false;
pci_irq_deassert(&vdev->pdev);
@@ -477,8 +474,7 @@ static void vfio_enable_intx_kvm(VFIOPCIDevice *vdev)
vdev->intx.kvm_accel = true;
- trace_vfio_enable_intx_kvm(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function);
+ trace_vfio_enable_intx_kvm(vdev->vbasedev.name);
return;
@@ -530,8 +526,7 @@ static void vfio_disable_intx_kvm(VFIOPCIDevice *vdev)
/* If we've missed an event, let it re-fire through QEMU */
vfio_unmask_single_irqindex(&vdev->vbasedev, VFIO_PCI_INTX_IRQ_INDEX);
- trace_vfio_disable_intx_kvm(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function);
+ trace_vfio_disable_intx_kvm(vdev->vbasedev.name);
#endif
}
@@ -550,8 +545,7 @@ static void vfio_update_irq(PCIDevice *pdev)
return; /* Nothing changed */
}
- trace_vfio_update_irq(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function,
+ trace_vfio_update_irq(vdev->vbasedev.name,
vdev->intx.route.irq, route.irq);
vfio_disable_intx_kvm(vdev);
@@ -627,8 +621,7 @@ static int vfio_enable_intx(VFIOPCIDevice *vdev)
vdev->interrupt = VFIO_INT_INTx;
- trace_vfio_enable_intx(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function);
+ trace_vfio_enable_intx(vdev->vbasedev.name);
return 0;
}
@@ -650,8 +643,7 @@ static void vfio_disable_intx(VFIOPCIDevice *vdev)
vdev->interrupt = VFIO_INT_NONE;
- trace_vfio_disable_intx(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function);
+ trace_vfio_disable_intx(vdev->vbasedev.name);
}
/*
@@ -678,9 +670,7 @@ static void vfio_msi_interrupt(void *opaque)
abort();
}
- trace_vfio_msi_interrupt(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function,
- nr, msg.address, msg.data);
+ trace_vfio_msi_interrupt(vbasedev->name, nr, msg.address, msg.data);
#endif
if (vdev->interrupt == VFIO_INT_MSIX) {
@@ -787,9 +777,7 @@ static int vfio_msix_vector_do_use(PCIDevice *pdev, unsigned int nr,
VFIOMSIVector *vector;
int ret;
- trace_vfio_msix_vector_do_use(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function,
- nr);
+ trace_vfio_msix_vector_do_use(vdev->vbasedev.name, nr);
vector = &vdev->msi_vectors[nr];
@@ -875,9 +863,7 @@ static void vfio_msix_vector_release(PCIDevice *pdev, unsigned int nr)
VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev);
VFIOMSIVector *vector = &vdev->msi_vectors[nr];
- trace_vfio_msix_vector_release(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function,
- nr);
+ trace_vfio_msix_vector_release(vdev->vbasedev.name, nr);
/*
* There are still old guests that mask and unmask vectors on every
@@ -940,8 +926,7 @@ static void vfio_enable_msix(VFIOPCIDevice *vdev)
error_report("vfio: msix_set_vector_notifiers failed");
}
- trace_vfio_enable_msix(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function);
+ trace_vfio_enable_msix(vdev->vbasedev.name);
}
static void vfio_enable_msi(VFIOPCIDevice *vdev)
@@ -1017,9 +1002,7 @@ retry:
return;
}
- trace_vfio_enable_msi(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function,
- vdev->nr_vectors);
+ trace_vfio_enable_msi(vdev->vbasedev.name, vdev->nr_vectors);
}
static void vfio_disable_msi_common(VFIOPCIDevice *vdev)
@@ -1069,8 +1052,7 @@ static void vfio_disable_msix(VFIOPCIDevice *vdev)
vfio_disable_msi_common(vdev);
- trace_vfio_disable_msix(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function);
+ trace_vfio_disable_msix(vdev->vbasedev.name);
}
static void vfio_disable_msi(VFIOPCIDevice *vdev)
@@ -1078,8 +1060,7 @@ static void vfio_disable_msi(VFIOPCIDevice *vdev)
vfio_disable_irqindex(&vdev->vbasedev, VFIO_PCI_MSI_IRQ_INDEX);
vfio_disable_msi_common(vdev);
- trace_vfio_disable_msi(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function);
+ trace_vfio_disable_msi(vdev->vbasedev.name);
}
static void vfio_update_msi(VFIOPCIDevice *vdev)
@@ -1213,9 +1194,7 @@ static void vfio_pci_load_rom(VFIOPCIDevice *vdev)
return;
}
- trace_vfio_pci_load_rom(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function,
- (unsigned long)reg_info.size,
+ trace_vfio_pci_load_rom(vdev->vbasedev.name, (unsigned long)reg_info.size,
(unsigned long)reg_info.offset,
(unsigned long)reg_info.flags);
@@ -1225,9 +1204,7 @@ static void vfio_pci_load_rom(VFIOPCIDevice *vdev)
if (!vdev->rom_size) {
vdev->rom_read_failed = true;
error_report("vfio-pci: Cannot read device rom at "
- "%04x:%02x:%02x.%x",
- vdev->host.domain, vdev->host.bus, vdev->host.slot,
- vdev->host.function);
+ "%s", vdev->vbasedev.name);
error_printf("Device option ROM contents are probably invalid "
"(check dmesg).\nSkip option ROM probe with rombar=0, "
"or load from file with romfile=\n");
@@ -1289,9 +1266,7 @@ static uint64_t vfio_rom_read(void *opaque, hwaddr addr, unsigned size)
break;
}
- trace_vfio_rom_read(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function,
- addr, size, data);
+ trace_vfio_rom_read(vdev->vbasedev.name, addr, size, data);
return data;
}
@@ -1388,9 +1363,7 @@ static void vfio_pci_size_rom(VFIOPCIDevice *vdev)
}
}
- trace_vfio_pci_size_rom(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function,
- size);
+ trace_vfio_pci_size_rom(vdev->vbasedev.name, size);
snprintf(name, sizeof(name), "vfio[%04x:%02x:%02x.%x].rom",
vdev->host.domain, vdev->host.bus, vdev->host.slot,
@@ -1524,10 +1497,7 @@ static uint64_t vfio_generic_window_quirk_read(void *opaque,
quirk->data.address_val + offset, size);
trace_vfio_generic_window_quirk_read(memory_region_name(&quirk->mem),
- vdev->host.domain,
- vdev->host.bus,
- vdev->host.slot,
- vdev->host.function,
+ vdev->vbasedev.name,
quirk->data.bar,
addr, size, data);
} else {
@@ -1575,14 +1545,10 @@ static void vfio_generic_window_quirk_write(void *opaque, hwaddr addr,
vfio_pci_write_config(&vdev->pdev,
quirk->data.address_val + offset, data, size);
-
trace_vfio_generic_window_quirk_write(memory_region_name(&quirk->mem),
- vdev->host.domain,
- vdev->host.bus,
- vdev->host.slot,
- vdev->host.function,
- quirk->data.bar,
- addr, data, size);
+ vdev->vbasedev.name,
+ quirk->data.bar,
+ addr, data, size);
return;
}
@@ -1616,11 +1582,7 @@ static uint64_t vfio_generic_quirk_read(void *opaque,
data = vfio_pci_read_config(&vdev->pdev, addr - offset, size);
trace_vfio_generic_quirk_read(memory_region_name(&quirk->mem),
- vdev->host.domain,
- vdev->host.bus,
- vdev->host.slot,
- vdev->host.function,
- quirk->data.bar,
+ vdev->vbasedev.name, quirk->data.bar,
addr + base, size, data);
} else {
data = vfio_region_read(&vdev->bars[quirk->data.bar].region,
@@ -1649,11 +1611,7 @@ static void vfio_generic_quirk_write(void *opaque, hwaddr addr,
vfio_pci_write_config(&vdev->pdev, addr - offset, data, size);
trace_vfio_generic_quirk_write(memory_region_name(&quirk->mem),
- vdev->host.domain,
- vdev->host.bus,
- vdev->host.slot,
- vdev->host.function,
- quirk->data.bar,
+ vdev->vbasedev.name, quirk->data.bar,
addr + base, data, size);
} else {
vfio_region_write(&vdev->bars[quirk->data.bar].region,
@@ -1725,8 +1683,7 @@ static void vfio_vga_probe_ati_3c3_quirk(VFIOPCIDevice *vdev)
QLIST_INSERT_HEAD(&vdev->vga.region[QEMU_PCI_VGA_IO_HI].quirks,
quirk, next);
- trace_vfio_vga_probe_ati_3c3_quirk(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function);
+ trace_vfio_vga_probe_ati_3c3_quirk(vdev->vbasedev.name);
}
/*
@@ -1767,10 +1724,7 @@ static void vfio_probe_ati_bar4_window_quirk(VFIOPCIDevice *vdev, int nr)
QLIST_INSERT_HEAD(&vdev->bars[nr].quirks, quirk, next);
- trace_vfio_probe_ati_bar4_window_quirk(vdev->host.domain,
- vdev->host.bus,
- vdev->host.slot,
- vdev->host.function);
+ trace_vfio_probe_ati_bar4_window_quirk(vdev->vbasedev.name);
}
#define PCI_VENDOR_ID_REALTEK 0x10ec
@@ -1809,8 +1763,7 @@ static uint64_t vfio_rtl8168_window_quirk_read(void *opaque,
if (quirk->data.flags) {
trace_vfio_rtl8168_window_quirk_read_fake(
memory_region_name(&quirk->mem),
- vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function);
+ vdev->vbasedev.name);
return quirk->data.address_match ^ 0x10000000U;
}
@@ -1821,9 +1774,7 @@ static uint64_t vfio_rtl8168_window_quirk_read(void *opaque,
trace_vfio_rtl8168_window_quirk_read_table(
memory_region_name(&quirk->mem),
- vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function
- );
+ vdev->vbasedev.name);
if (!(vdev->pdev.cap_present & QEMU_PCI_CAP_MSIX)) {
return 0;
@@ -1836,10 +1787,8 @@ static uint64_t vfio_rtl8168_window_quirk_read(void *opaque,
}
}
- trace_vfio_rtl8168_window_quirk_read_direct(
- memory_region_name(&quirk->mem),
- vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function);
+ trace_vfio_rtl8168_window_quirk_read_direct(memory_region_name(&quirk->mem),
+ vdev->vbasedev.name);
return vfio_region_read(&vdev->bars[quirk->data.bar].region,
addr + 0x70, size);
@@ -1859,8 +1808,7 @@ static void vfio_rtl8168_window_quirk_write(void *opaque, hwaddr addr,
trace_vfio_rtl8168_window_quirk_write_table(
memory_region_name(&quirk->mem),
- vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function);
+ vdev->vbasedev.name);
io_mem_write(&vdev->pdev.msix_table_mmio,
(hwaddr)(quirk->data.address_match & 0xfff),
@@ -1881,8 +1829,7 @@ static void vfio_rtl8168_window_quirk_write(void *opaque, hwaddr addr,
trace_vfio_rtl8168_window_quirk_write_direct(
memory_region_name(&quirk->mem),
- vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function);
+ vdev->vbasedev.name);
vfio_region_write(&vdev->bars[quirk->data.bar].region,
addr + 0x70, data, size);
@@ -1920,10 +1867,7 @@ static void vfio_probe_rtl8168_bar2_window_quirk(VFIOPCIDevice *vdev, int nr)
QLIST_INSERT_HEAD(&vdev->bars[nr].quirks, quirk, next);
- trace_vfio_probe_rtl8168_bar2_window_quirk(vdev->host.domain,
- vdev->host.bus,
- vdev->host.slot,
- vdev->host.function);
+ trace_vfio_probe_rtl8168_bar2_window_quirk(vdev->vbasedev.name);
}
/*
* Trap the BAR2 MMIO window to config space as well.
@@ -1955,10 +1899,7 @@ static void vfio_probe_ati_bar2_4000_quirk(VFIOPCIDevice *vdev, int nr)
QLIST_INSERT_HEAD(&vdev->bars[nr].quirks, quirk, next);
- trace_vfio_probe_ati_bar2_4000_quirk(vdev->host.domain,
- vdev->host.bus,
- vdev->host.slot,
- vdev->host.function);
+ trace_vfio_probe_ati_bar2_4000_quirk(vdev->vbasedev.name);
}
/*
@@ -2091,10 +2032,7 @@ static void vfio_vga_probe_nvidia_3d0_quirk(VFIOPCIDevice *vdev)
QLIST_INSERT_HEAD(&vdev->vga.region[QEMU_PCI_VGA_IO_HI].quirks,
quirk, next);
- trace_vfio_vga_probe_nvidia_3d0_quirk(vdev->host.domain,
- vdev->host.bus,
- vdev->host.slot,
- vdev->host.function);
+ trace_vfio_vga_probe_nvidia_3d0_quirk(vdev->vbasedev.name);
}
/*
@@ -2183,10 +2121,7 @@ static void vfio_probe_nvidia_bar5_window_quirk(VFIOPCIDevice *vdev, int nr)
QLIST_INSERT_HEAD(&vdev->bars[nr].quirks, quirk, next);
- trace_vfio_probe_nvidia_bar5_window_quirk(vdev->host.domain,
- vdev->host.bus,
- vdev->host.slot,
- vdev->host.function);
+ trace_vfio_probe_nvidia_bar5_window_quirk(vdev->vbasedev.name);
}
static void vfio_nvidia_88000_quirk_write(void *opaque, hwaddr addr,
@@ -2257,10 +2192,7 @@ static void vfio_probe_nvidia_bar0_88000_quirk(VFIOPCIDevice *vdev, int nr)
QLIST_INSERT_HEAD(&vdev->bars[nr].quirks, quirk, next);
- trace_vfio_probe_nvidia_bar0_88000_quirk(vdev->host.domain,
- vdev->host.bus,
- vdev->host.slot,
- vdev->host.function);
+ trace_vfio_probe_nvidia_bar0_88000_quirk(vdev->vbasedev.name);
}
/*
@@ -2297,10 +2229,7 @@ static void vfio_probe_nvidia_bar0_1800_quirk(VFIOPCIDevice *vdev, int nr)
QLIST_INSERT_HEAD(&vdev->bars[nr].quirks, quirk, next);
- trace_vfio_probe_nvidia_bar0_1800_quirk(vdev->host.domain,
- vdev->host.bus,
- vdev->host.slot,
- vdev->host.function);
+ trace_vfio_probe_nvidia_bar0_1800_quirk(vdev->vbasedev.name);
}
/*
@@ -2387,9 +2316,7 @@ static uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len)
val = (emu_val & emu_bits) | (phys_val & ~emu_bits);
- trace_vfio_pci_read_config(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function,
- addr, len, val);
+ trace_vfio_pci_read_config(vdev->vbasedev.name, addr, len, val);
return val;
}
@@ -2400,9 +2327,7 @@ static void vfio_pci_write_config(PCIDevice *pdev, uint32_t addr,
VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev);
uint32_t val_le = cpu_to_le32(val);
- trace_vfio_pci_write_config(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function,
- addr, val, len);
+ trace_vfio_pci_write_config(vdev->vbasedev.name, addr, val, len);
/* Write everything to VFIO, let it filter out what we can't write */
if (pwrite(vdev->vbasedev.fd, &val_le, len, vdev->config_offset + addr)
@@ -2539,7 +2464,7 @@ static void vfio_iommu_map_notify(Notifier *n, void *data)
&xlat, &len, iotlb->perm & IOMMU_WO);
if (!memory_region_is_ram(mr)) {
error_report("iommu map to non memory area %"HWADDR_PRIx"\n",
- xlat);
+ xlat);
return;
}
/*
@@ -2784,8 +2709,7 @@ static int vfio_setup_msi(VFIOPCIDevice *vdev, int pos)
msi_maskbit = !!(ctrl & PCI_MSI_FLAGS_MASKBIT);
entries = 1 << ((ctrl & PCI_MSI_FLAGS_QMASK) >> 1);
- trace_vfio_setup_msi(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function, pos);
+ trace_vfio_setup_msi(vdev->vbasedev.name, pos);
ret = msi_init(&vdev->pdev, pos, entries, msi_64bit, msi_maskbit);
if (ret < 0) {
@@ -2846,9 +2770,8 @@ static int vfio_early_setup_msix(VFIOPCIDevice *vdev)
vdev->msix->pba_offset = pba & ~PCI_MSIX_FLAGS_BIRMASK;
vdev->msix->entries = (ctrl & PCI_MSIX_FLAGS_QSIZE) + 1;
- trace_vfio_early_setup_msix(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function,
- pos, vdev->msix->table_bar,
+ trace_vfio_early_setup_msix(vdev->vbasedev.name, pos,
+ vdev->msix->table_bar,
vdev->msix->table_offset,
vdev->msix->entries);
@@ -3224,8 +3147,7 @@ static void vfio_check_pcie_flr(VFIOPCIDevice *vdev, uint8_t pos)
uint32_t cap = pci_get_long(vdev->pdev.config + pos + PCI_EXP_DEVCAP);
if (cap & PCI_EXP_DEVCAP_FLR) {
- trace_vfio_check_pcie_flr(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function);
+ trace_vfio_check_pcie_flr(vdev->vbasedev.name);
vdev->has_flr = true;
}
}
@@ -3235,8 +3157,7 @@ static void vfio_check_pm_reset(VFIOPCIDevice *vdev, uint8_t pos)
uint16_t csr = pci_get_word(vdev->pdev.config + pos + PCI_PM_CTRL);
if (!(csr & PCI_PM_CTRL_NO_SOFT_RESET)) {
- trace_vfio_check_pm_reset(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function);
+ trace_vfio_check_pm_reset(vdev->vbasedev.name);
vdev->has_pm_reset = true;
}
}
@@ -3246,8 +3167,7 @@ static void vfio_check_af_flr(VFIOPCIDevice *vdev, uint8_t pos)
uint8_t cap = pci_get_byte(vdev->pdev.config + pos + PCI_AF_CAP);
if ((cap & PCI_AF_CAP_TP) && (cap & PCI_AF_CAP_FLR)) {
- trace_vfio_check_af_flr(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function);
+ trace_vfio_check_af_flr(vdev->vbasedev.name);
vdev->has_flr = true;
}
}
@@ -3398,9 +3318,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single)
int ret, i, count;
bool multi = false;
- trace_vfio_pci_hot_reset(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function,
- single ? "one" : "multi");
+ trace_vfio_pci_hot_reset(vdev->vbasedev.name, single ? "one" : "multi");
vfio_pci_pre_reset(vdev);
vdev->vbasedev.needs_reset = false;
@@ -3431,10 +3349,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single)
goto out_single;
}
- trace_vfio_pci_hot_reset_has_dep_devices(vdev->host.domain,
- vdev->host.bus,
- vdev->host.slot,
- vdev->host.function);
+ trace_vfio_pci_hot_reset_has_dep_devices(vdev->vbasedev.name);
/* Verify that we have all the groups required */
for (i = 0; i < info->count; i++) {
@@ -3462,10 +3377,9 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single)
if (!group) {
if (!vdev->has_pm_reset) {
- error_report("vfio: Cannot reset device %04x:%02x:%02x.%x, "
+ error_report("vfio: Cannot reset device %s, "
"depends on group %d which is not owned.",
- vdev->host.domain, vdev->host.bus, vdev->host.slot,
- vdev->host.function, devices[i].group_id);
+ vdev->vbasedev.name, devices[i].group_id);
}
ret = -EPERM;
goto out;
@@ -3480,8 +3394,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single)
if (vfio_pci_host_match(&host, &tmp->host)) {
if (single) {
error_report("vfio: found another in-use device "
- "%04x:%02x:%02x.%x\n", host.domain, host.bus,
- host.slot, host.function);
+ "%s\n", vbasedev_iter->name);
ret = -EINVAL;
goto out_single;
}
@@ -3528,10 +3441,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single)
ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_PCI_HOT_RESET, reset);
g_free(reset);
- trace_vfio_pci_hot_reset_result(vdev->host.domain,
- vdev->host.bus,
- vdev->host.slot,
- vdev->host.function,
+ trace_vfio_pci_hot_reset_result(vdev->vbasedev.name,
ret ? "%m" : "Success");
out:
@@ -4073,10 +3983,9 @@ static int vfio_populate_device(VFIODevice *vbasedev)
} else if (irq_info.count == 1) {
vdev->pci_aer = true;
} else {
- error_report("vfio: %04x:%02x:%02x.%x "
+ error_report("vfio: %s "
"Could not enable error recovery for the device",
- vdev->host.domain, vdev->host.bus, vdev->host.slot,
- vdev->host.function);
+ vbasedev->name);
}
error:
@@ -4293,8 +4202,7 @@ static int vfio_initfn(PCIDevice *pdev)
return -errno;
}
- trace_vfio_initfn(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function, groupid);
+ trace_vfio_initfn(vdev->vbasedev.name, groupid);
group = vfio_get_group(groupid, pci_device_iommu_address_space(pdev));
if (!group) {
@@ -4430,16 +4338,14 @@ static void vfio_pci_reset(DeviceState *dev)
PCIDevice *pdev = DO_UPCAST(PCIDevice, qdev, dev);
VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev);
- trace_vfio_pci_reset(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function);
+ trace_vfio_pci_reset(vdev->vbasedev.name);
vfio_pci_pre_reset(vdev);
if (vdev->vbasedev.reset_works &&
(vdev->has_flr || !vdev->has_pm_reset) &&
!ioctl(vdev->vbasedev.fd, VFIO_DEVICE_RESET)) {
- trace_vfio_pci_reset_flr(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function);
+ trace_vfio_pci_reset_flr(vdev->vbasedev.name);
goto post_reset;
}
@@ -4451,8 +4357,7 @@ static void vfio_pci_reset(DeviceState *dev)
/* If nothing else works and the device supports PM reset, use it */
if (vdev->vbasedev.reset_works && vdev->has_pm_reset &&
!ioctl(vdev->vbasedev.fd, VFIO_DEVICE_RESET)) {
- trace_vfio_pci_reset_pm(vdev->host.domain, vdev->host.bus,
- vdev->host.slot, vdev->host.function);
+ trace_vfio_pci_reset_pm(vdev->vbasedev.name);
goto post_reset;
}
diff --git a/trace-events b/trace-events
index 55a559b..0e7aa53 100644
--- a/trace-events
+++ b/trace-events
@@ -1352,68 +1352,72 @@ pci_cfg_read(const char *dev, unsigned devid, unsigned fnid, unsigned offs, unsi
pci_cfg_write(const char *dev, unsigned devid, unsigned fnid, unsigned offs, unsigned val) "%s %02u:%u @0x%x <- 0x%x"
# hw/vfio/vfio-pci.c
-vfio_intx_interrupt(int domain, int bus, int slot, int fn, char line) "(%04x:%02x:%02x.%x) Pin %c"
-vfio_eoi(int domain, int bus, int slot, int fn) "(%04x:%02x:%02x.%x) EOI"
-vfio_enable_intx_kvm(int domain, int bus, int slot, int fn) "(%04x:%02x:%02x.%x) KVM INTx accel enabled"
-vfio_disable_intx_kvm(int domain, int bus, int slot, int fn) "(%04x:%02x:%02x.%x) KVM INTx accel disabled"
-vfio_update_irq(int domain, int bus, int slot, int fn, int new_irq, int target_irq) " (%04x:%02x:%02x.%x) IRQ moved %d -> %d"
-vfio_enable_intx(int domain, int bus, int slot, int fn) "(%04x:%02x:%02x.%x)"
-vfio_disable_intx(int domain, int bus, int slot, int fn) "(%04x:%02x:%02x.%x)"
-vfio_msi_interrupt(int domain, int bus, int slot, int fn, int index, uint64_t addr, int data) "(%04x:%02x:%02x.%x) vector %d 0x%"PRIx64"/0x%x"
-vfio_msix_vector_do_use(int domain, int bus, int slot, int fn, int index) "(%04x:%02x:%02x.%x) vector %d used"
-vfio_msix_vector_release(int domain, int bus, int slot, int fn, int index) "(%04x:%02x:%02x.%x) vector %d released"
-vfio_enable_msix(int domain, int bus, int slot, int fn) "(%04x:%02x:%02x.%x)"
-vfio_enable_msi(int domain, int bus, int slot, int fn, int nr_vectors) "(%04x:%02x:%02x.%x) Enabled %d MSI vectors"
-vfio_disable_msix(int domain, int bus, int slot, int fn) "(%04x:%02x:%02x.%x)"
-vfio_disable_msi(int domain, int bus, int slot, int fn) "(%04x:%02x:%02x.%x)"
-vfio_pci_load_rom(int domain, int bus, int slot, int fn, unsigned long size, unsigned long offset, unsigned long flags) "Device %04x:%02x:%02x.%x ROM:\n size: 0x%lx, offset: 0x%lx, flags: 0x%lx"
-vfio_rom_read(int domain, int bus, int slot, int fn, uint64_t addr, int size, uint64_t data) "(%04x:%02x:%02x.%x, 0x%"PRIx64", 0x%x) = 0x%"PRIx64
-vfio_pci_size_rom(int domain, int bus, int slot, int fn, int size) "%04x:%02x:%02x.%x ROM size 0x%x"
-vfio_vga_write(uint64_t addr, uint64_t data, int size) "(0x%"PRIx64", 0x%"PRIx64", %d)"
-vfio_vga_read(uint64_t addr, int size, uint64_t data) "(0x%"PRIx64", %d) = 0x%"PRIx64
-vfio_generic_window_quirk_read(const char * region_name, int domain, int bus, int slot, int fn, int index, uint64_t addr, int size, uint64_t data) "%s read(%04x:%02x:%02x.%x:BAR%d+0x%"PRIx64", %d) = 0x%"PRIx64
-vfio_generic_window_quirk_write(const char * region_name, int domain, int bus, int slot, int fn, int index, uint64_t addr, uint64_t data, int size) "%s write(%04x:%02x:%02x.%x:BAR%d+0x%"PRIx64", 0x%"PRIx64", %d)"
-vfio_generic_quirk_read(const char * region_name, int domain, int bus, int slot, int fn, int index, uint64_t addr, int size, uint64_t data) "%s read(%04x:%02x:%02x.%x:BAR%d+0x%"PRIx64", %d) = 0x%"PRIx64
-vfio_generic_quirk_write(const char * region_name, int domain, int bus, int slot, int fn, int index, uint64_t addr, uint64_t data, int size) "%s write(%04x:%02x:%02x.%x:BAR%d+0x%"PRIx64", 0x%"PRIx64", %d)"
+vfio_intx_interrupt(const char *name, char line) " (%s) Pin %c"
+vfio_eoi(const char *name) " (%s) EOI"
+vfio_enable_intx_kvm(const char *name) " (%s) KVM INTx accel enabled"
+vfio_disable_intx_kvm(const char *name) " (%s) KVM INTx accel disabled"
+vfio_update_irq(const char *name, int new_irq, int target_irq) " (%s) IRQ moved %d -> %d"
+vfio_enable_intx(const char *name) " (%s)"
+vfio_disable_intx(const char *name) " (%s)"
+vfio_msi_interrupt(const char *name, int index, uint64_t addr, int data) " (%s) vector %d 0x%"PRIx64"/0x%x"
+vfio_msix_vector_do_use(const char *name, int index) " (%s) vector %d used"
+vfio_msix_vector_release(const char *name, int index) " (%s) vector %d released"
+vfio_enable_msix(const char *name) " (%s)"
+vfio_enable_msi(const char *name, int nr_vectors) " (%s) Enabled %d MSI vectors"
+vfio_disable_msix(const char *name) " (%s)"
+vfio_disable_msi(const char *name) " (%s)"
+vfio_pci_load_rom(const char *name, unsigned long size, unsigned long offset, unsigned long flags) "Device %s ROM:\n size: 0x%lx, offset: 0x%lx, flags: 0x%lx"
+vfio_rom_read(const char *name, uint64_t addr, int size, uint64_t data) " (%s, 0x%"PRIx64", 0x%x) = 0x%"PRIx64
+vfio_pci_size_rom(const char *name, int size) "%s ROM size 0x%x"
+vfio_vga_write(uint64_t addr, uint64_t data, int size) " (0x%"PRIx64", 0x%"PRIx64", %d)"
+vfio_vga_read(uint64_t addr, int size, uint64_t data) " (0x%"PRIx64", %d) = 0x%"PRIx64
+# remove ) =
+vfio_generic_window_quirk_read(const char * region_name, const char *name, int index, uint64_t addr, int size, uint64_t data) "%s read(%s:BAR%d+0x%"PRIx64", %d = 0x%"PRIx64
+## remove )
+vfio_generic_window_quirk_write(const char * region_name, const char *name, int index, uint64_t addr, uint64_t data, int size) "%s write(%s:BAR%d+0x%"PRIx64", 0x%"PRIx64", %d"
+# remove ) =
+vfio_generic_quirk_read(const char * region_name, const char *name, int index, uint64_t addr, int size, uint64_t data) "%s read(%s:BAR%d+0x%"PRIx64", %d = 0x%"PRIx64
+# remove )
+vfio_generic_quirk_write(const char * region_name, const char *name, int index, uint64_t addr, uint64_t data, int size) "%s write(%s:BAR%d+0x%"PRIx64", 0x%"PRIx64", %d"
vfio_ati_3c3_quirk_read(uint64_t data) " (0x3c3, 1) = 0x%"PRIx64
-vfio_vga_probe_ati_3c3_quirk(int domain, int bus, int slot, int fn) "Enabled ATI/AMD quirk 0x3c3 BAR4 for device %04x:%02x:%02x.%x"
-vfio_probe_ati_bar4_window_quirk(int domain, int bus, int slot, int fn) "Enabled ATI/AMD BAR4 window quirk for device %04x:%02x:%02x.%x"
-vfio_rtl8168_window_quirk_read_fake(const char *region_name, int domain, int bus, int slot, int fn) "%s fake read(%04x:%02x:%02x.%d)"
-vfio_rtl8168_window_quirk_read_table(const char *region_name, int domain, int bus, int slot, int fn) "%s MSI-X table read(%04x:%02x:%02x.%d)"
-vfio_rtl8168_window_quirk_read_direct(const char *region_name, int domain, int bus, int slot, int fn) "%s direct read(%04x:%02x:%02x.%d)"
-vfio_rtl8168_window_quirk_write_table(const char *region_name, int domain, int bus, int slot, int fn) "%s MSI-X table write(%04x:%02x:%02x.%d)"
-vfio_rtl8168_window_quirk_write_direct(const char *region_name, int domain, int bus, int slot, int fn) "%s direct write(%04x:%02x:%02x.%d)"
-vfio_probe_rtl8168_bar2_window_quirk(int domain, int bus, int slot, int fn) "Enabled RTL8168 BAR2 window quirk for device %04x:%02x:%02x.%x"
-vfio_probe_ati_bar2_4000_quirk(int domain, int bus, int slot, int fn) "Enabled ATI/AMD BAR2 0x4000 quirk for device %04x:%02x:%02x.%x"
+vfio_vga_probe_ati_3c3_quirk(const char *name) "Enabled ATI/AMD quirk 0x3c3 BAR4for device %s"
+vfio_probe_ati_bar4_window_quirk(const char *name) "Enabled ATI/AMD BAR4 window quirk for device %s"
+#issue with )
+vfio_rtl8168_window_quirk_read_fake(const char *region_name, const char *name) "%s fake read(%s"
+vfio_rtl8168_window_quirk_read_table(const char *region_name, const char *name) "%s MSI-X table read(%s"
+vfio_rtl8168_window_quirk_read_direct(const char *region_name, const char *name) "%s direct read(%s"
+vfio_rtl8168_window_quirk_write_table(const char *region_name, const char *name) "%s MSI-X table write(%s"
+vfio_rtl8168_window_quirk_write_direct(const char *region_name, const char *name) "%s direct write(%s"
+vfio_probe_rtl8168_bar2_window_quirk(const char *name) "Enabled RTL8168 BAR2 window quirk for device %s"
+vfio_probe_ati_bar2_4000_quirk(const char *name) "Enabled ATI/AMD BAR2 0x4000 quirk for device %s"
vfio_nvidia_3d0_quirk_read(int size, uint64_t data) " (0x3d0, %d) = 0x%"PRIx64
vfio_nvidia_3d0_quirk_write(uint64_t data, int size) " (0x3d0, 0x%"PRIx64", %d)"
-vfio_vga_probe_nvidia_3d0_quirk(int domain, int bus, int slot, int fn) "Enabled NVIDIA VGA 0x3d0 quirk for device %04x:%02x:%02x.%x"
-vfio_probe_nvidia_bar5_window_quirk(int domain, int bus, int slot, int fn) "Enabled NVIDIA BAR5 window quirk for device %04x:%02x:%02x.%x"
-vfio_probe_nvidia_bar0_88000_quirk(int domain, int bus, int slot, int fn) "Enabled NVIDIA BAR0 0x88000 quirk for device %04x:%02x:%02x.%x"
+vfio_vga_probe_nvidia_3d0_quirk(const char *name) "Enabled NVIDIA VGA 0x3d0 quirk for device %s"
+vfio_probe_nvidia_bar5_window_quirk(const char *name) "Enabled NVIDIA BAR5 window quirk for device %s"
+vfio_probe_nvidia_bar0_88000_quirk(const char *name) "Enabled NVIDIA BAR0 0x88000 quirk for device %s"
vfio_probe_nvidia_bar0_1800_quirk_id(int id) "Nvidia NV%02x"
-vfio_probe_nvidia_bar0_1800_quirk(int domain, int bus, int slot, int fn) "Enabled NVIDIA BAR0 0x1800 quirk for device %04x:%02x:%02x.%x"
-vfio_pci_read_config(int domain, int bus, int slot, int fn, int addr, int len, int val) " (%04x:%02x:%02x.%x, @0x%x, len=0x%x) %x"
-vfio_pci_write_config(int domain, int bus, int slot, int fn, int addr, int val, int len) " (%04x:%02x:%02x.%x, @0x%x, 0x%x, len=0x%x)"
-vfio_setup_msi(int domain, int bus, int slot, int fn, int pos) "%04x:%02x:%02x.%x PCI MSI CAP @0x%x"
-vfio_early_setup_msix(int domain, int bus, int slot, int fn, int pos, int table_bar, int offset, int entries) "%04x:%02x:%02x.%x PCI MSI-X CAP @0x%x, BAR %d, offset 0x%x, entries %d"
-vfio_check_pcie_flr(int domain, int bus, int slot, int fn) "%04x:%02x:%02x.%x Supports FLR via PCIe cap"
-vfio_check_pm_reset(int domain, int bus, int slot, int fn) "%04x:%02x:%02x.%x Supports PM reset"
-vfio_check_af_flr(int domain, int bus, int slot, int fn) "%04x:%02x:%02x.%x Supports FLR via AF cap"
-vfio_pci_hot_reset(int domain, int bus, int slot, int fn, const char *type) " (%04x:%02x:%02x.%x) %s"
-vfio_pci_hot_reset_has_dep_devices(int domain, int bus, int slot, int fn) "%04x:%02x:%02x.%x: hot reset dependent devices:"
+vfio_probe_nvidia_bar0_1800_quirk(const char *name) "Enabled NVIDIA BAR0 0x1800 quirk for device %s"
+vfio_pci_read_config(const char *name, int addr, int len, int val) " (%s, @0x%x, len=0x%x) %x"
+vfio_pci_write_config(const char *name, int addr, int val, int len) " (%s, @0x%x, 0x%x, len=0x%x)"
+vfio_setup_msi(const char *name, int pos) "%s PCI MSI CAP @0x%x"
+vfio_early_setup_msix(const char *name, int pos, int table_bar, int offset, int entries) "%s PCI MSI-X CAP @0x%x, BAR %d, offset 0x%x, entries %d"
+vfio_check_pcie_flr(const char *name) "%s Supports FLR via PCIe cap"
+vfio_check_pm_reset(const char *name) "%s Supports PM reset"
+vfio_check_af_flr(const char *name) "%s Supports FLR via AF cap"
+vfio_pci_hot_reset(const char *name, const char *type) " (%s) %s"
+vfio_pci_hot_reset_has_dep_devices(const char *name) "%s: hot reset dependent devices:"
vfio_pci_hot_reset_dep_devices(int domain, int bus, int slot, int function, int group_id) "\t%04x:%02x:%02x.%x group %d"
-vfio_pci_hot_reset_result(int domain, int bus, int slot, int fn, const char *result) "%04x:%02x:%02x.%x hot reset: %s"
+vfio_pci_hot_reset_result(const char *name, const char *result) "%s hot reset: %s"
vfio_populate_device_region(const char *region_name, int index, unsigned long size, unsigned long offset, unsigned long flags) "Device %s region %d:\n size: 0x%lx, offset: 0x%lx, flags: 0x%lx"
vfio_populate_device_config(const char *name, unsigned long size, unsigned long offset, unsigned long flags) "Device %s config:\n size: 0x%lx, offset: 0x%lx, flags: 0x%lx"
vfio_populate_device_get_irq_info_failure(void) "VFIO_DEVICE_GET_IRQ_INFO failure: %m"
-vfio_get_device(const char *name, unsigned flags, unsigned num_regions, unsigned num_irqs) "Device %s flags: %u, regions: %u, irgs: %u"
-vfio_initfn(int domain, int bus, int slot, int fn, int group_id) " (%04x:%02x:%02x.%x) group %d"
-vfio_pci_reset(int domain, int bus, int slot, int fn) " (%04x:%02x:%02x.%x)"
-vfio_pci_reset_flr(int domain, int bus, int slot, int fn) "%04x:%02x:%02x.%x FLR/VFIO_DEVICE_RESET"
-vfio_pci_reset_pm(int domain, int bus, int slot, int fn) "%04x:%02x:%02x.%x PCI PM Reset"
+vfio_initfn(const char *name, int group_id) " (%s) group %d"
+vfio_pci_reset(const char *name) " (%s)"
+vfio_pci_reset_flr(const char *name) "%s FLR/VFIO_DEVICE_RESET"
+vfio_pci_reset_pm(const char *name) "%s PCI PM Reset"
vfio_region_write(const char *name, int index, uint64_t addr, uint64_t data, unsigned size) " (%s:region%d+0x%"PRIx64", 0x%"PRIx64 ", %d)"
-vfio_region_read(const char *name, int index, uint64_t addr, unsigned size, uint64_t data) " (%s:region%d+0x%"PRIx64", %d) = 0x%"PRIx64
+vfio_region_read(char *name, int index, uint64_t addr, unsigned size, uint64_t data) " (%s:region%d+0x%"PRIx64", %d) = 0x%"PRIx64
vfio_iommu_map_notify(uint64_t iova_start, uint64_t iova_end) "iommu map @ %"PRIx64" - %"PRIx64
vfio_listener_region_add_skip(uint64_t start, uint64_t end) "SKIPPING region_add %"PRIx64" - %"PRIx64
vfio_listener_region_add_iommu(uint64_t start, uint64_t end) "region_add [iommu] %"PRIx64" - %"PRIx64
@@ -1422,6 +1426,7 @@ vfio_listener_region_del_skip(uint64_t start, uint64_t end) "SKIPPING region_del
vfio_listener_region_del(uint64_t start, uint64_t end) "region_del %"PRIx64" - %"PRIx64
vfio_disconnect_container(int fd) "close container->fd=%d"
vfio_put_group(int fd) "close group->fd=%d"
+vfio_get_device(const char * name, unsigned int flags, unsigned int num_regions, unsigned int num_irqs) "Device %s flags: %u, regions: %u, irqs: %u"
vfio_put_base_device(int fd) "close vdev->fd=%d"
#hw/acpi/memory_hotplug.c
--
1.8.3.2
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH v8 11/19] hw/vfio: create common module
2014-11-30 18:35 [Qemu-devel] [PATCH v8 00/19] KVM platform device passthrough Eric Auger
` (9 preceding siblings ...)
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 10/19] hw/vfio/pci: use name field in format strings Eric Auger
@ 2014-11-30 18:35 ` Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 12/19] hw/vfio/platform: add vfio-platform support Eric Auger
` (8 subsequent siblings)
19 siblings, 0 replies; 22+ messages in thread
From: Eric Auger @ 2014-11-30 18:35 UTC (permalink / raw)
To: eric.auger, christoffer.dall, qemu-devel, agraf, pbonzini,
kim.phillips, a.rigo, manish.jaggi, joel.schopp, zhaoshenglong,
ard.biesheuvel
Cc: peter.maydell, patches, Kim Phillips, eric.auger, will.deacon,
stuart.yoder, Bharat.Bhushan, alex.williamson, a.motakis, kvmarm
A new common module is created. It implements all functions
that have no device specificity (PCI, Platform).
This patch only consists in move (no functional changes)
Signed-off-by: Kim Phillips <kim.phillips@linaro.org>
Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
v7 -> v8:
- integrate "Add skip_dump flag to ignore memory region
during dump"
- vfio_compute_needs_reset does not return bool anymore
v6 -> v7:
- integrate Revert "vfio: Make BARs native endian"
- remove VFIO_DEVICE_TYPE_PLATFORM in vfio-common.h,
will come in next patch
v5 -> v6:
- follow all evolutions of original PCI code from v5 to V6
- move declaration of vfio_region_ops, vfio_memory_listener,
vfio_group_list, vfio_address_spaces into vfio-common.h
v4 -> v5:
- integrate "sPAPR/IOMMU: Fix TCE entry permission"
- VFIOdevice .name dealloc removed from vfio_put_base_device
- add some includes according to vfio inclusion policy
v3 -> v4:
[Eric Auger]
move done after all PCI modifications to anticipate for
VFIO Platform needs. Purpose is to alleviate the whole
review process.
<= v3
First split done by Kim Phillips
Conflicts:
hw/vfio/pci.c
Conflicts:
hw/vfio/pci.c
---
hw/vfio/Makefile.objs | 1 +
hw/vfio/common.c | 959 ++++++++++++++++++++++++++++++++++++++
hw/vfio/pci.c | 1028 +----------------------------------------
include/hw/vfio/vfio-common.h | 151 ++++++
trace-events | 1 +
5 files changed, 1113 insertions(+), 1027 deletions(-)
create mode 100644 hw/vfio/common.c
create mode 100644 include/hw/vfio/vfio-common.h
diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
index 31c7dab..e31f30e 100644
--- a/hw/vfio/Makefile.objs
+++ b/hw/vfio/Makefile.objs
@@ -1,3 +1,4 @@
ifeq ($(CONFIG_LINUX), y)
+obj-$(CONFIG_SOFTMMU) += common.o
obj-$(CONFIG_PCI) += pci.o
endif
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
new file mode 100644
index 0000000..554467f
--- /dev/null
+++ b/hw/vfio/common.c
@@ -0,0 +1,959 @@
+/*
+ * generic functions used by VFIO devices
+ *
+ * Copyright Red Hat, Inc. 2012
+ *
+ * Authors:
+ * Alex Williamson <alex.williamson@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See
+ * the COPYING file in the top-level directory.
+ *
+ * Based on qemu-kvm device-assignment:
+ * Adapted for KVM by Qumranet.
+ * Copyright (c) 2007, Neocleus, Alex Novik (alex@neocleus.com)
+ * Copyright (c) 2007, Neocleus, Guy Zana (guy@neocleus.com)
+ * Copyright (C) 2008, Qumranet, Amit Shah (amit.shah@qumranet.com)
+ * Copyright (C) 2008, Red Hat, Amit Shah (amit.shah@redhat.com)
+ * Copyright (C) 2008, IBM, Muli Ben-Yehuda (muli@il.ibm.com)
+ */
+
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <linux/vfio.h>
+
+#include "hw/vfio/vfio-common.h"
+#include "hw/vfio/vfio.h"
+#include "exec/address-spaces.h"
+#include "exec/memory.h"
+#include "hw/hw.h"
+#include "qemu/error-report.h"
+#include "sysemu/kvm.h"
+#include "trace.h"
+
+struct vfio_group_head vfio_group_list =
+ QLIST_HEAD_INITIALIZER(vfio_address_spaces);
+struct vfio_as_head vfio_address_spaces =
+ QLIST_HEAD_INITIALIZER(vfio_address_spaces);
+
+#ifdef CONFIG_KVM
+/*
+ * We have a single VFIO pseudo device per KVM VM. Once created it lives
+ * for the life of the VM. Closing the file descriptor only drops our
+ * reference to it and the device's reference to kvm. Therefore once
+ * initialized, this file descriptor is only released on QEMU exit and
+ * we'll re-use it should another vfio device be attached before then.
+ */
+static int vfio_kvm_device_fd = -1;
+#endif
+
+/*
+ * Common VFIO interrupt disable
+ */
+void vfio_disable_irqindex(VFIODevice *vbasedev, int index)
+{
+ struct vfio_irq_set irq_set = {
+ .argsz = sizeof(irq_set),
+ .flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER,
+ .index = index,
+ .start = 0,
+ .count = 0,
+ };
+
+ ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
+}
+
+void vfio_unmask_single_irqindex(VFIODevice *vbasedev, int index)
+{
+ struct vfio_irq_set irq_set = {
+ .argsz = sizeof(irq_set),
+ .flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_UNMASK,
+ .index = index,
+ .start = 0,
+ .count = 1,
+ };
+
+ ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
+}
+
+void vfio_mask_single_irqindex(VFIODevice *vbasedev, int index)
+{
+ struct vfio_irq_set irq_set = {
+ .argsz = sizeof(irq_set),
+ .flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_MASK,
+ .index = index,
+ .start = 0,
+ .count = 1,
+ };
+
+ ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
+}
+
+/*
+ * IO Port/MMIO - Beware of the endians, VFIO is always little endian
+ */
+void vfio_region_write(void *opaque, hwaddr addr,
+ uint64_t data, unsigned size)
+{
+ VFIORegion *region = opaque;
+ VFIODevice *vbasedev = region->vbasedev;
+ union {
+ uint8_t byte;
+ uint16_t word;
+ uint32_t dword;
+ uint64_t qword;
+ } buf;
+
+ switch (size) {
+ case 1:
+ buf.byte = data;
+ break;
+ case 2:
+ buf.word = cpu_to_le16(data);
+ break;
+ case 4:
+ buf.dword = cpu_to_le32(data);
+ break;
+ default:
+ hw_error("vfio: unsupported write size, %d bytes", size);
+ break;
+ }
+
+ if (pwrite(vbasedev->fd, &buf, size, region->fd_offset + addr) != size) {
+ error_report("%s(%s:region%d+0x%"HWADDR_PRIx", 0x%"PRIx64
+ ",%d) failed: %m",
+ __func__, vbasedev->name, region->nr,
+ addr, data, size);
+ }
+
+ trace_vfio_region_write(vbasedev->name, region->nr, addr, data, size);
+
+ /*
+ * A read or write to a BAR always signals an INTx EOI. This will
+ * do nothing if not pending (including not in INTx mode). We assume
+ * that a BAR access is in response to an interrupt and that BAR
+ * accesses will service the interrupt. Unfortunately, we don't know
+ * which access will service the interrupt, so we're potentially
+ * getting quite a few host interrupts per guest interrupt.
+ */
+ vbasedev->ops->vfio_eoi(vbasedev);
+}
+
+uint64_t vfio_region_read(void *opaque,
+ hwaddr addr, unsigned size)
+{
+ VFIORegion *region = opaque;
+ VFIODevice *vbasedev = region->vbasedev;
+ union {
+ uint8_t byte;
+ uint16_t word;
+ uint32_t dword;
+ uint64_t qword;
+ } buf;
+ uint64_t data = 0;
+
+ if (pread(vbasedev->fd, &buf, size, region->fd_offset + addr) != size) {
+ error_report("%s(%s:region%d+0x%"HWADDR_PRIx", %d) failed: %m",
+ __func__, vbasedev->name, region->nr,
+ addr, size);
+ return (uint64_t)-1;
+ }
+ switch (size) {
+ case 1:
+ data = buf.byte;
+ break;
+ case 2:
+ data = le16_to_cpu(buf.word);
+ break;
+ case 4:
+ data = le32_to_cpu(buf.dword);
+ break;
+ default:
+ hw_error("vfio: unsupported read size, %d bytes", size);
+ break;
+ }
+
+ trace_vfio_region_read(vbasedev->name, region->nr, addr, size, data);
+
+ /* Same as write above */
+ vbasedev->ops->vfio_eoi(vbasedev);
+
+ return data;
+}
+
+const MemoryRegionOps vfio_region_ops = {
+ .read = vfio_region_read,
+ .write = vfio_region_write,
+ .endianness = DEVICE_LITTLE_ENDIAN,
+};
+
+/*
+ * DMA - Mapping and unmapping for the "type1" IOMMU interface used on x86
+ */
+static int vfio_dma_unmap(VFIOContainer *container,
+ hwaddr iova, ram_addr_t size)
+{
+ struct vfio_iommu_type1_dma_unmap unmap = {
+ .argsz = sizeof(unmap),
+ .flags = 0,
+ .iova = iova,
+ .size = size,
+ };
+
+ if (ioctl(container->fd, VFIO_IOMMU_UNMAP_DMA, &unmap)) {
+ error_report("VFIO_UNMAP_DMA: %d\n", -errno);
+ return -errno;
+ }
+
+ return 0;
+}
+
+static int vfio_dma_map(VFIOContainer *container, hwaddr iova,
+ ram_addr_t size, void *vaddr, bool readonly)
+{
+ struct vfio_iommu_type1_dma_map map = {
+ .argsz = sizeof(map),
+ .flags = VFIO_DMA_MAP_FLAG_READ,
+ .vaddr = (__u64)(uintptr_t)vaddr,
+ .iova = iova,
+ .size = size,
+ };
+
+ if (!readonly) {
+ map.flags |= VFIO_DMA_MAP_FLAG_WRITE;
+ }
+
+ /*
+ * Try the mapping, if it fails with EBUSY, unmap the region and try
+ * again. This shouldn't be necessary, but we sometimes see it in
+ * the the VGA ROM space.
+ */
+ if (ioctl(container->fd, VFIO_IOMMU_MAP_DMA, &map) == 0 ||
+ (errno == EBUSY && vfio_dma_unmap(container, iova, size) == 0 &&
+ ioctl(container->fd, VFIO_IOMMU_MAP_DMA, &map) == 0)) {
+ return 0;
+ }
+
+ error_report("VFIO_MAP_DMA: %d\n", -errno);
+ return -errno;
+}
+
+static bool vfio_listener_skipped_section(MemoryRegionSection *section)
+{
+ return (!memory_region_is_ram(section->mr) &&
+ !memory_region_is_iommu(section->mr)) ||
+ /*
+ * Sizing an enabled 64-bit BAR can cause spurious mappings to
+ * addresses in the upper part of the 64-bit address space. These
+ * are never accessed by the CPU and beyond the address width of
+ * some IOMMU hardware. TODO: VFIO should tell us the IOMMU width.
+ */
+ section->offset_within_address_space & (1ULL << 63);
+}
+
+static void vfio_iommu_map_notify(Notifier *n, void *data)
+{
+ VFIOGuestIOMMU *giommu = container_of(n, VFIOGuestIOMMU, n);
+ VFIOContainer *container = giommu->container;
+ IOMMUTLBEntry *iotlb = data;
+ MemoryRegion *mr;
+ hwaddr xlat;
+ hwaddr len = iotlb->addr_mask + 1;
+ void *vaddr;
+ int ret;
+
+ trace_vfio_iommu_map_notify(iotlb->iova,
+ iotlb->iova + iotlb->addr_mask);
+
+ /*
+ * The IOMMU TLB entry we have just covers translation through
+ * this IOMMU to its immediate target. We need to translate
+ * it the rest of the way through to memory.
+ */
+ mr = address_space_translate(&address_space_memory,
+ iotlb->translated_addr,
+ &xlat, &len, iotlb->perm & IOMMU_WO);
+ if (!memory_region_is_ram(mr)) {
+ error_report("iommu map to non memory area %"HWADDR_PRIx"\n",
+ xlat);
+ return;
+ }
+ /*
+ * Translation truncates length to the IOMMU page size,
+ * check that it did not truncate too much.
+ */
+ if (len & iotlb->addr_mask) {
+ error_report("iommu has granularity incompatible with target AS\n");
+ return;
+ }
+
+ if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) {
+ vaddr = memory_region_get_ram_ptr(mr) + xlat;
+ ret = vfio_dma_map(container, iotlb->iova,
+ iotlb->addr_mask + 1, vaddr,
+ !(iotlb->perm & IOMMU_WO) || mr->readonly);
+ if (ret) {
+ error_report("vfio_dma_map(%p, 0x%"HWADDR_PRIx", "
+ "0x%"HWADDR_PRIx", %p) = %d (%m)",
+ container, iotlb->iova,
+ iotlb->addr_mask + 1, vaddr, ret);
+ }
+ } else {
+ ret = vfio_dma_unmap(container, iotlb->iova, iotlb->addr_mask + 1);
+ if (ret) {
+ error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", "
+ "0x%"HWADDR_PRIx") = %d (%m)",
+ container, iotlb->iova,
+ iotlb->addr_mask + 1, ret);
+ }
+ }
+}
+
+static void vfio_listener_region_add(MemoryListener *listener,
+ MemoryRegionSection *section)
+{
+ VFIOContainer *container = container_of(listener, VFIOContainer,
+ iommu_data.type1.listener);
+ hwaddr iova, end;
+ Int128 llend;
+ void *vaddr;
+ int ret;
+
+ if (vfio_listener_skipped_section(section)) {
+ trace_vfio_listener_region_add_skip(
+ section->offset_within_address_space,
+ section->offset_within_address_space +
+ int128_get64(int128_sub(section->size, int128_one())));
+ return;
+ }
+
+ if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
+ (section->offset_within_region & ~TARGET_PAGE_MASK))) {
+ error_report("%s received unaligned region", __func__);
+ return;
+ }
+
+ iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
+ llend = int128_make64(section->offset_within_address_space);
+ llend = int128_add(llend, section->size);
+ llend = int128_and(llend, int128_exts64(TARGET_PAGE_MASK));
+
+ if (int128_ge(int128_make64(iova), llend)) {
+ return;
+ }
+
+ memory_region_ref(section->mr);
+
+ if (memory_region_is_iommu(section->mr)) {
+ VFIOGuestIOMMU *giommu;
+
+ trace_vfio_listener_region_add_iommu(iova,
+ int128_get64(int128_sub(llend, int128_one())));
+ /*
+ * FIXME: We should do some checking to see if the
+ * capabilities of the host VFIO IOMMU are adequate to model
+ * the guest IOMMU
+ *
+ * FIXME: For VFIO iommu types which have KVM acceleration to
+ * avoid bouncing all map/unmaps through qemu this way, this
+ * would be the right place to wire that up (tell the KVM
+ * device emulation the VFIO iommu handles to use).
+ */
+ /*
+ * This assumes that the guest IOMMU is empty of
+ * mappings at this point.
+ *
+ * One way of doing this is:
+ * 1. Avoid sharing IOMMUs between emulated devices or different
+ * IOMMU groups.
+ * 2. Implement VFIO_IOMMU_ENABLE in the host kernel to fail if
+ * there are some mappings in IOMMU.
+ *
+ * VFIO on SPAPR does that. Other IOMMU models may do that different,
+ * they must make sure there are no existing mappings or
+ * loop through existing mappings to map them into VFIO.
+ */
+ giommu = g_malloc0(sizeof(*giommu));
+ giommu->iommu = section->mr;
+ giommu->container = container;
+ giommu->n.notify = vfio_iommu_map_notify;
+ QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next);
+ memory_region_register_iommu_notifier(giommu->iommu, &giommu->n);
+
+ return;
+ }
+
+ /* Here we assume that memory_region_is_ram(section->mr)==true */
+
+ end = int128_get64(llend);
+ vaddr = memory_region_get_ram_ptr(section->mr) +
+ section->offset_within_region +
+ (iova - section->offset_within_address_space);
+
+ trace_vfio_listener_region_add_ram(iova, end - 1, vaddr);
+
+ ret = vfio_dma_map(container, iova, end - iova, vaddr, section->readonly);
+ if (ret) {
+ error_report("vfio_dma_map(%p, 0x%"HWADDR_PRIx", "
+ "0x%"HWADDR_PRIx", %p) = %d (%m)",
+ container, iova, end - iova, vaddr, ret);
+
+ /*
+ * On the initfn path, store the first error in the container so we
+ * can gracefully fail. Runtime, there's not much we can do other
+ * than throw a hardware error.
+ */
+ if (!container->iommu_data.type1.initialized) {
+ if (!container->iommu_data.type1.error) {
+ container->iommu_data.type1.error = ret;
+ }
+ } else {
+ hw_error("vfio: DMA mapping failed, unable to continue");
+ }
+ }
+}
+
+static void vfio_listener_region_del(MemoryListener *listener,
+ MemoryRegionSection *section)
+{
+ VFIOContainer *container = container_of(listener, VFIOContainer,
+ iommu_data.type1.listener);
+ hwaddr iova, end;
+ int ret;
+
+ if (vfio_listener_skipped_section(section)) {
+ trace_vfio_listener_region_del_skip(
+ section->offset_within_address_space,
+ section->offset_within_address_space +
+ int128_get64(int128_sub(section->size, int128_one())));
+ return;
+ }
+
+ if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
+ (section->offset_within_region & ~TARGET_PAGE_MASK))) {
+ error_report("%s received unaligned region", __func__);
+ return;
+ }
+
+ if (memory_region_is_iommu(section->mr)) {
+ VFIOGuestIOMMU *giommu;
+
+ QLIST_FOREACH(giommu, &container->giommu_list, giommu_next) {
+ if (giommu->iommu == section->mr) {
+ memory_region_unregister_iommu_notifier(&giommu->n);
+ QLIST_REMOVE(giommu, giommu_next);
+ g_free(giommu);
+ break;
+ }
+ }
+
+ /*
+ * FIXME: We assume the one big unmap below is adequate to
+ * remove any individual page mappings in the IOMMU which
+ * might have been copied into VFIO. This works for a page table
+ * based IOMMU where a big unmap flattens a large range of IO-PTEs.
+ * That may not be true for all IOMMU types.
+ */
+ }
+
+ iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
+ end = (section->offset_within_address_space + int128_get64(section->size)) &
+ TARGET_PAGE_MASK;
+
+ if (iova >= end) {
+ return;
+ }
+
+ trace_vfio_listener_region_del(iova, end - 1);
+
+ ret = vfio_dma_unmap(container, iova, end - iova);
+ memory_region_unref(section->mr);
+ if (ret) {
+ error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", "
+ "0x%"HWADDR_PRIx") = %d (%m)",
+ container, iova, end - iova, ret);
+ }
+}
+
+const MemoryListener vfio_memory_listener = {
+ .region_add = vfio_listener_region_add,
+ .region_del = vfio_listener_region_del,
+};
+
+void vfio_listener_release(VFIOContainer *container)
+{
+ memory_listener_unregister(&container->iommu_data.type1.listener);
+}
+
+int vfio_mmap_region(Object *obj, VFIORegion *region,
+ MemoryRegion *mem, MemoryRegion *submem,
+ void **map, size_t size, off_t offset,
+ const char *name)
+{
+ int ret = 0;
+ VFIODevice *vbasedev = region->vbasedev;
+
+ if (VFIO_ALLOW_MMAP && size && region->flags &
+ VFIO_REGION_INFO_FLAG_MMAP) {
+ int prot = 0;
+
+ if (region->flags & VFIO_REGION_INFO_FLAG_READ) {
+ prot |= PROT_READ;
+ }
+
+ if (region->flags & VFIO_REGION_INFO_FLAG_WRITE) {
+ prot |= PROT_WRITE;
+ }
+
+ *map = mmap(NULL, size, prot, MAP_SHARED,
+ vbasedev->fd,
+ region->fd_offset + offset);
+ if (*map == MAP_FAILED) {
+ *map = NULL;
+ ret = -errno;
+ goto empty_region;
+ }
+
+ memory_region_init_ram_ptr(submem, obj, name, size, *map);
+ memory_region_set_skip_dump(submem);
+ } else {
+empty_region:
+ /* Create a zero sized sub-region to make cleanup easy. */
+ memory_region_init(submem, obj, name, 0);
+ }
+
+ memory_region_add_subregion(mem, offset, submem);
+
+ return ret;
+}
+
+void vfio_reset_handler(void *opaque)
+{
+ VFIOGroup *group;
+ VFIODevice *vbasedev;
+
+ QLIST_FOREACH(group, &vfio_group_list, next) {
+ QLIST_FOREACH(vbasedev, &group->device_list, next) {
+ vbasedev->ops->vfio_compute_needs_reset(vbasedev);
+ }
+ }
+
+ QLIST_FOREACH(group, &vfio_group_list, next) {
+ QLIST_FOREACH(vbasedev, &group->device_list, next) {
+ if (vbasedev->needs_reset) {
+ vbasedev->ops->vfio_hot_reset_multi(vbasedev);
+ }
+ }
+ }
+}
+
+static void vfio_kvm_device_add_group(VFIOGroup *group)
+{
+#ifdef CONFIG_KVM
+ struct kvm_device_attr attr = {
+ .group = KVM_DEV_VFIO_GROUP,
+ .attr = KVM_DEV_VFIO_GROUP_ADD,
+ .addr = (uint64_t)(unsigned long)&group->fd,
+ };
+
+ if (!kvm_enabled()) {
+ return;
+ }
+
+ if (vfio_kvm_device_fd < 0) {
+ struct kvm_create_device cd = {
+ .type = KVM_DEV_TYPE_VFIO,
+ };
+
+ if (kvm_vm_ioctl(kvm_state, KVM_CREATE_DEVICE, &cd)) {
+ error_report("KVM_CREATE_DEVICE: %m\n");
+ return;
+ }
+
+ vfio_kvm_device_fd = cd.fd;
+ }
+
+ if (ioctl(vfio_kvm_device_fd, KVM_SET_DEVICE_ATTR, &attr)) {
+ error_report("Failed to add group %d to KVM VFIO device: %m",
+ group->groupid);
+ }
+#endif
+}
+
+static void vfio_kvm_device_del_group(VFIOGroup *group)
+{
+#ifdef CONFIG_KVM
+ struct kvm_device_attr attr = {
+ .group = KVM_DEV_VFIO_GROUP,
+ .attr = KVM_DEV_VFIO_GROUP_DEL,
+ .addr = (uint64_t)(unsigned long)&group->fd,
+ };
+
+ if (vfio_kvm_device_fd < 0) {
+ return;
+ }
+
+ if (ioctl(vfio_kvm_device_fd, KVM_SET_DEVICE_ATTR, &attr)) {
+ error_report("Failed to remove group %d from KVM VFIO device: %m",
+ group->groupid);
+ }
+#endif
+}
+
+static VFIOAddressSpace *vfio_get_address_space(AddressSpace *as)
+{
+ VFIOAddressSpace *space;
+
+ QLIST_FOREACH(space, &vfio_address_spaces, list) {
+ if (space->as == as) {
+ return space;
+ }
+ }
+
+ /* No suitable VFIOAddressSpace, create a new one */
+ space = g_malloc0(sizeof(*space));
+ space->as = as;
+ QLIST_INIT(&space->containers);
+
+ QLIST_INSERT_HEAD(&vfio_address_spaces, space, list);
+
+ return space;
+}
+
+static void vfio_put_address_space(VFIOAddressSpace *space)
+{
+ if (QLIST_EMPTY(&space->containers)) {
+ QLIST_REMOVE(space, list);
+ g_free(space);
+ }
+}
+
+static int vfio_connect_container(VFIOGroup *group, AddressSpace *as)
+{
+ VFIOContainer *container;
+ int ret, fd;
+ VFIOAddressSpace *space;
+
+ space = vfio_get_address_space(as);
+
+ QLIST_FOREACH(container, &space->containers, next) {
+ if (!ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &container->fd)) {
+ group->container = container;
+ QLIST_INSERT_HEAD(&container->group_list, group, container_next);
+ return 0;
+ }
+ }
+
+ fd = qemu_open("/dev/vfio/vfio", O_RDWR);
+ if (fd < 0) {
+ error_report("vfio: failed to open /dev/vfio/vfio: %m");
+ ret = -errno;
+ goto put_space_exit;
+ }
+
+ ret = ioctl(fd, VFIO_GET_API_VERSION);
+ if (ret != VFIO_API_VERSION) {
+ error_report("vfio: supported vfio version: %d, "
+ "reported version: %d", VFIO_API_VERSION, ret);
+ ret = -EINVAL;
+ goto close_fd_exit;
+ }
+
+ container = g_malloc0(sizeof(*container));
+ container->space = space;
+ container->fd = fd;
+ if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1_IOMMU)) {
+ ret = ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &fd);
+ if (ret) {
+ error_report("vfio: failed to set group container: %m");
+ ret = -errno;
+ goto free_container_exit;
+ }
+
+ ret = ioctl(fd, VFIO_SET_IOMMU, VFIO_TYPE1_IOMMU);
+ if (ret) {
+ error_report("vfio: failed to set iommu for container: %m");
+ ret = -errno;
+ goto free_container_exit;
+ }
+
+ container->iommu_data.type1.listener = vfio_memory_listener;
+ container->iommu_data.release = vfio_listener_release;
+
+ memory_listener_register(&container->iommu_data.type1.listener,
+ &address_space_memory);
+
+ if (container->iommu_data.type1.error) {
+ ret = container->iommu_data.type1.error;
+ error_report("vfio: memory listener initialization failed for container");
+ goto listener_release_exit;
+ }
+
+ container->iommu_data.type1.initialized = true;
+
+ } else if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_IOMMU)) {
+ ret = ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &fd);
+ if (ret) {
+ error_report("vfio: failed to set group container: %m");
+ ret = -errno;
+ goto free_container_exit;
+ }
+ ret = ioctl(fd, VFIO_SET_IOMMU, VFIO_SPAPR_TCE_IOMMU);
+ if (ret) {
+ error_report("vfio: failed to set iommu for container: %m");
+ ret = -errno;
+ goto free_container_exit;
+ }
+
+ /*
+ * The host kernel code implementing VFIO_IOMMU_DISABLE is called
+ * when container fd is closed so we do not call it explicitly
+ * in this file.
+ */
+ ret = ioctl(fd, VFIO_IOMMU_ENABLE);
+ if (ret) {
+ error_report("vfio: failed to enable container: %m");
+ ret = -errno;
+ goto free_container_exit;
+ }
+
+ container->iommu_data.type1.listener = vfio_memory_listener;
+ container->iommu_data.release = vfio_listener_release;
+
+ memory_listener_register(&container->iommu_data.type1.listener,
+ container->space->as);
+
+ } else {
+ error_report("vfio: No available IOMMU models");
+ ret = -EINVAL;
+ goto free_container_exit;
+ }
+
+ QLIST_INIT(&container->group_list);
+ QLIST_INSERT_HEAD(&space->containers, container, next);
+
+ group->container = container;
+ QLIST_INSERT_HEAD(&container->group_list, group, container_next);
+
+ return 0;
+listener_release_exit:
+ vfio_listener_release(container);
+
+free_container_exit:
+ g_free(container);
+
+close_fd_exit:
+ close(fd);
+
+put_space_exit:
+ vfio_put_address_space(space);
+
+ return ret;
+}
+
+static void vfio_disconnect_container(VFIOGroup *group)
+{
+ VFIOContainer *container = group->container;
+
+ if (ioctl(group->fd, VFIO_GROUP_UNSET_CONTAINER, &container->fd)) {
+ error_report("vfio: error disconnecting group %d from container",
+ group->groupid);
+ }
+
+ QLIST_REMOVE(group, container_next);
+ group->container = NULL;
+
+ if (QLIST_EMPTY(&container->group_list)) {
+ VFIOAddressSpace *space = container->space;
+
+ if (container->iommu_data.release) {
+ container->iommu_data.release(container);
+ }
+ QLIST_REMOVE(container, next);
+ trace_vfio_disconnect_container(container->fd);
+ close(container->fd);
+ g_free(container);
+
+ vfio_put_address_space(space);
+ }
+}
+
+VFIOGroup *vfio_get_group(int groupid, AddressSpace *as)
+{
+ VFIOGroup *group;
+ char path[32];
+ struct vfio_group_status status = { .argsz = sizeof(status) };
+
+ QLIST_FOREACH(group, &vfio_group_list, next) {
+ if (group->groupid == groupid) {
+ /* Found it. Now is it already in the right context? */
+ if (group->container->space->as == as) {
+ return group;
+ } else {
+ error_report("vfio: group %d used in multiple address spaces",
+ group->groupid);
+ return NULL;
+ }
+ }
+ }
+
+ group = g_malloc0(sizeof(*group));
+
+ snprintf(path, sizeof(path), "/dev/vfio/%d", groupid);
+ group->fd = qemu_open(path, O_RDWR);
+ if (group->fd < 0) {
+ error_report("vfio: error opening %s: %m", path);
+ goto free_group_exit;
+ }
+
+ if (ioctl(group->fd, VFIO_GROUP_GET_STATUS, &status)) {
+ error_report("vfio: error getting group status: %m");
+ goto close_fd_exit;
+ }
+
+ if (!(status.flags & VFIO_GROUP_FLAGS_VIABLE)) {
+ error_report("vfio: error, group %d is not viable, please ensure "
+ "all devices within the iommu_group are bound to their "
+ "vfio bus driver.", groupid);
+ goto close_fd_exit;
+ }
+
+ group->groupid = groupid;
+ QLIST_INIT(&group->device_list);
+
+ if (vfio_connect_container(group, as)) {
+ error_report("vfio: failed to setup container for group %d", groupid);
+ goto close_fd_exit;
+ }
+
+ if (QLIST_EMPTY(&vfio_group_list)) {
+ qemu_register_reset(vfio_reset_handler, NULL);
+ }
+
+ QLIST_INSERT_HEAD(&vfio_group_list, group, next);
+
+ vfio_kvm_device_add_group(group);
+
+ return group;
+
+close_fd_exit:
+ close(group->fd);
+
+free_group_exit:
+ g_free(group);
+
+ return NULL;
+}
+
+void vfio_put_group(VFIOGroup *group)
+{
+ if (!QLIST_EMPTY(&group->device_list)) {
+ return;
+ }
+
+ vfio_kvm_device_del_group(group);
+ vfio_disconnect_container(group);
+ QLIST_REMOVE(group, next);
+ trace_vfio_put_group(group->fd);
+ close(group->fd);
+ g_free(group);
+
+ if (QLIST_EMPTY(&vfio_group_list)) {
+ qemu_unregister_reset(vfio_reset_handler, NULL);
+ }
+}
+
+int vfio_get_device(VFIOGroup *group, const char *name,
+ VFIODevice *vbasedev)
+{
+ struct vfio_device_info dev_info = { .argsz = sizeof(dev_info) };
+ int ret;
+
+ ret = ioctl(group->fd, VFIO_GROUP_GET_DEVICE_FD, name);
+ if (ret < 0) {
+ error_report("vfio: error getting device %s from group %d: %m",
+ name, group->groupid);
+ error_printf("Verify all devices in group %d are bound to vfio-<bus> "
+ "or pci-stub and not already in use\n", group->groupid);
+ return ret;
+ }
+
+ vbasedev->fd = ret;
+ vbasedev->group = group;
+ QLIST_INSERT_HEAD(&group->device_list, vbasedev, next);
+
+ ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_INFO, &dev_info);
+ if (ret) {
+ error_report("vfio: error getting device info: %m");
+ goto error;
+ }
+
+ vbasedev->num_irqs = dev_info.num_irqs;
+ vbasedev->num_regions = dev_info.num_regions;
+ vbasedev->flags = dev_info.flags;
+
+ trace_vfio_get_device(name, dev_info.flags, dev_info.num_regions,
+ dev_info.num_irqs);
+
+ vbasedev->reset_works = !!(dev_info.flags & VFIO_DEVICE_FLAGS_RESET);
+
+ ret = vbasedev->ops->vfio_populate_device(vbasedev);
+
+error:
+ if (ret) {
+ vfio_put_base_device(vbasedev);
+ }
+ return ret;
+}
+
+void vfio_put_base_device(VFIODevice *vbasedev)
+{
+ QLIST_REMOVE(vbasedev, next);
+ vbasedev->group = NULL;
+ trace_vfio_put_base_device(vbasedev->fd);
+ close(vbasedev->fd);
+}
+
+static int vfio_container_do_ioctl(AddressSpace *as, int32_t groupid,
+ int req, void *param)
+{
+ VFIOGroup *group;
+ VFIOContainer *container;
+ int ret = -1;
+
+ group = vfio_get_group(groupid, as);
+ if (!group) {
+ error_report("vfio: group %d not registered", groupid);
+ return ret;
+ }
+
+ container = group->container;
+ if (group->container) {
+ ret = ioctl(container->fd, req, param);
+ if (ret < 0) {
+ error_report("vfio: failed to ioctl container: ret=%d, %s",
+ ret, strerror(errno));
+ }
+ }
+
+ vfio_put_group(group);
+
+ return ret;
+}
+
+int vfio_container_ioctl(AddressSpace *as, int32_t groupid,
+ int req, void *param)
+{
+ /* We allow only certain ioctls to the container */
+ switch (req) {
+ case VFIO_CHECK_EXTENSION:
+ case VFIO_IOMMU_SPAPR_TCE_GET_INFO:
+ break;
+ default:
+ /* Return an error on unknown requests */
+ error_report("vfio: unsupported ioctl %X", req);
+ return -1;
+ }
+
+ return vfio_container_do_ioctl(as, groupid, req, param);
+}
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 6e15c8a..b2beae5 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -41,16 +41,7 @@
#include "sysemu/sysemu.h"
#include "trace.h"
#include "hw/vfio/vfio.h"
-
-/* Extra debugging, trap acceleration paths for more logging */
-#define VFIO_ALLOW_MMAP 1
-#define VFIO_ALLOW_KVM_INTX 1
-#define VFIO_ALLOW_KVM_MSI 1
-#define VFIO_ALLOW_KVM_MSIX 1
-
-enum {
- VFIO_DEVICE_TYPE_PCI = 0,
-};
+#include "hw/vfio/vfio-common.h"
struct VFIOPCIDevice;
@@ -77,17 +68,6 @@ typedef struct VFIOQuirk {
} data;
} VFIOQuirk;
-typedef struct VFIORegion {
- struct VFIODevice *vbasedev;
- off_t fd_offset; /* offset of region within device fd */
- MemoryRegion mem; /* slow, read/write access */
- MemoryRegion mmap_mem; /* direct mapped access */
- void *mmap;
- size_t size;
- uint32_t flags; /* VFIO region flags (rd/wr/mmap) */
- uint8_t nr; /* cache the region number for debug */
-} VFIORegion;
-
typedef struct VFIOBAR {
VFIORegion region;
bool ioport;
@@ -143,45 +123,6 @@ enum {
VFIO_INT_MSIX = 3,
};
-typedef struct VFIOAddressSpace {
- AddressSpace *as;
- QLIST_HEAD(, VFIOContainer) containers;
- QLIST_ENTRY(VFIOAddressSpace) list;
-} VFIOAddressSpace;
-
-static QLIST_HEAD(, VFIOAddressSpace) vfio_address_spaces =
- QLIST_HEAD_INITIALIZER(vfio_address_spaces);
-
-struct VFIOGroup;
-
-typedef struct VFIOType1 {
- MemoryListener listener;
- int error;
- bool initialized;
-} VFIOType1;
-
-typedef struct VFIOContainer {
- VFIOAddressSpace *space;
- int fd; /* /dev/vfio/vfio, empowered by the attached groups */
- struct {
- /* enable abstraction to support various iommu backends */
- union {
- VFIOType1 type1;
- };
- void (*release)(struct VFIOContainer *);
- } iommu_data;
- QLIST_HEAD(, VFIOGuestIOMMU) giommu_list;
- QLIST_HEAD(, VFIOGroup) group_list;
- QLIST_ENTRY(VFIOContainer) next;
-} VFIOContainer;
-
-typedef struct VFIOGuestIOMMU {
- VFIOContainer *container;
- MemoryRegion *iommu;
- Notifier n;
- QLIST_ENTRY(VFIOGuestIOMMU) giommu_next;
-} VFIOGuestIOMMU;
-
/* Cache of MSI-X setup plus extra mmap and memory region for split BAR map */
typedef struct VFIOMSIXInfo {
uint8_t table_bar;
@@ -193,29 +134,6 @@ typedef struct VFIOMSIXInfo {
void *mmap;
} VFIOMSIXInfo;
-typedef struct VFIODeviceOps VFIODeviceOps;
-
-typedef struct VFIODevice {
- QLIST_ENTRY(VFIODevice) next;
- struct VFIOGroup *group;
- char *name;
- int fd;
- int type;
- bool reset_works;
- bool needs_reset;
- VFIODeviceOps *ops;
- unsigned int num_irqs;
- unsigned int num_regions;
- unsigned int flags;
-} VFIODevice;
-
-struct VFIODeviceOps {
- void (*vfio_compute_needs_reset)(VFIODevice *vdev);
- int (*vfio_hot_reset_multi)(VFIODevice *vdev);
- void (*vfio_eoi)(VFIODevice *vdev);
- int (*vfio_populate_device)(VFIODevice *vdev);
-};
-
typedef struct VFIOPCIDevice {
PCIDevice pdev;
VFIODevice vbasedev;
@@ -247,15 +165,6 @@ typedef struct VFIOPCIDevice {
bool rom_read_failed;
} VFIOPCIDevice;
-typedef struct VFIOGroup {
- int fd;
- int groupid;
- VFIOContainer *container;
- QLIST_HEAD(, VFIODevice) device_list;
- QLIST_ENTRY(VFIOGroup) next;
- QLIST_ENTRY(VFIOGroup) container_next;
-} VFIOGroup;
-
typedef struct VFIORomBlacklistEntry {
uint16_t vendor_id;
uint16_t device_id;
@@ -281,76 +190,14 @@ static const VFIORomBlacklistEntry romblacklist[] = {
#define MSIX_CAP_LENGTH 12
-static QLIST_HEAD(, VFIOGroup)
- vfio_group_list = QLIST_HEAD_INITIALIZER(vfio_group_list);
-
-#ifdef CONFIG_KVM
-/*
- * We have a single VFIO pseudo device per KVM VM. Once created it lives
- * for the life of the VM. Closing the file descriptor only drops our
- * reference to it and the device's reference to kvm. Therefore once
- * initialized, this file descriptor is only released on QEMU exit and
- * we'll re-use it should another vfio device be attached before then.
- */
-static int vfio_kvm_device_fd = -1;
-#endif
-
static void vfio_disable_interrupts(VFIOPCIDevice *vdev);
static uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len);
static void vfio_pci_write_config(PCIDevice *pdev, uint32_t addr,
uint32_t val, int len);
static void vfio_mmap_set_enabled(VFIOPCIDevice *vdev, bool enabled);
-static void vfio_put_base_device(VFIODevice *vbasedev);
static int vfio_populate_device(VFIODevice *vbasedev);
/*
- * Common VFIO interrupt disable
- */
-static void vfio_disable_irqindex(VFIODevice *vbasedev, int index)
-{
- struct vfio_irq_set irq_set = {
- .argsz = sizeof(irq_set),
- .flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER,
- .index = index,
- .start = 0,
- .count = 0,
- };
-
- ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
-}
-
-/*
- * INTx
- */
-static void vfio_unmask_single_irqindex(VFIODevice *vbasedev, int index)
-{
- struct vfio_irq_set irq_set = {
- .argsz = sizeof(irq_set),
- .flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_UNMASK,
- .index = index,
- .start = 0,
- .count = 1,
- };
-
- ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
-}
-
-#ifdef CONFIG_KVM /* Unused outside of CONFIG_KVM code */
-static void vfio_mask_single_irqindex(VFIODevice *vbasedev, int index)
-{
- struct vfio_irq_set irq_set = {
- .argsz = sizeof(irq_set),
- .flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_MASK,
- .index = index,
- .start = 0,
- .count = 1,
- };
-
- ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set);
-}
-#endif
-
-/*
* Disabling BAR mmaping can be slow, but toggling it around INTx can
* also be a huge overhead. We try to get the best of both worlds by
* waiting until an interrupt to disable mmaps (subsequent transitions
@@ -1080,105 +927,6 @@ static void vfio_update_msi(VFIOPCIDevice *vdev)
}
}
-/*
- * IO Port/MMIO - Beware of the endians, VFIO is always little endian
- */
-static void vfio_region_write(void *opaque, hwaddr addr,
- uint64_t data, unsigned size)
-{
- VFIORegion *region = opaque;
- VFIODevice *vbasedev = region->vbasedev;
- union {
- uint8_t byte;
- uint16_t word;
- uint32_t dword;
- uint64_t qword;
- } buf;
-
- switch (size) {
- case 1:
- buf.byte = data;
- break;
- case 2:
- buf.word = cpu_to_le16(data);
- break;
- case 4:
- buf.dword = cpu_to_le32(data);
- break;
- default:
- hw_error("vfio: unsupported write size, %d bytes", size);
- break;
- }
-
- if (pwrite(vbasedev->fd, &buf, size, region->fd_offset + addr) != size) {
- error_report("%s(%s:region%d+0x%"HWADDR_PRIx", 0x%"PRIx64
- ",%d) failed: %m",
- __func__, vbasedev->name, region->nr,
- addr, data, size);
- }
-
- trace_vfio_region_write(vbasedev->name, region->nr, addr, data, size);
-
- /*
- * A read or write to a BAR always signals an INTx EOI. This will
- * do nothing if not pending (including not in INTx mode). We assume
- * that a BAR access is in response to an interrupt and that BAR
- * accesses will service the interrupt. Unfortunately, we don't know
- * which access will service the interrupt, so we're potentially
- * getting quite a few host interrupts per guest interrupt.
- */
- vbasedev->ops->vfio_eoi(vbasedev);
-}
-
-static uint64_t vfio_region_read(void *opaque,
- hwaddr addr, unsigned size)
-{
- VFIORegion *region = opaque;
- VFIODevice *vbasedev = region->vbasedev;
- union {
- uint8_t byte;
- uint16_t word;
- uint32_t dword;
- uint64_t qword;
- } buf;
- uint64_t data = 0;
-
- if (pread(vbasedev->fd, &buf, size, region->fd_offset + addr) != size) {
- error_report("%s(%s:region%d+0x%"HWADDR_PRIx", %d) failed: %m",
- __func__, vbasedev->name, region->nr,
- addr, size);
- return (uint64_t)-1;
- }
-
- switch (size) {
- case 1:
- data = buf.byte;
- break;
- case 2:
- data = le16_to_cpu(buf.word);
- break;
- case 4:
- data = le32_to_cpu(buf.dword);
- break;
- default:
- hw_error("vfio: unsupported read size, %d bytes", size);
- break;
- }
-
- trace_vfio_region_read(vbasedev->name, region->nr, addr, size, data);
-
- /* Same as write above */
- vbasedev->ops->vfio_eoi(vbasedev);
-
- return data;
-}
-
-static const MemoryRegionOps vfio_region_ops = {
- .read = vfio_region_read,
- .write = vfio_region_write,
- .endianness = DEVICE_LITTLE_ENDIAN,
-};
-
static void vfio_pci_load_rom(VFIOPCIDevice *vdev)
{
struct vfio_region_info reg_info = {
@@ -2377,305 +2125,6 @@ static void vfio_pci_write_config(PCIDevice *pdev, uint32_t addr,
}
/*
- * DMA - Mapping and unmapping for the "type1" IOMMU interface used on x86
- */
-static int vfio_dma_unmap(VFIOContainer *container,
- hwaddr iova, ram_addr_t size)
-{
- struct vfio_iommu_type1_dma_unmap unmap = {
- .argsz = sizeof(unmap),
- .flags = 0,
- .iova = iova,
- .size = size,
- };
-
- if (ioctl(container->fd, VFIO_IOMMU_UNMAP_DMA, &unmap)) {
- error_report("VFIO_UNMAP_DMA: %d\n", -errno);
- return -errno;
- }
-
- return 0;
-}
-
-static int vfio_dma_map(VFIOContainer *container, hwaddr iova,
- ram_addr_t size, void *vaddr, bool readonly)
-{
- struct vfio_iommu_type1_dma_map map = {
- .argsz = sizeof(map),
- .flags = VFIO_DMA_MAP_FLAG_READ,
- .vaddr = (__u64)(uintptr_t)vaddr,
- .iova = iova,
- .size = size,
- };
-
- if (!readonly) {
- map.flags |= VFIO_DMA_MAP_FLAG_WRITE;
- }
-
- /*
- * Try the mapping, if it fails with EBUSY, unmap the region and try
- * again. This shouldn't be necessary, but we sometimes see it in
- * the the VGA ROM space.
- */
- if (ioctl(container->fd, VFIO_IOMMU_MAP_DMA, &map) == 0 ||
- (errno == EBUSY && vfio_dma_unmap(container, iova, size) == 0 &&
- ioctl(container->fd, VFIO_IOMMU_MAP_DMA, &map) == 0)) {
- return 0;
- }
-
- error_report("VFIO_MAP_DMA: %d\n", -errno);
- return -errno;
-}
-
-static bool vfio_listener_skipped_section(MemoryRegionSection *section)
-{
- return (!memory_region_is_ram(section->mr) &&
- !memory_region_is_iommu(section->mr)) ||
- /*
- * Sizing an enabled 64-bit BAR can cause spurious mappings to
- * addresses in the upper part of the 64-bit address space. These
- * are never accessed by the CPU and beyond the address width of
- * some IOMMU hardware. TODO: VFIO should tell us the IOMMU width.
- */
- section->offset_within_address_space & (1ULL << 63);
-}
-
-static void vfio_iommu_map_notify(Notifier *n, void *data)
-{
- VFIOGuestIOMMU *giommu = container_of(n, VFIOGuestIOMMU, n);
- VFIOContainer *container = giommu->container;
- IOMMUTLBEntry *iotlb = data;
- MemoryRegion *mr;
- hwaddr xlat;
- hwaddr len = iotlb->addr_mask + 1;
- void *vaddr;
- int ret;
-
- trace_vfio_iommu_map_notify(iotlb->iova,
- iotlb->iova + iotlb->addr_mask);
-
- /*
- * The IOMMU TLB entry we have just covers translation through
- * this IOMMU to its immediate target. We need to translate
- * it the rest of the way through to memory.
- */
- mr = address_space_translate(&address_space_memory,
- iotlb->translated_addr,
- &xlat, &len, iotlb->perm & IOMMU_WO);
- if (!memory_region_is_ram(mr)) {
- error_report("iommu map to non memory area %"HWADDR_PRIx"\n",
- xlat);
- return;
- }
- /*
- * Translation truncates length to the IOMMU page size,
- * check that it did not truncate too much.
- */
- if (len & iotlb->addr_mask) {
- error_report("iommu has granularity incompatible with target AS\n");
- return;
- }
-
- if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) {
- vaddr = memory_region_get_ram_ptr(mr) + xlat;
-
- ret = vfio_dma_map(container, iotlb->iova,
- iotlb->addr_mask + 1, vaddr,
- !(iotlb->perm & IOMMU_WO) || mr->readonly);
- if (ret) {
- error_report("vfio_dma_map(%p, 0x%"HWADDR_PRIx", "
- "0x%"HWADDR_PRIx", %p) = %d (%m)",
- container, iotlb->iova,
- iotlb->addr_mask + 1, vaddr, ret);
- }
- } else {
- ret = vfio_dma_unmap(container, iotlb->iova, iotlb->addr_mask + 1);
- if (ret) {
- error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", "
- "0x%"HWADDR_PRIx") = %d (%m)",
- container, iotlb->iova,
- iotlb->addr_mask + 1, ret);
- }
- }
-}
-
-static void vfio_listener_region_add(MemoryListener *listener,
- MemoryRegionSection *section)
-{
- VFIOContainer *container = container_of(listener, VFIOContainer,
- iommu_data.type1.listener);
- hwaddr iova, end;
- Int128 llend;
- void *vaddr;
- int ret;
-
- if (vfio_listener_skipped_section(section)) {
- trace_vfio_listener_region_add_skip(
- section->offset_within_address_space,
- section->offset_within_address_space +
- int128_get64(int128_sub(section->size, int128_one())));
- return;
- }
-
- if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
- (section->offset_within_region & ~TARGET_PAGE_MASK))) {
- error_report("%s received unaligned region", __func__);
- return;
- }
-
- iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
- llend = int128_make64(section->offset_within_address_space);
- llend = int128_add(llend, section->size);
- llend = int128_and(llend, int128_exts64(TARGET_PAGE_MASK));
-
- if (int128_ge(int128_make64(iova), llend)) {
- return;
- }
-
- memory_region_ref(section->mr);
-
- if (memory_region_is_iommu(section->mr)) {
- VFIOGuestIOMMU *giommu;
-
- trace_vfio_listener_region_add_iommu(iova,
- int128_get64(int128_sub(llend, int128_one())));
- /*
- * FIXME: We should do some checking to see if the
- * capabilities of the host VFIO IOMMU are adequate to model
- * the guest IOMMU
- *
- * FIXME: For VFIO iommu types which have KVM acceleration to
- * avoid bouncing all map/unmaps through qemu this way, this
- * would be the right place to wire that up (tell the KVM
- * device emulation the VFIO iommu handles to use).
- */
- /*
- * This assumes that the guest IOMMU is empty of
- * mappings at this point.
- *
- * One way of doing this is:
- * 1. Avoid sharing IOMMUs between emulated devices or different
- * IOMMU groups.
- * 2. Implement VFIO_IOMMU_ENABLE in the host kernel to fail if
- * there are some mappings in IOMMU.
- *
- * VFIO on SPAPR does that. Other IOMMU models may do that different,
- * they must make sure there are no existing mappings or
- * loop through existing mappings to map them into VFIO.
- */
- giommu = g_malloc0(sizeof(*giommu));
- giommu->iommu = section->mr;
- giommu->container = container;
- giommu->n.notify = vfio_iommu_map_notify;
- QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next);
- memory_region_register_iommu_notifier(giommu->iommu, &giommu->n);
-
- return;
- }
-
- /* Here we assume that memory_region_is_ram(section->mr)==true */
-
- end = int128_get64(llend);
- vaddr = memory_region_get_ram_ptr(section->mr) +
- section->offset_within_region +
- (iova - section->offset_within_address_space);
-
- trace_vfio_listener_region_add_ram(iova, end - 1, vaddr);
-
- ret = vfio_dma_map(container, iova, end - iova, vaddr, section->readonly);
- if (ret) {
- error_report("vfio_dma_map(%p, 0x%"HWADDR_PRIx", "
- "0x%"HWADDR_PRIx", %p) = %d (%m)",
- container, iova, end - iova, vaddr, ret);
-
- /*
- * On the initfn path, store the first error in the container so we
- * can gracefully fail. Runtime, there's not much we can do other
- * than throw a hardware error.
- */
- if (!container->iommu_data.type1.initialized) {
- if (!container->iommu_data.type1.error) {
- container->iommu_data.type1.error = ret;
- }
- } else {
- hw_error("vfio: DMA mapping failed, unable to continue");
- }
- }
-}
-
-static void vfio_listener_region_del(MemoryListener *listener,
- MemoryRegionSection *section)
-{
- VFIOContainer *container = container_of(listener, VFIOContainer,
- iommu_data.type1.listener);
- hwaddr iova, end;
- int ret;
-
- if (vfio_listener_skipped_section(section)) {
- trace_vfio_listener_region_del_skip(
- section->offset_within_address_space,
- section->offset_within_address_space +
- int128_get64(int128_sub(section->size, int128_one())));
- return;
- }
-
- if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
- (section->offset_within_region & ~TARGET_PAGE_MASK))) {
- error_report("%s received unaligned region", __func__);
- return;
- }
-
- if (memory_region_is_iommu(section->mr)) {
- VFIOGuestIOMMU *giommu;
-
- QLIST_FOREACH(giommu, &container->giommu_list, giommu_next) {
- if (giommu->iommu == section->mr) {
- memory_region_unregister_iommu_notifier(&giommu->n);
- QLIST_REMOVE(giommu, giommu_next);
- g_free(giommu);
- break;
- }
- }
-
- /*
- * FIXME: We assume the one big unmap below is adequate to
- * remove any individual page mappings in the IOMMU which
- * might have been copied into VFIO. This works for a page table
- * based IOMMU where a big unmap flattens a large range of IO-PTEs.
- * That may not be true for all IOMMU types.
- */
- }
-
- iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
- end = (section->offset_within_address_space + int128_get64(section->size)) &
- TARGET_PAGE_MASK;
-
- if (iova >= end) {
- return;
- }
-
- trace_vfio_listener_region_del(iova, end - 1);
-
- ret = vfio_dma_unmap(container, iova, end - iova);
- memory_region_unref(section->mr);
- if (ret) {
- error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", "
- "0x%"HWADDR_PRIx") = %d (%m)",
- container, iova, end - iova, ret);
- }
-}
-
-static MemoryListener vfio_memory_listener = {
- .region_add = vfio_listener_region_add,
- .region_del = vfio_listener_region_del,
-};
-
-static void vfio_listener_release(VFIOContainer *container)
-{
- memory_listener_unregister(&container->iommu_data.type1.listener);
-}
-
-/*
* Interrupt setup
*/
static void vfio_disable_interrupts(VFIOPCIDevice *vdev)
@@ -2849,47 +2298,6 @@ static void vfio_unmap_bar(VFIOPCIDevice *vdev, int nr)
}
}
-static int vfio_mmap_region(Object *obj, VFIORegion *region,
- MemoryRegion *mem, MemoryRegion *submem,
- void **map, size_t size, off_t offset,
- const char *name)
-{
- int ret = 0;
- VFIODevice *vbasedev = region->vbasedev;
-
- if (VFIO_ALLOW_MMAP && size && region->flags &
- VFIO_REGION_INFO_FLAG_MMAP) {
- int prot = 0;
-
- if (region->flags & VFIO_REGION_INFO_FLAG_READ) {
- prot |= PROT_READ;
- }
-
- if (region->flags & VFIO_REGION_INFO_FLAG_WRITE) {
- prot |= PROT_WRITE;
- }
-
- *map = mmap(NULL, size, prot, MAP_SHARED,
- vbasedev->fd, region->fd_offset + offset);
- if (*map == MAP_FAILED) {
- *map = NULL;
- ret = -errno;
- goto empty_region;
- }
-
- memory_region_init_ram_ptr(submem, obj, name, size, *map);
- memory_region_set_skip_dump(submem);
- } else {
-empty_region:
- /* Create a zero sized sub-region to make cleanup easy. */
- memory_region_init(submem, obj, name, 0);
- }
-
- memory_region_add_subregion(mem, offset, submem);
-
- return ret;
-}
-
static void vfio_map_bar(VFIOPCIDevice *vdev, int nr)
{
VFIOBAR *bar = &vdev->bars[nr];
@@ -3529,345 +2937,6 @@ static VFIODeviceOps vfio_pci_ops = {
.vfio_populate_device = vfio_populate_device,
};
-static void vfio_reset_handler(void *opaque)
-{
- VFIOGroup *group;
- VFIODevice *vbasedev;
-
- QLIST_FOREACH(group, &vfio_group_list, next) {
- QLIST_FOREACH(vbasedev, &group->device_list, next) {
- vbasedev->ops->vfio_compute_needs_reset(vbasedev);
- }
- }
-
- QLIST_FOREACH(group, &vfio_group_list, next) {
- QLIST_FOREACH(vbasedev, &group->device_list, next) {
- if (vbasedev->needs_reset) {
- vbasedev->ops->vfio_hot_reset_multi(vbasedev);
- }
- }
- }
-}
-
-static void vfio_kvm_device_add_group(VFIOGroup *group)
-{
-#ifdef CONFIG_KVM
- struct kvm_device_attr attr = {
- .group = KVM_DEV_VFIO_GROUP,
- .attr = KVM_DEV_VFIO_GROUP_ADD,
- .addr = (uint64_t)(unsigned long)&group->fd,
- };
-
- if (!kvm_enabled()) {
- return;
- }
-
- if (vfio_kvm_device_fd < 0) {
- struct kvm_create_device cd = {
- .type = KVM_DEV_TYPE_VFIO,
- };
-
- if (kvm_vm_ioctl(kvm_state, KVM_CREATE_DEVICE, &cd)) {
- error_report("KVM_CREATE_DEVICE: %m\n");
- return;
- }
-
- vfio_kvm_device_fd = cd.fd;
- }
-
- if (ioctl(vfio_kvm_device_fd, KVM_SET_DEVICE_ATTR, &attr)) {
- error_report("Failed to add group %d to KVM VFIO device: %m",
- group->groupid);
- }
-#endif
-}
-
-static void vfio_kvm_device_del_group(VFIOGroup *group)
-{
-#ifdef CONFIG_KVM
- struct kvm_device_attr attr = {
- .group = KVM_DEV_VFIO_GROUP,
- .attr = KVM_DEV_VFIO_GROUP_DEL,
- .addr = (uint64_t)(unsigned long)&group->fd,
- };
-
- if (vfio_kvm_device_fd < 0) {
- return;
- }
-
- if (ioctl(vfio_kvm_device_fd, KVM_SET_DEVICE_ATTR, &attr)) {
- error_report("Failed to remove group %d from KVM VFIO device: %m",
- group->groupid);
- }
-#endif
-}
-
-static VFIOAddressSpace *vfio_get_address_space(AddressSpace *as)
-{
- VFIOAddressSpace *space;
-
- QLIST_FOREACH(space, &vfio_address_spaces, list) {
- if (space->as == as) {
- return space;
- }
- }
-
- /* No suitable VFIOAddressSpace, create a new one */
- space = g_malloc0(sizeof(*space));
- space->as = as;
- QLIST_INIT(&space->containers);
-
- QLIST_INSERT_HEAD(&vfio_address_spaces, space, list);
-
- return space;
-}
-
-static void vfio_put_address_space(VFIOAddressSpace *space)
-{
- if (QLIST_EMPTY(&space->containers)) {
- QLIST_REMOVE(space, list);
- g_free(space);
- }
-}
-
-static int vfio_connect_container(VFIOGroup *group, AddressSpace *as)
-{
- VFIOContainer *container;
- int ret, fd;
- VFIOAddressSpace *space;
-
- space = vfio_get_address_space(as);
-
- QLIST_FOREACH(container, &space->containers, next) {
- if (!ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &container->fd)) {
- group->container = container;
- QLIST_INSERT_HEAD(&container->group_list, group, container_next);
- return 0;
- }
- }
-
- fd = qemu_open("/dev/vfio/vfio", O_RDWR);
- if (fd < 0) {
- error_report("vfio: failed to open /dev/vfio/vfio: %m");
- ret = -errno;
- goto put_space_exit;
- }
-
- ret = ioctl(fd, VFIO_GET_API_VERSION);
- if (ret != VFIO_API_VERSION) {
- error_report("vfio: supported vfio version: %d, "
- "reported version: %d", VFIO_API_VERSION, ret);
- ret = -EINVAL;
- goto close_fd_exit;
- }
-
- container = g_malloc0(sizeof(*container));
- container->space = space;
- container->fd = fd;
-
- if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1_IOMMU)) {
- ret = ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &fd);
- if (ret) {
- error_report("vfio: failed to set group container: %m");
- ret = -errno;
- goto free_container_exit;
- }
-
- ret = ioctl(fd, VFIO_SET_IOMMU, VFIO_TYPE1_IOMMU);
- if (ret) {
- error_report("vfio: failed to set iommu for container: %m");
- ret = -errno;
- goto free_container_exit;
- }
-
- container->iommu_data.type1.listener = vfio_memory_listener;
- container->iommu_data.release = vfio_listener_release;
-
- memory_listener_register(&container->iommu_data.type1.listener,
- &address_space_memory);
-
- if (container->iommu_data.type1.error) {
- ret = container->iommu_data.type1.error;
- error_report("vfio: memory listener initialization failed for container");
- goto listener_release_exit;
- }
-
- container->iommu_data.type1.initialized = true;
-
- } else if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_IOMMU)) {
- ret = ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &fd);
- if (ret) {
- error_report("vfio: failed to set group container: %m");
- ret = -errno;
- goto free_container_exit;
- }
-
- ret = ioctl(fd, VFIO_SET_IOMMU, VFIO_SPAPR_TCE_IOMMU);
- if (ret) {
- error_report("vfio: failed to set iommu for container: %m");
- ret = -errno;
- goto free_container_exit;
- }
-
- /*
- * The host kernel code implementing VFIO_IOMMU_DISABLE is called
- * when container fd is closed so we do not call it explicitly
- * in this file.
- */
- ret = ioctl(fd, VFIO_IOMMU_ENABLE);
- if (ret) {
- error_report("vfio: failed to enable container: %m");
- ret = -errno;
- goto free_container_exit;
- }
-
- container->iommu_data.type1.listener = vfio_memory_listener;
- container->iommu_data.release = vfio_listener_release;
-
- memory_listener_register(&container->iommu_data.type1.listener,
- container->space->as);
-
- } else {
- error_report("vfio: No available IOMMU models");
- ret = -EINVAL;
- goto free_container_exit;
- }
-
- QLIST_INIT(&container->group_list);
- QLIST_INSERT_HEAD(&space->containers, container, next);
-
- group->container = container;
- QLIST_INSERT_HEAD(&container->group_list, group, container_next);
-
- return 0;
-
-listener_release_exit:
- vfio_listener_release(container);
-
-free_container_exit:
- g_free(container);
-
-close_fd_exit:
- close(fd);
-
-put_space_exit:
- vfio_put_address_space(space);
-
- return ret;
-}
-
-static void vfio_disconnect_container(VFIOGroup *group)
-{
- VFIOContainer *container = group->container;
-
- if (ioctl(group->fd, VFIO_GROUP_UNSET_CONTAINER, &container->fd)) {
- error_report("vfio: error disconnecting group %d from container",
- group->groupid);
- }
-
- QLIST_REMOVE(group, container_next);
- group->container = NULL;
-
- if (QLIST_EMPTY(&container->group_list)) {
- VFIOAddressSpace *space = container->space;
-
- if (container->iommu_data.release) {
- container->iommu_data.release(container);
- }
- QLIST_REMOVE(container, next);
- trace_vfio_disconnect_container(container->fd);
- close(container->fd);
- g_free(container);
-
- vfio_put_address_space(space);
- }
-}
-
-static VFIOGroup *vfio_get_group(int groupid, AddressSpace *as)
-{
- VFIOGroup *group;
- char path[32];
- struct vfio_group_status status = { .argsz = sizeof(status) };
-
- QLIST_FOREACH(group, &vfio_group_list, next) {
- if (group->groupid == groupid) {
- /* Found it. Now is it already in the right context? */
- if (group->container->space->as == as) {
- return group;
- } else {
- error_report("vfio: group %d used in multiple address spaces",
- group->groupid);
- return NULL;
- }
- }
- }
-
- group = g_malloc0(sizeof(*group));
-
- snprintf(path, sizeof(path), "/dev/vfio/%d", groupid);
- group->fd = qemu_open(path, O_RDWR);
- if (group->fd < 0) {
- error_report("vfio: error opening %s: %m", path);
- goto free_group_exit;
- }
-
- if (ioctl(group->fd, VFIO_GROUP_GET_STATUS, &status)) {
- error_report("vfio: error getting group status: %m");
- goto close_fd_exit;
- }
-
- if (!(status.flags & VFIO_GROUP_FLAGS_VIABLE)) {
- error_report("vfio: error, group %d is not viable, please ensure "
- "all devices within the iommu_group are bound to their "
- "vfio bus driver.", groupid);
- goto close_fd_exit;
- }
-
- group->groupid = groupid;
- QLIST_INIT(&group->device_list);
-
- if (vfio_connect_container(group, as)) {
- error_report("vfio: failed to setup container for group %d", groupid);
- goto close_fd_exit;
- }
-
- if (QLIST_EMPTY(&vfio_group_list)) {
- qemu_register_reset(vfio_reset_handler, NULL);
- }
-
- QLIST_INSERT_HEAD(&vfio_group_list, group, next);
-
- vfio_kvm_device_add_group(group);
-
- return group;
-
-close_fd_exit:
- close(group->fd);
-
-free_group_exit:
- g_free(group);
-
- return NULL;
-}
-
-static void vfio_put_group(VFIOGroup *group)
-{
- if (!QLIST_EMPTY(&group->device_list)) {
- return;
- }
-
- vfio_kvm_device_del_group(group);
- vfio_disconnect_container(group);
- QLIST_REMOVE(group, next);
- trace_vfio_put_group(group->fd);
- close(group->fd);
- g_free(group);
-
- if (QLIST_EMPTY(&vfio_group_list)) {
- qemu_unregister_reset(vfio_reset_handler, NULL);
- }
-}
-
static int vfio_populate_device(VFIODevice *vbasedev)
{
VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
@@ -3992,57 +3061,6 @@ error:
return ret;
}
-static int vfio_get_device(VFIOGroup *group, const char *name,
- VFIODevice *vbasedev)
-{
- struct vfio_device_info dev_info = { .argsz = sizeof(dev_info) };
- int ret;
-
- ret = ioctl(group->fd, VFIO_GROUP_GET_DEVICE_FD, name);
- if (ret < 0) {
- error_report("vfio: error getting device %s from group %d: %m",
- name, group->groupid);
- error_printf("Verify all devices in group %d are bound to vfio-<bus> "
- "or pci-stub and not already in use\n", group->groupid);
- return ret;
- }
-
- vbasedev->fd = ret;
- vbasedev->group = group;
- QLIST_INSERT_HEAD(&group->device_list, vbasedev, next);
-
- ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_INFO, &dev_info);
- if (ret) {
- error_report("vfio: error getting device info: %m");
- goto error;
- }
-
- vbasedev->num_irqs = dev_info.num_irqs;
- vbasedev->num_regions = dev_info.num_regions;
- vbasedev->flags = dev_info.flags;
-
- trace_vfio_get_device(name, dev_info.flags,
- dev_info.num_regions, dev_info.num_irqs);
-
- vbasedev->reset_works = !!(dev_info.flags & VFIO_DEVICE_FLAGS_RESET);
-
- ret = vbasedev->ops->vfio_populate_device(vbasedev);
-
-error:
- if (ret) {
- vfio_put_base_device(vbasedev);
- }
- return ret;
-}
-
-void vfio_put_base_device(VFIODevice *vbasedev)
-{
- QLIST_REMOVE(vbasedev, next);
- vbasedev->group = NULL;
- trace_vfio_put_base_device(vbasedev->fd);
- close(vbasedev->fd);
-}
-
static void vfio_put_device(VFIOPCIDevice *vdev)
{
g_free(vdev->vbasedev.name);
@@ -4426,47 +3444,3 @@ static void register_vfio_pci_dev_type(void)
}
type_init(register_vfio_pci_dev_type)
-
-static int vfio_container_do_ioctl(AddressSpace *as, int32_t groupid,
- int req, void *param)
-{
- VFIOGroup *group;
- VFIOContainer *container;
- int ret = -1;
-
- group = vfio_get_group(groupid, as);
- if (!group) {
- error_report("vfio: group %d not registered", groupid);
- return ret;
- }
-
- container = group->container;
- if (group->container) {
- ret = ioctl(container->fd, req, param);
- if (ret < 0) {
- error_report("vfio: failed to ioctl container: ret=%d, %s",
- ret, strerror(errno));
- }
- }
-
- vfio_put_group(group);
-
- return ret;
-}
-
-int vfio_container_ioctl(AddressSpace *as, int32_t groupid,
- int req, void *param)
-{
- /* We allow only certain ioctls to the container */
- switch (req) {
- case VFIO_CHECK_EXTENSION:
- case VFIO_IOMMU_SPAPR_TCE_GET_INFO:
- break;
- default:
- /* Return an error on unknown requests */
- error_report("vfio: unsupported ioctl %X", req);
- return -1;
- }
-
- return vfio_container_do_ioctl(as, groupid, req, param);
-}
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
new file mode 100644
index 0000000..1d5bfe8
--- /dev/null
+++ b/include/hw/vfio/vfio-common.h
@@ -0,0 +1,151 @@
+/*
+ * common header for vfio based device assignment support
+ *
+ * Copyright Red Hat, Inc. 2012
+ *
+ * Authors:
+ * Alex Williamson <alex.williamson@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See
+ * the COPYING file in the top-level directory.
+ *
+ * Based on qemu-kvm device-assignment:
+ * Adapted for KVM by Qumranet.
+ * Copyright (c) 2007, Neocleus, Alex Novik (alex@neocleus.com)
+ * Copyright (c) 2007, Neocleus, Guy Zana (guy@neocleus.com)
+ * Copyright (C) 2008, Qumranet, Amit Shah (amit.shah@qumranet.com)
+ * Copyright (C) 2008, Red Hat, Amit Shah (amit.shah@redhat.com)
+ * Copyright (C) 2008, IBM, Muli Ben-Yehuda (muli@il.ibm.com)
+ */
+#ifndef HW_VFIO_VFIO_COMMON_H
+#define HW_VFIO_VFIO_COMMON_H
+
+#include "qemu-common.h"
+#include "exec/address-spaces.h"
+#include "exec/memory.h"
+#include "qemu/queue.h"
+#include "qemu/notify.h"
+
+/*#define DEBUG_VFIO*/
+#ifdef DEBUG_VFIO
+#define DPRINTF(fmt, ...) \
+ do { fprintf(stderr, "vfio: " fmt, ## __VA_ARGS__); } while (0)
+#else
+#define DPRINTF(fmt, ...) \
+ do { } while (0)
+#endif
+
+/* Extra debugging, trap acceleration paths for more logging */
+#define VFIO_ALLOW_MMAP 1
+#define VFIO_ALLOW_KVM_INTX 1
+#define VFIO_ALLOW_KVM_MSI 1
+#define VFIO_ALLOW_KVM_MSIX 1
+
+enum {
+ VFIO_DEVICE_TYPE_PCI = 0,
+};
+
+typedef struct VFIORegion {
+ struct VFIODevice *vbasedev;
+ off_t fd_offset; /* offset of region within device fd */
+ MemoryRegion mem; /* slow, read/write access */
+ MemoryRegion mmap_mem; /* direct mapped access */
+ void *mmap;
+ size_t size;
+ uint32_t flags; /* VFIO region flags (rd/wr/mmap) */
+ uint8_t nr; /* cache the region number for debug */
+} VFIORegion;
+
+typedef struct VFIOAddressSpace {
+ AddressSpace *as;
+ QLIST_HEAD(, VFIOContainer) containers;
+ QLIST_ENTRY(VFIOAddressSpace) list;
+} VFIOAddressSpace;
+
+struct VFIOGroup;
+
+typedef struct VFIOType1 {
+ MemoryListener listener;
+ int error;
+ bool initialized;
+} VFIOType1;
+
+typedef struct VFIOContainer {
+ VFIOAddressSpace *space;
+ int fd; /* /dev/vfio/vfio, empowered by the attached groups */
+ struct {
+ /* enable abstraction to support various iommu backends */
+ union {
+ VFIOType1 type1;
+ };
+ void (*release)(struct VFIOContainer *);
+ } iommu_data;
+ QLIST_HEAD(, VFIOGuestIOMMU) giommu_list;
+ QLIST_HEAD(, VFIOGroup) group_list;
+ QLIST_ENTRY(VFIOContainer) next;
+} VFIOContainer;
+
+typedef struct VFIOGuestIOMMU {
+ VFIOContainer *container;
+ MemoryRegion *iommu;
+ Notifier n;
+ QLIST_ENTRY(VFIOGuestIOMMU) giommu_next;
+} VFIOGuestIOMMU;
+
+typedef struct VFIODeviceOps VFIODeviceOps;
+
+typedef struct VFIODevice {
+ QLIST_ENTRY(VFIODevice) next;
+ struct VFIOGroup *group;
+ char *name;
+ int fd;
+ int type;
+ bool reset_works;
+ bool needs_reset;
+ VFIODeviceOps *ops;
+ unsigned int num_irqs;
+ unsigned int num_regions;
+ unsigned int flags;
+} VFIODevice;
+
+struct VFIODeviceOps {
+ void (*vfio_compute_needs_reset)(VFIODevice *vdev);
+ int (*vfio_hot_reset_multi)(VFIODevice *vdev);
+ void (*vfio_eoi)(VFIODevice *vdev);
+ int (*vfio_populate_device)(VFIODevice *vdev);
+};
+
+typedef struct VFIOGroup {
+ int fd;
+ int groupid;
+ VFIOContainer *container;
+ QLIST_HEAD(, VFIODevice) device_list;
+ QLIST_ENTRY(VFIOGroup) next;
+ QLIST_ENTRY(VFIOGroup) container_next;
+} VFIOGroup;
+
+void vfio_put_base_device(VFIODevice *vbasedev);
+void vfio_disable_irqindex(VFIODevice *vbasedev, int index);
+void vfio_unmask_single_irqindex(VFIODevice *vbasedev, int index);
+void vfio_mask_single_irqindex(VFIODevice *vbasedev, int index);
+void vfio_region_write(void *opaque, hwaddr addr,
+ uint64_t data, unsigned size);
+uint64_t vfio_region_read(void *opaque,
+ hwaddr addr, unsigned size);
+void vfio_listener_release(VFIOContainer *container);
+int vfio_mmap_region(Object *vdev, VFIORegion *region,
+ MemoryRegion *mem, MemoryRegion *submem,
+ void **map, size_t size, off_t offset,
+ const char *name);
+void vfio_reset_handler(void *opaque);
+VFIOGroup *vfio_get_group(int groupid, AddressSpace *as);
+void vfio_put_group(VFIOGroup *group);
+int vfio_get_device(VFIOGroup *group, const char *name,
+ VFIODevice *vbasedev);
+
+extern const MemoryRegionOps vfio_region_ops;
+extern const MemoryListener vfio_memory_listener;
+extern QLIST_HEAD(vfio_group_head, VFIOGroup) vfio_group_list;
+extern QLIST_HEAD(vfio_as_head, VFIOAddressSpace) vfio_address_spaces;
+
+#endif /* !HW_VFIO_VFIO_COMMON_H */
diff --git a/trace-events b/trace-events
index 0e7aa53..8acbcce 100644
--- a/trace-events
+++ b/trace-events
@@ -1416,6 +1416,7 @@ vfio_pci_reset(const char *name) " (%s)"
vfio_pci_reset_flr(const char *name) "%s FLR/VFIO_DEVICE_RESET"
vfio_pci_reset_pm(const char *name) "%s PCI PM Reset"
+# hw/vfio/vfio-common.c
vfio_region_write(const char *name, int index, uint64_t addr, uint64_t data, unsigned size) " (%s:region%d+0x%"PRIx64", 0x%"PRIx64 ", %d)"
vfio_region_read(char *name, int index, uint64_t addr, unsigned size, uint64_t data) " (%s:region%d+0x%"PRIx64", %d) = 0x%"PRIx64
vfio_iommu_map_notify(uint64_t iova_start, uint64_t iova_end) "iommu map @ %"PRIx64" - %"PRIx64
--
1.8.3.2
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH v8 12/19] hw/vfio/platform: add vfio-platform support
2014-11-30 18:35 [Qemu-devel] [PATCH v8 00/19] KVM platform device passthrough Eric Auger
` (10 preceding siblings ...)
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 11/19] hw/vfio: create common module Eric Auger
@ 2014-11-30 18:35 ` Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 13/19] hw/vfio: calxeda xgmac device Eric Auger
` (7 subsequent siblings)
19 siblings, 0 replies; 22+ messages in thread
From: Eric Auger @ 2014-11-30 18:35 UTC (permalink / raw)
To: eric.auger, christoffer.dall, qemu-devel, agraf, pbonzini,
kim.phillips, a.rigo, manish.jaggi, joel.schopp, zhaoshenglong,
ard.biesheuvel
Cc: peter.maydell, patches, Kim Phillips, eric.auger, will.deacon,
stuart.yoder, Bharat.Bhushan, alex.williamson, a.motakis, kvmarm
Minimal VFIO platform implementation supporting
- register space user mapping,
- IRQ assignment based on eventfds handled on qemu side.
irqfd kernel acceleration comes in a subsequent patch.
Signed-off-by: Kim Phillips <kim.phillips@linaro.org>
Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
v7 -> v8:
- change proto of vfio_platform_compute_needs_reset and sets
vbasedev->needs_reset to false there
- vfio_[un]mask_irqindex renamed into vfio_[un]mask_single_irqindex
- vfio_register_irq_starter renamed into vfio_kick_irqs
we now use a reset notifier instead of a machine init done notifier.
Enables to get rid of the VfioIrqStarterNotifierParams dangling
pointer. Previously we use pbus first_irq. This is no more possible
since the reset notifier takes a void * and first_irq is a field of
a const struct. So now we pass the DeviceState handle of the
interrupt controller. I tried to keep the code generic, reason why
I did not rely on an architecture specific accessor to retrieve
the gsi number (gic accessor as proposed by Alex). I would like to
avoid creating an ARM VFIO device model. I hope this model
model can work on other archs than arm (no multiple intc?);
wouldn't it be simpler to keep the previous first_irq parameter and
relax the const constraint.
v6 -> v7:
- compat is not exposed anymore as a user option. Rationale is
the vfio device became abstract and a specialization is needed
anyway. The derived device must set the compat string.
- in v6 vfio_start_irq_injection was exposed in vfio-platform.h.
A new function dubbed vfio_register_irq_starter replaces it. It
registers a machine init done notifier that programs & starts
all dynamic VFIO device IRQs. This function is supposed to be
called by the machine file. A set of static helper routines are
added too. It must be called before the creation of the platform
bus device.
v5 -> v6:
- vfio_device property renamed into host property
- correct error handling of VFIO_DEVICE_GET_IRQ_INFO ioctl
and remove PCI related comment
- remove declaration of vfio_setup_irqfd and irqfd_allowed
property.Both belong to next patch (irqfd)
- remove declaration of vfio_intp_interrupt in vfio-platform.h
- functions that can be static get this characteristic
- remove declarations of vfio_region_ops, vfio_memory_listener,
group_list, vfio_address_spaces. All are moved to vfio-common.h
- remove vfio_put_device declaration and definition
- print_regions removed. code moved into vfio_populate_regions
- replace DPRINTF by trace events
- new helper routine to set the trigger eventfd
- dissociate intp init from the injection enablement:
vfio_enable_intp renamed into vfio_init_intp and new function
named vfio_start_eventfd_injection
- injection start moved to vfio_start_irq_injection (not anymore
in vfio_populate_interrupt)
- new start_irq_fn field in VFIOPlatformDevice corresponding to
the function that will be used for starting injection
- user handled eventfd:
x add mutex to protect IRQ state & list manipulation,
x correct misleading comment in vfio_intp_interrupt.
x Fix bugs thanks to fake interrupt modality
- VFIOPlatformDeviceClass becomes abstract
- add error_setg in vfio_platform_realize
v4 -> v5:
- vfio-plaform.h included first
- cleanup error handling in *populate*, vfio_get_device,
vfio_enable_intp
- vfio_put_device not called anymore
- add some includes to follow vfio policy
v3 -> v4:
[Eric Auger]
- merge of "vfio: Add initial IRQ support in platform device"
to get a full functional patch although perfs are limited.
- removal of unrealize function since I currently understand
it is only used with device hot-plug feature.
v2 -> v3:
[Eric Auger]
- further factorization between PCI and platform (VFIORegion,
VFIODevice). same level of functionality.
<= v2:
[Kim Philipps]
- Initial Creation of the device supporting register space mapping
---
hw/vfio/Makefile.objs | 1 +
hw/vfio/platform.c | 629 ++++++++++++++++++++++++++++++++++++++++
include/hw/vfio/vfio-common.h | 1 +
include/hw/vfio/vfio-platform.h | 85 ++++++
trace-events | 12 +
5 files changed, 728 insertions(+)
create mode 100644 hw/vfio/platform.c
create mode 100644 include/hw/vfio/vfio-platform.h
diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
index e31f30e..c5c76fe 100644
--- a/hw/vfio/Makefile.objs
+++ b/hw/vfio/Makefile.objs
@@ -1,4 +1,5 @@
ifeq ($(CONFIG_LINUX), y)
obj-$(CONFIG_SOFTMMU) += common.o
obj-$(CONFIG_PCI) += pci.o
+obj-$(CONFIG_SOFTMMU) += platform.o
endif
diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
new file mode 100644
index 0000000..41f8693
--- /dev/null
+++ b/hw/vfio/platform.c
@@ -0,0 +1,629 @@
+/*
+ * vfio based device assignment support - platform devices
+ *
+ * Copyright Linaro Limited, 2014
+ *
+ * Authors:
+ * Kim Phillips <kim.phillips@linaro.org>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See
+ * the COPYING file in the top-level directory.
+ *
+ * Based on vfio based PCI device assignment support:
+ * Copyright Red Hat, Inc. 2012
+ */
+
+#include <linux/vfio.h>
+#include <sys/ioctl.h>
+
+#include "hw/vfio/vfio-platform.h"
+#include "qemu/error-report.h"
+#include "qemu/range.h"
+#include "sysemu/sysemu.h"
+#include "exec/memory.h"
+#include "qemu/queue.h"
+#include "hw/sysbus.h"
+#include "trace.h"
+#include "hw/platform-bus.h"
+
+static void vfio_intp_interrupt(VFIOINTp *intp);
+typedef void (*eventfd_user_side_handler_t)(VFIOINTp *intp);
+static int vfio_set_trigger_eventfd(VFIOINTp *intp,
+ eventfd_user_side_handler_t handler);
+
+/*
+ * Functions only used when eventfd are handled on user-side
+ * ie. without irqfd
+ */
+
+/**
+ * vfio_platform_eoi - IRQ completion routine
+ * @vbasedev: the VFIO device
+ *
+ * de-asserts the active virtual IRQ and unmask the physical IRQ
+ * (masked by the VFIO driver). Handle pending IRQs if any.
+ * eoi function is called on the first access to any MMIO region
+ * after an IRQ was triggered. It is assumed this access corresponds
+ * to the IRQ status register reset. With such a mechanism, a single
+ * IRQ can be handled at a time since there is no way to know which
+ * IRQ was completed by the guest (we would need additional details
+ * about the IRQ status register mask)
+ */
+static void vfio_platform_eoi(VFIODevice *vbasedev)
+{
+ VFIOINTp *intp;
+ VFIOPlatformDevice *vdev =
+ container_of(vbasedev, VFIOPlatformDevice, vbasedev);
+
+ qemu_mutex_lock(&vdev->intp_mutex);
+ QLIST_FOREACH(intp, &vdev->intp_list, next) {
+ if (intp->state == VFIO_IRQ_ACTIVE) {
+ trace_vfio_platform_eoi(intp->pin,
+ event_notifier_get_fd(&intp->interrupt));
+ intp->state = VFIO_IRQ_INACTIVE;
+
+ /* deassert the virtual IRQ and unmask physical one */
+ qemu_set_irq(intp->qemuirq, 0);
+ vfio_unmask_single_irqindex(vbasedev, intp->pin);
+
+ /* a single IRQ can be active at a time */
+ break;
+ }
+ }
+ /* in case there are pending IRQs, handle them one at a time */
+ if (!QSIMPLEQ_EMPTY(&vdev->pending_intp_queue)) {
+ intp = QSIMPLEQ_FIRST(&vdev->pending_intp_queue);
+ trace_vfio_platform_eoi_handle_pending(intp->pin);
+ qemu_mutex_unlock(&vdev->intp_mutex);
+ vfio_intp_interrupt(intp);
+ qemu_mutex_lock(&vdev->intp_mutex);
+ QSIMPLEQ_REMOVE_HEAD(&vdev->pending_intp_queue, pqnext);
+ qemu_mutex_unlock(&vdev->intp_mutex);
+ } else {
+ qemu_mutex_unlock(&vdev->intp_mutex);
+ }
+}
+
+/**
+ * vfio_mmap_set_enabled - enable/disable the fast path mode
+ * @vdev: the VFIO platform device
+ * @enabled: the target mmap state
+ *
+ * true ~ fast path = MMIO region is mmaped (no KVM TRAP)
+ * false ~ slow path = MMIO region is trapped and region callbacks
+ * are called slow path enables to trap the IRQ status register
+ * guest reset
+*/
+
+static void vfio_mmap_set_enabled(VFIOPlatformDevice *vdev, bool enabled)
+{
+ VFIORegion *region;
+ int i;
+
+ trace_vfio_platform_mmap_set_enabled(enabled);
+
+ for (i = 0; i < vdev->vbasedev.num_regions; i++) {
+ region = vdev->regions[i];
+
+ /* register space is unmapped to trap EOI */
+ memory_region_set_enabled(®ion->mmap_mem, enabled);
+ }
+}
+
+/**
+ * vfio_intp_mmap_enable - timer function, restores the fast path
+ * if there is no more active IRQ
+ * @opaque: actually points to the VFIO platform device
+ *
+ * Called on mmap timer timout, this function checks whether the
+ * IRQ is still active and in the negative restores the fast path.
+ * by construction a single eventfd is handled at a time.
+ * if the IRQ is still active, the timer is restarted.
+ */
+static void vfio_intp_mmap_enable(void *opaque)
+{
+ VFIOINTp *tmp;
+ VFIOPlatformDevice *vdev = (VFIOPlatformDevice *)opaque;
+
+ qemu_mutex_lock(&vdev->intp_mutex);
+ QLIST_FOREACH(tmp, &vdev->intp_list, next) {
+ if (tmp->state == VFIO_IRQ_ACTIVE) {
+ trace_vfio_platform_intp_mmap_enable(tmp->pin);
+ /* re-program the timer to check active status later */
+ timer_mod(vdev->mmap_timer,
+ qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
+ vdev->mmap_timeout);
+ qemu_mutex_unlock(&vdev->intp_mutex);
+ return;
+ }
+ }
+ vfio_mmap_set_enabled(vdev, true);
+ qemu_mutex_unlock(&vdev->intp_mutex);
+}
+
+/**
+ * vfio_intp_interrupt - The user-side eventfd handler
+ * @opaque: opaque pointer which in practice is the VFIOINTp*
+ *
+ * the function can be entered
+ * - in event handler context: this IRQ is inactive
+ * in that case, the vIRQ is injected into the guest if there
+ * is no other active or pending IRQ.
+ * - in IOhandler context: this IRQ is pending.
+ * there is no ACTIVE IRQ
+ */
+static void vfio_intp_interrupt(VFIOINTp *intp)
+{
+ int ret;
+ VFIOINTp *tmp;
+ VFIOPlatformDevice *vdev = intp->vdev;
+ bool delay_handling = false;
+
+ qemu_mutex_lock(&vdev->intp_mutex);
+ if (intp->state == VFIO_IRQ_INACTIVE) {
+ QLIST_FOREACH(tmp, &vdev->intp_list, next) {
+ if (tmp->state == VFIO_IRQ_ACTIVE ||
+ tmp->state == VFIO_IRQ_PENDING) {
+ delay_handling = true;
+ break;
+ }
+ }
+ }
+ if (delay_handling) {
+ /*
+ * the new IRQ gets a pending status and is pushed in
+ * the pending queue
+ */
+ intp->state = VFIO_IRQ_PENDING;
+ trace_vfio_intp_interrupt_set_pending(intp->pin);
+ QSIMPLEQ_INSERT_TAIL(&vdev->pending_intp_queue,
+ intp, pqnext);
+ ret = event_notifier_test_and_clear(&intp->interrupt);
+ qemu_mutex_unlock(&vdev->intp_mutex);
+ return;
+ }
+
+ /* no active IRQ, the new IRQ can be forwarded to the guest */
+ trace_vfio_platform_intp_interrupt(intp->pin,
+ event_notifier_get_fd(&intp->interrupt));
+
+ if (intp->state == VFIO_IRQ_INACTIVE) {
+ ret = event_notifier_test_and_clear(&intp->interrupt);
+ if (!ret) {
+ error_report("Error when clearing fd=%d (ret = %d)\n",
+ event_notifier_get_fd(&intp->interrupt), ret);
+ }
+ } /* else this is a pending IRQ that moves to ACTIVE state */
+
+ intp->state = VFIO_IRQ_ACTIVE;
+
+ /* sets slow path */
+ vfio_mmap_set_enabled(vdev, false);
+
+ /* trigger the virtual IRQ */
+ qemu_set_irq(intp->qemuirq, 1);
+
+ /* schedule the mmap timer which will restore mmap path after EOI*/
+ if (vdev->mmap_timeout) {
+ timer_mod(vdev->mmap_timer,
+ qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
+ vdev->mmap_timeout);
+ }
+ qemu_mutex_unlock(&vdev->intp_mutex);
+}
+
+/**
+ * vfio_start_eventfd_injection - starts the virtual IRQ injection using
+ * user-side handled eventfds
+ * @intp: the IRQ struct pointer
+ */
+
+static int vfio_start_eventfd_injection(VFIOINTp *intp)
+{
+ int ret;
+ VFIODevice *vbasedev = &intp->vdev->vbasedev;
+
+ vfio_mask_single_irqindex(vbasedev, intp->pin);
+
+ ret = vfio_set_trigger_eventfd(intp, vfio_intp_interrupt);
+ if (ret) {
+ error_report("vfio: Error: Failed to pass IRQ fd to the driver: %m");
+ vfio_unmask_single_irqindex(vbasedev, intp->pin);
+ return ret;
+ }
+ vfio_unmask_single_irqindex(vbasedev, intp->pin);
+ return 0;
+}
+
+/*
+ * Functions used whatever the injection method
+ */
+
+/**
+ * vfio_set_trigger_eventfd - set VFIO eventfd handling
+ * ie. program the VFIO driver to associates a given IRQ index
+ * with a fd handler
+ *
+ * @intp: IRQ struct pointer
+ * @handler: handler to be called on eventfd trigger
+ */
+static int vfio_set_trigger_eventfd(VFIOINTp *intp,
+ eventfd_user_side_handler_t handler)
+{
+ VFIODevice *vbasedev = &intp->vdev->vbasedev;
+ struct vfio_irq_set *irq_set;
+ int argsz, ret;
+ int32_t *pfd;
+
+ argsz = sizeof(*irq_set) + sizeof(*pfd);
+ irq_set = g_malloc0(argsz);
+ irq_set->argsz = argsz;
+ irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER;
+ irq_set->index = intp->pin;
+ irq_set->start = 0;
+ irq_set->count = 1;
+ pfd = (int32_t *)&irq_set->data;
+ *pfd = event_notifier_get_fd(&intp->interrupt);
+ qemu_set_fd_handler(*pfd, (IOHandler *)handler, NULL, intp);
+ ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
+ g_free(irq_set);
+ if (ret < 0) {
+ error_report("vfio: Failed to set trigger eventfd: %m");
+ qemu_set_fd_handler(*pfd, NULL, NULL, NULL);
+ }
+ return ret;
+}
+
+/* not implemented yet */
+static void vfio_platform_compute_needs_reset(VFIODevice *vbasedev)
+{
+ vbasedev->needs_reset = false;
+}
+
+/* not implemented yet */
+static int vfio_platform_hot_reset_multi(VFIODevice *vbasedev)
+{
+ return 0;
+}
+
+/**
+ * vfio_init_intp - allocate, initialize the IRQ struct pointer
+ * and add it into the list of IRQ
+ * @vbasedev: the VFIO device
+ * @index: VFIO device IRQ index
+ */
+static VFIOINTp *vfio_init_intp(VFIODevice *vbasedev, unsigned int index)
+{
+ int ret;
+ VFIOPlatformDevice *vdev =
+ container_of(vbasedev, VFIOPlatformDevice, vbasedev);
+ SysBusDevice *sbdev = SYS_BUS_DEVICE(vdev);
+ VFIOINTp *intp;
+
+ /* allocate and populate a new VFIOINTp structure put in a queue list */
+ intp = g_malloc0(sizeof(*intp));
+ intp->vdev = vdev;
+ intp->pin = index;
+ intp->state = VFIO_IRQ_INACTIVE;
+ sysbus_init_irq(sbdev, &intp->qemuirq);
+
+ /* Get an eventfd for trigger */
+ ret = event_notifier_init(&intp->interrupt, 0);
+ if (ret) {
+ g_free(intp);
+ error_report("vfio: Error: trigger event_notifier_init failed ");
+ return NULL;
+ }
+
+ /* store the new intp in qlist */
+ QLIST_INSERT_HEAD(&vdev->intp_list, intp, next);
+ return intp;
+}
+
+/**
+ * vfio_populate_device - initialize MMIO region and IRQ
+ * @vbasedev: the VFIO device
+ *
+ * query the VFIO device for exposed MMIO regions and IRQ and
+ * populate the associated fields in the device struct
+ */
+static int vfio_populate_device(VFIODevice *vbasedev)
+{
+ struct vfio_irq_info irq = { .argsz = sizeof(irq) };
+ struct vfio_region_info reg_info = { .argsz = sizeof(reg_info) };
+ VFIOINTp *intp;
+ int i, ret = 0;
+ VFIOPlatformDevice *vdev =
+ container_of(vbasedev, VFIOPlatformDevice, vbasedev);
+
+ vdev->regions = g_malloc0(sizeof(VFIORegion *) * vbasedev->num_regions);
+
+ for (i = 0; i < vbasedev->num_regions; i++) {
+ vdev->regions[i] = g_malloc0(sizeof(VFIORegion));
+ reg_info.index = i;
+ ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_REGION_INFO, ®_info);
+ if (ret) {
+ error_report("vfio: Error getting region %d info: %m", i);
+ goto error;
+ }
+ vdev->regions[i]->flags = reg_info.flags;
+ vdev->regions[i]->size = reg_info.size;
+ vdev->regions[i]->fd_offset = reg_info.offset;
+ vdev->regions[i]->nr = i;
+ vdev->regions[i]->vbasedev = vbasedev;
+
+ trace_vfio_platform_populate_regions(vdev->regions[i]->nr,
+ (unsigned long)vdev->regions[i]->flags,
+ (unsigned long)vdev->regions[i]->size,
+ vdev->regions[i]->vbasedev->fd,
+ (unsigned long)vdev->regions[i]->fd_offset);
+ }
+
+ vdev->mmap_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL,
+ vfio_intp_mmap_enable, vdev);
+
+ QSIMPLEQ_INIT(&vdev->pending_intp_queue);
+
+ for (i = 0; i < vbasedev->num_irqs; i++) {
+ irq.index = i;
+
+ ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_IRQ_INFO, &irq);
+ if (ret) {
+ error_printf("vfio: error getting device %s irq info",
+ vbasedev->name);
+ return ret;
+ } else {
+ trace_vfio_platform_populate_interrupts(irq.index,
+ irq.count,
+ irq.flags);
+ intp = vfio_init_intp(vbasedev, irq.index);
+ if (!intp) {
+ error_report("vfio: Error installing IRQ %d up", i);
+ return ret;
+ }
+ }
+ }
+ return 0;
+error:
+ return ret;
+}
+
+/* specialized functions ofr VFIO Platform devices */
+static VFIODeviceOps vfio_platform_ops = {
+ .vfio_compute_needs_reset = vfio_platform_compute_needs_reset,
+ .vfio_hot_reset_multi = vfio_platform_hot_reset_multi,
+ .vfio_eoi = vfio_platform_eoi,
+ .vfio_populate_device = vfio_populate_device,
+};
+
+/**
+ * vfio_base_device_init - implements some of the VFIO mechanics
+ * @vbasedev: the VFIO device
+ *
+ * retrieves the group the device belongs to and get the device fd
+ * returns the VFIO device fd
+ * precondition: the device name must be initialized
+ */
+static int vfio_base_device_init(VFIODevice *vbasedev)
+{
+ VFIOGroup *group;
+ VFIODevice *vbasedev_iter;
+ char path[PATH_MAX], iommu_group_path[PATH_MAX], *group_name;
+ ssize_t len;
+ struct stat st;
+ int groupid;
+ int ret;
+
+ /* name must be set prior to the call */
+ if (!vbasedev->name) {
+ return -EINVAL;
+ }
+
+ /* Check that the host device exists */
+ snprintf(path, sizeof(path), "/sys/bus/platform/devices/%s/",
+ vbasedev->name);
+
+ if (stat(path, &st) < 0) {
+ error_report("vfio: error: no such host device: %s", path);
+ return -errno;
+ }
+
+ strncat(path, "iommu_group", sizeof(path) - strlen(path) - 1);
+ len = readlink(path, iommu_group_path, sizeof(path));
+ if (len <= 0 || len >= sizeof(path)) {
+ error_report("vfio: error no iommu_group for device");
+ return len < 0 ? -errno : ENAMETOOLONG;
+ }
+
+ iommu_group_path[len] = 0;
+ group_name = basename(iommu_group_path);
+
+ if (sscanf(group_name, "%d", &groupid) != 1) {
+ error_report("vfio: error reading %s: %m", path);
+ return -errno;
+ }
+
+ trace_vfio_platform_base_device_init(vbasedev->name, groupid);
+
+ group = vfio_get_group(groupid, &address_space_memory);
+ if (!group) {
+ error_report("vfio: failed to get group %d", groupid);
+ return -ENOENT;
+ }
+
+ snprintf(path, sizeof(path), "%s", vbasedev->name);
+
+ QLIST_FOREACH(vbasedev_iter, &group->device_list, next) {
+ if (strcmp(vbasedev_iter->name, vbasedev->name) == 0) {
+ error_report("vfio: error: device %s is already attached", path);
+ vfio_put_group(group);
+ return -EBUSY;
+ }
+ }
+ ret = vfio_get_device(group, path, vbasedev);
+ if (ret) {
+ error_report("vfio: failed to get device %s", path);
+ vfio_put_group(group);
+ }
+ return ret;
+}
+
+/**
+ * vfio_map_region - initialize the 2 mr (mmapped on ops) for a
+ * given index
+ * @vdev: the VFIO platform device
+ * @nr: the index of the region
+ *
+ * init the top memory region and the mmapped memroy region beneath
+ * VFIOPlatformDevice is used since VFIODevice is not a QOM Object
+ * and could not be passed to memory region functions
+*/
+static void vfio_map_region(VFIOPlatformDevice *vdev, int nr)
+{
+ VFIORegion *region = vdev->regions[nr];
+ unsigned size = region->size;
+ char name[64];
+
+ if (!size) {
+ return;
+ }
+
+ snprintf(name, sizeof(name), "VFIO %s region %d",
+ vdev->vbasedev.name, nr);
+
+ /* A "slow" read/write mapping underlies all regions */
+ memory_region_init_io(®ion->mem, OBJECT(vdev), &vfio_region_ops,
+ region, name, size);
+
+ strncat(name, " mmap", sizeof(name) - strlen(name) - 1);
+
+ if (vfio_mmap_region(OBJECT(vdev), region, ®ion->mem,
+ ®ion->mmap_mem, ®ion->mmap, size, 0, name)) {
+ error_report("%s unsupported. Performance may be slow", name);
+ }
+}
+
+/**
+ * vfio_platform_realize - the device realize function
+ * @dev: device state pointer
+ * @errp: error
+ *
+ * initialize the device, its memory regions and IRQ structures
+ * IRQ are started separately
+ */
+static void vfio_platform_realize(DeviceState *dev, Error **errp)
+{
+ VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(dev);
+ SysBusDevice *sbdev = SYS_BUS_DEVICE(dev);
+ VFIODevice *vbasedev = &vdev->vbasedev;
+ int i, ret;
+
+ vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM;
+ vbasedev->ops = &vfio_platform_ops;
+ vdev->start_irq_fn = vfio_start_eventfd_injection;
+
+ trace_vfio_platform_realize(vbasedev->name, vdev->compat);
+
+ ret = vfio_base_device_init(vbasedev);
+ if (ret) {
+ error_setg(errp, "vfio: vfio_base_device_init failed for %s",
+ vbasedev->name);
+ return;
+ }
+
+ for (i = 0; i < vbasedev->num_regions; i++) {
+ vfio_map_region(vdev, i);
+ sysbus_init_mmio(sbdev, &vdev->regions[i]->mem);
+ }
+}
+
+/*
+ * Mechanics to program/start irq injection on machine init done notifier:
+ * this is needed since at finalize time, the device IRQ are not yet
+ * bound to the platform bus IRQ. It is assumed here dynamic instantiation
+ * always is used. Binding to the platform bus IRQ happens on a machine
+ * init done notifier registered by the platform bus. Only at that time the
+ * absolute virtual IRQ = GSI is known and allows to setup IRQFD.
+ * as a consequence, we configure the IRQ routing in a reset notifier
+ * registered by the machine file through vfio_kick_irqs.
+ */
+
+/* Start injection of IRQ for a specific VFIO device */
+static int vfio_irq_starter(SysBusDevice *sbdev, void *opaque)
+{
+ DeviceState *intc = (DeviceState *)opaque;
+ VFIOPlatformDevice *vdev;
+ VFIOINTp *intp;
+ qemu_irq irq;
+ int gsi;
+
+ if (object_dynamic_cast(OBJECT(sbdev), TYPE_VFIO_PLATFORM)) {
+ vdev = VFIO_PLATFORM_DEVICE(sbdev);
+
+ QLIST_FOREACH(intp, &vdev->intp_list, next) {
+ gsi = 0;
+ while (1) {
+ irq = qdev_get_gpio_in(intc, gsi);
+ if (irq == intp->qemuirq) {
+ break;
+ }
+ gsi++;
+ }
+ intp->virtualID = gsi;
+ vdev->start_irq_fn(intp);
+ }
+ }
+ return 0;
+}
+
+/* loop on all VFIO platform devices and start their IRQ injection */
+void vfio_kick_irqs(void *data)
+{
+ DeviceState *intc = (DeviceState *)data;
+ DeviceState *dev =
+ qdev_find_recursive(sysbus_get_default(), TYPE_PLATFORM_BUS_DEVICE);
+ PlatformBusDevice *pbus = PLATFORM_BUS_DEVICE(dev);
+
+ if (pbus->done_gathering) {
+ foreach_dynamic_sysbus_device(vfio_irq_starter, intc);
+ }
+}
+
+static const VMStateDescription vfio_platform_vmstate = {
+ .name = TYPE_VFIO_PLATFORM,
+ .unmigratable = 1,
+};
+
+static Property vfio_platform_dev_properties[] = {
+ DEFINE_PROP_STRING("host", VFIOPlatformDevice, vbasedev.name),
+ DEFINE_PROP_UINT32("mmap-timeout-ms", VFIOPlatformDevice,
+ mmap_timeout, 1100),
+ DEFINE_PROP_END_OF_LIST(),
+};
+
+static void vfio_platform_class_init(ObjectClass *klass, void *data)
+{
+ DeviceClass *dc = DEVICE_CLASS(klass);
+
+ dc->realize = vfio_platform_realize;
+ dc->props = vfio_platform_dev_properties;
+ dc->vmsd = &vfio_platform_vmstate;
+ dc->desc = "VFIO-based platform device assignment";
+ set_bit(DEVICE_CATEGORY_MISC, dc->categories);
+}
+
+static const TypeInfo vfio_platform_dev_info = {
+ .name = TYPE_VFIO_PLATFORM,
+ .parent = TYPE_SYS_BUS_DEVICE,
+ .instance_size = sizeof(VFIOPlatformDevice),
+ .class_init = vfio_platform_class_init,
+ .class_size = sizeof(VFIOPlatformDeviceClass),
+ .abstract = true,
+};
+
+static void register_vfio_platform_dev_type(void)
+{
+ type_register_static(&vfio_platform_dev_info);
+}
+
+type_init(register_vfio_platform_dev_type)
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 1d5bfe8..b5af090 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -43,6 +43,7 @@
enum {
VFIO_DEVICE_TYPE_PCI = 0,
+ VFIO_DEVICE_TYPE_PLATFORM = 1,
};
typedef struct VFIORegion {
diff --git a/include/hw/vfio/vfio-platform.h b/include/hw/vfio/vfio-platform.h
new file mode 100644
index 0000000..449c049
--- /dev/null
+++ b/include/hw/vfio/vfio-platform.h
@@ -0,0 +1,85 @@
+/*
+ * vfio based device assignment support - platform devices
+ *
+ * Copyright Linaro Limited, 2014
+ *
+ * Authors:
+ * Kim Phillips <kim.phillips@linaro.org>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See
+ * the COPYING file in the top-level directory.
+ *
+ * Based on vfio based PCI device assignment support:
+ * Copyright Red Hat, Inc. 2012
+ */
+
+#ifndef HW_VFIO_VFIO_PLATFORM_H
+#define HW_VFIO_VFIO_PLATFORM_H
+
+#include "hw/sysbus.h"
+#include "hw/vfio/vfio-common.h"
+#include "qemu/event_notifier.h"
+#include "qemu/queue.h"
+#include "hw/irq.h"
+
+#define TYPE_VFIO_PLATFORM "vfio-platform"
+
+enum {
+ VFIO_IRQ_INACTIVE = 0,
+ VFIO_IRQ_PENDING = 1,
+ VFIO_IRQ_ACTIVE = 2,
+ /* VFIO_IRQ_ACTIVE_AND_PENDING cannot happen with VFIO */
+};
+
+typedef struct VFIOINTp {
+ QLIST_ENTRY(VFIOINTp) next; /* entry for IRQ list */
+ QSIMPLEQ_ENTRY(VFIOINTp) pqnext; /* entry for pending IRQ queue */
+ EventNotifier interrupt; /* eventfd triggered on interrupt */
+ EventNotifier unmask; /* eventfd for unmask on QEMU bypass */
+ qemu_irq qemuirq;
+ struct VFIOPlatformDevice *vdev; /* back pointer to device */
+ int state; /* inactive, pending, active */
+ bool kvm_accel; /* set when QEMU bypass through KVM enabled */
+ uint8_t pin; /* index */
+ uint8_t virtualID; /* virtual IRQ */
+} VFIOINTp;
+
+typedef int (*start_irq_fn_t)(VFIOINTp *intp);
+
+typedef struct VFIOPlatformDevice {
+ SysBusDevice sbdev;
+ VFIODevice vbasedev; /* not a QOM object */
+ VFIORegion **regions;
+ QLIST_HEAD(, VFIOINTp) intp_list; /* list of IRQ */
+ /* queue of pending IRQ */
+ QSIMPLEQ_HEAD(pending_intp_queue, VFIOINTp) pending_intp_queue;
+ char *compat; /* compatibility string */
+ uint32_t mmap_timeout; /* delay to re-enable mmaps after interrupt */
+ QEMUTimer *mmap_timer; /* enable mmaps after periods w/o interrupts */
+ start_irq_fn_t start_irq_fn;
+ QemuMutex intp_mutex;
+} VFIOPlatformDevice;
+
+
+typedef struct VFIOPlatformDeviceClass {
+ /*< private >*/
+ SysBusDeviceClass parent_class;
+ /*< public >*/
+} VFIOPlatformDeviceClass;
+
+#define VFIO_PLATFORM_DEVICE(obj) \
+ OBJECT_CHECK(VFIOPlatformDevice, (obj), TYPE_VFIO_PLATFORM)
+#define VFIO_PLATFORM_DEVICE_CLASS(klass) \
+ OBJECT_CLASS_CHECK(VFIOPlatformDeviceClass, (klass), TYPE_VFIO_PLATFORM)
+#define VFIO_PLATFORM_DEVICE_GET_CLASS(obj) \
+ OBJECT_GET_CLASS(VFIOPlatformDeviceClass, (obj), TYPE_VFIO_PLATFORM)
+
+/**
+ * vfio_kick_irqs - registers a reset notifier that starts IRQ injection
+ * for all VFIO dynamic sysbus devices attached to the platform bus.
+ *
+ * @opaque: handle to the interrupt controller DeviceState*
+ */
+void vfio_kick_irqs(void *opaque);
+
+#endif /*HW_VFIO_VFIO_PLATFORM_H*/
diff --git a/trace-events b/trace-events
index 8acbcce..7a9e780 100644
--- a/trace-events
+++ b/trace-events
@@ -1430,6 +1430,18 @@ vfio_put_group(int fd) "close group->fd=%d"
vfio_get_device(const char * name, unsigned int flags, unsigned int num_regions, unsigned int num_irqs) "Device %s flags: %u, regions: %u, irqs: %u"
vfio_put_base_device(int fd) "close vdev->fd=%d"
+# hw/vfio/platform.c
+vfio_platform_eoi(int pin, int fd) "EOI IRQ pin %d (fd=%d)"
+vfio_platform_mmap_set_enabled(bool enabled) "fast path = %d"
+vfio_platform_intp_mmap_enable(int pin) "IRQ #%d still active, stay in slow path"
+vfio_platform_intp_interrupt(int pin, int fd) "Handle IRQ #%d (fd = %d)"
+vfio_platform_populate_interrupts(int pin, int count, int flags) "- IRQ index %d: count %d, flags=0x%x"
+vfio_platform_populate_regions(int region_index, unsigned long flag, unsigned long size, int fd, unsigned long offset) "- region %d flags = 0x%lx, size = 0x%lx, fd= %d, offset = 0x%lx"
+vfio_platform_base_device_init(char *name, int groupid) "%s belongs to group #%d"
+vfio_platform_realize(char *name, char *compat) "vfio device %s, compat = %s"
+vfio_intp_interrupt_set_pending(int index) "irq %d is set PENDING"
+vfio_platform_eoi_handle_pending(int index) "handle PENDING IRQ %d"
+
#hw/acpi/memory_hotplug.c
mhp_acpi_invalid_slot_selected(uint32_t slot) "0x%"PRIx32
mhp_acpi_read_addr_lo(uint32_t slot, uint32_t addr) "slot[0x%"PRIx32"] addr lo: 0x%"PRIx32
--
1.8.3.2
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH v8 13/19] hw/vfio: calxeda xgmac device
2014-11-30 18:35 [Qemu-devel] [PATCH v8 00/19] KVM platform device passthrough Eric Auger
` (11 preceding siblings ...)
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 12/19] hw/vfio/platform: add vfio-platform support Eric Auger
@ 2014-11-30 18:35 ` Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 14/19] hw/arm/virt: add support for VFIO devices Eric Auger
` (6 subsequent siblings)
19 siblings, 0 replies; 22+ messages in thread
From: Eric Auger @ 2014-11-30 18:35 UTC (permalink / raw)
To: eric.auger, christoffer.dall, qemu-devel, agraf, pbonzini,
kim.phillips, a.rigo, manish.jaggi, joel.schopp, zhaoshenglong,
ard.biesheuvel
Cc: peter.maydell, patches, eric.auger, will.deacon, stuart.yoder,
Bharat.Bhushan, alex.williamson, a.motakis, kvmarm
The platform device class has become abstract. This patch introduces
a calxeda xgmac device that can be be instantiated on command line
using such option.
-device vfio-calxeda-xgmac,host="fff51000.ethernet"
Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
v7 -> v8:
- add a comment in the header about the MMIO regions and IRQ which
are exposed by the device
v5 -> v6
- back again following Alex Graf advises
- fix a bug related to compat override
v4 -> v5:
removed since device tree was moved to hw/arm/dyn_sysbus_devtree.c
v4: creation for device tree specialization
---
hw/vfio/Makefile.objs | 1 +
hw/vfio/calxeda_xgmac.c | 54 ++++++++++++++++++++++++++++++++++++
include/hw/vfio/vfio-calxeda-xgmac.h | 46 ++++++++++++++++++++++++++++++
3 files changed, 101 insertions(+)
create mode 100644 hw/vfio/calxeda_xgmac.c
create mode 100644 include/hw/vfio/vfio-calxeda-xgmac.h
diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
index c5c76fe..913ab14 100644
--- a/hw/vfio/Makefile.objs
+++ b/hw/vfio/Makefile.objs
@@ -2,4 +2,5 @@ ifeq ($(CONFIG_LINUX), y)
obj-$(CONFIG_SOFTMMU) += common.o
obj-$(CONFIG_PCI) += pci.o
obj-$(CONFIG_SOFTMMU) += platform.o
+obj-$(CONFIG_SOFTMMU) += calxeda_xgmac.o
endif
diff --git a/hw/vfio/calxeda_xgmac.c b/hw/vfio/calxeda_xgmac.c
new file mode 100644
index 0000000..199e076
--- /dev/null
+++ b/hw/vfio/calxeda_xgmac.c
@@ -0,0 +1,54 @@
+/*
+ * calxeda xgmac example VFIO device
+ *
+ * Copyright Linaro Limited, 2014
+ *
+ * Authors:
+ * Eric Auger <eric.auger@linaro.org>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "hw/vfio/vfio-calxeda-xgmac.h"
+
+static void calxeda_xgmac_realize(DeviceState *dev, Error **errp)
+{
+ VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(dev);
+ VFIOCalxedaXgmacDeviceClass *k = VFIO_CALXEDA_XGMAC_DEVICE_GET_CLASS(dev);
+
+ vdev->compat = g_strdup("calxeda,hb-xgmac");
+
+ k->parent_realize(dev, errp);
+}
+
+static const VMStateDescription vfio_platform_vmstate = {
+ .name = TYPE_VFIO_CALXEDA_XGMAC,
+ .unmigratable = 1,
+};
+
+static void vfio_calxeda_xgmac_class_init(ObjectClass *klass, void *data)
+{
+ DeviceClass *dc = DEVICE_CLASS(klass);
+ VFIOCalxedaXgmacDeviceClass *vcxc =
+ VFIO_CALXEDA_XGMAC_DEVICE_CLASS(klass);
+ vcxc->parent_realize = dc->realize;
+ dc->realize = calxeda_xgmac_realize;
+ dc->desc = "VFIO Calxeda XGMAC";
+}
+
+static const TypeInfo vfio_calxeda_xgmac_dev_info = {
+ .name = TYPE_VFIO_CALXEDA_XGMAC,
+ .parent = TYPE_VFIO_PLATFORM,
+ .instance_size = sizeof(VFIOCalxedaXgmacDevice),
+ .class_init = vfio_calxeda_xgmac_class_init,
+ .class_size = sizeof(VFIOCalxedaXgmacDeviceClass),
+};
+
+static void register_calxeda_xgmac_dev_type(void)
+{
+ type_register_static(&vfio_calxeda_xgmac_dev_info);
+}
+
+type_init(register_calxeda_xgmac_dev_type)
diff --git a/include/hw/vfio/vfio-calxeda-xgmac.h b/include/hw/vfio/vfio-calxeda-xgmac.h
new file mode 100644
index 0000000..f994775
--- /dev/null
+++ b/include/hw/vfio/vfio-calxeda-xgmac.h
@@ -0,0 +1,46 @@
+/*
+ * VFIO calxeda xgmac device
+ *
+ * Copyright Linaro Limited, 2014
+ *
+ * Authors:
+ * Eric Auger <eric.auger@linaro.org>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef HW_VFIO_VFIO_CALXEDA_XGMAC_H
+#define HW_VFIO_VFIO_CALXEDA_XGMAC_H
+
+#include "hw/vfio/vfio-platform.h"
+
+#define TYPE_VFIO_CALXEDA_XGMAC "vfio-calxeda-xgmac"
+
+/**
+ * This device exposes:
+ * - a single MMIO region corresponding to its register space
+ * - 3 IRQS (main and 2 power related IRQs)
+ */
+typedef struct VFIOCalxedaXgmacDevice {
+ VFIOPlatformDevice vdev;
+} VFIOCalxedaXgmacDevice;
+
+typedef struct VFIOCalxedaXgmacDeviceClass {
+ /*< private >*/
+ VFIOPlatformDeviceClass parent_class;
+ /*< public >*/
+ DeviceRealize parent_realize;
+} VFIOCalxedaXgmacDeviceClass;
+
+#define VFIO_CALXEDA_XGMAC_DEVICE(obj) \
+ OBJECT_CHECK(VFIOCalxedaXgmacDevice, (obj), TYPE_VFIO_CALXEDA_XGMAC)
+#define VFIO_CALXEDA_XGMAC_DEVICE_CLASS(klass) \
+ OBJECT_CLASS_CHECK(VFIOCalxedaXgmacDeviceClass, (klass), \
+ TYPE_VFIO_CALXEDA_XGMAC)
+#define VFIO_CALXEDA_XGMAC_DEVICE_GET_CLASS(obj) \
+ OBJECT_GET_CLASS(VFIOCalxedaXgmacDeviceClass, (obj), \
+ TYPE_VFIO_CALXEDA_XGMAC)
+
+#endif
--
1.8.3.2
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH v8 14/19] hw/arm/virt: add support for VFIO devices
2014-11-30 18:35 [Qemu-devel] [PATCH v8 00/19] KVM platform device passthrough Eric Auger
` (12 preceding siblings ...)
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 13/19] hw/vfio: calxeda xgmac device Eric Auger
@ 2014-11-30 18:35 ` Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 15/19] hw/arm/sysbus-fdt: enable vfio-calxeda-xgmac dynamic instantiation Eric Auger
` (5 subsequent siblings)
19 siblings, 0 replies; 22+ messages in thread
From: Eric Auger @ 2014-11-30 18:35 UTC (permalink / raw)
To: eric.auger, christoffer.dall, qemu-devel, agraf, pbonzini,
kim.phillips, a.rigo, manish.jaggi, joel.schopp, zhaoshenglong,
ard.biesheuvel
Cc: peter.maydell, patches, eric.auger, will.deacon, stuart.yoder,
Bharat.Bhushan, alex.williamson, a.motakis, kvmarm
VFIO devices are dynamic sysbus devices. They could already be
instantiated. However for them to be functional, IRQ injection must
be programmed and started. This programming must happen after the
sysbus devices are attached to the platform bus and IRQ are bound.
Only at that time the GSI they are connected to are identified and
irqfd can be programmed.
Binding happens in a machine init done notifier registered by the
platform bus init. The IRQ start is done in a reset notifier.
This patchs adds the registration of the IRQ start notifier in machvirt.
Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
v7 -> v8:
- vfio_kick_irqs replaces older vfio_register_irq_starter. The new
function registers a reset notifier while the older registered a
machine init done notifier.
- Given the fact platform_bus_first_irq has become part of a const
struct its handle cannot be passed as a void* to the reset notifier.
We now pass the interrupt DeviceState*.
- create_gic now returns the DeviceState handle of the gic so that it
can be passed to the reset notifier registration
---
hw/arm/virt.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 37326a9..346b04a 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -44,6 +44,7 @@
#include "qemu/error-report.h"
#include "hw/arm/sysbus-fdt.h"
#include "hw/platform-bus.h"
+#include "hw/vfio/vfio-platform.h"
#define NUM_VIRTIO_TRANSPORTS 32
@@ -330,7 +331,7 @@ static void fdt_add_gic_node(const VirtBoardInfo *vbi)
qemu_fdt_setprop_cell(vbi->fdt, "/intc", "phandle", gic_phandle);
}
-static void create_gic(const VirtBoardInfo *vbi, qemu_irq *pic)
+static DeviceState *create_gic(const VirtBoardInfo *vbi, qemu_irq *pic)
{
/* We create a standalone GIC v2 */
DeviceState *gicdev;
@@ -378,6 +379,7 @@ static void create_gic(const VirtBoardInfo *vbi, qemu_irq *pic)
}
fdt_add_gic_node(vbi);
+ return gicdev;
}
static void create_uart(const VirtBoardInfo *vbi, qemu_irq *pic)
@@ -537,7 +539,8 @@ static void create_flash(const VirtBoardInfo *vbi)
}
static void create_platform_bus(VirtBoardInfo *vbi, qemu_irq *pic,
- const ARMPlatformBusSystemParams *system_params)
+ const ARMPlatformBusSystemParams *system_params,
+ DeviceState *gic)
{
DeviceState *dev;
SysBusDevice *s;
@@ -571,6 +574,9 @@ static void create_platform_bus(VirtBoardInfo *vbi, qemu_irq *pic,
memory_region_add_subregion(sysmem,
system_params->platform_bus_base,
sysbus_mmio_get_region(s, 0));
+
+ /* setup VFIO signaling/IRQFD for all VFIO platform sysbus devices */
+ qemu_register_reset(vfio_kick_irqs, gic);
}
static void *machvirt_dtb(const struct arm_boot_info *binfo, int *fdt_size)
@@ -589,6 +595,7 @@ static void machvirt_init(MachineState *machine)
MemoryRegion *ram = g_new(MemoryRegion, 1);
const char *cpu_model = machine->cpu_model;
VirtBoardInfo *vbi;
+ DeviceState *gic;
if (!cpu_model) {
cpu_model = "cortex-a15";
@@ -646,7 +653,7 @@ static void machvirt_init(MachineState *machine)
create_flash(vbi);
- create_gic(vbi, pic);
+ gic = create_gic(vbi, pic);
create_uart(vbi, pic);
@@ -658,7 +665,7 @@ static void machvirt_init(MachineState *machine)
*/
create_virtio_devices(vbi, pic);
- create_platform_bus(vbi, pic, &platform_bus_params);
+ create_platform_bus(vbi, pic, &platform_bus_params, gic);
vbi->bootinfo.ram_size = machine->ram_size;
vbi->bootinfo.kernel_filename = machine->kernel_filename;
--
1.8.3.2
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH v8 15/19] hw/arm/sysbus-fdt: enable vfio-calxeda-xgmac dynamic instantiation
2014-11-30 18:35 [Qemu-devel] [PATCH v8 00/19] KVM platform device passthrough Eric Auger
` (13 preceding siblings ...)
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 14/19] hw/arm/virt: add support for VFIO devices Eric Auger
@ 2014-11-30 18:35 ` Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 16/19] hw/vfio/platform: Add irqfd support Eric Auger
` (4 subsequent siblings)
19 siblings, 0 replies; 22+ messages in thread
From: Eric Auger @ 2014-11-30 18:35 UTC (permalink / raw)
To: eric.auger, christoffer.dall, qemu-devel, agraf, pbonzini,
kim.phillips, a.rigo, manish.jaggi, joel.schopp, zhaoshenglong,
ard.biesheuvel
Cc: peter.maydell, patches, eric.auger, will.deacon, stuart.yoder,
Bharat.Bhushan, alex.williamson, a.motakis, kvmarm
vfio-calxeda-xgmac now can be instantiated using the -device option.
The node creation function generates a very basic dt node composed
of the compat, reg and interrupts properties
Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
v7 -> v8:
- move the add_fdt_node_functions array declaration between the device
specific code and the generic code to avoid forward declarations of
decice specific functions
- rename add_basic_vfio_fdt_node into
add_calxeda_midway_xgmac_fdt_node
v6 -> v7:
- compat string re-formatting removed since compat string is not exposed
anymore as a user option
- VFIO IRQ kick-off removed from sysbus-fdt and moved to VFIO platform
device
---
hw/arm/sysbus-fdt.c | 88 +++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 88 insertions(+)
diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c
index 7537267..86bbd06 100644
--- a/hw/arm/sysbus-fdt.c
+++ b/hw/arm/sysbus-fdt.c
@@ -26,6 +26,8 @@
#include "sysemu/device_tree.h"
#include "hw/platform-bus.h"
#include "sysemu/sysemu.h"
+#include "hw/vfio/vfio-platform.h"
+#include "hw/vfio/vfio-calxeda-xgmac.h"
/*
* internal struct that contains the information to create dynamic
@@ -53,11 +55,97 @@ typedef struct NodeCreationPair {
int (*add_fdt_node_fn)(SysBusDevice *sbdev, void *opaque);
} NodeCreationPair;
+/* Device Specific Code */
+
+/**
+ * add_calxeda_midway_xgmac_fdt_node
+ *
+ * Generates a very simple node with following properties:
+ * compatible string, regs, interrupts
+ */
+static int add_calxeda_midway_xgmac_fdt_node(SysBusDevice *sbdev, void *opaque)
+{
+ PlatformBusFdtData *data = opaque;
+ PlatformBusDevice *pbus = data->pbus;
+ void *fdt = data->fdt;
+ const char *parent_node = data->pbus_node_name;
+ int compat_str_len;
+ char *nodename;
+ int i, ret;
+ uint32_t *irq_attr;
+ uint64_t *reg_attr;
+ uint64_t mmio_base;
+ uint64_t irq_number;
+ VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(sbdev);
+ VFIODevice *vbasedev = &vdev->vbasedev;
+ Object *obj = OBJECT(sbdev);
+
+ mmio_base = object_property_get_int(obj, "mmio[0]", NULL);
+
+ nodename = g_strdup_printf("%s/%s@%" PRIx64, parent_node,
+ vbasedev->name,
+ mmio_base);
+
+ qemu_fdt_add_subnode(fdt, nodename);
+
+ compat_str_len = strlen(vdev->compat) + 1;
+ qemu_fdt_setprop(fdt, nodename, "compatible",
+ vdev->compat, compat_str_len);
+
+ reg_attr = g_new(uint64_t, vbasedev->num_regions*4);
+
+ for (i = 0; i < vbasedev->num_regions; i++) {
+ mmio_base = platform_bus_get_mmio_addr(pbus, sbdev, i);
+ reg_attr[4*i] = 1;
+ reg_attr[4*i+1] = mmio_base;
+ reg_attr[4*i+2] = 1;
+ reg_attr[4*i+3] = memory_region_size(&vdev->regions[i]->mem);
+ }
+
+ ret = qemu_fdt_setprop_sized_cells_from_array(fdt, nodename, "reg",
+ vbasedev->num_regions*2, reg_attr);
+ if (ret < 0) {
+ error_report("could not set reg property of node %s", nodename);
+ goto fail;
+ }
+
+ irq_attr = g_new(uint32_t, vbasedev->num_irqs*3);
+
+ for (i = 0; i < vbasedev->num_irqs; i++) {
+ irq_number = platform_bus_get_irqn(pbus, sbdev , i)
+ + data->irq_start;
+ irq_attr[3*i] = cpu_to_be32(0);
+ irq_attr[3*i+1] = cpu_to_be32(irq_number);
+ irq_attr[3*i+2] = cpu_to_be32(0x4);
+ }
+
+ ret = qemu_fdt_setprop(fdt, nodename, "interrupts",
+ irq_attr, vbasedev->num_irqs*3*sizeof(uint32_t));
+ if (ret < 0) {
+ error_report("could not set interrupts property of node %s",
+ nodename);
+ goto fail;
+ }
+
+ g_free(nodename);
+ g_free(irq_attr);
+ g_free(reg_attr);
+
+ return 0;
+
+fail:
+
+ return -1;
+}
+
/* list of supported dynamic sysbus devices */
static const NodeCreationPair add_fdt_node_functions[] = {
+{TYPE_VFIO_CALXEDA_XGMAC, add_calxeda_midway_xgmac_fdt_node},
{"", NULL}, /*last element*/
};
+/* Generic Code */
+
/**
* add_fdt_node - add the device tree node of a dynamic sysbus device
*
--
1.8.3.2
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH v8 16/19] hw/vfio/platform: Add irqfd support
2014-11-30 18:35 [Qemu-devel] [PATCH v8 00/19] KVM platform device passthrough Eric Auger
` (14 preceding siblings ...)
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 15/19] hw/arm/sysbus-fdt: enable vfio-calxeda-xgmac dynamic instantiation Eric Auger
@ 2014-11-30 18:35 ` Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 17/19] linux-headers: Update KVM headers from linux-next tag ToBeFilled Eric Auger
` (3 subsequent siblings)
19 siblings, 0 replies; 22+ messages in thread
From: Eric Auger @ 2014-11-30 18:35 UTC (permalink / raw)
To: eric.auger, christoffer.dall, qemu-devel, agraf, pbonzini,
kim.phillips, a.rigo, manish.jaggi, joel.schopp, zhaoshenglong,
ard.biesheuvel
Cc: peter.maydell, patches, eric.auger, will.deacon, stuart.yoder,
Bharat.Bhushan, alex.williamson, a.motakis, kvmarm
This patch aims at optimizing IRQ handling using irqfd framework.
Instead of handling the eventfds on user-side they are handled on
kernel side using
- the KVM irqfd framework,
- the VFIO driver virqfd framework.
the virtual IRQ completion is trapped at interrupt controller
This removes the need for fast/slow path swap.
Overall this brings significant performance improvements.
it depends on host kernel KVM irqfd.
Signed-off-by: Alvise Rigo <a.rigo@virtualopensystems.com>
Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
v5 -> v6
- rely on kvm_irqfds_enabled() and kvm_resamplefds_enabled()
- guard KVM code with #ifdef CONFIG_KVM
v3 -> v4:
[Alvise Rigo]
Use of VFIO Platform driver v6 unmask/virqfd feature and removal
of resamplefd handler. Physical IRQ unmasking is now done in
VFIO driver.
v3:
[Eric Auger]
initial support with resamplefd handled on QEMU side since the
unmask was not supported on VFIO platform driver v5.
Conflicts:
hw/vfio/platform.c
---
hw/vfio/platform.c | 96 +++++++++++++++++++++++++++++++++++++++++
include/hw/vfio/vfio-platform.h | 1 +
trace-events | 2 +
3 files changed, 99 insertions(+)
diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
index 41f8693..97d98bf 100644
--- a/hw/vfio/platform.c
+++ b/hw/vfio/platform.c
@@ -25,6 +25,7 @@
#include "hw/sysbus.h"
#include "trace.h"
#include "hw/platform-bus.h"
+#include "sysemu/kvm.h"
static void vfio_intp_interrupt(VFIOINTp *intp);
typedef void (*eventfd_user_side_handler_t)(VFIOINTp *intp);
@@ -236,6 +237,83 @@ static int vfio_start_eventfd_injection(VFIOINTp *intp)
}
/*
+ * Functions used for irqfd
+ */
+
+#ifdef CONFIG_KVM
+
+/**
+ * vfio_set_resample_eventfd - sets the resamplefd for an IRQ
+ * @intp: the IRQ struct pointer
+ * programs the VFIO driver to unmask this IRQ when the
+ * intp->unmask eventfd is triggered
+ */
+static int vfio_set_resample_eventfd(VFIOINTp *intp)
+{
+ VFIODevice *vbasedev = &intp->vdev->vbasedev;
+ struct vfio_irq_set *irq_set;
+ int argsz, ret;
+ int32_t *pfd;
+
+ argsz = sizeof(*irq_set) + sizeof(*pfd);
+ irq_set = g_malloc0(argsz);
+ irq_set->argsz = argsz;
+ irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_UNMASK;
+ irq_set->index = intp->pin;
+ irq_set->start = 0;
+ irq_set->count = 1;
+ pfd = (int32_t *)&irq_set->data;
+ *pfd = event_notifier_get_fd(&intp->unmask);
+ qemu_set_fd_handler(*pfd, NULL, NULL, intp);
+ ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
+ g_free(irq_set);
+ if (ret < 0) {
+ error_report("vfio: Failed to set resample eventfd: %m");
+ qemu_set_fd_handler(*pfd, NULL, NULL, NULL);
+ }
+ return ret;
+}
+
+/**
+ * vfio_start_irqfd_injection - starts irqfd injection for an IRQ
+ * programs VFIO driver with both the trigger and resamplefd
+ * programs KVM with the gsi, trigger & resample eventfds
+ */
+static int vfio_start_irqfd_injection(VFIOINTp *intp)
+{
+ struct kvm_irqfd irqfd = {
+ .fd = event_notifier_get_fd(&intp->interrupt),
+ .resamplefd = event_notifier_get_fd(&intp->unmask),
+ .gsi = intp->virtualID,
+ .flags = KVM_IRQFD_FLAG_RESAMPLE,
+ };
+
+ if (kvm_vm_ioctl(kvm_state, KVM_IRQFD, &irqfd)) {
+ error_report("vfio: Error: Failed to assign the irqfd: %m");
+ goto fail_irqfd;
+ }
+ if (vfio_set_trigger_eventfd(intp, NULL) < 0) {
+ goto fail_vfio;
+ }
+ if (vfio_set_resample_eventfd(intp) < 0) {
+ goto fail_vfio;
+ }
+
+ intp->kvm_accel = true;
+ trace_vfio_platform_start_irqfd_injection(intp->pin, intp->virtualID,
+ irqfd.fd, irqfd.resamplefd);
+ return 0;
+
+fail_vfio:
+ irqfd.flags = KVM_IRQFD_FLAG_DEASSIGN;
+ kvm_vm_ioctl(kvm_state, KVM_IRQFD, &irqfd);
+fail_irqfd:
+ return -1;
+}
+
+#endif
+
+/*
* Functions used whatever the injection method
*/
@@ -314,6 +392,13 @@ static VFIOINTp *vfio_init_intp(VFIODevice *vbasedev, unsigned int index)
error_report("vfio: Error: trigger event_notifier_init failed ");
return NULL;
}
+ /* Get an eventfd for resample/unmask */
+ ret = event_notifier_init(&intp->unmask, 0);
+ if (ret) {
+ g_free(intp);
+ error_report("vfio: Error: resample event_notifier_init failed eoi");
+ return NULL;
+ }
/* store the new intp in qlist */
QLIST_INSERT_HEAD(&vdev->intp_list, intp, next);
@@ -520,7 +605,17 @@ static void vfio_platform_realize(DeviceState *dev, Error **errp)
vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM;
vbasedev->ops = &vfio_platform_ops;
+
+#ifdef CONFIG_KVM
+ if (kvm_irqfds_enabled() && kvm_resamplefds_enabled() &&
+ vdev->irqfd_allowed) {
+ vdev->start_irq_fn = vfio_start_irqfd_injection;
+ } else {
+ vdev->start_irq_fn = vfio_start_eventfd_injection;
+ }
+#else
vdev->start_irq_fn = vfio_start_eventfd_injection;
+#endif
trace_vfio_platform_realize(vbasedev->name, vdev->compat);
@@ -598,6 +693,7 @@ static Property vfio_platform_dev_properties[] = {
DEFINE_PROP_STRING("host", VFIOPlatformDevice, vbasedev.name),
DEFINE_PROP_UINT32("mmap-timeout-ms", VFIOPlatformDevice,
mmap_timeout, 1100),
+ DEFINE_PROP_BOOL("x-irqfd", VFIOPlatformDevice, irqfd_allowed, true),
DEFINE_PROP_END_OF_LIST(),
};
diff --git a/include/hw/vfio/vfio-platform.h b/include/hw/vfio/vfio-platform.h
index 449c049..de0b5d5 100644
--- a/include/hw/vfio/vfio-platform.h
+++ b/include/hw/vfio/vfio-platform.h
@@ -58,6 +58,7 @@ typedef struct VFIOPlatformDevice {
QEMUTimer *mmap_timer; /* enable mmaps after periods w/o interrupts */
start_irq_fn_t start_irq_fn;
QemuMutex intp_mutex;
+ bool irqfd_allowed; /* debug option to force irqfd on/off */
} VFIOPlatformDevice;
diff --git a/trace-events b/trace-events
index 7a9e780..59a09f6 100644
--- a/trace-events
+++ b/trace-events
@@ -1441,6 +1441,8 @@ vfio_platform_base_device_init(char *name, int groupid) "%s belongs to group #%d
vfio_platform_realize(char *name, char *compat) "vfio device %s, compat = %s"
vfio_intp_interrupt_set_pending(int index) "irq %d is set PENDING"
vfio_platform_eoi_handle_pending(int index) "handle PENDING IRQ %d"
+vfio_platform_start_irqfd_injection(int index, int gsi, int fd, int resamplefd) "IRQ index=%d, gsi =%d, fd = %d, resamplefd = %d"
+vfio_start_eventfd_injection(int index, int fd) "IRQ index=%d, fd = %d"
#hw/acpi/memory_hotplug.c
mhp_acpi_invalid_slot_selected(uint32_t slot) "0x%"PRIx32
--
1.8.3.2
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH v8 17/19] linux-headers: Update KVM headers from linux-next tag ToBeFilled
2014-11-30 18:35 [Qemu-devel] [PATCH v8 00/19] KVM platform device passthrough Eric Auger
` (15 preceding siblings ...)
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 16/19] hw/vfio/platform: Add irqfd support Eric Auger
@ 2014-11-30 18:35 ` Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 18/19] hw/vfio/common: vfio_kvm_device_fd moved in the common header Eric Auger
` (2 subsequent siblings)
19 siblings, 0 replies; 22+ messages in thread
From: Eric Auger @ 2014-11-30 18:35 UTC (permalink / raw)
To: eric.auger, christoffer.dall, qemu-devel, agraf, pbonzini,
kim.phillips, a.rigo, manish.jaggi, joel.schopp, zhaoshenglong,
ard.biesheuvel
Cc: peter.maydell, patches, eric.auger, will.deacon, stuart.yoder,
Bharat.Bhushan, alex.williamson, a.motakis, kvmarm
Syncup KVM related linux headers from linux-next tree using
scripts/update-linux-headers.sh.
Integrate updated KVM-VFIO API related to forwarded IRQ
Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
| 10 ++++++++++
1 file changed, 10 insertions(+)
--git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 12045a1..9f798ab 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -946,6 +946,9 @@ struct kvm_device_attr {
#define KVM_DEV_VFIO_GROUP 1
#define KVM_DEV_VFIO_GROUP_ADD 1
#define KVM_DEV_VFIO_GROUP_DEL 2
+#define KVM_DEV_VFIO_DEVICE 2
+#define KVM_DEV_VFIO_DEVICE_FORWARD_IRQ 1
+#define KVM_DEV_VFIO_DEVICE_UNFORWARD_IRQ 2
enum kvm_device_type {
KVM_DEV_TYPE_FSL_MPIC_20 = 1,
@@ -963,6 +966,13 @@ enum kvm_device_type {
KVM_DEV_TYPE_MAX,
};
+struct kvm_arch_forwarded_irq {
+ __u32 fd; /* file desciptor of the VFIO device */
+ __u32 index; /* VFIO device IRQ index */
+ __u32 subindex; /* VFIO device IRQ subindex */
+ __u32 gsi; /* gsi, ie. virtual IRQ number */
+};
+
/*
* ioctls for VM fds
*/
--
1.8.3.2
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH v8 18/19] hw/vfio/common: vfio_kvm_device_fd moved in the common header
2014-11-30 18:35 [Qemu-devel] [PATCH v8 00/19] KVM platform device passthrough Eric Auger
` (16 preceding siblings ...)
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 17/19] linux-headers: Update KVM headers from linux-next tag ToBeFilled Eric Auger
@ 2014-11-30 18:35 ` Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 19/19] hw/vfio/platform: add forwarded irq support Eric Auger
2014-12-08 17:04 ` [Qemu-devel] [PATCH v8 00/19] KVM platform device passthrough Alex Williamson
19 siblings, 0 replies; 22+ messages in thread
From: Eric Auger @ 2014-11-30 18:35 UTC (permalink / raw)
To: eric.auger, christoffer.dall, qemu-devel, agraf, pbonzini,
kim.phillips, a.rigo, manish.jaggi, joel.schopp, zhaoshenglong,
ard.biesheuvel
Cc: peter.maydell, patches, eric.auger, will.deacon, stuart.yoder,
Bharat.Bhushan, alex.williamson, a.motakis, kvmarm
the device is now used in platform for forwarded IRQ setup
Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
hw/vfio/common.c | 3 ++-
include/hw/vfio/vfio-common.h | 5 +++++
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 554467f..ba00ec9 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -44,9 +44,10 @@ struct vfio_as_head vfio_address_spaces =
* initialized, this file descriptor is only released on QEMU exit and
* we'll re-use it should another vfio device be attached before then.
*/
-static int vfio_kvm_device_fd = -1;
+int vfio_kvm_device_fd = -1;
#endif
+
/*
* Common VFIO interrupt disable
*/
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index b5af090..58fd786 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -41,6 +41,11 @@
#define VFIO_ALLOW_KVM_MSI 1
#define VFIO_ALLOW_KVM_MSIX 1
+#ifdef CONFIG_KVM
+extern int vfio_kvm_device_fd;
+#endif
+
+
enum {
VFIO_DEVICE_TYPE_PCI = 0,
VFIO_DEVICE_TYPE_PLATFORM = 1,
--
1.8.3.2
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH v8 19/19] hw/vfio/platform: add forwarded irq support
2014-11-30 18:35 [Qemu-devel] [PATCH v8 00/19] KVM platform device passthrough Eric Auger
` (17 preceding siblings ...)
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 18/19] hw/vfio/common: vfio_kvm_device_fd moved in the common header Eric Auger
@ 2014-11-30 18:35 ` Eric Auger
2014-12-08 17:04 ` [Qemu-devel] [PATCH v8 00/19] KVM platform device passthrough Alex Williamson
19 siblings, 0 replies; 22+ messages in thread
From: Eric Auger @ 2014-11-30 18:35 UTC (permalink / raw)
To: eric.auger, christoffer.dall, qemu-devel, agraf, pbonzini,
kim.phillips, a.rigo, manish.jaggi, joel.schopp, zhaoshenglong,
ard.biesheuvel
Cc: peter.maydell, patches, eric.auger, will.deacon, stuart.yoder,
Bharat.Bhushan, alex.williamson, a.motakis, kvmarm
Tests whether the forwarded IRQ modality is available.
In the positive device IRQs are forwarded. This control is
achieved with KVM-VFIO device. with such a modality injection
still is handled through irqfds. However end of interrupt is
not trapped anymore. As soon as the guest completes its virtual
IRQ, the corresponding physical IRQ is completed and the same
physical IRQ can hit again.
A new x-forward property enables to force forwarding off although
enabled by the kernel.
Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
hw/vfio/platform.c | 52 +++++++++++++++++++++++++++++++++++++++++
include/hw/vfio/vfio-platform.h | 2 ++
trace-events | 1 +
3 files changed, 55 insertions(+)
diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
index 97d98bf..7881b9b 100644
--- a/hw/vfio/platform.c
+++ b/hw/vfio/platform.c
@@ -237,6 +237,52 @@ static int vfio_start_eventfd_injection(VFIOINTp *intp)
}
/*
+ * Functions used with forwarding capability
+ */
+
+#ifdef CONFIG_KVM
+
+static bool has_kvm_vfio_forward_capability(void)
+{
+ struct kvm_device_attr attr = {
+ .group = KVM_DEV_VFIO_DEVICE,
+ .attr = KVM_DEV_VFIO_DEVICE_FORWARD_IRQ};
+
+ if (ioctl(vfio_kvm_device_fd, KVM_HAS_DEVICE_ATTR, &attr) == 0) {
+ return true;
+ } else {
+ return false;
+ }
+}
+
+static int vfio_set_forwarding(VFIOINTp *intp)
+{
+ int ret;
+ struct kvm_device_attr attr = {
+ .group = KVM_DEV_VFIO_DEVICE,
+ .attr = KVM_DEV_VFIO_DEVICE_FORWARD_IRQ};
+
+ intp->fwd_irq = g_malloc0(sizeof(*intp->fwd_irq));
+ intp->fwd_irq->fd = intp->vdev->vbasedev.fd;
+ intp->fwd_irq->index = intp->pin;
+ intp->fwd_irq->gsi = intp->virtualID;
+
+ attr.addr = (uint64_t)(unsigned long)intp->fwd_irq;
+
+ if (ioctl(vfio_kvm_device_fd, KVM_SET_DEVICE_ATTR, &attr)) {
+ error_report("Failed to forward IRQ %d through KVM VFIO device",
+ intp->pin);
+ g_free(intp->fwd_irq);
+ return -errno;
+ }
+ trace_vfio_start_fwd_injection(intp->pin);
+
+ return ret;
+}
+
+#endif
+
+/*
* Functions used for irqfd
*/
@@ -288,6 +334,11 @@ static int vfio_start_irqfd_injection(VFIOINTp *intp)
.flags = KVM_IRQFD_FLAG_RESAMPLE,
};
+ if (has_kvm_vfio_forward_capability() &&
+ intp->vdev->forward_allowed) {
+ vfio_set_forwarding(intp);
+ }
+
if (kvm_vm_ioctl(kvm_state, KVM_IRQFD, &irqfd)) {
error_report("vfio: Error: Failed to assign the irqfd: %m");
goto fail_irqfd;
@@ -694,6 +745,7 @@ static Property vfio_platform_dev_properties[] = {
DEFINE_PROP_UINT32("mmap-timeout-ms", VFIOPlatformDevice,
mmap_timeout, 1100),
DEFINE_PROP_BOOL("x-irqfd", VFIOPlatformDevice, irqfd_allowed, true),
+ DEFINE_PROP_BOOL("x-forward", VFIOPlatformDevice, forward_allowed, true),
DEFINE_PROP_END_OF_LIST(),
};
diff --git a/include/hw/vfio/vfio-platform.h b/include/hw/vfio/vfio-platform.h
index de0b5d5..d512bb3 100644
--- a/include/hw/vfio/vfio-platform.h
+++ b/include/hw/vfio/vfio-platform.h
@@ -42,6 +42,7 @@ typedef struct VFIOINTp {
bool kvm_accel; /* set when QEMU bypass through KVM enabled */
uint8_t pin; /* index */
uint8_t virtualID; /* virtual IRQ */
+ struct kvm_arch_forwarded_irq *fwd_irq;
} VFIOINTp;
typedef int (*start_irq_fn_t)(VFIOINTp *intp);
@@ -59,6 +60,7 @@ typedef struct VFIOPlatformDevice {
start_irq_fn_t start_irq_fn;
QemuMutex intp_mutex;
bool irqfd_allowed; /* debug option to force irqfd on/off */
+ bool forward_allowed; /* debug option to force forwarding on/off */
} VFIOPlatformDevice;
diff --git a/trace-events b/trace-events
index 59a09f6..0aea358 100644
--- a/trace-events
+++ b/trace-events
@@ -1431,6 +1431,7 @@ vfio_get_device(const char * name, unsigned int flags, unsigned int num_regions,
vfio_put_base_device(int fd) "close vdev->fd=%d"
# hw/vfio/platform.c
+vfio_start_fwd_injection(int pin) "forwarding set for IRQ pin %d"
vfio_platform_eoi(int pin, int fd) "EOI IRQ pin %d (fd=%d)"
vfio_platform_mmap_set_enabled(bool enabled) "fast path = %d"
vfio_platform_intp_mmap_enable(int pin) "IRQ #%d still active, stay in slow path"
--
1.8.3.2
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [Qemu-devel] [PATCH v8 00/19] KVM platform device passthrough
2014-11-30 18:35 [Qemu-devel] [PATCH v8 00/19] KVM platform device passthrough Eric Auger
` (18 preceding siblings ...)
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 19/19] hw/vfio/platform: add forwarded irq support Eric Auger
@ 2014-12-08 17:04 ` Alex Williamson
2014-12-08 17:35 ` Eric Auger
19 siblings, 1 reply; 22+ messages in thread
From: Alex Williamson @ 2014-12-08 17:04 UTC (permalink / raw)
To: Eric Auger
Cc: joel.schopp, kim.phillips, eric.auger, a.motakis, a.rigo,
peter.maydell, manish.jaggi, ard.biesheuvel, will.deacon,
qemu-devel, agraf, Bharat.Bhushan, stuart.yoder, patches,
zhaoshenglong, pbonzini, kvmarm, christoffer.dall
On Sun, 2014-11-30 at 18:35 +0000, Eric Auger wrote:
> This RFC series aims at enabling KVM platform device passthrough.
> It implements a VFIO platform device, derived from VFIO PCI device.
>
> The VFIO platform device uses the host VFIO platform driver which must
> be bound to the assigned device prior to the QEMU system start.
>
> - the guest can directly access the device register space
> - assigned device IRQs are transparently routed to the guest by
> QEMU/KVM (3 methods currently are supported: user-level eventfd
> handling, irqfd, forwarded IRQs)
> - iommu is transparently programmed to prevent the device from
> accessing physical pages outside of the guest address space
>
> This patch series is made of the following patch file groups:
>
> 1-11) PCI modifications to prepare for platform device introduction
Hi Eric,
Sorry for the delay, I've only been able to make a serious review of
1-11 and it looks good. I think we should start trying to pull these in
after QEMU2.2. Looking beyond patch 11/19, I'm surprised that the very
first next patch isn't a linux header sync so that the platform
populate_device routine can test the device info to make sure it's
operating on a vfio platform device. Thanks,
Alex
> 12-15) VFIO & calxeda midway platform device without irqfd support
> 16) VFIO platform device with irqfd support
> 17-19) VFIO platform device with IRQ forwarding support
>
> Each group is independent and should be separately upstreamable.
>
> Dependency List:
>
> QEMU dependencies:
> [1] [PATCH v5] machvirt dynamic sysbus device instantiation
> Eric Auger
> [2] [PATCH v3 0/2] actual checks of KVM_CAP_IRQFD and KVM_CAP_IRQFD_RESAMPLE,
> Eric Auger
> http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00589.html
> [3] [PATCH v2] vfio: migration to trace points
> Eric Auger
> https://patchwork.ozlabs.org/patch/394785/
>
> Kernel Dependencies:
> [1] [PATCH v10 00/20] VFIO support for platform and AMBA devices on ARM
> Antonios Motakis
> http://comments.gmane.org/gmane.linux.kernel.iommu/7096
> [2] [PATCH v3 0/6] vfio: type1: support for ARM SMMUS with VFIO_IOMMU_TYPE1
> Antonios Motakis
> http://www.spinics.net/lists/kvm-arm/msg11738.html
> [3] [PATCH v4] ARM: KVM: add irqfd support
> Eric Auger
> https://lkml.org/lkml/2014/9/1/141
> [4] [RFC PATCH 0/9] ARM: Forwarding physical interrupts to a guest VM,
> Marc Zyngier
> http://lwn.net/Articles/603514/
> [5] [PATCH v3 0/9] KVM-VFIO IRQ forward control
> Eric Auger
> https://lkml.org/lkml/2014/9/1/344
>
> - kernel pieces can be found at:
> http://git.linaro.org/people/eric.auger/linux.git (branch 3.18-rc6-v10)
> - QEMU pieces can be found at:
> http://git.linaro.org/people/eric.auger/qemu.git (branch vfio_integ_v8)
>
> The patch series was tested on Calxeda Midway (ARMv7) where one xgmac
> is assigned to KVM host while the second one is assigned to the guest.
> Reworked PCI device is not tested.
>
> Wiki for Calxeda Midway setup:
> https://wiki.linaro.org/LEG/Engineering/Virtualization/Platform_Device_Passthrough_on_Midway
>
> History:
> v7->v8:
> - rebase on v2.2.0-rc3 and integrate
> "Add skip_dump flag to ignore memory region during dump"
> - KVM header evolution with subindex addition in kvm_arch_forwarded_irq
> - split [PATCH v7 03/16] hw/vfio/pci: introduce VFIODevice into 4 patches
> - vfio_compute_needs_reset does not return bool anymore
> - add some comments about exposed MMIO region and IRQ in calxeda xgmac
> device
> - vfio_[un]mask_irqindex renamed into vfio_[un]mask_single_irqindex
> - rework IRQ startup: former machine init done notifier is replaced by a
> reset notifier. machine file passes the interrupt controller
> DeviceState handle (not the platform bus first irq parameter).
> - sysbus-fdt:
> - move the add_fdt_node_functions array declaration between the device
> specific code and the generic code to avoid forward declarations of
> decice specific functions
> - rename add_basic_vfio_fdt_node into add_calxeda_midway_xgmac_fdt_node
> emphasizing the fact it is xgmac specific
>
> v6->v7:
> - fake injection test modality removed
> - VFIO_DEVICE_TYPE_PLATFORM only introduced with VFIO platform
> - new helper functions to start VFIO IRQ on machine init done notifier
> (introduced in hw/vfio/platform: add vfio-platform support and notifier
> registration invoked in hw/arm/virt: add support for VFIO devices).
> vfio_start_irq_injection is replaced by vfio_register_irq_starter.
>
> v5->v6:
> - rebase on 2.1rc5 PCI code
> - forwarded IRQ first integraton
> - vfio_device property renamed into host property
> - split IRQ setup in different functions that match the 3 supported
> injection techniques (user handled eventfd, irqfd, forwarded IRQ):
> removes dynamic switch between injection methods
> - introduce fake interrupts as a test modality:
> x makes possible to test multiple IRQ user-side handling.
> x this is a test feature only: enable to trigger a fd as if the
> real physical IRQ hit. No virtual IRQ is injected into the guest
> but handling is simulated so that the state machine can be tested
> - user handled eventfd:
> x add mutex to protect IRQ state & list manipulation,
> x correct misleading comment in vfio_intp_interrupt.
> x Fix bugs using fake interrupt modality
> - irqfd no more advertised in this patchset (handled in [3])
> - VFIOPlatformDeviceClass becomes abstract and Calxeda xgmac device
> and class is re-introduced (as per v4)
> - all DPRINTF removed in platform and replaced by trace-points
> - corrects compilation with configure --disable-kvm
> - simplifies the split for vfio_get_device and introduce a unique
> specialized function named vfio_populate_device
> - group_list renamed into vfio_group_list
> - hw/arm/dyn_sysbus_devtree.c currently only support vfio-calxeda-xgmac
> instantiation. Needs to be specialized for other VFIO devices
> - fix 2 bugs in dyn_sysbus_devtree(reg_attr index and compat)
>
> v4->v5:
> - rebase on v2.1.0 PCI code
> - take into account Alex Williamson comments on PCI code rework
> - trace updates in vfio_region_write/read
> - remove fd from VFIORegion
> - get/put ckeanup
> - bug fix: bar region's vbasedev field duly initialization
> - misc cleanups in platform device
> - device tree node generation removed from device and handled in
> hw/arm/dyn_sysbus_devtree.c
> - remove "hw/vfio: add an example calxeda_xgmac": with removal of
> device tree node generation we do not have so many things to
> implement in that derived device yet. May be re-introduced later
> on if needed typically for reset/migration.
> - no GSI routing table anymore
>
> v3->v4 changes (Eric Auger, Alvise Rigo)
> - rebase on last VFIO PCI code (v2.1.0-rc0)
> - full git history rework to ease PCI code change review
> - mv include files in hw/vfio
> - DPRINTF reformatting temporarily moved out
> - support of VFIO virq (removal of resamplefd handler on user-side)
> - integration with sysbus dynamic instantiation framwork
> - removal of unrealize and cleanup routines until it is better
> understood what is really needed
> - Support of VFIO for Amba devices should be handled in an inherited
> device to specialize the device tree generation (clock handle currently
> missing in framework however)
> - "Always use eventfd as notifying mechanism" temporarily moved out
> - static instantiation is not mainstream (although it remains possible)
> note if static instantiation is used, irqfd must be setup in machine file
> when virtual IRQ is known
> - create the GSI routing table on qemu side
>
> v2->v3 changes (Alvise Rigo, Eric Auger):
> - Following Alex W recommandations, further efforts to factorize the
> code between PCI:introduction of VFIODevice and VFIORegion
> as base classes
> - unique reset handler for platform and PCI
> - cleanup following Kim's comments
> - multiple IRQ support mechanics should be in place although not
> tested
> - Better handling of MMIO multiple regions
> - New features and fixes by Alvise (multiple compat string, exec
> flag, force eventfd usage, amba device tree support)
> - irqfd support
>
> v1->v2 changes (Kim Phillips, Eric Auger):
> - IRQ initial support (legacy mode where eventfds are handled on
> user side)
> - hacked dynamic instantiation
>
> v1 (Kim Phillips):
> - initial split between PCI and platform
> - MMIO support only
> - static instantiation
>
> Best Regards
>
> Eric
>
>
>
>
> Eric Auger (18):
> hw/vfio/pci: Rename VFIODevice into VFIOPCIDevice
> hw/vfio/pci: generalize mask/unmask to any IRQ index
> hw/vfio/pci: introduce minimalist VFIODevice with fd
> hw/vfio/pci: add type, name and group fields in VFIODevice
> hw/vfio/pci: handle reset at VFIODevice
> hw/vfio/pci: Introduce VFIORegion
> hw/vfio/pci: split vfio_get_device
> hw/vfio/pci: rename group_list into vfio_group_list
> hw/vfio/pci: use name field in format strings
> hw/vfio: create common module
> hw/vfio/platform: add vfio-platform support
> hw/vfio: calxeda xgmac device
> hw/arm/virt: add support for VFIO devices
> hw/arm/sysbus-fdt: enable vfio-calxeda-xgmac dynamic instantiation
> hw/vfio/platform: Add irqfd support
> linux-headers: Update KVM headers from linux-next tag ToBeFilled
> hw/vfio/common: vfio_kvm_device_fd moved in the common header
> hw/vfio/platform: add forwarded irq support
>
> Kim Phillips (1):
> vfio: move hw/misc/vfio.c to hw/vfio/pci.c Move vfio.h into
> include/hw/vfio
>
> LICENSE | 2 +-
> MAINTAINERS | 2 +-
> hw/Makefile.objs | 1 +
> hw/arm/sysbus-fdt.c | 88 ++
> hw/arm/virt.c | 15 +-
> hw/misc/Makefile.objs | 1 -
> hw/ppc/spapr_pci_vfio.c | 2 +-
> hw/vfio/Makefile.objs | 6 +
> hw/vfio/calxeda_xgmac.c | 54 ++
> hw/vfio/common.c | 960 +++++++++++++++++++
> hw/{misc/vfio.c => vfio/pci.c} | 1671 +++++++---------------------------
> hw/vfio/platform.c | 777 ++++++++++++++++
> include/hw/vfio/vfio-calxeda-xgmac.h | 46 +
> include/hw/vfio/vfio-common.h | 157 ++++
> include/hw/vfio/vfio-platform.h | 88 ++
> include/hw/{misc => vfio}/vfio.h | 0
> linux-headers/linux/kvm.h | 10 +
> trace-events | 137 +--
> 18 files changed, 2599 insertions(+), 1418 deletions(-)
> create mode 100644 hw/vfio/Makefile.objs
> create mode 100644 hw/vfio/calxeda_xgmac.c
> create mode 100644 hw/vfio/common.c
> rename hw/{misc/vfio.c => vfio/pci.c} (65%)
> create mode 100644 hw/vfio/platform.c
> create mode 100644 include/hw/vfio/vfio-calxeda-xgmac.h
> create mode 100644 include/hw/vfio/vfio-common.h
> create mode 100644 include/hw/vfio/vfio-platform.h
> rename include/hw/{misc => vfio}/vfio.h (100%)
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Qemu-devel] [PATCH v8 00/19] KVM platform device passthrough
2014-12-08 17:04 ` [Qemu-devel] [PATCH v8 00/19] KVM platform device passthrough Alex Williamson
@ 2014-12-08 17:35 ` Eric Auger
0 siblings, 0 replies; 22+ messages in thread
From: Eric Auger @ 2014-12-08 17:35 UTC (permalink / raw)
To: Alex Williamson
Cc: joel.schopp, kim.phillips, eric.auger, a.motakis, a.rigo,
peter.maydell, manish.jaggi, ard.biesheuvel, will.deacon,
qemu-devel, agraf, Bharat.Bhushan, stuart.yoder, patches,
zhaoshenglong, pbonzini, kvmarm, christoffer.dall
On 12/08/2014 06:04 PM, Alex Williamson wrote:
> On Sun, 2014-11-30 at 18:35 +0000, Eric Auger wrote:
>> This RFC series aims at enabling KVM platform device passthrough.
>> It implements a VFIO platform device, derived from VFIO PCI device.
>>
>> The VFIO platform device uses the host VFIO platform driver which must
>> be bound to the assigned device prior to the QEMU system start.
>>
>> - the guest can directly access the device register space
>> - assigned device IRQs are transparently routed to the guest by
>> QEMU/KVM (3 methods currently are supported: user-level eventfd
>> handling, irqfd, forwarded IRQs)
>> - iommu is transparently programmed to prevent the device from
>> accessing physical pages outside of the guest address space
>>
>> This patch series is made of the following patch file groups:
>>
>> 1-11) PCI modifications to prepare for platform device introduction
>
> Hi Eric,
>
> Sorry for the delay, I've only been able to make a serious review of
> 1-11 and it looks good. I think we should start trying to pull these in
> after QEMU2.2.
That's a great news!
Looking beyond patch 11/19, I'm surprised that the very
> first next patch isn't a linux header sync so that the platform
> populate_device routine can test the device info to make sure it's
> operating on a vfio platform device.
Yes you're fully right. Currently I do not test it and it's a mistake. I
will add this linux header sync.
Thanks
Eric
Thanks,
>
> Alex
>
>> 12-15) VFIO & calxeda midway platform device without irqfd support
>> 16) VFIO platform device with irqfd support
>> 17-19) VFIO platform device with IRQ forwarding support
>>
>> Each group is independent and should be separately upstreamable.
>>
>> Dependency List:
>>
>> QEMU dependencies:
>> [1] [PATCH v5] machvirt dynamic sysbus device instantiation
>> Eric Auger
>> [2] [PATCH v3 0/2] actual checks of KVM_CAP_IRQFD and KVM_CAP_IRQFD_RESAMPLE,
>> Eric Auger
>> http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00589.html
>> [3] [PATCH v2] vfio: migration to trace points
>> Eric Auger
>> https://patchwork.ozlabs.org/patch/394785/
>>
>> Kernel Dependencies:
>> [1] [PATCH v10 00/20] VFIO support for platform and AMBA devices on ARM
>> Antonios Motakis
>> http://comments.gmane.org/gmane.linux.kernel.iommu/7096
>> [2] [PATCH v3 0/6] vfio: type1: support for ARM SMMUS with VFIO_IOMMU_TYPE1
>> Antonios Motakis
>> http://www.spinics.net/lists/kvm-arm/msg11738.html
>> [3] [PATCH v4] ARM: KVM: add irqfd support
>> Eric Auger
>> https://lkml.org/lkml/2014/9/1/141
>> [4] [RFC PATCH 0/9] ARM: Forwarding physical interrupts to a guest VM,
>> Marc Zyngier
>> http://lwn.net/Articles/603514/
>> [5] [PATCH v3 0/9] KVM-VFIO IRQ forward control
>> Eric Auger
>> https://lkml.org/lkml/2014/9/1/344
>>
>> - kernel pieces can be found at:
>> http://git.linaro.org/people/eric.auger/linux.git (branch 3.18-rc6-v10)
>> - QEMU pieces can be found at:
>> http://git.linaro.org/people/eric.auger/qemu.git (branch vfio_integ_v8)
>>
>> The patch series was tested on Calxeda Midway (ARMv7) where one xgmac
>> is assigned to KVM host while the second one is assigned to the guest.
>> Reworked PCI device is not tested.
>>
>> Wiki for Calxeda Midway setup:
>> https://wiki.linaro.org/LEG/Engineering/Virtualization/Platform_Device_Passthrough_on_Midway
>>
>> History:
>> v7->v8:
>> - rebase on v2.2.0-rc3 and integrate
>> "Add skip_dump flag to ignore memory region during dump"
>> - KVM header evolution with subindex addition in kvm_arch_forwarded_irq
>> - split [PATCH v7 03/16] hw/vfio/pci: introduce VFIODevice into 4 patches
>> - vfio_compute_needs_reset does not return bool anymore
>> - add some comments about exposed MMIO region and IRQ in calxeda xgmac
>> device
>> - vfio_[un]mask_irqindex renamed into vfio_[un]mask_single_irqindex
>> - rework IRQ startup: former machine init done notifier is replaced by a
>> reset notifier. machine file passes the interrupt controller
>> DeviceState handle (not the platform bus first irq parameter).
>> - sysbus-fdt:
>> - move the add_fdt_node_functions array declaration between the device
>> specific code and the generic code to avoid forward declarations of
>> decice specific functions
>> - rename add_basic_vfio_fdt_node into add_calxeda_midway_xgmac_fdt_node
>> emphasizing the fact it is xgmac specific
>>
>> v6->v7:
>> - fake injection test modality removed
>> - VFIO_DEVICE_TYPE_PLATFORM only introduced with VFIO platform
>> - new helper functions to start VFIO IRQ on machine init done notifier
>> (introduced in hw/vfio/platform: add vfio-platform support and notifier
>> registration invoked in hw/arm/virt: add support for VFIO devices).
>> vfio_start_irq_injection is replaced by vfio_register_irq_starter.
>>
>> v5->v6:
>> - rebase on 2.1rc5 PCI code
>> - forwarded IRQ first integraton
>> - vfio_device property renamed into host property
>> - split IRQ setup in different functions that match the 3 supported
>> injection techniques (user handled eventfd, irqfd, forwarded IRQ):
>> removes dynamic switch between injection methods
>> - introduce fake interrupts as a test modality:
>> x makes possible to test multiple IRQ user-side handling.
>> x this is a test feature only: enable to trigger a fd as if the
>> real physical IRQ hit. No virtual IRQ is injected into the guest
>> but handling is simulated so that the state machine can be tested
>> - user handled eventfd:
>> x add mutex to protect IRQ state & list manipulation,
>> x correct misleading comment in vfio_intp_interrupt.
>> x Fix bugs using fake interrupt modality
>> - irqfd no more advertised in this patchset (handled in [3])
>> - VFIOPlatformDeviceClass becomes abstract and Calxeda xgmac device
>> and class is re-introduced (as per v4)
>> - all DPRINTF removed in platform and replaced by trace-points
>> - corrects compilation with configure --disable-kvm
>> - simplifies the split for vfio_get_device and introduce a unique
>> specialized function named vfio_populate_device
>> - group_list renamed into vfio_group_list
>> - hw/arm/dyn_sysbus_devtree.c currently only support vfio-calxeda-xgmac
>> instantiation. Needs to be specialized for other VFIO devices
>> - fix 2 bugs in dyn_sysbus_devtree(reg_attr index and compat)
>>
>> v4->v5:
>> - rebase on v2.1.0 PCI code
>> - take into account Alex Williamson comments on PCI code rework
>> - trace updates in vfio_region_write/read
>> - remove fd from VFIORegion
>> - get/put ckeanup
>> - bug fix: bar region's vbasedev field duly initialization
>> - misc cleanups in platform device
>> - device tree node generation removed from device and handled in
>> hw/arm/dyn_sysbus_devtree.c
>> - remove "hw/vfio: add an example calxeda_xgmac": with removal of
>> device tree node generation we do not have so many things to
>> implement in that derived device yet. May be re-introduced later
>> on if needed typically for reset/migration.
>> - no GSI routing table anymore
>>
>> v3->v4 changes (Eric Auger, Alvise Rigo)
>> - rebase on last VFIO PCI code (v2.1.0-rc0)
>> - full git history rework to ease PCI code change review
>> - mv include files in hw/vfio
>> - DPRINTF reformatting temporarily moved out
>> - support of VFIO virq (removal of resamplefd handler on user-side)
>> - integration with sysbus dynamic instantiation framwork
>> - removal of unrealize and cleanup routines until it is better
>> understood what is really needed
>> - Support of VFIO for Amba devices should be handled in an inherited
>> device to specialize the device tree generation (clock handle currently
>> missing in framework however)
>> - "Always use eventfd as notifying mechanism" temporarily moved out
>> - static instantiation is not mainstream (although it remains possible)
>> note if static instantiation is used, irqfd must be setup in machine file
>> when virtual IRQ is known
>> - create the GSI routing table on qemu side
>>
>> v2->v3 changes (Alvise Rigo, Eric Auger):
>> - Following Alex W recommandations, further efforts to factorize the
>> code between PCI:introduction of VFIODevice and VFIORegion
>> as base classes
>> - unique reset handler for platform and PCI
>> - cleanup following Kim's comments
>> - multiple IRQ support mechanics should be in place although not
>> tested
>> - Better handling of MMIO multiple regions
>> - New features and fixes by Alvise (multiple compat string, exec
>> flag, force eventfd usage, amba device tree support)
>> - irqfd support
>>
>> v1->v2 changes (Kim Phillips, Eric Auger):
>> - IRQ initial support (legacy mode where eventfds are handled on
>> user side)
>> - hacked dynamic instantiation
>>
>> v1 (Kim Phillips):
>> - initial split between PCI and platform
>> - MMIO support only
>> - static instantiation
>>
>> Best Regards
>>
>> Eric
>>
>>
>>
>>
>> Eric Auger (18):
>> hw/vfio/pci: Rename VFIODevice into VFIOPCIDevice
>> hw/vfio/pci: generalize mask/unmask to any IRQ index
>> hw/vfio/pci: introduce minimalist VFIODevice with fd
>> hw/vfio/pci: add type, name and group fields in VFIODevice
>> hw/vfio/pci: handle reset at VFIODevice
>> hw/vfio/pci: Introduce VFIORegion
>> hw/vfio/pci: split vfio_get_device
>> hw/vfio/pci: rename group_list into vfio_group_list
>> hw/vfio/pci: use name field in format strings
>> hw/vfio: create common module
>> hw/vfio/platform: add vfio-platform support
>> hw/vfio: calxeda xgmac device
>> hw/arm/virt: add support for VFIO devices
>> hw/arm/sysbus-fdt: enable vfio-calxeda-xgmac dynamic instantiation
>> hw/vfio/platform: Add irqfd support
>> linux-headers: Update KVM headers from linux-next tag ToBeFilled
>> hw/vfio/common: vfio_kvm_device_fd moved in the common header
>> hw/vfio/platform: add forwarded irq support
>>
>> Kim Phillips (1):
>> vfio: move hw/misc/vfio.c to hw/vfio/pci.c Move vfio.h into
>> include/hw/vfio
>>
>> LICENSE | 2 +-
>> MAINTAINERS | 2 +-
>> hw/Makefile.objs | 1 +
>> hw/arm/sysbus-fdt.c | 88 ++
>> hw/arm/virt.c | 15 +-
>> hw/misc/Makefile.objs | 1 -
>> hw/ppc/spapr_pci_vfio.c | 2 +-
>> hw/vfio/Makefile.objs | 6 +
>> hw/vfio/calxeda_xgmac.c | 54 ++
>> hw/vfio/common.c | 960 +++++++++++++++++++
>> hw/{misc/vfio.c => vfio/pci.c} | 1671 +++++++---------------------------
>> hw/vfio/platform.c | 777 ++++++++++++++++
>> include/hw/vfio/vfio-calxeda-xgmac.h | 46 +
>> include/hw/vfio/vfio-common.h | 157 ++++
>> include/hw/vfio/vfio-platform.h | 88 ++
>> include/hw/{misc => vfio}/vfio.h | 0
>> linux-headers/linux/kvm.h | 10 +
>> trace-events | 137 +--
>> 18 files changed, 2599 insertions(+), 1418 deletions(-)
>> create mode 100644 hw/vfio/Makefile.objs
>> create mode 100644 hw/vfio/calxeda_xgmac.c
>> create mode 100644 hw/vfio/common.c
>> rename hw/{misc/vfio.c => vfio/pci.c} (65%)
>> create mode 100644 hw/vfio/platform.c
>> create mode 100644 include/hw/vfio/vfio-calxeda-xgmac.h
>> create mode 100644 include/hw/vfio/vfio-common.h
>> create mode 100644 include/hw/vfio/vfio-platform.h
>> rename include/hw/{misc => vfio}/vfio.h (100%)
>>
>
>
>
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2014-12-08 17:37 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-30 18:35 [Qemu-devel] [PATCH v8 00/19] KVM platform device passthrough Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 01/19] vfio: move hw/misc/vfio.c to hw/vfio/pci.c Move vfio.h into include/hw/vfio Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 02/19] hw/vfio/pci: Rename VFIODevice into VFIOPCIDevice Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 03/19] hw/vfio/pci: generalize mask/unmask to any IRQ index Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 04/19] hw/vfio/pci: introduce minimalist VFIODevice with fd Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 05/19] hw/vfio/pci: add type, name and group fields in VFIODevice Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 06/19] hw/vfio/pci: handle reset at VFIODevice Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 07/19] hw/vfio/pci: Introduce VFIORegion Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 08/19] hw/vfio/pci: split vfio_get_device Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 09/19] hw/vfio/pci: rename group_list into vfio_group_list Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 10/19] hw/vfio/pci: use name field in format strings Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 11/19] hw/vfio: create common module Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 12/19] hw/vfio/platform: add vfio-platform support Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 13/19] hw/vfio: calxeda xgmac device Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 14/19] hw/arm/virt: add support for VFIO devices Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 15/19] hw/arm/sysbus-fdt: enable vfio-calxeda-xgmac dynamic instantiation Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 16/19] hw/vfio/platform: Add irqfd support Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 17/19] linux-headers: Update KVM headers from linux-next tag ToBeFilled Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 18/19] hw/vfio/common: vfio_kvm_device_fd moved in the common header Eric Auger
2014-11-30 18:35 ` [Qemu-devel] [PATCH v8 19/19] hw/vfio/platform: add forwarded irq support Eric Auger
2014-12-08 17:04 ` [Qemu-devel] [PATCH v8 00/19] KVM platform device passthrough Alex Williamson
2014-12-08 17:35 ` Eric Auger
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).