* [PATCH v6 01/19] backends: Introduce HostIOMMUDevice abstract
2024-06-03 6:10 [PATCH v6 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
@ 2024-06-03 6:10 ` Zhenzhong Duan
2024-06-03 12:16 ` Cédric Le Goater
2024-06-03 12:46 ` Eric Auger
2024-06-03 6:10 ` [PATCH v6 02/19] vfio/container: Introduce TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO device Zhenzhong Duan
` (18 subsequent siblings)
19 siblings, 2 replies; 70+ messages in thread
From: Zhenzhong Duan @ 2024-06-03 6:10 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
yi.l.liu, chao.p.peng, Zhenzhong Duan, Paolo Bonzini
Introduce HostIOMMUDevice as an abstraction of host IOMMU device.
Introduce .realize() to initialize HostIOMMUDevice further after
instance init.
Introduce a macro CONFIG_HOST_IOMMU_DEVICE to define the usage
for VFIO, and VDPA in the future.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
MAINTAINERS | 2 ++
include/sysemu/host_iommu_device.h | 51 ++++++++++++++++++++++++++++++
backends/host_iommu_device.c | 30 ++++++++++++++++++
backends/Kconfig | 5 +++
backends/meson.build | 1 +
5 files changed, 89 insertions(+)
create mode 100644 include/sysemu/host_iommu_device.h
create mode 100644 backends/host_iommu_device.c
diff --git a/MAINTAINERS b/MAINTAINERS
index 448dc951c5..1cf2b25beb 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2196,6 +2196,8 @@ M: Zhenzhong Duan <zhenzhong.duan@intel.com>
S: Supported
F: backends/iommufd.c
F: include/sysemu/iommufd.h
+F: backends/host_iommu_device.c
+F: include/sysemu/host_iommu_device.h
F: include/qemu/chardev_open.h
F: util/chardev_open.c
F: docs/devel/vfio-iommufd.rst
diff --git a/include/sysemu/host_iommu_device.h b/include/sysemu/host_iommu_device.h
new file mode 100644
index 0000000000..2b58a94d62
--- /dev/null
+++ b/include/sysemu/host_iommu_device.h
@@ -0,0 +1,51 @@
+/*
+ * Host IOMMU device abstract declaration
+ *
+ * Copyright (C) 2024 Intel Corporation.
+ *
+ * Authors: Zhenzhong Duan <zhenzhong.duan@intel.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See
+ * the COPYING file in the top-level directory.
+ */
+
+#ifndef HOST_IOMMU_DEVICE_H
+#define HOST_IOMMU_DEVICE_H
+
+#include "qom/object.h"
+#include "qapi/error.h"
+
+#define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
+OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass, HOST_IOMMU_DEVICE)
+
+struct HostIOMMUDevice {
+ Object parent_obj;
+};
+
+/**
+ * struct HostIOMMUDeviceClass - The base class for all host IOMMU devices.
+ *
+ * Different type of host devices (e.g., VFIO or VDPA device) or devices
+ * with different backend (e.g., VFIO legacy container or IOMMUFD backend)
+ * can have different sub-classes.
+ */
+struct HostIOMMUDeviceClass {
+ ObjectClass parent_class;
+
+ /**
+ * @realize: initialize host IOMMU device instance further.
+ *
+ * Mandatory callback.
+ *
+ * @hiod: pointer to a host IOMMU device instance.
+ *
+ * @opaque: pointer to agent device of this host IOMMU device,
+ * i.e., for VFIO, pointer to VFIODevice
+ *
+ * @errp: pass an Error out when realize fails.
+ *
+ * Returns: true on success, false on failure.
+ */
+ bool (*realize)(HostIOMMUDevice *hiod, void *opaque, Error **errp);
+};
+#endif
diff --git a/backends/host_iommu_device.c b/backends/host_iommu_device.c
new file mode 100644
index 0000000000..41f2fdce20
--- /dev/null
+++ b/backends/host_iommu_device.c
@@ -0,0 +1,30 @@
+/*
+ * Host IOMMU device abstract
+ *
+ * Copyright (C) 2024 Intel Corporation.
+ *
+ * Authors: Zhenzhong Duan <zhenzhong.duan@intel.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See
+ * the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "sysemu/host_iommu_device.h"
+
+OBJECT_DEFINE_ABSTRACT_TYPE(HostIOMMUDevice,
+ host_iommu_device,
+ HOST_IOMMU_DEVICE,
+ OBJECT)
+
+static void host_iommu_device_class_init(ObjectClass *oc, void *data)
+{
+}
+
+static void host_iommu_device_init(Object *obj)
+{
+}
+
+static void host_iommu_device_finalize(Object *obj)
+{
+}
diff --git a/backends/Kconfig b/backends/Kconfig
index 2cb23f62fa..34ab29e994 100644
--- a/backends/Kconfig
+++ b/backends/Kconfig
@@ -3,3 +3,8 @@ source tpm/Kconfig
config IOMMUFD
bool
depends on VFIO
+
+config HOST_IOMMU_DEVICE
+ bool
+ default y
+ depends on VFIO
diff --git a/backends/meson.build b/backends/meson.build
index 8b2b111497..2e975d641e 100644
--- a/backends/meson.build
+++ b/backends/meson.build
@@ -25,6 +25,7 @@ if have_vhost_user
endif
system_ss.add(when: 'CONFIG_VIRTIO_CRYPTO', if_true: files('cryptodev-vhost.c'))
system_ss.add(when: 'CONFIG_IOMMUFD', if_true: files('iommufd.c'))
+system_ss.add(when: 'CONFIG_HOST_IOMMU_DEVICE', if_true: files('host_iommu_device.c'))
if have_vhost_user_crypto
system_ss.add(when: 'CONFIG_VIRTIO_CRYPTO', if_true: files('cryptodev-vhost-user.c'))
endif
--
2.34.1
^ permalink raw reply related [flat|nested] 70+ messages in thread
* Re: [PATCH v6 01/19] backends: Introduce HostIOMMUDevice abstract
2024-06-03 6:10 ` [PATCH v6 01/19] backends: Introduce HostIOMMUDevice abstract Zhenzhong Duan
@ 2024-06-03 12:16 ` Cédric Le Goater
2024-06-04 3:10 ` Duan, Zhenzhong
2024-06-03 12:46 ` Eric Auger
1 sibling, 1 reply; 70+ messages in thread
From: Cédric Le Goater @ 2024-06-03 12:16 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg, nicolinc,
joao.m.martins, clement.mathieu--drif, kevin.tian, yi.l.liu,
chao.p.peng, Paolo Bonzini
On 6/3/24 08:10, Zhenzhong Duan wrote:
> Introduce HostIOMMUDevice as an abstraction of host IOMMU device.
>
> Introduce .realize() to initialize HostIOMMUDevice further after
> instance init.
>
> Introduce a macro CONFIG_HOST_IOMMU_DEVICE to define the usage
> for VFIO, and VDPA in the future.
>
> Suggested-by: Cédric Le Goater <clg@redhat.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
> MAINTAINERS | 2 ++
> include/sysemu/host_iommu_device.h | 51 ++++++++++++++++++++++++++++++
> backends/host_iommu_device.c | 30 ++++++++++++++++++
> backends/Kconfig | 5 +++
> backends/meson.build | 1 +
> 5 files changed, 89 insertions(+)
> create mode 100644 include/sysemu/host_iommu_device.h
> create mode 100644 backends/host_iommu_device.c
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 448dc951c5..1cf2b25beb 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -2196,6 +2196,8 @@ M: Zhenzhong Duan <zhenzhong.duan@intel.com>
> S: Supported
> F: backends/iommufd.c
> F: include/sysemu/iommufd.h
> +F: backends/host_iommu_device.c
> +F: include/sysemu/host_iommu_device.h
> F: include/qemu/chardev_open.h
> F: util/chardev_open.c
> F: docs/devel/vfio-iommufd.rst
> diff --git a/include/sysemu/host_iommu_device.h b/include/sysemu/host_iommu_device.h
> new file mode 100644
> index 0000000000..2b58a94d62
> --- /dev/null
> +++ b/include/sysemu/host_iommu_device.h
> @@ -0,0 +1,51 @@
> +/*
> + * Host IOMMU device abstract declaration
> + *
> + * Copyright (C) 2024 Intel Corporation.
> + *
> + * Authors: Zhenzhong Duan <zhenzhong.duan@intel.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See
> + * the COPYING file in the top-level directory.
> + */
> +
> +#ifndef HOST_IOMMU_DEVICE_H
> +#define HOST_IOMMU_DEVICE_H
> +
> +#include "qom/object.h"
> +#include "qapi/error.h"
> +
> +#define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
> +OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass, HOST_IOMMU_DEVICE)
> +
> +struct HostIOMMUDevice {
> + Object parent_obj;
> +};
> +
> +/**
> + * struct HostIOMMUDeviceClass - The base class for all host IOMMU devices.
> + *
> + * Different type of host devices (e.g., VFIO or VDPA device) or devices
> + * with different backend (e.g., VFIO legacy container or IOMMUFD backend)
> + * can have different sub-classes.
> + */
> +struct HostIOMMUDeviceClass {
> + ObjectClass parent_class;
> +
> + /**
> + * @realize: initialize host IOMMU device instance further.
> + *
> + * Mandatory callback.
> + *
> + * @hiod: pointer to a host IOMMU device instance.
> + *
> + * @opaque: pointer to agent device of this host IOMMU device,
> + * i.e., for VFIO, pointer to VFIODevice
> + *
> + * @errp: pass an Error out when realize fails.
> + *
> + * Returns: true on success, false on failure.
> + */
> + bool (*realize)(HostIOMMUDevice *hiod, void *opaque, Error **errp);
> +};
> +#endif
> diff --git a/backends/host_iommu_device.c b/backends/host_iommu_device.c
> new file mode 100644
> index 0000000000..41f2fdce20
> --- /dev/null
> +++ b/backends/host_iommu_device.c
> @@ -0,0 +1,30 @@
> +/*
> + * Host IOMMU device abstract
> + *
> + * Copyright (C) 2024 Intel Corporation.
> + *
> + * Authors: Zhenzhong Duan <zhenzhong.duan@intel.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See
> + * the COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "sysemu/host_iommu_device.h"
> +
> +OBJECT_DEFINE_ABSTRACT_TYPE(HostIOMMUDevice,
> + host_iommu_device,
> + HOST_IOMMU_DEVICE,
> + OBJECT)
> +
> +static void host_iommu_device_class_init(ObjectClass *oc, void *data)
> +{
> +}
> +
> +static void host_iommu_device_init(Object *obj)
> +{
> +}
> +
> +static void host_iommu_device_finalize(Object *obj)
> +{
> +}
> diff --git a/backends/Kconfig b/backends/Kconfig
> index 2cb23f62fa..34ab29e994 100644
> --- a/backends/Kconfig
> +++ b/backends/Kconfig
> @@ -3,3 +3,8 @@ source tpm/Kconfig
> config IOMMUFD
> bool
> depends on VFIO
> +
> +config HOST_IOMMU_DEVICE
> + bool
> + default y
> + depends on VFIO
And you can drop HOST_IOMMU_DEVICE config
> diff --git a/backends/meson.build b/backends/meson.build
> index 8b2b111497..2e975d641e 100644
> --- a/backends/meson.build
> +++ b/backends/meson.build
> @@ -25,6 +25,7 @@ if have_vhost_user
> endif
> system_ss.add(when: 'CONFIG_VIRTIO_CRYPTO', if_true: files('cryptodev-vhost.c'))
> system_ss.add(when: 'CONFIG_IOMMUFD', if_true: files('iommufd.c'))
> +system_ss.add(when: 'CONFIG_HOST_IOMMU_DEVICE', if_true: files('host_iommu_device.c'))
and I would move host_iommu_device.c build under host_os == 'linux'
Thanks,
C.
> if have_vhost_user_crypto
> system_ss.add(when: 'CONFIG_VIRTIO_CRYPTO', if_true: files('cryptodev-vhost-user.c'))
> endif
^ permalink raw reply [flat|nested] 70+ messages in thread
* RE: [PATCH v6 01/19] backends: Introduce HostIOMMUDevice abstract
2024-06-03 12:16 ` Cédric Le Goater
@ 2024-06-04 3:10 ` Duan, Zhenzhong
0 siblings, 0 replies; 70+ messages in thread
From: Duan, Zhenzhong @ 2024-06-04 3:10 UTC (permalink / raw)
To: Cédric Le Goater, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, eric.auger@redhat.com, mst@redhat.com,
peterx@redhat.com, jasowang@redhat.com, jgg@nvidia.com,
nicolinc@nvidia.com, joao.m.martins@oracle.com,
clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
Peng, Chao P, Paolo Bonzini
>-----Original Message-----
>From: Cédric Le Goater <clg@redhat.com>
>Subject: Re: [PATCH v6 01/19] backends: Introduce HostIOMMUDevice
>abstract
>
>On 6/3/24 08:10, Zhenzhong Duan wrote:
>> Introduce HostIOMMUDevice as an abstraction of host IOMMU device.
>>
>> Introduce .realize() to initialize HostIOMMUDevice further after
>> instance init.
>>
>> Introduce a macro CONFIG_HOST_IOMMU_DEVICE to define the usage
>> for VFIO, and VDPA in the future.
>>
>> Suggested-by: Cédric Le Goater <clg@redhat.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>> MAINTAINERS | 2 ++
>> include/sysemu/host_iommu_device.h | 51
>++++++++++++++++++++++++++++++
>> backends/host_iommu_device.c | 30 ++++++++++++++++++
>> backends/Kconfig | 5 +++
>> backends/meson.build | 1 +
>> 5 files changed, 89 insertions(+)
>> create mode 100644 include/sysemu/host_iommu_device.h
>> create mode 100644 backends/host_iommu_device.c
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 448dc951c5..1cf2b25beb 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -2196,6 +2196,8 @@ M: Zhenzhong Duan
><zhenzhong.duan@intel.com>
>> S: Supported
>> F: backends/iommufd.c
>> F: include/sysemu/iommufd.h
>> +F: backends/host_iommu_device.c
>> +F: include/sysemu/host_iommu_device.h
>> F: include/qemu/chardev_open.h
>> F: util/chardev_open.c
>> F: docs/devel/vfio-iommufd.rst
>> diff --git a/include/sysemu/host_iommu_device.h
>b/include/sysemu/host_iommu_device.h
>> new file mode 100644
>> index 0000000000..2b58a94d62
>> --- /dev/null
>> +++ b/include/sysemu/host_iommu_device.h
>> @@ -0,0 +1,51 @@
>> +/*
>> + * Host IOMMU device abstract declaration
>> + *
>> + * Copyright (C) 2024 Intel Corporation.
>> + *
>> + * Authors: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2. See
>> + * the COPYING file in the top-level directory.
>> + */
>> +
>> +#ifndef HOST_IOMMU_DEVICE_H
>> +#define HOST_IOMMU_DEVICE_H
>> +
>> +#include "qom/object.h"
>> +#include "qapi/error.h"
>> +
>> +#define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
>> +OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass,
>HOST_IOMMU_DEVICE)
>> +
>> +struct HostIOMMUDevice {
>> + Object parent_obj;
>> +};
>> +
>> +/**
>> + * struct HostIOMMUDeviceClass - The base class for all host IOMMU
>devices.
>> + *
>> + * Different type of host devices (e.g., VFIO or VDPA device) or devices
>> + * with different backend (e.g., VFIO legacy container or IOMMUFD
>backend)
>> + * can have different sub-classes.
>> + */
>> +struct HostIOMMUDeviceClass {
>> + ObjectClass parent_class;
>> +
>> + /**
>> + * @realize: initialize host IOMMU device instance further.
>> + *
>> + * Mandatory callback.
>> + *
>> + * @hiod: pointer to a host IOMMU device instance.
>> + *
>> + * @opaque: pointer to agent device of this host IOMMU device,
>> + * i.e., for VFIO, pointer to VFIODevice
>> + *
>> + * @errp: pass an Error out when realize fails.
>> + *
>> + * Returns: true on success, false on failure.
>> + */
>> + bool (*realize)(HostIOMMUDevice *hiod, void *opaque, Error **errp);
>> +};
>> +#endif
>> diff --git a/backends/host_iommu_device.c
>b/backends/host_iommu_device.c
>> new file mode 100644
>> index 0000000000..41f2fdce20
>> --- /dev/null
>> +++ b/backends/host_iommu_device.c
>> @@ -0,0 +1,30 @@
>> +/*
>> + * Host IOMMU device abstract
>> + *
>> + * Copyright (C) 2024 Intel Corporation.
>> + *
>> + * Authors: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2. See
>> + * the COPYING file in the top-level directory.
>> + */
>> +
>> +#include "qemu/osdep.h"
>> +#include "sysemu/host_iommu_device.h"
>> +
>> +OBJECT_DEFINE_ABSTRACT_TYPE(HostIOMMUDevice,
>> + host_iommu_device,
>> + HOST_IOMMU_DEVICE,
>> + OBJECT)
>> +
>> +static void host_iommu_device_class_init(ObjectClass *oc, void *data)
>> +{
>> +}
>> +
>> +static void host_iommu_device_init(Object *obj)
>> +{
>> +}
>> +
>> +static void host_iommu_device_finalize(Object *obj)
>> +{
>> +}
>> diff --git a/backends/Kconfig b/backends/Kconfig
>> index 2cb23f62fa..34ab29e994 100644
>> --- a/backends/Kconfig
>> +++ b/backends/Kconfig
>> @@ -3,3 +3,8 @@ source tpm/Kconfig
>> config IOMMUFD
>> bool
>> depends on VFIO
>> +
>> +config HOST_IOMMU_DEVICE
>> + bool
>> + default y
>> + depends on VFIO
>
>And you can drop HOST_IOMMU_DEVICE config
Will do.
>
>> diff --git a/backends/meson.build b/backends/meson.build
>> index 8b2b111497..2e975d641e 100644
>> --- a/backends/meson.build
>> +++ b/backends/meson.build
>> @@ -25,6 +25,7 @@ if have_vhost_user
>> endif
>> system_ss.add(when: 'CONFIG_VIRTIO_CRYPTO', if_true: files('cryptodev-
>vhost.c'))
>> system_ss.add(when: 'CONFIG_IOMMUFD', if_true: files('iommufd.c'))
>> +system_ss.add(when: 'CONFIG_HOST_IOMMU_DEVICE', if_true:
>files('host_iommu_device.c'))
>
>and I would move host_iommu_device.c build under host_os == 'linux'
Will do.
Thanks
Zhenzhong
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH v6 01/19] backends: Introduce HostIOMMUDevice abstract
2024-06-03 6:10 ` [PATCH v6 01/19] backends: Introduce HostIOMMUDevice abstract Zhenzhong Duan
2024-06-03 12:16 ` Cédric Le Goater
@ 2024-06-03 12:46 ` Eric Auger
2024-06-04 3:41 ` Duan, Zhenzhong
1 sibling, 1 reply; 70+ messages in thread
From: Eric Auger @ 2024-06-03 12:46 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, clg, mst, peterx, jasowang, jgg, nicolinc,
joao.m.martins, clement.mathieu--drif, kevin.tian, yi.l.liu,
chao.p.peng, Paolo Bonzini
On 6/3/24 08:10, Zhenzhong Duan wrote:
> Introduce HostIOMMUDevice as an abstraction of host IOMMU device.
>
> Introduce .realize() to initialize HostIOMMUDevice further after
> instance init.
>
> Introduce a macro CONFIG_HOST_IOMMU_DEVICE to define the usage
> for VFIO, and VDPA in the future.
>
> Suggested-by: Cédric Le Goater <clg@redhat.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
> MAINTAINERS | 2 ++
> include/sysemu/host_iommu_device.h | 51 ++++++++++++++++++++++++++++++
> backends/host_iommu_device.c | 30 ++++++++++++++++++
> backends/Kconfig | 5 +++
> backends/meson.build | 1 +
> 5 files changed, 89 insertions(+)
> create mode 100644 include/sysemu/host_iommu_device.h
> create mode 100644 backends/host_iommu_device.c
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 448dc951c5..1cf2b25beb 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -2196,6 +2196,8 @@ M: Zhenzhong Duan <zhenzhong.duan@intel.com>
> S: Supported
> F: backends/iommufd.c
> F: include/sysemu/iommufd.h
> +F: backends/host_iommu_device.c
> +F: include/sysemu/host_iommu_device.h
> F: include/qemu/chardev_open.h
> F: util/chardev_open.c
> F: docs/devel/vfio-iommufd.rst
> diff --git a/include/sysemu/host_iommu_device.h b/include/sysemu/host_iommu_device.h
> new file mode 100644
> index 0000000000..2b58a94d62
> --- /dev/null
> +++ b/include/sysemu/host_iommu_device.h
> @@ -0,0 +1,51 @@
> +/*
> + * Host IOMMU device abstract declaration
> + *
> + * Copyright (C) 2024 Intel Corporation.
> + *
> + * Authors: Zhenzhong Duan <zhenzhong.duan@intel.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See
> + * the COPYING file in the top-level directory.
> + */
> +
> +#ifndef HOST_IOMMU_DEVICE_H
> +#define HOST_IOMMU_DEVICE_H
> +
> +#include "qom/object.h"
> +#include "qapi/error.h"
> +
> +#define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
> +OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass, HOST_IOMMU_DEVICE)
> +
> +struct HostIOMMUDevice {
> + Object parent_obj;
> +};
> +
> +/**
> + * struct HostIOMMUDeviceClass - The base class for all host IOMMU devices.
> + *
> + * Different type of host devices (e.g., VFIO or VDPA device) or devices
s/type/types
> + * with different backend (e.g., VFIO legacy container or IOMMUFD backend)
> + * can have different sub-classes.
will have different implementations of the HostIOMMUDeviceClass?
> + */
> +struct HostIOMMUDeviceClass {
> + ObjectClass parent_class;
> +
> + /**
> + * @realize: initialize host IOMMU device instance further.
> + *
> + * Mandatory callback.
> + *
> + * @hiod: pointer to a host IOMMU device instance.
> + *
> + * @opaque: pointer to agent device of this host IOMMU device,
> + * i.e., for VFIO, pointer to VFIODevice
VFIO base device or VDPA device?
> + *
> + * @errp: pass an Error out when realize fails.
> + *
> + * Returns: true on success, false on failure.
> + */
> + bool (*realize)(HostIOMMUDevice *hiod, void *opaque, Error **errp);
I think I would introduce the get_cap API here as well to give a minimal
consistency to the class API.
> +};
> +#endif
> diff --git a/backends/host_iommu_device.c b/backends/host_iommu_device.c
> new file mode 100644
> index 0000000000..41f2fdce20
> --- /dev/null
> +++ b/backends/host_iommu_device.c
> @@ -0,0 +1,30 @@
> +/*
> + * Host IOMMU device abstract
> + *
> + * Copyright (C) 2024 Intel Corporation.
> + *
> + * Authors: Zhenzhong Duan <zhenzhong.duan@intel.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See
> + * the COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "sysemu/host_iommu_device.h"
> +
> +OBJECT_DEFINE_ABSTRACT_TYPE(HostIOMMUDevice,
> + host_iommu_device,
> + HOST_IOMMU_DEVICE,
> + OBJECT)
> +
> +static void host_iommu_device_class_init(ObjectClass *oc, void *data)
> +{
> +}
> +
> +static void host_iommu_device_init(Object *obj)
> +{
> +}
> +
> +static void host_iommu_device_finalize(Object *obj)
> +{
> +}
> diff --git a/backends/Kconfig b/backends/Kconfig
> index 2cb23f62fa..34ab29e994 100644
> --- a/backends/Kconfig
> +++ b/backends/Kconfig
> @@ -3,3 +3,8 @@ source tpm/Kconfig
> config IOMMUFD
> bool
> depends on VFIO
> +
> +config HOST_IOMMU_DEVICE
> + bool
> + default y
> + depends on VFIO
> diff --git a/backends/meson.build b/backends/meson.build
> index 8b2b111497..2e975d641e 100644
> --- a/backends/meson.build
> +++ b/backends/meson.build
> @@ -25,6 +25,7 @@ if have_vhost_user
> endif
> system_ss.add(when: 'CONFIG_VIRTIO_CRYPTO', if_true: files('cryptodev-vhost.c'))
> system_ss.add(when: 'CONFIG_IOMMUFD', if_true: files('iommufd.c'))
> +system_ss.add(when: 'CONFIG_HOST_IOMMU_DEVICE', if_true: files('host_iommu_device.c'))
> if have_vhost_user_crypto
> system_ss.add(when: 'CONFIG_VIRTIO_CRYPTO', if_true: files('cryptodev-vhost-user.c'))
> endif
^ permalink raw reply [flat|nested] 70+ messages in thread
* RE: [PATCH v6 01/19] backends: Introduce HostIOMMUDevice abstract
2024-06-03 12:46 ` Eric Auger
@ 2024-06-04 3:41 ` Duan, Zhenzhong
0 siblings, 0 replies; 70+ messages in thread
From: Duan, Zhenzhong @ 2024-06-04 3:41 UTC (permalink / raw)
To: eric.auger@redhat.com, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
peterx@redhat.com, jasowang@redhat.com, jgg@nvidia.com,
nicolinc@nvidia.com, joao.m.martins@oracle.com,
clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
Peng, Chao P, Paolo Bonzini
>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: Re: [PATCH v6 01/19] backends: Introduce HostIOMMUDevice
>abstract
>
>
>
>On 6/3/24 08:10, Zhenzhong Duan wrote:
>> Introduce HostIOMMUDevice as an abstraction of host IOMMU device.
>>
>> Introduce .realize() to initialize HostIOMMUDevice further after
>> instance init.
>>
>> Introduce a macro CONFIG_HOST_IOMMU_DEVICE to define the usage
>> for VFIO, and VDPA in the future.
>>
>> Suggested-by: Cédric Le Goater <clg@redhat.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>> MAINTAINERS | 2 ++
>> include/sysemu/host_iommu_device.h | 51
>++++++++++++++++++++++++++++++
>> backends/host_iommu_device.c | 30 ++++++++++++++++++
>> backends/Kconfig | 5 +++
>> backends/meson.build | 1 +
>> 5 files changed, 89 insertions(+)
>> create mode 100644 include/sysemu/host_iommu_device.h
>> create mode 100644 backends/host_iommu_device.c
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 448dc951c5..1cf2b25beb 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -2196,6 +2196,8 @@ M: Zhenzhong Duan
><zhenzhong.duan@intel.com>
>> S: Supported
>> F: backends/iommufd.c
>> F: include/sysemu/iommufd.h
>> +F: backends/host_iommu_device.c
>> +F: include/sysemu/host_iommu_device.h
>> F: include/qemu/chardev_open.h
>> F: util/chardev_open.c
>> F: docs/devel/vfio-iommufd.rst
>> diff --git a/include/sysemu/host_iommu_device.h
>b/include/sysemu/host_iommu_device.h
>> new file mode 100644
>> index 0000000000..2b58a94d62
>> --- /dev/null
>> +++ b/include/sysemu/host_iommu_device.h
>> @@ -0,0 +1,51 @@
>> +/*
>> + * Host IOMMU device abstract declaration
>> + *
>> + * Copyright (C) 2024 Intel Corporation.
>> + *
>> + * Authors: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2. See
>> + * the COPYING file in the top-level directory.
>> + */
>> +
>> +#ifndef HOST_IOMMU_DEVICE_H
>> +#define HOST_IOMMU_DEVICE_H
>> +
>> +#include "qom/object.h"
>> +#include "qapi/error.h"
>> +
>> +#define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
>> +OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass,
>HOST_IOMMU_DEVICE)
>> +
>> +struct HostIOMMUDevice {
>> + Object parent_obj;
>> +};
>> +
>> +/**
>> + * struct HostIOMMUDeviceClass - The base class for all host IOMMU
>devices.
>> + *
>> + * Different type of host devices (e.g., VFIO or VDPA device) or devices
>s/type/types
Will fix.
>> + * with different backend (e.g., VFIO legacy container or IOMMUFD
>backend)
>> + * can have different sub-classes.
>will have different implementations of the HostIOMMUDeviceClass?
Will do.
>> + */
>> +struct HostIOMMUDeviceClass {
>> + ObjectClass parent_class;
>> +
>> + /**
>> + * @realize: initialize host IOMMU device instance further.
>> + *
>> + * Mandatory callback.
>> + *
>> + * @hiod: pointer to a host IOMMU device instance.
>> + *
>> + * @opaque: pointer to agent device of this host IOMMU device,
>> + * i.e., for VFIO, pointer to VFIODevice
>VFIO base device or VDPA device?
Will do.
>> + *
>> + * @errp: pass an Error out when realize fails.
>> + *
>> + * Returns: true on success, false on failure.
>> + */
>> + bool (*realize)(HostIOMMUDevice *hiod, void *opaque, Error **errp);
>
>I think I would introduce the get_cap API here as well to give a minimal
>consistency to the class API.
Ok, will merge patch6 into this one.
[PATCH v6 05/19] backends/host_iommu_device: Introduce HostIOMMUDeviceCaps
Thanks
Zhenzhong
>> +};
>> +#endif
>> diff --git a/backends/host_iommu_device.c
>b/backends/host_iommu_device.c
>> new file mode 100644
>> index 0000000000..41f2fdce20
>> --- /dev/null
>> +++ b/backends/host_iommu_device.c
>> @@ -0,0 +1,30 @@
>> +/*
>> + * Host IOMMU device abstract
>> + *
>> + * Copyright (C) 2024 Intel Corporation.
>> + *
>> + * Authors: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2. See
>> + * the COPYING file in the top-level directory.
>> + */
>> +
>> +#include "qemu/osdep.h"
>> +#include "sysemu/host_iommu_device.h"
>> +
>> +OBJECT_DEFINE_ABSTRACT_TYPE(HostIOMMUDevice,
>> + host_iommu_device,
>> + HOST_IOMMU_DEVICE,
>> + OBJECT)
>> +
>> +static void host_iommu_device_class_init(ObjectClass *oc, void *data)
>> +{
>> +}
>> +
>> +static void host_iommu_device_init(Object *obj)
>> +{
>> +}
>> +
>> +static void host_iommu_device_finalize(Object *obj)
>> +{
>> +}
>> diff --git a/backends/Kconfig b/backends/Kconfig
>> index 2cb23f62fa..34ab29e994 100644
>> --- a/backends/Kconfig
>> +++ b/backends/Kconfig
>> @@ -3,3 +3,8 @@ source tpm/Kconfig
>> config IOMMUFD
>> bool
>> depends on VFIO
>> +
>> +config HOST_IOMMU_DEVICE
>> + bool
>> + default y
>> + depends on VFIO
>> diff --git a/backends/meson.build b/backends/meson.build
>> index 8b2b111497..2e975d641e 100644
>> --- a/backends/meson.build
>> +++ b/backends/meson.build
>> @@ -25,6 +25,7 @@ if have_vhost_user
>> endif
>> system_ss.add(when: 'CONFIG_VIRTIO_CRYPTO', if_true: files('cryptodev-
>vhost.c'))
>> system_ss.add(when: 'CONFIG_IOMMUFD', if_true: files('iommufd.c'))
>> +system_ss.add(when: 'CONFIG_HOST_IOMMU_DEVICE', if_true:
>files('host_iommu_device.c'))
>> if have_vhost_user_crypto
>> system_ss.add(when: 'CONFIG_VIRTIO_CRYPTO', if_true: files('cryptodev-
>vhost-user.c'))
>> endif
^ permalink raw reply [flat|nested] 70+ messages in thread
* [PATCH v6 02/19] vfio/container: Introduce TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO device
2024-06-03 6:10 [PATCH v6 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
2024-06-03 6:10 ` [PATCH v6 01/19] backends: Introduce HostIOMMUDevice abstract Zhenzhong Duan
@ 2024-06-03 6:10 ` Zhenzhong Duan
2024-06-03 6:10 ` [PATCH v6 03/19] backends/iommufd: Introduce abstract TYPE_HOST_IOMMU_DEVICE_IOMMUFD device Zhenzhong Duan
` (17 subsequent siblings)
19 siblings, 0 replies; 70+ messages in thread
From: Zhenzhong Duan @ 2024-06-03 6:10 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
yi.l.liu, chao.p.peng, Zhenzhong Duan
TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO represents a host IOMMU device under
VFIO legacy container backend.
It will have its own realize implementation.
Suggested-by: Eric Auger <eric.auger@redhat.com>
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
include/hw/vfio/vfio-common.h | 3 +++
hw/vfio/container.c | 5 ++++-
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 4cb1ab8645..75b167979a 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -31,6 +31,7 @@
#endif
#include "sysemu/sysemu.h"
#include "hw/vfio/vfio-container-base.h"
+#include "sysemu/host_iommu_device.h"
#define VFIO_MSG_PREFIX "vfio %s: "
@@ -171,6 +172,8 @@ typedef struct VFIOGroup {
bool ram_block_discard_allowed;
} VFIOGroup;
+#define TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO TYPE_HOST_IOMMU_DEVICE "-legacy-vfio"
+
typedef struct VFIODMABuf {
QemuDmaBuf *buf;
uint32_t pos_x, pos_y, pos_updates;
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 096cc97258..c4fca2dfca 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -1141,7 +1141,10 @@ static const TypeInfo types[] = {
.name = TYPE_VFIO_IOMMU_LEGACY,
.parent = TYPE_VFIO_IOMMU,
.class_init = vfio_iommu_legacy_class_init,
- },
+ }, {
+ .name = TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO,
+ .parent = TYPE_HOST_IOMMU_DEVICE,
+ }
};
DEFINE_TYPES(types)
--
2.34.1
^ permalink raw reply related [flat|nested] 70+ messages in thread
* [PATCH v6 03/19] backends/iommufd: Introduce abstract TYPE_HOST_IOMMU_DEVICE_IOMMUFD device
2024-06-03 6:10 [PATCH v6 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
2024-06-03 6:10 ` [PATCH v6 01/19] backends: Introduce HostIOMMUDevice abstract Zhenzhong Duan
2024-06-03 6:10 ` [PATCH v6 02/19] vfio/container: Introduce TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO device Zhenzhong Duan
@ 2024-06-03 6:10 ` Zhenzhong Duan
2024-06-03 12:50 ` Eric Auger
2024-06-03 6:10 ` [PATCH v6 04/19] vfio/iommufd: Introduce TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO device Zhenzhong Duan
` (16 subsequent siblings)
19 siblings, 1 reply; 70+ messages in thread
From: Zhenzhong Duan @ 2024-06-03 6:10 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
yi.l.liu, chao.p.peng, Zhenzhong Duan
TYPE_HOST_IOMMU_DEVICE_IOMMUFD represents a host IOMMU device under
iommufd backend.
It will have its own .get_cap() implementation.
Opportunistically, add missed header to include/sysemu/iommufd.h.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
include/sysemu/iommufd.h | 16 ++++++++++++++++
backends/iommufd.c | 35 ++++++++++++++++++-----------------
2 files changed, 34 insertions(+), 17 deletions(-)
diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
index 293bfbe967..f6e6d6e1f9 100644
--- a/include/sysemu/iommufd.h
+++ b/include/sysemu/iommufd.h
@@ -1,9 +1,23 @@
+/*
+ * iommufd container backend declaration
+ *
+ * Copyright (C) 2024 Intel Corporation.
+ * Copyright Red Hat, Inc. 2024
+ *
+ * Authors: Yi Liu <yi.l.liu@intel.com>
+ * Eric Auger <eric.auger@redhat.com>
+ * Zhenzhong Duan <zhenzhong.duan@intel.com>
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
#ifndef SYSEMU_IOMMUFD_H
#define SYSEMU_IOMMUFD_H
#include "qom/object.h"
#include "exec/hwaddr.h"
#include "exec/cpu-common.h"
+#include "sysemu/host_iommu_device.h"
#define TYPE_IOMMUFD_BACKEND "iommufd"
OBJECT_DECLARE_TYPE(IOMMUFDBackend, IOMMUFDBackendClass, IOMMUFD_BACKEND)
@@ -33,4 +47,6 @@ int iommufd_backend_map_dma(IOMMUFDBackend *be, uint32_t ioas_id, hwaddr iova,
ram_addr_t size, void *vaddr, bool readonly);
int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
hwaddr iova, ram_addr_t size);
+
+#define TYPE_HOST_IOMMU_DEVICE_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
#endif
diff --git a/backends/iommufd.c b/backends/iommufd.c
index c506afbdac..012f18d8d8 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -208,23 +208,24 @@ int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
return ret;
}
-static const TypeInfo iommufd_backend_info = {
- .name = TYPE_IOMMUFD_BACKEND,
- .parent = TYPE_OBJECT,
- .instance_size = sizeof(IOMMUFDBackend),
- .instance_init = iommufd_backend_init,
- .instance_finalize = iommufd_backend_finalize,
- .class_size = sizeof(IOMMUFDBackendClass),
- .class_init = iommufd_backend_class_init,
- .interfaces = (InterfaceInfo[]) {
- { TYPE_USER_CREATABLE },
- { }
+static const TypeInfo types[] = {
+ {
+ .name = TYPE_IOMMUFD_BACKEND,
+ .parent = TYPE_OBJECT,
+ .instance_size = sizeof(IOMMUFDBackend),
+ .instance_init = iommufd_backend_init,
+ .instance_finalize = iommufd_backend_finalize,
+ .class_size = sizeof(IOMMUFDBackendClass),
+ .class_init = iommufd_backend_class_init,
+ .interfaces = (InterfaceInfo[]) {
+ { TYPE_USER_CREATABLE },
+ { }
+ }
+ }, {
+ .name = TYPE_HOST_IOMMU_DEVICE_IOMMUFD,
+ .parent = TYPE_HOST_IOMMU_DEVICE,
+ .abstract = true,
}
};
-static void register_types(void)
-{
- type_register_static(&iommufd_backend_info);
-}
-
-type_init(register_types);
+DEFINE_TYPES(types)
--
2.34.1
^ permalink raw reply related [flat|nested] 70+ messages in thread
* Re: [PATCH v6 03/19] backends/iommufd: Introduce abstract TYPE_HOST_IOMMU_DEVICE_IOMMUFD device
2024-06-03 6:10 ` [PATCH v6 03/19] backends/iommufd: Introduce abstract TYPE_HOST_IOMMU_DEVICE_IOMMUFD device Zhenzhong Duan
@ 2024-06-03 12:50 ` Eric Auger
2024-06-04 3:43 ` Duan, Zhenzhong
0 siblings, 1 reply; 70+ messages in thread
From: Eric Auger @ 2024-06-03 12:50 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, clg, mst, peterx, jasowang, jgg, nicolinc,
joao.m.martins, clement.mathieu--drif, kevin.tian, yi.l.liu,
chao.p.peng
Hi Zhenzhong,
On 6/3/24 08:10, Zhenzhong Duan wrote:
> TYPE_HOST_IOMMU_DEVICE_IOMMUFD represents a host IOMMU device under
> iommufd backend.
>
> It will have its own .get_cap() implementation.
>
> Opportunistically, add missed header to include/sysemu/iommufd.h.
I would explain why it is abstract, ie. because it is going to be
derived into VFIO or VDPA type'd device.
Besides I think I would simply squash patches 3 and 4
Thanks
Eric
>
> Suggested-by: Cédric Le Goater <clg@redhat.com>
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
> include/sysemu/iommufd.h | 16 ++++++++++++++++
> backends/iommufd.c | 35 ++++++++++++++++++-----------------
> 2 files changed, 34 insertions(+), 17 deletions(-)
>
> diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
> index 293bfbe967..f6e6d6e1f9 100644
> --- a/include/sysemu/iommufd.h
> +++ b/include/sysemu/iommufd.h
> @@ -1,9 +1,23 @@
> +/*
> + * iommufd container backend declaration
> + *
> + * Copyright (C) 2024 Intel Corporation.
> + * Copyright Red Hat, Inc. 2024
> + *
> + * Authors: Yi Liu <yi.l.liu@intel.com>
> + * Eric Auger <eric.auger@redhat.com>
> + * Zhenzhong Duan <zhenzhong.duan@intel.com>
> + *
> + * SPDX-License-Identifier: GPL-2.0-or-later
> + */
> +
> #ifndef SYSEMU_IOMMUFD_H
> #define SYSEMU_IOMMUFD_H
>
> #include "qom/object.h"
> #include "exec/hwaddr.h"
> #include "exec/cpu-common.h"
> +#include "sysemu/host_iommu_device.h"
>
> #define TYPE_IOMMUFD_BACKEND "iommufd"
> OBJECT_DECLARE_TYPE(IOMMUFDBackend, IOMMUFDBackendClass, IOMMUFD_BACKEND)
> @@ -33,4 +47,6 @@ int iommufd_backend_map_dma(IOMMUFDBackend *be, uint32_t ioas_id, hwaddr iova,
> ram_addr_t size, void *vaddr, bool readonly);
> int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
> hwaddr iova, ram_addr_t size);
> +
> +#define TYPE_HOST_IOMMU_DEVICE_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
> #endif
> diff --git a/backends/iommufd.c b/backends/iommufd.c
> index c506afbdac..012f18d8d8 100644
> --- a/backends/iommufd.c
> +++ b/backends/iommufd.c
> @@ -208,23 +208,24 @@ int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
> return ret;
> }
>
> -static const TypeInfo iommufd_backend_info = {
> - .name = TYPE_IOMMUFD_BACKEND,
> - .parent = TYPE_OBJECT,
> - .instance_size = sizeof(IOMMUFDBackend),
> - .instance_init = iommufd_backend_init,
> - .instance_finalize = iommufd_backend_finalize,
> - .class_size = sizeof(IOMMUFDBackendClass),
> - .class_init = iommufd_backend_class_init,
> - .interfaces = (InterfaceInfo[]) {
> - { TYPE_USER_CREATABLE },
> - { }
> +static const TypeInfo types[] = {
> + {
> + .name = TYPE_IOMMUFD_BACKEND,
> + .parent = TYPE_OBJECT,
> + .instance_size = sizeof(IOMMUFDBackend),
> + .instance_init = iommufd_backend_init,
> + .instance_finalize = iommufd_backend_finalize,
> + .class_size = sizeof(IOMMUFDBackendClass),
> + .class_init = iommufd_backend_class_init,
> + .interfaces = (InterfaceInfo[]) {
> + { TYPE_USER_CREATABLE },
> + { }
> + }
> + }, {
> + .name = TYPE_HOST_IOMMU_DEVICE_IOMMUFD,
> + .parent = TYPE_HOST_IOMMU_DEVICE,
> + .abstract = true,
> }
> };
>
> -static void register_types(void)
> -{
> - type_register_static(&iommufd_backend_info);
> -}
> -
> -type_init(register_types);
> +DEFINE_TYPES(types)
^ permalink raw reply [flat|nested] 70+ messages in thread
* RE: [PATCH v6 03/19] backends/iommufd: Introduce abstract TYPE_HOST_IOMMU_DEVICE_IOMMUFD device
2024-06-03 12:50 ` Eric Auger
@ 2024-06-04 3:43 ` Duan, Zhenzhong
0 siblings, 0 replies; 70+ messages in thread
From: Duan, Zhenzhong @ 2024-06-04 3:43 UTC (permalink / raw)
To: eric.auger@redhat.com, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
peterx@redhat.com, jasowang@redhat.com, jgg@nvidia.com,
nicolinc@nvidia.com, joao.m.martins@oracle.com,
clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
Peng, Chao P
>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: Re: [PATCH v6 03/19] backends/iommufd: Introduce abstract
>TYPE_HOST_IOMMU_DEVICE_IOMMUFD device
>
>
>Hi Zhenzhong,
>On 6/3/24 08:10, Zhenzhong Duan wrote:
>> TYPE_HOST_IOMMU_DEVICE_IOMMUFD represents a host IOMMU device
>under
>> iommufd backend.
>>
>> It will have its own .get_cap() implementation.
>>
>> Opportunistically, add missed header to include/sysemu/iommufd.h.
>
>I would explain why it is abstract, ie. because it is going to be
>derived into VFIO or VDPA type'd device.
Sure.
>
>Besides I think I would simply squash patches 3 and 4
Will do.
Thanks
Zhenzhong
>
>Thanks
>
>Eric
>>
>> Suggested-by: Cédric Le Goater <clg@redhat.com>
>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>> include/sysemu/iommufd.h | 16 ++++++++++++++++
>> backends/iommufd.c | 35 ++++++++++++++++++-----------------
>> 2 files changed, 34 insertions(+), 17 deletions(-)
>>
>> diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
>> index 293bfbe967..f6e6d6e1f9 100644
>> --- a/include/sysemu/iommufd.h
>> +++ b/include/sysemu/iommufd.h
>> @@ -1,9 +1,23 @@
>> +/*
>> + * iommufd container backend declaration
>> + *
>> + * Copyright (C) 2024 Intel Corporation.
>> + * Copyright Red Hat, Inc. 2024
>> + *
>> + * Authors: Yi Liu <yi.l.liu@intel.com>
>> + * Eric Auger <eric.auger@redhat.com>
>> + * Zhenzhong Duan <zhenzhong.duan@intel.com>
>> + *
>> + * SPDX-License-Identifier: GPL-2.0-or-later
>> + */
>> +
>> #ifndef SYSEMU_IOMMUFD_H
>> #define SYSEMU_IOMMUFD_H
>>
>> #include "qom/object.h"
>> #include "exec/hwaddr.h"
>> #include "exec/cpu-common.h"
>> +#include "sysemu/host_iommu_device.h"
>>
>> #define TYPE_IOMMUFD_BACKEND "iommufd"
>> OBJECT_DECLARE_TYPE(IOMMUFDBackend, IOMMUFDBackendClass,
>IOMMUFD_BACKEND)
>> @@ -33,4 +47,6 @@ int iommufd_backend_map_dma(IOMMUFDBackend
>*be, uint32_t ioas_id, hwaddr iova,
>> ram_addr_t size, void *vaddr, bool readonly);
>> int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t
>ioas_id,
>> hwaddr iova, ram_addr_t size);
>> +
>> +#define TYPE_HOST_IOMMU_DEVICE_IOMMUFD
>TYPE_HOST_IOMMU_DEVICE "-iommufd"
>> #endif
>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>> index c506afbdac..012f18d8d8 100644
>> --- a/backends/iommufd.c
>> +++ b/backends/iommufd.c
>> @@ -208,23 +208,24 @@ int
>iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
>> return ret;
>> }
>>
>> -static const TypeInfo iommufd_backend_info = {
>> - .name = TYPE_IOMMUFD_BACKEND,
>> - .parent = TYPE_OBJECT,
>> - .instance_size = sizeof(IOMMUFDBackend),
>> - .instance_init = iommufd_backend_init,
>> - .instance_finalize = iommufd_backend_finalize,
>> - .class_size = sizeof(IOMMUFDBackendClass),
>> - .class_init = iommufd_backend_class_init,
>> - .interfaces = (InterfaceInfo[]) {
>> - { TYPE_USER_CREATABLE },
>> - { }
>> +static const TypeInfo types[] = {
>> + {
>> + .name = TYPE_IOMMUFD_BACKEND,
>> + .parent = TYPE_OBJECT,
>> + .instance_size = sizeof(IOMMUFDBackend),
>> + .instance_init = iommufd_backend_init,
>> + .instance_finalize = iommufd_backend_finalize,
>> + .class_size = sizeof(IOMMUFDBackendClass),
>> + .class_init = iommufd_backend_class_init,
>> + .interfaces = (InterfaceInfo[]) {
>> + { TYPE_USER_CREATABLE },
>> + { }
>> + }
>> + }, {
>> + .name = TYPE_HOST_IOMMU_DEVICE_IOMMUFD,
>> + .parent = TYPE_HOST_IOMMU_DEVICE,
>> + .abstract = true,
>> }
>> };
>>
>> -static void register_types(void)
>> -{
>> - type_register_static(&iommufd_backend_info);
>> -}
>> -
>> -type_init(register_types);
>> +DEFINE_TYPES(types)
^ permalink raw reply [flat|nested] 70+ messages in thread
* [PATCH v6 04/19] vfio/iommufd: Introduce TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO device
2024-06-03 6:10 [PATCH v6 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
` (2 preceding siblings ...)
2024-06-03 6:10 ` [PATCH v6 03/19] backends/iommufd: Introduce abstract TYPE_HOST_IOMMU_DEVICE_IOMMUFD device Zhenzhong Duan
@ 2024-06-03 6:10 ` Zhenzhong Duan
2024-06-03 6:10 ` [PATCH v6 05/19] backends/host_iommu_device: Introduce HostIOMMUDeviceCaps Zhenzhong Duan
` (15 subsequent siblings)
19 siblings, 0 replies; 70+ messages in thread
From: Zhenzhong Duan @ 2024-06-03 6:10 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
yi.l.liu, chao.p.peng, Zhenzhong Duan
TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO represents a host IOMMU device under
VFIO iommufd backend. It will be created during VFIO device attaching
and passed to vIOMMU.
It will have its own .realize() implementation.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
include/hw/vfio/vfio-common.h | 3 +++
hw/vfio/iommufd.c | 5 ++++-
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 75b167979a..56d1717211 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -32,6 +32,7 @@
#include "sysemu/sysemu.h"
#include "hw/vfio/vfio-container-base.h"
#include "sysemu/host_iommu_device.h"
+#include "sysemu/iommufd.h"
#define VFIO_MSG_PREFIX "vfio %s: "
@@ -173,6 +174,8 @@ typedef struct VFIOGroup {
} VFIOGroup;
#define TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO TYPE_HOST_IOMMU_DEVICE "-legacy-vfio"
+#define TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO \
+ TYPE_HOST_IOMMU_DEVICE_IOMMUFD "-vfio"
typedef struct VFIODMABuf {
QemuDmaBuf *buf;
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 554f9a6292..e4a507d55c 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -624,7 +624,10 @@ static const TypeInfo types[] = {
.name = TYPE_VFIO_IOMMU_IOMMUFD,
.parent = TYPE_VFIO_IOMMU,
.class_init = vfio_iommu_iommufd_class_init,
- },
+ }, {
+ .name = TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO,
+ .parent = TYPE_HOST_IOMMU_DEVICE_IOMMUFD,
+ }
};
DEFINE_TYPES(types)
--
2.34.1
^ permalink raw reply related [flat|nested] 70+ messages in thread
* [PATCH v6 05/19] backends/host_iommu_device: Introduce HostIOMMUDeviceCaps
2024-06-03 6:10 [PATCH v6 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
` (3 preceding siblings ...)
2024-06-03 6:10 ` [PATCH v6 04/19] vfio/iommufd: Introduce TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO device Zhenzhong Duan
@ 2024-06-03 6:10 ` Zhenzhong Duan
2024-06-03 12:40 ` Cédric Le Goater
2024-06-03 12:51 ` Eric Auger
2024-06-03 6:10 ` [PATCH v6 06/19] range: Introduce range_get_last_bit() Zhenzhong Duan
` (14 subsequent siblings)
19 siblings, 2 replies; 70+ messages in thread
From: Zhenzhong Duan @ 2024-06-03 6:10 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
yi.l.liu, chao.p.peng, Zhenzhong Duan
HostIOMMUDeviceCaps's elements map to the host IOMMU's capabilities.
Different platform IOMMU can support different elements.
Currently only two elements, type and aw_bits, type hints the host
platform IOMMU type, i.e., INTEL vtd, ARM smmu, etc; aw_bits hints
host IOMMU address width.
Introduce .get_cap() handler to check if HOST_IOMMU_DEVICE_CAP_XXX
is supported.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
include/sysemu/host_iommu_device.h | 37 ++++++++++++++++++++++++++++++
1 file changed, 37 insertions(+)
diff --git a/include/sysemu/host_iommu_device.h b/include/sysemu/host_iommu_device.h
index 2b58a94d62..d47d1034b1 100644
--- a/include/sysemu/host_iommu_device.h
+++ b/include/sysemu/host_iommu_device.h
@@ -15,11 +15,25 @@
#include "qom/object.h"
#include "qapi/error.h"
+/**
+ * struct HostIOMMUDeviceCaps - Define host IOMMU device capabilities.
+ *
+ * @type: host platform IOMMU type.
+ *
+ * @aw_bits: host IOMMU address width. 0xff if no limitation.
+ */
+typedef struct HostIOMMUDeviceCaps {
+ uint32_t type;
+ uint8_t aw_bits;
+} HostIOMMUDeviceCaps;
+
#define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass, HOST_IOMMU_DEVICE)
struct HostIOMMUDevice {
Object parent_obj;
+
+ HostIOMMUDeviceCaps caps;
};
/**
@@ -47,5 +61,28 @@ struct HostIOMMUDeviceClass {
* Returns: true on success, false on failure.
*/
bool (*realize)(HostIOMMUDevice *hiod, void *opaque, Error **errp);
+ /**
+ * @get_cap: check if a host IOMMU device capability is supported.
+ *
+ * Optional callback, if not implemented, hint not supporting query
+ * of @cap.
+ *
+ * @hiod: pointer to a host IOMMU device instance.
+ *
+ * @cap: capability to check.
+ *
+ * @errp: pass an Error out when fails to query capability.
+ *
+ * Returns: <0 on failure, 0 if a @cap is unsupported, or else
+ * 1 or some positive value for some special @cap,
+ * i.e., HOST_IOMMU_DEVICE_CAP_AW_BITS.
+ */
+ int (*get_cap)(HostIOMMUDevice *hiod, int cap, Error **errp);
};
+
+/*
+ * Host IOMMU device capability list.
+ */
+#define HOST_IOMMU_DEVICE_CAP_IOMMU_TYPE 0
+#define HOST_IOMMU_DEVICE_CAP_AW_BITS 1
#endif
--
2.34.1
^ permalink raw reply related [flat|nested] 70+ messages in thread
* Re: [PATCH v6 05/19] backends/host_iommu_device: Introduce HostIOMMUDeviceCaps
2024-06-03 6:10 ` [PATCH v6 05/19] backends/host_iommu_device: Introduce HostIOMMUDeviceCaps Zhenzhong Duan
@ 2024-06-03 12:40 ` Cédric Le Goater
2024-06-03 12:51 ` Eric Auger
1 sibling, 0 replies; 70+ messages in thread
From: Cédric Le Goater @ 2024-06-03 12:40 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, eric.auger, mst, peterx, jasowang, jgg, nicolinc,
joao.m.martins, clement.mathieu--drif, kevin.tian, yi.l.liu,
chao.p.peng
On 6/3/24 08:10, Zhenzhong Duan wrote:
> HostIOMMUDeviceCaps's elements map to the host IOMMU's capabilities.
> Different platform IOMMU can support different elements.
>
> Currently only two elements, type and aw_bits, type hints the host
> platform IOMMU type, i.e., INTEL vtd, ARM smmu, etc; aw_bits hints
> host IOMMU address width.
>
> Introduce .get_cap() handler to check if HOST_IOMMU_DEVICE_CAP_XXX
> is supported.
>
> Suggested-by: Cédric Le Goater <clg@redhat.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
> include/sysemu/host_iommu_device.h | 37 ++++++++++++++++++++++++++++++
> 1 file changed, 37 insertions(+)
>
> diff --git a/include/sysemu/host_iommu_device.h b/include/sysemu/host_iommu_device.h
> index 2b58a94d62..d47d1034b1 100644
> --- a/include/sysemu/host_iommu_device.h
> +++ b/include/sysemu/host_iommu_device.h
> @@ -15,11 +15,25 @@
> #include "qom/object.h"
> #include "qapi/error.h"
>
> +/**
> + * struct HostIOMMUDeviceCaps - Define host IOMMU device capabilities.
> + *
> + * @type: host platform IOMMU type.
> + *
> + * @aw_bits: host IOMMU address width. 0xff if no limitation.
Could we introduce a define for the special value 0xff ? This would answer
Eric's question.
Thanks,
C.
> + */
> +typedef struct HostIOMMUDeviceCaps {
> + uint32_t type;
> + uint8_t aw_bits;
> +} HostIOMMUDeviceCaps;
> +
> #define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
> OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass, HOST_IOMMU_DEVICE)
>
> struct HostIOMMUDevice {
> Object parent_obj;
> +
> + HostIOMMUDeviceCaps caps;
> };
>
> /**
> @@ -47,5 +61,28 @@ struct HostIOMMUDeviceClass {
> * Returns: true on success, false on failure.
> */
> bool (*realize)(HostIOMMUDevice *hiod, void *opaque, Error **errp);
> + /**
> + * @get_cap: check if a host IOMMU device capability is supported.
> + *
> + * Optional callback, if not implemented, hint not supporting query
> + * of @cap.
> + *
> + * @hiod: pointer to a host IOMMU device instance.
> + *
> + * @cap: capability to check.
> + *
> + * @errp: pass an Error out when fails to query capability.
> + *
> + * Returns: <0 on failure, 0 if a @cap is unsupported, or else
> + * 1 or some positive value for some special @cap,
> + * i.e., HOST_IOMMU_DEVICE_CAP_AW_BITS.
> + */
> + int (*get_cap)(HostIOMMUDevice *hiod, int cap, Error **errp);
> };
> +
> +/*
> + * Host IOMMU device capability list.
> + */
> +#define HOST_IOMMU_DEVICE_CAP_IOMMU_TYPE 0
> +#define HOST_IOMMU_DEVICE_CAP_AW_BITS 1
> #endif
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH v6 05/19] backends/host_iommu_device: Introduce HostIOMMUDeviceCaps
2024-06-03 6:10 ` [PATCH v6 05/19] backends/host_iommu_device: Introduce HostIOMMUDeviceCaps Zhenzhong Duan
2024-06-03 12:40 ` Cédric Le Goater
@ 2024-06-03 12:51 ` Eric Auger
1 sibling, 0 replies; 70+ messages in thread
From: Eric Auger @ 2024-06-03 12:51 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, clg, mst, peterx, jasowang, jgg, nicolinc,
joao.m.martins, clement.mathieu--drif, kevin.tian, yi.l.liu,
chao.p.peng
On 6/3/24 08:10, Zhenzhong Duan wrote:
> HostIOMMUDeviceCaps's elements map to the host IOMMU's capabilities.
> Different platform IOMMU can support different elements.
>
> Currently only two elements, type and aw_bits, type hints the host
> platform IOMMU type, i.e., INTEL vtd, ARM smmu, etc; aw_bits hints
> host IOMMU address width.
>
> Introduce .get_cap() handler to check if HOST_IOMMU_DEVICE_CAP_XXX
> is supported.
>
> Suggested-by: Cédric Le Goater <clg@redhat.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
I would either squash this patch into patch 1 or if you prefer ti keep
it seperated, add it just after.
Eric
> ---
> include/sysemu/host_iommu_device.h | 37 ++++++++++++++++++++++++++++++
> 1 file changed, 37 insertions(+)
>
> diff --git a/include/sysemu/host_iommu_device.h b/include/sysemu/host_iommu_device.h
> index 2b58a94d62..d47d1034b1 100644
> --- a/include/sysemu/host_iommu_device.h
> +++ b/include/sysemu/host_iommu_device.h
> @@ -15,11 +15,25 @@
> #include "qom/object.h"
> #include "qapi/error.h"
>
> +/**
> + * struct HostIOMMUDeviceCaps - Define host IOMMU device capabilities.
> + *
> + * @type: host platform IOMMU type.
> + *
> + * @aw_bits: host IOMMU address width. 0xff if no limitation.
> + */
> +typedef struct HostIOMMUDeviceCaps {
> + uint32_t type;
> + uint8_t aw_bits;
> +} HostIOMMUDeviceCaps;
> +
> #define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
> OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass, HOST_IOMMU_DEVICE)
>
> struct HostIOMMUDevice {
> Object parent_obj;
> +
> + HostIOMMUDeviceCaps caps;
> };
>
> /**
> @@ -47,5 +61,28 @@ struct HostIOMMUDeviceClass {
> * Returns: true on success, false on failure.
> */
> bool (*realize)(HostIOMMUDevice *hiod, void *opaque, Error **errp);
> + /**
> + * @get_cap: check if a host IOMMU device capability is supported.
> + *
> + * Optional callback, if not implemented, hint not supporting query
> + * of @cap.
> + *
> + * @hiod: pointer to a host IOMMU device instance.
> + *
> + * @cap: capability to check.
> + *
> + * @errp: pass an Error out when fails to query capability.
> + *
> + * Returns: <0 on failure, 0 if a @cap is unsupported, or else
> + * 1 or some positive value for some special @cap,
> + * i.e., HOST_IOMMU_DEVICE_CAP_AW_BITS.
> + */
> + int (*get_cap)(HostIOMMUDevice *hiod, int cap, Error **errp);
> };
> +
> +/*
> + * Host IOMMU device capability list.
> + */
> +#define HOST_IOMMU_DEVICE_CAP_IOMMU_TYPE 0
> +#define HOST_IOMMU_DEVICE_CAP_AW_BITS 1
> #endif
^ permalink raw reply [flat|nested] 70+ messages in thread
* [PATCH v6 06/19] range: Introduce range_get_last_bit()
2024-06-03 6:10 [PATCH v6 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
` (4 preceding siblings ...)
2024-06-03 6:10 ` [PATCH v6 05/19] backends/host_iommu_device: Introduce HostIOMMUDeviceCaps Zhenzhong Duan
@ 2024-06-03 6:10 ` Zhenzhong Duan
2024-06-03 12:57 ` Eric Auger
2024-06-03 6:10 ` [PATCH v6 07/19] vfio/container: Implement HostIOMMUDeviceClass::realize() handler Zhenzhong Duan
` (13 subsequent siblings)
19 siblings, 1 reply; 70+ messages in thread
From: Zhenzhong Duan @ 2024-06-03 6:10 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
yi.l.liu, chao.p.peng, Zhenzhong Duan
This helper get the highest 1 bit position of the upper bound.
If the range is empty or upper bound is zero, -1 is returned.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
include/qemu/range.h | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/include/qemu/range.h b/include/qemu/range.h
index 205e1da76d..4ce694a398 100644
--- a/include/qemu/range.h
+++ b/include/qemu/range.h
@@ -20,6 +20,8 @@
#ifndef QEMU_RANGE_H
#define QEMU_RANGE_H
+#include "qemu/bitops.h"
+
/*
* Operations on 64 bit address ranges.
* Notes:
@@ -217,6 +219,15 @@ static inline int ranges_overlap(uint64_t first1, uint64_t len1,
return !(last2 < first1 || last1 < first2);
}
+/* Get highest non-zero bit position of a range */
+static inline int range_get_last_bit(Range *range)
+{
+ if (range_is_empty(range)) {
+ return -1;
+ }
+ return 63 - clz64(range->upb);
+}
+
/*
* Return -1 if @a < @b, 1 @a > @b, and 0 if they touch or overlap.
* Both @a and @b must not be empty.
--
2.34.1
^ permalink raw reply related [flat|nested] 70+ messages in thread
* Re: [PATCH v6 06/19] range: Introduce range_get_last_bit()
2024-06-03 6:10 ` [PATCH v6 06/19] range: Introduce range_get_last_bit() Zhenzhong Duan
@ 2024-06-03 12:57 ` Eric Auger
0 siblings, 0 replies; 70+ messages in thread
From: Eric Auger @ 2024-06-03 12:57 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, clg, mst, peterx, jasowang, jgg, nicolinc,
joao.m.martins, clement.mathieu--drif, kevin.tian, yi.l.liu,
chao.p.peng
On 6/3/24 08:10, Zhenzhong Duan wrote:
> This helper get the highest 1 bit position of the upper bound.
>
> If the range is empty or upper bound is zero, -1 is returned.
>
> Suggested-by: Cédric Le Goater <clg@redhat.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Eric
> ---
> include/qemu/range.h | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/include/qemu/range.h b/include/qemu/range.h
> index 205e1da76d..4ce694a398 100644
> --- a/include/qemu/range.h
> +++ b/include/qemu/range.h
> @@ -20,6 +20,8 @@
> #ifndef QEMU_RANGE_H
> #define QEMU_RANGE_H
>
> +#include "qemu/bitops.h"
> +
> /*
> * Operations on 64 bit address ranges.
> * Notes:
> @@ -217,6 +219,15 @@ static inline int ranges_overlap(uint64_t first1, uint64_t len1,
> return !(last2 < first1 || last1 < first2);
> }
>
> +/* Get highest non-zero bit position of a range */
> +static inline int range_get_last_bit(Range *range)
> +{
> + if (range_is_empty(range)) {
> + return -1;
> + }
> + return 63 - clz64(range->upb);
> +}
> +
> /*
> * Return -1 if @a < @b, 1 @a > @b, and 0 if they touch or overlap.
> * Both @a and @b must not be empty.
^ permalink raw reply [flat|nested] 70+ messages in thread
* [PATCH v6 07/19] vfio/container: Implement HostIOMMUDeviceClass::realize() handler
2024-06-03 6:10 [PATCH v6 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
` (5 preceding siblings ...)
2024-06-03 6:10 ` [PATCH v6 06/19] range: Introduce range_get_last_bit() Zhenzhong Duan
@ 2024-06-03 6:10 ` Zhenzhong Duan
2024-06-03 11:23 ` Eric Auger
2024-06-03 6:10 ` [PATCH v6 08/19] backends/iommufd: Introduce helper function iommufd_backend_get_device_info() Zhenzhong Duan
` (12 subsequent siblings)
19 siblings, 1 reply; 70+ messages in thread
From: Zhenzhong Duan @ 2024-06-03 6:10 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
yi.l.liu, chao.p.peng, Zhenzhong Duan
Utilize range_get_last_bit() to get host IOMMU address width and
package it in HostIOMMUDeviceCaps for query with .get_cap().
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
hw/vfio/container.c | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index c4fca2dfca..48800fe92f 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -1136,6 +1136,31 @@ static void vfio_iommu_legacy_class_init(ObjectClass *klass, void *data)
vioc->pci_hot_reset = vfio_legacy_pci_hot_reset;
};
+static bool hiod_legacy_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
+ Error **errp)
+{
+ VFIODevice *vdev = opaque;
+ /* iova_ranges is a sorted list */
+ GList *l = g_list_last(vdev->bcontainer->iova_ranges);
+
+ /* Only support query HOST_IOMMU_DEVICE_CAP_AW_BITS with legacy backend */
+ if (l) {
+ Range *range = l->data;
+ hiod->caps.aw_bits = range_get_last_bit(range) + 1;
+ } else {
+ hiod->caps.aw_bits = 0xff;
+ }
+
+ return true;
+}
+
+static void hiod_legacy_vfio_class_init(ObjectClass *oc, void *data)
+{
+ HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
+
+ hioc->realize = hiod_legacy_vfio_realize;
+};
+
static const TypeInfo types[] = {
{
.name = TYPE_VFIO_IOMMU_LEGACY,
@@ -1144,6 +1169,7 @@ static const TypeInfo types[] = {
}, {
.name = TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO,
.parent = TYPE_HOST_IOMMU_DEVICE,
+ .class_init = hiod_legacy_vfio_class_init,
}
};
--
2.34.1
^ permalink raw reply related [flat|nested] 70+ messages in thread
* Re: [PATCH v6 07/19] vfio/container: Implement HostIOMMUDeviceClass::realize() handler
2024-06-03 6:10 ` [PATCH v6 07/19] vfio/container: Implement HostIOMMUDeviceClass::realize() handler Zhenzhong Duan
@ 2024-06-03 11:23 ` Eric Auger
2024-06-04 2:45 ` Duan, Zhenzhong
0 siblings, 1 reply; 70+ messages in thread
From: Eric Auger @ 2024-06-03 11:23 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, clg, mst, peterx, jasowang, jgg, nicolinc,
joao.m.martins, clement.mathieu--drif, kevin.tian, yi.l.liu,
chao.p.peng
Hi Zhenzhong,
On 6/3/24 08:10, Zhenzhong Duan wrote:
> Utilize range_get_last_bit() to get host IOMMU address width and
> package it in HostIOMMUDeviceCaps for query with .get_cap().
>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
> hw/vfio/container.c | 26 ++++++++++++++++++++++++++
> 1 file changed, 26 insertions(+)
>
> diff --git a/hw/vfio/container.c b/hw/vfio/container.c
> index c4fca2dfca..48800fe92f 100644
> --- a/hw/vfio/container.c
> +++ b/hw/vfio/container.c
> @@ -1136,6 +1136,31 @@ static void vfio_iommu_legacy_class_init(ObjectClass *klass, void *data)
> vioc->pci_hot_reset = vfio_legacy_pci_hot_reset;
> };
>
> +static bool hiod_legacy_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
> + Error **errp)
> +{
> + VFIODevice *vdev = opaque;
> + /* iova_ranges is a sorted list */
> + GList *l = g_list_last(vdev->bcontainer->iova_ranges);
> +
> + /* Only support query HOST_IOMMU_DEVICE_CAP_AW_BITS with legacy backend */
I don't get the comment as HOST_IOMMU_DEVICE_CAP_AW_BITS support seems
to be introduced in [PATCH v6 11/19] backends/iommufd: Implement
HostIOMMUDeviceClass::get_cap() handler
> + if (l) {
> + Range *range = l->data;
> + hiod->caps.aw_bits = range_get_last_bit(range) + 1;
> + } else {
> + hiod->caps.aw_bits = 0xff;
why this value?
> + }
> +
> + return true;
> +}
> +
> +static void hiod_legacy_vfio_class_init(ObjectClass *oc, void *data)
> +{
> + HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
> +
> + hioc->realize = hiod_legacy_vfio_realize;
> +};
> +
> static const TypeInfo types[] = {
> {
> .name = TYPE_VFIO_IOMMU_LEGACY,
> @@ -1144,6 +1169,7 @@ static const TypeInfo types[] = {
> }, {
> .name = TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO,
> .parent = TYPE_HOST_IOMMU_DEVICE,
> + .class_init = hiod_legacy_vfio_class_init,
> }
> };
>
Thanks
Eric
^ permalink raw reply [flat|nested] 70+ messages in thread
* RE: [PATCH v6 07/19] vfio/container: Implement HostIOMMUDeviceClass::realize() handler
2024-06-03 11:23 ` Eric Auger
@ 2024-06-04 2:45 ` Duan, Zhenzhong
2024-06-04 7:45 ` Eric Auger
0 siblings, 1 reply; 70+ messages in thread
From: Duan, Zhenzhong @ 2024-06-04 2:45 UTC (permalink / raw)
To: eric.auger@redhat.com, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
peterx@redhat.com, jasowang@redhat.com, jgg@nvidia.com,
nicolinc@nvidia.com, joao.m.martins@oracle.com,
clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
Peng, Chao P
Hi Eric,
>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: Re: [PATCH v6 07/19] vfio/container: Implement
>HostIOMMUDeviceClass::realize() handler
>
>Hi Zhenzhong,
>
>On 6/3/24 08:10, Zhenzhong Duan wrote:
>> Utilize range_get_last_bit() to get host IOMMU address width and
>> package it in HostIOMMUDeviceCaps for query with .get_cap().
>>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>> hw/vfio/container.c | 26 ++++++++++++++++++++++++++
>> 1 file changed, 26 insertions(+)
>>
>> diff --git a/hw/vfio/container.c b/hw/vfio/container.c
>> index c4fca2dfca..48800fe92f 100644
>> --- a/hw/vfio/container.c
>> +++ b/hw/vfio/container.c
>> @@ -1136,6 +1136,31 @@ static void
>vfio_iommu_legacy_class_init(ObjectClass *klass, void *data)
>> vioc->pci_hot_reset = vfio_legacy_pci_hot_reset;
>> };
>>
>> +static bool hiod_legacy_vfio_realize(HostIOMMUDevice *hiod, void
>*opaque,
>> + Error **errp)
>> +{
>> + VFIODevice *vdev = opaque;
>> + /* iova_ranges is a sorted list */
>> + GList *l = g_list_last(vdev->bcontainer->iova_ranges);
>> +
>> + /* Only support query HOST_IOMMU_DEVICE_CAP_AW_BITS with
>legacy backend */
>I don't get the comment as HOST_IOMMU_DEVICE_CAP_AW_BITS support
>seems
>to be introduced in [PATCH v6 11/19] backends/iommufd: Implement
>HostIOMMUDeviceClass::get_cap() handler
Sorry about my poor English, I mean legacy backend only support
HOST_IOMMU_DEVICE_CAP_AW_BITS, no other caps.
May be only:
/* Only support query HOST_IOMMU_DEVICE_CAP_AW_BITS */
>> + if (l) {
>> + Range *range = l->data;
>> + hiod->caps.aw_bits = range_get_last_bit(range) + 1;
>> + } else {
>> + hiod->caps.aw_bits = 0xff;
>why this value?
0xff means no limitation on aw_bits from host side. Aw_bits check
should always pass. This could be a case that an old kernel doesn't
support query iova ranges.
Will add a define like:
#define HOST_IOMMU_DEVICE_CAP_AW_BITS_MAX 0xff
Thanks
Zhenzhong
>> + }
>> +
>> + return true;
>> +}
>> +
>> +static void hiod_legacy_vfio_class_init(ObjectClass *oc, void *data)
>> +{
>> + HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
>> +
>> + hioc->realize = hiod_legacy_vfio_realize;
>> +};
>> +
>> static const TypeInfo types[] = {
>> {
>> .name = TYPE_VFIO_IOMMU_LEGACY,
>> @@ -1144,6 +1169,7 @@ static const TypeInfo types[] = {
>> }, {
>> .name = TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO,
>> .parent = TYPE_HOST_IOMMU_DEVICE,
>> + .class_init = hiod_legacy_vfio_class_init,
>> }
>> };
>>
>
>Thanks
>
>Eric
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH v6 07/19] vfio/container: Implement HostIOMMUDeviceClass::realize() handler
2024-06-04 2:45 ` Duan, Zhenzhong
@ 2024-06-04 7:45 ` Eric Auger
2024-06-04 7:59 ` Duan, Zhenzhong
0 siblings, 1 reply; 70+ messages in thread
From: Eric Auger @ 2024-06-04 7:45 UTC (permalink / raw)
To: Duan, Zhenzhong, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
peterx@redhat.com, jasowang@redhat.com, jgg@nvidia.com,
nicolinc@nvidia.com, joao.m.martins@oracle.com,
clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
Peng, Chao P
Hi Zhenzhong,
On 6/4/24 04:45, Duan, Zhenzhong wrote:
> Hi Eric,
>
>> -----Original Message-----
>> From: Eric Auger <eric.auger@redhat.com>
>> Subject: Re: [PATCH v6 07/19] vfio/container: Implement
>> HostIOMMUDeviceClass::realize() handler
>>
>> Hi Zhenzhong,
>>
>> On 6/3/24 08:10, Zhenzhong Duan wrote:
>>> Utilize range_get_last_bit() to get host IOMMU address width and
>>> package it in HostIOMMUDeviceCaps for query with .get_cap().
>>>
>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>> ---
>>> hw/vfio/container.c | 26 ++++++++++++++++++++++++++
>>> 1 file changed, 26 insertions(+)
>>>
>>> diff --git a/hw/vfio/container.c b/hw/vfio/container.c
>>> index c4fca2dfca..48800fe92f 100644
>>> --- a/hw/vfio/container.c
>>> +++ b/hw/vfio/container.c
>>> @@ -1136,6 +1136,31 @@ static void
>> vfio_iommu_legacy_class_init(ObjectClass *klass, void *data)
>>> vioc->pci_hot_reset = vfio_legacy_pci_hot_reset;
>>> };
>>>
>>> +static bool hiod_legacy_vfio_realize(HostIOMMUDevice *hiod, void
>> *opaque,
>>> + Error **errp)
>>> +{
>>> + VFIODevice *vdev = opaque;
>>> + /* iova_ranges is a sorted list */
>>> + GList *l = g_list_last(vdev->bcontainer->iova_ranges);
>>> +
>>> + /* Only support query HOST_IOMMU_DEVICE_CAP_AW_BITS with
>> legacy backend */
>> I don't get the comment as HOST_IOMMU_DEVICE_CAP_AW_BITS support
>> seems
>> to be introduced in [PATCH v6 11/19] backends/iommufd: Implement
>> HostIOMMUDeviceClass::get_cap() handler
> Sorry about my poor English, I mean legacy backend only support
> HOST_IOMMU_DEVICE_CAP_AW_BITS, no other caps.
> May be only:
>
> /* Only support query HOST_IOMMU_DEVICE_CAP_AW_BITS */
no problem. Then I would put this comment in the commit msg instead.
Something like "the realize function populates the capabilities. For now
only the aw_bits caps is computed".
>
>>> + if (l) {
>>> + Range *range = l->data;
>>> + hiod->caps.aw_bits = range_get_last_bit(range) + 1;
>>> + } else {
>>> + hiod->caps.aw_bits = 0xff;
>> why this value?
> 0xff means no limitation on aw_bits from host side. Aw_bits check
> should always pass. This could be a case that an old kernel doesn't
> support query iova ranges.
>
> Will add a define like:
>
> #define HOST_IOMMU_DEVICE_CAP_AW_BITS_MAX 0xff
Wouldn't 64 bits be a better choice? Also maybe add a comment explaining
that iova_ranges may be void for old kernels that do not support the query?
Eric
>
> Thanks
> Zhenzhong
>
>>> + }
>>> +
>>> + return true;
>>> +}
>>> +
>>> +static void hiod_legacy_vfio_class_init(ObjectClass *oc, void *data)
>>> +{
>>> + HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
>>> +
>>> + hioc->realize = hiod_legacy_vfio_realize;
>>> +};
>>> +
>>> static const TypeInfo types[] = {
>>> {
>>> .name = TYPE_VFIO_IOMMU_LEGACY,
>>> @@ -1144,6 +1169,7 @@ static const TypeInfo types[] = {
>>> }, {
>>> .name = TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO,
>>> .parent = TYPE_HOST_IOMMU_DEVICE,
>>> + .class_init = hiod_legacy_vfio_class_init,
>>> }
>>> };
>>>
>> Thanks
>>
>> Eric
^ permalink raw reply [flat|nested] 70+ messages in thread
* RE: [PATCH v6 07/19] vfio/container: Implement HostIOMMUDeviceClass::realize() handler
2024-06-04 7:45 ` Eric Auger
@ 2024-06-04 7:59 ` Duan, Zhenzhong
0 siblings, 0 replies; 70+ messages in thread
From: Duan, Zhenzhong @ 2024-06-04 7:59 UTC (permalink / raw)
To: eric.auger@redhat.com, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
peterx@redhat.com, jasowang@redhat.com, jgg@nvidia.com,
nicolinc@nvidia.com, joao.m.martins@oracle.com,
clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
Peng, Chao P
>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: Re: [PATCH v6 07/19] vfio/container: Implement
>HostIOMMUDeviceClass::realize() handler
>
>Hi Zhenzhong,
>
>On 6/4/24 04:45, Duan, Zhenzhong wrote:
>> Hi Eric,
>>
>>> -----Original Message-----
>>> From: Eric Auger <eric.auger@redhat.com>
>>> Subject: Re: [PATCH v6 07/19] vfio/container: Implement
>>> HostIOMMUDeviceClass::realize() handler
>>>
>>> Hi Zhenzhong,
>>>
>>> On 6/3/24 08:10, Zhenzhong Duan wrote:
>>>> Utilize range_get_last_bit() to get host IOMMU address width and
>>>> package it in HostIOMMUDeviceCaps for query with .get_cap().
>>>>
>>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>>> ---
>>>> hw/vfio/container.c | 26 ++++++++++++++++++++++++++
>>>> 1 file changed, 26 insertions(+)
>>>>
>>>> diff --git a/hw/vfio/container.c b/hw/vfio/container.c
>>>> index c4fca2dfca..48800fe92f 100644
>>>> --- a/hw/vfio/container.c
>>>> +++ b/hw/vfio/container.c
>>>> @@ -1136,6 +1136,31 @@ static void
>>> vfio_iommu_legacy_class_init(ObjectClass *klass, void *data)
>>>> vioc->pci_hot_reset = vfio_legacy_pci_hot_reset;
>>>> };
>>>>
>>>> +static bool hiod_legacy_vfio_realize(HostIOMMUDevice *hiod, void
>>> *opaque,
>>>> + Error **errp)
>>>> +{
>>>> + VFIODevice *vdev = opaque;
>>>> + /* iova_ranges is a sorted list */
>>>> + GList *l = g_list_last(vdev->bcontainer->iova_ranges);
>>>> +
>>>> + /* Only support query HOST_IOMMU_DEVICE_CAP_AW_BITS with
>>> legacy backend */
>>> I don't get the comment as HOST_IOMMU_DEVICE_CAP_AW_BITS
>support
>>> seems
>>> to be introduced in [PATCH v6 11/19] backends/iommufd: Implement
>>> HostIOMMUDeviceClass::get_cap() handler
>> Sorry about my poor English, I mean legacy backend only support
>> HOST_IOMMU_DEVICE_CAP_AW_BITS, no other caps.
>> May be only:
>>
>> /* Only support query HOST_IOMMU_DEVICE_CAP_AW_BITS */
>no problem. Then I would put this comment in the commit msg instead.
>Something like "the realize function populates the capabilities. For now
>only the aw_bits caps is computed".
Sure.
>
>
>>
>>>> + if (l) {
>>>> + Range *range = l->data;
>>>> + hiod->caps.aw_bits = range_get_last_bit(range) + 1;
>>>> + } else {
>>>> + hiod->caps.aw_bits = 0xff;
>>> why this value?
>> 0xff means no limitation on aw_bits from host side. Aw_bits check
>> should always pass. This could be a case that an old kernel doesn't
>> support query iova ranges.
>>
>> Will add a define like:
>>
>> #define HOST_IOMMU_DEVICE_CAP_AW_BITS_MAX 0xff
>Wouldn't 64 bits be a better choice?
Yes, 64 bits is large enough, will it.
> Also maybe add a comment explaining
>that iova_ranges may be void for old kernels that do not support the query?
Sure.
Thanks
Zhenzhong
^ permalink raw reply [flat|nested] 70+ messages in thread
* [PATCH v6 08/19] backends/iommufd: Introduce helper function iommufd_backend_get_device_info()
2024-06-03 6:10 [PATCH v6 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
` (6 preceding siblings ...)
2024-06-03 6:10 ` [PATCH v6 07/19] vfio/container: Implement HostIOMMUDeviceClass::realize() handler Zhenzhong Duan
@ 2024-06-03 6:10 ` Zhenzhong Duan
2024-06-03 6:10 ` [PATCH v6 09/19] vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler Zhenzhong Duan
` (11 subsequent siblings)
19 siblings, 0 replies; 70+ messages in thread
From: Zhenzhong Duan @ 2024-06-03 6:10 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
yi.l.liu, chao.p.peng, Zhenzhong Duan, Yi Sun
Introduce a helper function iommufd_backend_get_device_info() to get
host IOMMU related information through iommufd uAPI.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
include/sysemu/iommufd.h | 3 +++
backends/iommufd.c | 22 ++++++++++++++++++++++
2 files changed, 25 insertions(+)
diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
index f6e6d6e1f9..9edfec6045 100644
--- a/include/sysemu/iommufd.h
+++ b/include/sysemu/iommufd.h
@@ -47,6 +47,9 @@ int iommufd_backend_map_dma(IOMMUFDBackend *be, uint32_t ioas_id, hwaddr iova,
ram_addr_t size, void *vaddr, bool readonly);
int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
hwaddr iova, ram_addr_t size);
+bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
+ uint32_t *type, void *data, uint32_t len,
+ Error **errp);
#define TYPE_HOST_IOMMU_DEVICE_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd"
#endif
diff --git a/backends/iommufd.c b/backends/iommufd.c
index 012f18d8d8..c7e969d6f7 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -208,6 +208,28 @@ int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
return ret;
}
+bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
+ uint32_t *type, void *data, uint32_t len,
+ Error **errp)
+{
+ struct iommu_hw_info info = {
+ .size = sizeof(info),
+ .dev_id = devid,
+ .data_len = len,
+ .data_uptr = (uintptr_t)data,
+ };
+
+ if (ioctl(be->fd, IOMMU_GET_HW_INFO, &info)) {
+ error_setg_errno(errp, errno, "Failed to get hardware info");
+ return false;
+ }
+
+ g_assert(type);
+ *type = info.out_data_type;
+
+ return true;
+}
+
static const TypeInfo types[] = {
{
.name = TYPE_IOMMUFD_BACKEND,
--
2.34.1
^ permalink raw reply related [flat|nested] 70+ messages in thread
* [PATCH v6 09/19] vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler
2024-06-03 6:10 [PATCH v6 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
` (7 preceding siblings ...)
2024-06-03 6:10 ` [PATCH v6 08/19] backends/iommufd: Introduce helper function iommufd_backend_get_device_info() Zhenzhong Duan
@ 2024-06-03 6:10 ` Zhenzhong Duan
2024-06-03 11:30 ` Eric Auger
2024-06-06 9:26 ` Eric Auger
2024-06-03 6:10 ` [PATCH v6 10/19] vfio/container: Implement HostIOMMUDeviceClass::get_cap() handler Zhenzhong Duan
` (10 subsequent siblings)
19 siblings, 2 replies; 70+ messages in thread
From: Zhenzhong Duan @ 2024-06-03 6:10 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
yi.l.liu, chao.p.peng, Zhenzhong Duan, Marcel Apfelbaum
It calls iommufd_backend_get_device_info() to get host IOMMU
related information and translate it into HostIOMMUDeviceCaps
for query with .get_cap().
Introduce macro VTD_MGAW_FROM_CAP to get MGAW which equals to
(aw_bits - 1).
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
include/hw/i386/intel_iommu.h | 1 +
hw/vfio/iommufd.c | 37 +++++++++++++++++++++++++++++++++++
2 files changed, 38 insertions(+)
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 7fa0a695c8..7d694b0813 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -47,6 +47,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(IntelIOMMUState, INTEL_IOMMU_DEVICE)
#define VTD_HOST_AW_48BIT 48
#define VTD_HOST_ADDRESS_WIDTH VTD_HOST_AW_39BIT
#define VTD_HAW_MASK(aw) ((1ULL << (aw)) - 1)
+#define VTD_MGAW_FROM_CAP(cap) ((cap >> 16) & 0x3fULL)
#define DMAR_REPORT_F_INTR (1)
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index e4a507d55c..9d2e95e20e 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -25,6 +25,7 @@
#include "qemu/cutils.h"
#include "qemu/chardev_open.h"
#include "pci.h"
+#include "hw/i386/intel_iommu_internal.h"
static int iommufd_cdev_map(const VFIOContainerBase *bcontainer, hwaddr iova,
ram_addr_t size, void *vaddr, bool readonly)
@@ -619,6 +620,41 @@ static void vfio_iommu_iommufd_class_init(ObjectClass *klass, void *data)
vioc->pci_hot_reset = iommufd_cdev_pci_hot_reset;
};
+static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
+ Error **errp)
+{
+ VFIODevice *vdev = opaque;
+ HostIOMMUDeviceCaps *caps = &hiod->caps;
+ enum iommu_hw_info_type type;
+ union {
+ struct iommu_hw_info_vtd vtd;
+ } data;
+
+ if (!iommufd_backend_get_device_info(vdev->iommufd, vdev->devid,
+ &type, &data, sizeof(data), errp)) {
+ return false;
+ }
+
+ caps->type = type;
+
+ switch (type) {
+ case IOMMU_HW_INFO_TYPE_INTEL_VTD:
+ caps->aw_bits = VTD_MGAW_FROM_CAP(data.vtd.cap_reg) + 1;
+ break;
+ case IOMMU_HW_INFO_TYPE_NONE:
+ break;
+ }
+
+ return true;
+}
+
+static void hiod_iommufd_vfio_class_init(ObjectClass *oc, void *data)
+{
+ HostIOMMUDeviceClass *hiodc = HOST_IOMMU_DEVICE_CLASS(oc);
+
+ hiodc->realize = hiod_iommufd_vfio_realize;
+};
+
static const TypeInfo types[] = {
{
.name = TYPE_VFIO_IOMMU_IOMMUFD,
@@ -627,6 +663,7 @@ static const TypeInfo types[] = {
}, {
.name = TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO,
.parent = TYPE_HOST_IOMMU_DEVICE_IOMMUFD,
+ .class_init = hiod_iommufd_vfio_class_init,
}
};
--
2.34.1
^ permalink raw reply related [flat|nested] 70+ messages in thread
* Re: [PATCH v6 09/19] vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler
2024-06-03 6:10 ` [PATCH v6 09/19] vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler Zhenzhong Duan
@ 2024-06-03 11:30 ` Eric Auger
2024-06-04 2:58 ` Duan, Zhenzhong
2024-06-06 9:26 ` Eric Auger
1 sibling, 1 reply; 70+ messages in thread
From: Eric Auger @ 2024-06-03 11:30 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, clg, mst, peterx, jasowang, jgg, nicolinc,
joao.m.martins, clement.mathieu--drif, kevin.tian, yi.l.liu,
chao.p.peng, Marcel Apfelbaum
Hi Zhenzhong,
On 6/3/24 08:10, Zhenzhong Duan wrote:
> It calls iommufd_backend_get_device_info() to get host IOMMU
> related information and translate it into HostIOMMUDeviceCaps
> for query with .get_cap().
>
> Introduce macro VTD_MGAW_FROM_CAP to get MGAW which equals to
> (aw_bits - 1).
>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
> include/hw/i386/intel_iommu.h | 1 +
> hw/vfio/iommufd.c | 37 +++++++++++++++++++++++++++++++++++
> 2 files changed, 38 insertions(+)
>
> diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
> index 7fa0a695c8..7d694b0813 100644
> --- a/include/hw/i386/intel_iommu.h
> +++ b/include/hw/i386/intel_iommu.h
> @@ -47,6 +47,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(IntelIOMMUState, INTEL_IOMMU_DEVICE)
> #define VTD_HOST_AW_48BIT 48
> #define VTD_HOST_ADDRESS_WIDTH VTD_HOST_AW_39BIT
> #define VTD_HAW_MASK(aw) ((1ULL << (aw)) - 1)
> +#define VTD_MGAW_FROM_CAP(cap) ((cap >> 16) & 0x3fULL)
>
> #define DMAR_REPORT_F_INTR (1)
>
> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
> index e4a507d55c..9d2e95e20e 100644
> --- a/hw/vfio/iommufd.c
> +++ b/hw/vfio/iommufd.c
> @@ -25,6 +25,7 @@
> #include "qemu/cutils.h"
> #include "qemu/chardev_open.h"
> #include "pci.h"
> +#include "hw/i386/intel_iommu_internal.h"
>
> static int iommufd_cdev_map(const VFIOContainerBase *bcontainer, hwaddr iova,
> ram_addr_t size, void *vaddr, bool readonly)
> @@ -619,6 +620,41 @@ static void vfio_iommu_iommufd_class_init(ObjectClass *klass, void *data)
> vioc->pci_hot_reset = iommufd_cdev_pci_hot_reset;
> };
>
> +static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
> + Error **errp)
> +{
> + VFIODevice *vdev = opaque;
> + HostIOMMUDeviceCaps *caps = &hiod->caps;
> + enum iommu_hw_info_type type;
> + union {
> + struct iommu_hw_info_vtd vtd;
> + } data;
> +
> + if (!iommufd_backend_get_device_info(vdev->iommufd, vdev->devid,
> + &type, &data, sizeof(data), errp)) {
> + return false;
> + }
> +
> + caps->type = type;
> +
> + switch (type) {
> + case IOMMU_HW_INFO_TYPE_INTEL_VTD:
> + caps->aw_bits = VTD_MGAW_FROM_CAP(data.vtd.cap_reg) + 1;
Please can you remind me of why you can't reuse the iova_ranges method.
isn't it generic enough?
> + break;
> + case IOMMU_HW_INFO_TYPE_NONE:
so what about other types?
Eric
> + break;
> + }
> +
> + return true;
> +}
> +
> +static void hiod_iommufd_vfio_class_init(ObjectClass *oc, void *data)
> +{
> + HostIOMMUDeviceClass *hiodc = HOST_IOMMU_DEVICE_CLASS(oc);
> +
> + hiodc->realize = hiod_iommufd_vfio_realize;
> +};
> +
> static const TypeInfo types[] = {
> {
> .name = TYPE_VFIO_IOMMU_IOMMUFD,
> @@ -627,6 +663,7 @@ static const TypeInfo types[] = {
> }, {
> .name = TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO,
> .parent = TYPE_HOST_IOMMU_DEVICE_IOMMUFD,
> + .class_init = hiod_iommufd_vfio_class_init,
> }
> };
>
^ permalink raw reply [flat|nested] 70+ messages in thread
* RE: [PATCH v6 09/19] vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler
2024-06-03 11:30 ` Eric Auger
@ 2024-06-04 2:58 ` Duan, Zhenzhong
2024-06-04 7:31 ` Eric Auger
0 siblings, 1 reply; 70+ messages in thread
From: Duan, Zhenzhong @ 2024-06-04 2:58 UTC (permalink / raw)
To: eric.auger@redhat.com, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
peterx@redhat.com, jasowang@redhat.com, jgg@nvidia.com,
nicolinc@nvidia.com, joao.m.martins@oracle.com,
clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
Peng, Chao P, Marcel Apfelbaum
>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: Re: [PATCH v6 09/19] vfio/iommufd: Implement
>HostIOMMUDeviceClass::realize() handler
>
>Hi Zhenzhong,
>
>On 6/3/24 08:10, Zhenzhong Duan wrote:
>> It calls iommufd_backend_get_device_info() to get host IOMMU
>> related information and translate it into HostIOMMUDeviceCaps
>> for query with .get_cap().
>>
>> Introduce macro VTD_MGAW_FROM_CAP to get MGAW which equals to
>> (aw_bits - 1).
>>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>> include/hw/i386/intel_iommu.h | 1 +
>> hw/vfio/iommufd.c | 37
>+++++++++++++++++++++++++++++++++++
>> 2 files changed, 38 insertions(+)
>>
>> diff --git a/include/hw/i386/intel_iommu.h
>b/include/hw/i386/intel_iommu.h
>> index 7fa0a695c8..7d694b0813 100644
>> --- a/include/hw/i386/intel_iommu.h
>> +++ b/include/hw/i386/intel_iommu.h
>> @@ -47,6 +47,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(IntelIOMMUState,
>INTEL_IOMMU_DEVICE)
>> #define VTD_HOST_AW_48BIT 48
>> #define VTD_HOST_ADDRESS_WIDTH VTD_HOST_AW_39BIT
>> #define VTD_HAW_MASK(aw) ((1ULL << (aw)) - 1)
>> +#define VTD_MGAW_FROM_CAP(cap) ((cap >> 16) & 0x3fULL)
>>
>> #define DMAR_REPORT_F_INTR (1)
>>
>> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>> index e4a507d55c..9d2e95e20e 100644
>> --- a/hw/vfio/iommufd.c
>> +++ b/hw/vfio/iommufd.c
>> @@ -25,6 +25,7 @@
>> #include "qemu/cutils.h"
>> #include "qemu/chardev_open.h"
>> #include "pci.h"
>> +#include "hw/i386/intel_iommu_internal.h"
>>
>> static int iommufd_cdev_map(const VFIOContainerBase *bcontainer,
>hwaddr iova,
>> ram_addr_t size, void *vaddr, bool readonly)
>> @@ -619,6 +620,41 @@ static void
>vfio_iommu_iommufd_class_init(ObjectClass *klass, void *data)
>> vioc->pci_hot_reset = iommufd_cdev_pci_hot_reset;
>> };
>>
>> +static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void
>*opaque,
>> + Error **errp)
>> +{
>> + VFIODevice *vdev = opaque;
>> + HostIOMMUDeviceCaps *caps = &hiod->caps;
>> + enum iommu_hw_info_type type;
>> + union {
>> + struct iommu_hw_info_vtd vtd;
>> + } data;
>> +
>> + if (!iommufd_backend_get_device_info(vdev->iommufd, vdev->devid,
>> + &type, &data, sizeof(data), errp)) {
>> + return false;
>> + }
>> +
>> + caps->type = type;
>> +
>> + switch (type) {
>> + case IOMMU_HW_INFO_TYPE_INTEL_VTD:
>> + caps->aw_bits = VTD_MGAW_FROM_CAP(data.vtd.cap_reg) + 1;
>Please can you remind me of why you can't reuse the iova_ranges method.
>isn't it generic enough?
Yes, iova_ranges method is only for iova_ranges, we want to make
HostIOMMUDevice.get_cap() a common interface.
When we want to pass iova_ranges, we can add HOST_IOMMU_DEVICE_CAP_IOVA_RANGES
and HostIOMMUDevice.iova_ranges.
>> + break;
>> + case IOMMU_HW_INFO_TYPE_NONE:
>so what about other types?
There is no other types for now. When there is, we can easily add the code:
case IOMMU_HW_INFO_TYPE_ARM_SMMU:
caps->aw_bits = xxx;
Thanks
Zhenzhong
>
>Eric
>> + break;
>> + }
>> +
>> + return true;
>> +}
>> +
>> +static void hiod_iommufd_vfio_class_init(ObjectClass *oc, void *data)
>> +{
>> + HostIOMMUDeviceClass *hiodc = HOST_IOMMU_DEVICE_CLASS(oc);
>> +
>> + hiodc->realize = hiod_iommufd_vfio_realize;
>> +};
>> +
>> static const TypeInfo types[] = {
>> {
>> .name = TYPE_VFIO_IOMMU_IOMMUFD,
>> @@ -627,6 +663,7 @@ static const TypeInfo types[] = {
>> }, {
>> .name = TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO,
>> .parent = TYPE_HOST_IOMMU_DEVICE_IOMMUFD,
>> + .class_init = hiod_iommufd_vfio_class_init,
>> }
>> };
>>
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH v6 09/19] vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler
2024-06-04 2:58 ` Duan, Zhenzhong
@ 2024-06-04 7:31 ` Eric Auger
2024-06-04 7:51 ` Duan, Zhenzhong
0 siblings, 1 reply; 70+ messages in thread
From: Eric Auger @ 2024-06-04 7:31 UTC (permalink / raw)
To: Duan, Zhenzhong, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
peterx@redhat.com, jasowang@redhat.com, jgg@nvidia.com,
nicolinc@nvidia.com, joao.m.martins@oracle.com,
clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
Peng, Chao P, Marcel Apfelbaum
On 6/4/24 04:58, Duan, Zhenzhong wrote:
>
>> -----Original Message-----
>> From: Eric Auger <eric.auger@redhat.com>
>> Subject: Re: [PATCH v6 09/19] vfio/iommufd: Implement
>> HostIOMMUDeviceClass::realize() handler
>>
>> Hi Zhenzhong,
>>
>> On 6/3/24 08:10, Zhenzhong Duan wrote:
>>> It calls iommufd_backend_get_device_info() to get host IOMMU
>>> related information and translate it into HostIOMMUDeviceCaps
>>> for query with .get_cap().
>>>
>>> Introduce macro VTD_MGAW_FROM_CAP to get MGAW which equals to
>>> (aw_bits - 1).
>>>
>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>> ---
>>> include/hw/i386/intel_iommu.h | 1 +
>>> hw/vfio/iommufd.c | 37
>> +++++++++++++++++++++++++++++++++++
>>> 2 files changed, 38 insertions(+)
>>>
>>> diff --git a/include/hw/i386/intel_iommu.h
>> b/include/hw/i386/intel_iommu.h
>>> index 7fa0a695c8..7d694b0813 100644
>>> --- a/include/hw/i386/intel_iommu.h
>>> +++ b/include/hw/i386/intel_iommu.h
>>> @@ -47,6 +47,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(IntelIOMMUState,
>> INTEL_IOMMU_DEVICE)
>>> #define VTD_HOST_AW_48BIT 48
>>> #define VTD_HOST_ADDRESS_WIDTH VTD_HOST_AW_39BIT
>>> #define VTD_HAW_MASK(aw) ((1ULL << (aw)) - 1)
>>> +#define VTD_MGAW_FROM_CAP(cap) ((cap >> 16) & 0x3fULL)
>>>
>>> #define DMAR_REPORT_F_INTR (1)
>>>
>>> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>>> index e4a507d55c..9d2e95e20e 100644
>>> --- a/hw/vfio/iommufd.c
>>> +++ b/hw/vfio/iommufd.c
>>> @@ -25,6 +25,7 @@
>>> #include "qemu/cutils.h"
>>> #include "qemu/chardev_open.h"
>>> #include "pci.h"
>>> +#include "hw/i386/intel_iommu_internal.h"
>>>
>>> static int iommufd_cdev_map(const VFIOContainerBase *bcontainer,
>> hwaddr iova,
>>> ram_addr_t size, void *vaddr, bool readonly)
>>> @@ -619,6 +620,41 @@ static void
>> vfio_iommu_iommufd_class_init(ObjectClass *klass, void *data)
>>> vioc->pci_hot_reset = iommufd_cdev_pci_hot_reset;
>>> };
>>>
>>> +static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void
>> *opaque,
>>> + Error **errp)
>>> +{
>>> + VFIODevice *vdev = opaque;
>>> + HostIOMMUDeviceCaps *caps = &hiod->caps;
>>> + enum iommu_hw_info_type type;
>>> + union {
>>> + struct iommu_hw_info_vtd vtd;
>>> + } data;
>>> +
>>> + if (!iommufd_backend_get_device_info(vdev->iommufd, vdev->devid,
>>> + &type, &data, sizeof(data), errp)) {
>>> + return false;
>>> + }
>>> +
>>> + caps->type = type;
>>> +
>>> + switch (type) {
>>> + case IOMMU_HW_INFO_TYPE_INTEL_VTD:
>>> + caps->aw_bits = VTD_MGAW_FROM_CAP(data.vtd.cap_reg) + 1;
>> Please can you remind me of why you can't reuse the iova_ranges method.
>> isn't it generic enough?
> Yes, iova_ranges method is only for iova_ranges, we want to make
> HostIOMMUDevice.get_cap() a common interface.
>
> When we want to pass iova_ranges, we can add HOST_IOMMU_DEVICE_CAP_IOVA_RANGES
> and HostIOMMUDevice.iova_ranges.
I rather meant that iova_ranges is part of VFIOContainerBase and you
could reuse the technics used in hiod_legacy_vfio_realize, relying on a
common helper instead of using
VTD_MGAW_FROM_CAP(data.vtd.cap_reg). Doesn't it work?
>
>>> + break;
>>> + case IOMMU_HW_INFO_TYPE_NONE:
>> so what about other types?
> There is no other types for now. When there is, we can easily add the code
Thanks
Eric
>
> case IOMMU_HW_INFO_TYPE_ARM_SMMU:
> caps->aw_bits = xxx;
>
> Thanks
> Zhenzhong
>
>> Eric
>>> + break;
>>> + }
>>> +
>>> + return true;
>>> +}
>>> +
>>> +static void hiod_iommufd_vfio_class_init(ObjectClass *oc, void *data)
>>> +{
>>> + HostIOMMUDeviceClass *hiodc = HOST_IOMMU_DEVICE_CLASS(oc);
>>> +
>>> + hiodc->realize = hiod_iommufd_vfio_realize;
>>> +};
>>> +
>>> static const TypeInfo types[] = {
>>> {
>>> .name = TYPE_VFIO_IOMMU_IOMMUFD,
>>> @@ -627,6 +663,7 @@ static const TypeInfo types[] = {
>>> }, {
>>> .name = TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO,
>>> .parent = TYPE_HOST_IOMMU_DEVICE_IOMMUFD,
>>> + .class_init = hiod_iommufd_vfio_class_init,
>>> }
>>> };
>>>
^ permalink raw reply [flat|nested] 70+ messages in thread
* RE: [PATCH v6 09/19] vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler
2024-06-04 7:31 ` Eric Auger
@ 2024-06-04 7:51 ` Duan, Zhenzhong
2024-06-04 8:08 ` Eric Auger
0 siblings, 1 reply; 70+ messages in thread
From: Duan, Zhenzhong @ 2024-06-04 7:51 UTC (permalink / raw)
To: eric.auger@redhat.com, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
peterx@redhat.com, jasowang@redhat.com, jgg@nvidia.com,
nicolinc@nvidia.com, joao.m.martins@oracle.com,
clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
Peng, Chao P, Marcel Apfelbaum
>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: Re: [PATCH v6 09/19] vfio/iommufd: Implement
>HostIOMMUDeviceClass::realize() handler
>
>
>
>On 6/4/24 04:58, Duan, Zhenzhong wrote:
>>
>>> -----Original Message-----
>>> From: Eric Auger <eric.auger@redhat.com>
>>> Subject: Re: [PATCH v6 09/19] vfio/iommufd: Implement
>>> HostIOMMUDeviceClass::realize() handler
>>>
>>> Hi Zhenzhong,
>>>
>>> On 6/3/24 08:10, Zhenzhong Duan wrote:
>>>> It calls iommufd_backend_get_device_info() to get host IOMMU
>>>> related information and translate it into HostIOMMUDeviceCaps
>>>> for query with .get_cap().
>>>>
>>>> Introduce macro VTD_MGAW_FROM_CAP to get MGAW which equals
>to
>>>> (aw_bits - 1).
>>>>
>>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>>> ---
>>>> include/hw/i386/intel_iommu.h | 1 +
>>>> hw/vfio/iommufd.c | 37
>>> +++++++++++++++++++++++++++++++++++
>>>> 2 files changed, 38 insertions(+)
>>>>
>>>> diff --git a/include/hw/i386/intel_iommu.h
>>> b/include/hw/i386/intel_iommu.h
>>>> index 7fa0a695c8..7d694b0813 100644
>>>> --- a/include/hw/i386/intel_iommu.h
>>>> +++ b/include/hw/i386/intel_iommu.h
>>>> @@ -47,6 +47,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(IntelIOMMUState,
>>> INTEL_IOMMU_DEVICE)
>>>> #define VTD_HOST_AW_48BIT 48
>>>> #define VTD_HOST_ADDRESS_WIDTH VTD_HOST_AW_39BIT
>>>> #define VTD_HAW_MASK(aw) ((1ULL << (aw)) - 1)
>>>> +#define VTD_MGAW_FROM_CAP(cap) ((cap >> 16) & 0x3fULL)
>>>>
>>>> #define DMAR_REPORT_F_INTR (1)
>>>>
>>>> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>>>> index e4a507d55c..9d2e95e20e 100644
>>>> --- a/hw/vfio/iommufd.c
>>>> +++ b/hw/vfio/iommufd.c
>>>> @@ -25,6 +25,7 @@
>>>> #include "qemu/cutils.h"
>>>> #include "qemu/chardev_open.h"
>>>> #include "pci.h"
>>>> +#include "hw/i386/intel_iommu_internal.h"
>>>>
>>>> static int iommufd_cdev_map(const VFIOContainerBase *bcontainer,
>>> hwaddr iova,
>>>> ram_addr_t size, void *vaddr, bool readonly)
>>>> @@ -619,6 +620,41 @@ static void
>>> vfio_iommu_iommufd_class_init(ObjectClass *klass, void *data)
>>>> vioc->pci_hot_reset = iommufd_cdev_pci_hot_reset;
>>>> };
>>>>
>>>> +static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void
>>> *opaque,
>>>> + Error **errp)
>>>> +{
>>>> + VFIODevice *vdev = opaque;
>>>> + HostIOMMUDeviceCaps *caps = &hiod->caps;
>>>> + enum iommu_hw_info_type type;
>>>> + union {
>>>> + struct iommu_hw_info_vtd vtd;
>>>> + } data;
>>>> +
>>>> + if (!iommufd_backend_get_device_info(vdev->iommufd, vdev-
>>devid,
>>>> + &type, &data, sizeof(data), errp)) {
>>>> + return false;
>>>> + }
>>>> +
>>>> + caps->type = type;
>>>> +
>>>> + switch (type) {
>>>> + case IOMMU_HW_INFO_TYPE_INTEL_VTD:
>>>> + caps->aw_bits = VTD_MGAW_FROM_CAP(data.vtd.cap_reg) + 1;
>>> Please can you remind me of why you can't reuse the iova_ranges
>method.
>>> isn't it generic enough?
>> Yes, iova_ranges method is only for iova_ranges, we want to make
>> HostIOMMUDevice.get_cap() a common interface.
>>
>> When we want to pass iova_ranges, we can add
>HOST_IOMMU_DEVICE_CAP_IOVA_RANGES
>> and HostIOMMUDevice.iova_ranges.
>
>I rather meant that iova_ranges is part of VFIOContainerBase and you
>could reuse the technics used in hiod_legacy_vfio_realize, relying on a
>common helper instead of using
>
>VTD_MGAW_FROM_CAP(data.vtd.cap_reg). Doesn't it work?
Get your point.
Yes, It does work and should have same result.
That means iommufd backend support two ways to get aw_bits.
Only reason is I feel VTD_MGAW_FROM_CAP(data.vtd.cap_reg) is a bit simpler
and there are other bits picked in nesting series, see:
case IOMMU_HW_INFO_TYPE_INTEL_VTD:
caps->aw_bits = VTD_MGAW_FROM_CAP(data.vtd.cap_reg) + 1;
caps->nesting = !!(data.vtd.ecap_reg & VTD_ECAP_NEST);
caps->fs1gp = !!(data.vtd.cap_reg & VTD_CAP_FS1GP);
caps->errata = data.vtd.flags & IOMMU_HW_INFO_VTD_ERRATA_772415_SPR17;
I'm fine to use iova_ranges to calculate aw_bits for iommufd backend if you prefer that.
Thanks
Zhenzhong
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH v6 09/19] vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler
2024-06-04 7:51 ` Duan, Zhenzhong
@ 2024-06-04 8:08 ` Eric Auger
2024-06-04 8:39 ` Duan, Zhenzhong
0 siblings, 1 reply; 70+ messages in thread
From: Eric Auger @ 2024-06-04 8:08 UTC (permalink / raw)
To: Duan, Zhenzhong, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
peterx@redhat.com, jasowang@redhat.com, jgg@nvidia.com,
nicolinc@nvidia.com, joao.m.martins@oracle.com,
clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
Peng, Chao P, Marcel Apfelbaum
Hi,
On 6/4/24 09:51, Duan, Zhenzhong wrote:
>
>> -----Original Message-----
>> From: Eric Auger <eric.auger@redhat.com>
>> Subject: Re: [PATCH v6 09/19] vfio/iommufd: Implement
>> HostIOMMUDeviceClass::realize() handler
>>
>>
>>
>> On 6/4/24 04:58, Duan, Zhenzhong wrote:
>>>> -----Original Message-----
>>>> From: Eric Auger <eric.auger@redhat.com>
>>>> Subject: Re: [PATCH v6 09/19] vfio/iommufd: Implement
>>>> HostIOMMUDeviceClass::realize() handler
>>>>
>>>> Hi Zhenzhong,
>>>>
>>>> On 6/3/24 08:10, Zhenzhong Duan wrote:
>>>>> It calls iommufd_backend_get_device_info() to get host IOMMU
>>>>> related information and translate it into HostIOMMUDeviceCaps
>>>>> for query with .get_cap().
>>>>>
>>>>> Introduce macro VTD_MGAW_FROM_CAP to get MGAW which equals
>> to
>>>>> (aw_bits - 1).
>>>>>
>>>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>>>> ---
>>>>> include/hw/i386/intel_iommu.h | 1 +
>>>>> hw/vfio/iommufd.c | 37
>>>> +++++++++++++++++++++++++++++++++++
>>>>> 2 files changed, 38 insertions(+)
>>>>>
>>>>> diff --git a/include/hw/i386/intel_iommu.h
>>>> b/include/hw/i386/intel_iommu.h
>>>>> index 7fa0a695c8..7d694b0813 100644
>>>>> --- a/include/hw/i386/intel_iommu.h
>>>>> +++ b/include/hw/i386/intel_iommu.h
>>>>> @@ -47,6 +47,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(IntelIOMMUState,
>>>> INTEL_IOMMU_DEVICE)
>>>>> #define VTD_HOST_AW_48BIT 48
>>>>> #define VTD_HOST_ADDRESS_WIDTH VTD_HOST_AW_39BIT
>>>>> #define VTD_HAW_MASK(aw) ((1ULL << (aw)) - 1)
>>>>> +#define VTD_MGAW_FROM_CAP(cap) ((cap >> 16) & 0x3fULL)
>>>>>
>>>>> #define DMAR_REPORT_F_INTR (1)
>>>>>
>>>>> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>>>>> index e4a507d55c..9d2e95e20e 100644
>>>>> --- a/hw/vfio/iommufd.c
>>>>> +++ b/hw/vfio/iommufd.c
>>>>> @@ -25,6 +25,7 @@
>>>>> #include "qemu/cutils.h"
>>>>> #include "qemu/chardev_open.h"
>>>>> #include "pci.h"
>>>>> +#include "hw/i386/intel_iommu_internal.h"
>>>>>
>>>>> static int iommufd_cdev_map(const VFIOContainerBase *bcontainer,
>>>> hwaddr iova,
>>>>> ram_addr_t size, void *vaddr, bool readonly)
>>>>> @@ -619,6 +620,41 @@ static void
>>>> vfio_iommu_iommufd_class_init(ObjectClass *klass, void *data)
>>>>> vioc->pci_hot_reset = iommufd_cdev_pci_hot_reset;
>>>>> };
>>>>>
>>>>> +static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void
>>>> *opaque,
>>>>> + Error **errp)
>>>>> +{
>>>>> + VFIODevice *vdev = opaque;
>>>>> + HostIOMMUDeviceCaps *caps = &hiod->caps;
>>>>> + enum iommu_hw_info_type type;
>>>>> + union {
>>>>> + struct iommu_hw_info_vtd vtd;
>>>>> + } data;
>>>>> +
>>>>> + if (!iommufd_backend_get_device_info(vdev->iommufd, vdev-
>>> devid,
>>>>> + &type, &data, sizeof(data), errp)) {
>>>>> + return false;
>>>>> + }
>>>>> +
>>>>> + caps->type = type;
>>>>> +
>>>>> + switch (type) {
>>>>> + case IOMMU_HW_INFO_TYPE_INTEL_VTD:
>>>>> + caps->aw_bits = VTD_MGAW_FROM_CAP(data.vtd.cap_reg) + 1;
>>>> Please can you remind me of why you can't reuse the iova_ranges
>> method.
>>>> isn't it generic enough?
>>> Yes, iova_ranges method is only for iova_ranges, we want to make
>>> HostIOMMUDevice.get_cap() a common interface.
>>>
>>> When we want to pass iova_ranges, we can add
>> HOST_IOMMU_DEVICE_CAP_IOVA_RANGES
>>> and HostIOMMUDevice.iova_ranges.
>> I rather meant that iova_ranges is part of VFIOContainerBase and you
>> could reuse the technics used in hiod_legacy_vfio_realize, relying on a
>> common helper instead of using
>>
>> VTD_MGAW_FROM_CAP(data.vtd.cap_reg). Doesn't it work?
> Get your point.
> Yes, It does work and should have same result.
> That means iommufd backend support two ways to get aw_bits.
>
> Only reason is I feel VTD_MGAW_FROM_CAP(data.vtd.cap_reg) is a bit simpler
> and there are other bits picked in nesting series, see:
>
> case IOMMU_HW_INFO_TYPE_INTEL_VTD:
> caps->aw_bits = VTD_MGAW_FROM_CAP(data.vtd.cap_reg) + 1;
> caps->nesting = !!(data.vtd.ecap_reg & VTD_ECAP_NEST);
> caps->fs1gp = !!(data.vtd.cap_reg & VTD_CAP_FS1GP);
> caps->errata = data.vtd.flags & IOMMU_HW_INFO_VTD_ERRATA_772415_SPR17;
>
> I'm fine to use iova_ranges to calculate aw_bits for iommufd backend if you prefer that.
Yes I think I would prefer because this technics also work for other
iommus and not only VTD. It also can rely on common code between legacy
and iommufd. The nesting series can bring the rest later
Eric
>
> Thanks
> Zhenzhong
^ permalink raw reply [flat|nested] 70+ messages in thread
* RE: [PATCH v6 09/19] vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler
2024-06-04 8:08 ` Eric Auger
@ 2024-06-04 8:39 ` Duan, Zhenzhong
0 siblings, 0 replies; 70+ messages in thread
From: Duan, Zhenzhong @ 2024-06-04 8:39 UTC (permalink / raw)
To: eric.auger@redhat.com, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
peterx@redhat.com, jasowang@redhat.com, jgg@nvidia.com,
nicolinc@nvidia.com, joao.m.martins@oracle.com,
clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
Peng, Chao P, Marcel Apfelbaum
>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: Re: [PATCH v6 09/19] vfio/iommufd: Implement
>HostIOMMUDeviceClass::realize() handler
>
>
>Hi,
>On 6/4/24 09:51, Duan, Zhenzhong wrote:
>>
>>> -----Original Message-----
>>> From: Eric Auger <eric.auger@redhat.com>
>>> Subject: Re: [PATCH v6 09/19] vfio/iommufd: Implement
>>> HostIOMMUDeviceClass::realize() handler
>>>
>>>
>>>
>>> On 6/4/24 04:58, Duan, Zhenzhong wrote:
>>>>> -----Original Message-----
>>>>> From: Eric Auger <eric.auger@redhat.com>
>>>>> Subject: Re: [PATCH v6 09/19] vfio/iommufd: Implement
>>>>> HostIOMMUDeviceClass::realize() handler
>>>>>
>>>>> Hi Zhenzhong,
>>>>>
>>>>> On 6/3/24 08:10, Zhenzhong Duan wrote:
>>>>>> It calls iommufd_backend_get_device_info() to get host IOMMU
>>>>>> related information and translate it into HostIOMMUDeviceCaps
>>>>>> for query with .get_cap().
>>>>>>
>>>>>> Introduce macro VTD_MGAW_FROM_CAP to get MGAW which
>equals
>>> to
>>>>>> (aw_bits - 1).
>>>>>>
>>>>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>>>>> ---
>>>>>> include/hw/i386/intel_iommu.h | 1 +
>>>>>> hw/vfio/iommufd.c | 37
>>>>> +++++++++++++++++++++++++++++++++++
>>>>>> 2 files changed, 38 insertions(+)
>>>>>>
>>>>>> diff --git a/include/hw/i386/intel_iommu.h
>>>>> b/include/hw/i386/intel_iommu.h
>>>>>> index 7fa0a695c8..7d694b0813 100644
>>>>>> --- a/include/hw/i386/intel_iommu.h
>>>>>> +++ b/include/hw/i386/intel_iommu.h
>>>>>> @@ -47,6 +47,7 @@
>OBJECT_DECLARE_SIMPLE_TYPE(IntelIOMMUState,
>>>>> INTEL_IOMMU_DEVICE)
>>>>>> #define VTD_HOST_AW_48BIT 48
>>>>>> #define VTD_HOST_ADDRESS_WIDTH VTD_HOST_AW_39BIT
>>>>>> #define VTD_HAW_MASK(aw) ((1ULL << (aw)) - 1)
>>>>>> +#define VTD_MGAW_FROM_CAP(cap) ((cap >> 16) & 0x3fULL)
>>>>>>
>>>>>> #define DMAR_REPORT_F_INTR (1)
>>>>>>
>>>>>> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>>>>>> index e4a507d55c..9d2e95e20e 100644
>>>>>> --- a/hw/vfio/iommufd.c
>>>>>> +++ b/hw/vfio/iommufd.c
>>>>>> @@ -25,6 +25,7 @@
>>>>>> #include "qemu/cutils.h"
>>>>>> #include "qemu/chardev_open.h"
>>>>>> #include "pci.h"
>>>>>> +#include "hw/i386/intel_iommu_internal.h"
>>>>>>
>>>>>> static int iommufd_cdev_map(const VFIOContainerBase *bcontainer,
>>>>> hwaddr iova,
>>>>>> ram_addr_t size, void *vaddr, bool readonly)
>>>>>> @@ -619,6 +620,41 @@ static void
>>>>> vfio_iommu_iommufd_class_init(ObjectClass *klass, void *data)
>>>>>> vioc->pci_hot_reset = iommufd_cdev_pci_hot_reset;
>>>>>> };
>>>>>>
>>>>>> +static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod,
>void
>>>>> *opaque,
>>>>>> + Error **errp)
>>>>>> +{
>>>>>> + VFIODevice *vdev = opaque;
>>>>>> + HostIOMMUDeviceCaps *caps = &hiod->caps;
>>>>>> + enum iommu_hw_info_type type;
>>>>>> + union {
>>>>>> + struct iommu_hw_info_vtd vtd;
>>>>>> + } data;
>>>>>> +
>>>>>> + if (!iommufd_backend_get_device_info(vdev->iommufd, vdev-
>>>> devid,
>>>>>> + &type, &data, sizeof(data), errp)) {
>>>>>> + return false;
>>>>>> + }
>>>>>> +
>>>>>> + caps->type = type;
>>>>>> +
>>>>>> + switch (type) {
>>>>>> + case IOMMU_HW_INFO_TYPE_INTEL_VTD:
>>>>>> + caps->aw_bits = VTD_MGAW_FROM_CAP(data.vtd.cap_reg) +
>1;
>>>>> Please can you remind me of why you can't reuse the iova_ranges
>>> method.
>>>>> isn't it generic enough?
>>>> Yes, iova_ranges method is only for iova_ranges, we want to make
>>>> HostIOMMUDevice.get_cap() a common interface.
>>>>
>>>> When we want to pass iova_ranges, we can add
>>> HOST_IOMMU_DEVICE_CAP_IOVA_RANGES
>>>> and HostIOMMUDevice.iova_ranges.
>>> I rather meant that iova_ranges is part of VFIOContainerBase and you
>>> could reuse the technics used in hiod_legacy_vfio_realize, relying on a
>>> common helper instead of using
>>>
>>> VTD_MGAW_FROM_CAP(data.vtd.cap_reg). Doesn't it work?
>> Get your point.
>> Yes, It does work and should have same result.
>> That means iommufd backend support two ways to get aw_bits.
>>
>> Only reason is I feel VTD_MGAW_FROM_CAP(data.vtd.cap_reg) is a bit
>simpler
>> and there are other bits picked in nesting series, see:
>>
>> case IOMMU_HW_INFO_TYPE_INTEL_VTD:
>> caps->aw_bits = VTD_MGAW_FROM_CAP(data.vtd.cap_reg) + 1;
>> caps->nesting = !!(data.vtd.ecap_reg & VTD_ECAP_NEST);
>> caps->fs1gp = !!(data.vtd.cap_reg & VTD_CAP_FS1GP);
>> caps->errata = data.vtd.flags &
>IOMMU_HW_INFO_VTD_ERRATA_772415_SPR17;
>>
>> I'm fine to use iova_ranges to calculate aw_bits for iommufd backend if
>you prefer that.
>Yes I think I would prefer because this technics also work for other
>iommus and not only VTD. It also can rely on common code between legacy
>and iommufd. The nesting series can bring the rest later
OK, will do.
Thanks
Zhenzhong
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH v6 09/19] vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler
2024-06-03 6:10 ` [PATCH v6 09/19] vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler Zhenzhong Duan
2024-06-03 11:30 ` Eric Auger
@ 2024-06-06 9:26 ` Eric Auger
2024-06-06 9:32 ` Eric Auger
1 sibling, 1 reply; 70+ messages in thread
From: Eric Auger @ 2024-06-06 9:26 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, clg, mst, peterx, jasowang, jgg, nicolinc,
joao.m.martins, clement.mathieu--drif, kevin.tian, yi.l.liu,
chao.p.peng, Marcel Apfelbaum
Hi Zhenzhong,
On 6/3/24 08:10, Zhenzhong Duan wrote:
> It calls iommufd_backend_get_device_info() to get host IOMMU
> related information and translate it into HostIOMMUDeviceCaps
> for query with .get_cap().
>
> Introduce macro VTD_MGAW_FROM_CAP to get MGAW which equals to
> (aw_bits - 1).
>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
> include/hw/i386/intel_iommu.h | 1 +
> hw/vfio/iommufd.c | 37 +++++++++++++++++++++++++++++++++++
> 2 files changed, 38 insertions(+)
>
> diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
> index 7fa0a695c8..7d694b0813 100644
> --- a/include/hw/i386/intel_iommu.h
> +++ b/include/hw/i386/intel_iommu.h
> @@ -47,6 +47,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(IntelIOMMUState, INTEL_IOMMU_DEVICE)
> #define VTD_HOST_AW_48BIT 48
> #define VTD_HOST_ADDRESS_WIDTH VTD_HOST_AW_39BIT
> #define VTD_HAW_MASK(aw) ((1ULL << (aw)) - 1)
> +#define VTD_MGAW_FROM_CAP(cap) ((cap >> 16) & 0x3fULL)
>
> #define DMAR_REPORT_F_INTR (1)
>
> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
> index e4a507d55c..9d2e95e20e 100644
> --- a/hw/vfio/iommufd.c
> +++ b/hw/vfio/iommufd.c
> @@ -25,6 +25,7 @@
> #include "qemu/cutils.h"
> #include "qemu/chardev_open.h"
> #include "pci.h"
> +#include "hw/i386/intel_iommu_internal.h"
>
> static int iommufd_cdev_map(const VFIOContainerBase *bcontainer, hwaddr iova,
> ram_addr_t size, void *vaddr, bool readonly)
> @@ -619,6 +620,41 @@ static void vfio_iommu_iommufd_class_init(ObjectClass *klass, void *data)
> vioc->pci_hot_reset = iommufd_cdev_pci_hot_reset;
> };
>
> +static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
> + Error **errp)
> +{
> + VFIODevice *vdev = opaque;
I think it would make sense to store vdev in hiod. This would allow to
postpone some computations in the HostIOMMUDevice ops instead of doing
everything in the realize.
For instance to retrieve the usable iova_ranges I will need to access
the base container in the associated ops.
Thanks
Eric
> + HostIOMMUDeviceCaps *caps = &hiod->caps;
> + enum iommu_hw_info_type type;
> + union {
> + struct iommu_hw_info_vtd vtd;
> + } data;
> +
> + if (!iommufd_backend_get_device_info(vdev->iommufd, vdev->devid,
> + &type, &data, sizeof(data), errp)) {
> + return false;
> + }
> +
> + caps->type = type;
> +
> + switch (type) {
> + case IOMMU_HW_INFO_TYPE_INTEL_VTD:
> + caps->aw_bits = VTD_MGAW_FROM_CAP(data.vtd.cap_reg) + 1;
> + break;
> + case IOMMU_HW_INFO_TYPE_NONE:
> + break;
> + }
> +
> + return true;
> +}
> +
> +static void hiod_iommufd_vfio_class_init(ObjectClass *oc, void *data)
> +{
> + HostIOMMUDeviceClass *hiodc = HOST_IOMMU_DEVICE_CLASS(oc);
> +
> + hiodc->realize = hiod_iommufd_vfio_realize;
> +};
> +
> static const TypeInfo types[] = {
> {
> .name = TYPE_VFIO_IOMMU_IOMMUFD,
> @@ -627,6 +663,7 @@ static const TypeInfo types[] = {
> }, {
> .name = TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO,
> .parent = TYPE_HOST_IOMMU_DEVICE_IOMMUFD,
> + .class_init = hiod_iommufd_vfio_class_init,
> }
> };
>
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH v6 09/19] vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler
2024-06-06 9:26 ` Eric Auger
@ 2024-06-06 9:32 ` Eric Auger
2024-06-06 10:19 ` Duan, Zhenzhong
0 siblings, 1 reply; 70+ messages in thread
From: Eric Auger @ 2024-06-06 9:32 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, clg, mst, peterx, jasowang, jgg, nicolinc,
joao.m.martins, clement.mathieu--drif, kevin.tian, yi.l.liu,
chao.p.peng, Marcel Apfelbaum
On 6/6/24 11:26, Eric Auger wrote:
> Hi Zhenzhong,
> On 6/3/24 08:10, Zhenzhong Duan wrote:
>> It calls iommufd_backend_get_device_info() to get host IOMMU
>> related information and translate it into HostIOMMUDeviceCaps
>> for query with .get_cap().
>>
>> Introduce macro VTD_MGAW_FROM_CAP to get MGAW which equals to
>> (aw_bits - 1).
>>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>> include/hw/i386/intel_iommu.h | 1 +
>> hw/vfio/iommufd.c | 37 +++++++++++++++++++++++++++++++++++
>> 2 files changed, 38 insertions(+)
>>
>> diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
>> index 7fa0a695c8..7d694b0813 100644
>> --- a/include/hw/i386/intel_iommu.h
>> +++ b/include/hw/i386/intel_iommu.h
>> @@ -47,6 +47,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(IntelIOMMUState, INTEL_IOMMU_DEVICE)
>> #define VTD_HOST_AW_48BIT 48
>> #define VTD_HOST_ADDRESS_WIDTH VTD_HOST_AW_39BIT
>> #define VTD_HAW_MASK(aw) ((1ULL << (aw)) - 1)
>> +#define VTD_MGAW_FROM_CAP(cap) ((cap >> 16) & 0x3fULL)
>>
>> #define DMAR_REPORT_F_INTR (1)
>>
>> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>> index e4a507d55c..9d2e95e20e 100644
>> --- a/hw/vfio/iommufd.c
>> +++ b/hw/vfio/iommufd.c
>> @@ -25,6 +25,7 @@
>> #include "qemu/cutils.h"
>> #include "qemu/chardev_open.h"
>> #include "pci.h"
>> +#include "hw/i386/intel_iommu_internal.h"
>>
>> static int iommufd_cdev_map(const VFIOContainerBase *bcontainer, hwaddr iova,
>> ram_addr_t size, void *vaddr, bool readonly)
>> @@ -619,6 +620,41 @@ static void vfio_iommu_iommufd_class_init(ObjectClass *klass, void *data)
>> vioc->pci_hot_reset = iommufd_cdev_pci_hot_reset;
>> };
>>
>> +static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
>> + Error **errp)
>> +{
>> + VFIODevice *vdev = opaque;
> I think it would make sense to store vdev in hiod. This would allow to
> postpone some computations in the HostIOMMUDevice ops instead of doing
> everything in the realize.
> For instance to retrieve the usable iova_ranges I will need to access
> the base container in the associated ops.
this would need to be opaque since the agent device can be either
VFIODevice or VDPA object though
Eric
>
> Thanks
>
> Eric
>> + HostIOMMUDeviceCaps *caps = &hiod->caps;
>> + enum iommu_hw_info_type type;
>> + union {
>> + struct iommu_hw_info_vtd vtd;
>> + } data;
>> +
>> + if (!iommufd_backend_get_device_info(vdev->iommufd, vdev->devid,
>> + &type, &data, sizeof(data), errp)) {
>> + return false;
>> + }
>> +
>> + caps->type = type;
>> +
>> + switch (type) {
>> + case IOMMU_HW_INFO_TYPE_INTEL_VTD:
>> + caps->aw_bits = VTD_MGAW_FROM_CAP(data.vtd.cap_reg) + 1;
>> + break;
>> + case IOMMU_HW_INFO_TYPE_NONE:
>> + break;
>> + }
>> +
>> + return true;
>> +}
>> +
>> +static void hiod_iommufd_vfio_class_init(ObjectClass *oc, void *data)
>> +{
>> + HostIOMMUDeviceClass *hiodc = HOST_IOMMU_DEVICE_CLASS(oc);
>> +
>> + hiodc->realize = hiod_iommufd_vfio_realize;
>> +};
>> +
>> static const TypeInfo types[] = {
>> {
>> .name = TYPE_VFIO_IOMMU_IOMMUFD,
>> @@ -627,6 +663,7 @@ static const TypeInfo types[] = {
>> }, {
>> .name = TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO,
>> .parent = TYPE_HOST_IOMMU_DEVICE_IOMMUFD,
>> + .class_init = hiod_iommufd_vfio_class_init,
>> }
>> };
>>
>
^ permalink raw reply [flat|nested] 70+ messages in thread
* RE: [PATCH v6 09/19] vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler
2024-06-06 9:32 ` Eric Auger
@ 2024-06-06 10:19 ` Duan, Zhenzhong
0 siblings, 0 replies; 70+ messages in thread
From: Duan, Zhenzhong @ 2024-06-06 10:19 UTC (permalink / raw)
To: eric.auger@redhat.com, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
peterx@redhat.com, jasowang@redhat.com, jgg@nvidia.com,
nicolinc@nvidia.com, joao.m.martins@oracle.com,
clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
Peng, Chao P, Marcel Apfelbaum
Hi Eric,
>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: Re: [PATCH v6 09/19] vfio/iommufd: Implement
>HostIOMMUDeviceClass::realize() handler
>
>
>
>On 6/6/24 11:26, Eric Auger wrote:
>> Hi Zhenzhong,
>> On 6/3/24 08:10, Zhenzhong Duan wrote:
>>> It calls iommufd_backend_get_device_info() to get host IOMMU
>>> related information and translate it into HostIOMMUDeviceCaps
>>> for query with .get_cap().
>>>
>>> Introduce macro VTD_MGAW_FROM_CAP to get MGAW which equals to
>>> (aw_bits - 1).
>>>
>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>> ---
>>> include/hw/i386/intel_iommu.h | 1 +
>>> hw/vfio/iommufd.c | 37
>+++++++++++++++++++++++++++++++++++
>>> 2 files changed, 38 insertions(+)
>>>
>>> diff --git a/include/hw/i386/intel_iommu.h
>b/include/hw/i386/intel_iommu.h
>>> index 7fa0a695c8..7d694b0813 100644
>>> --- a/include/hw/i386/intel_iommu.h
>>> +++ b/include/hw/i386/intel_iommu.h
>>> @@ -47,6 +47,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(IntelIOMMUState,
>INTEL_IOMMU_DEVICE)
>>> #define VTD_HOST_AW_48BIT 48
>>> #define VTD_HOST_ADDRESS_WIDTH VTD_HOST_AW_39BIT
>>> #define VTD_HAW_MASK(aw) ((1ULL << (aw)) - 1)
>>> +#define VTD_MGAW_FROM_CAP(cap) ((cap >> 16) & 0x3fULL)
>>>
>>> #define DMAR_REPORT_F_INTR (1)
>>>
>>> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>>> index e4a507d55c..9d2e95e20e 100644
>>> --- a/hw/vfio/iommufd.c
>>> +++ b/hw/vfio/iommufd.c
>>> @@ -25,6 +25,7 @@
>>> #include "qemu/cutils.h"
>>> #include "qemu/chardev_open.h"
>>> #include "pci.h"
>>> +#include "hw/i386/intel_iommu_internal.h"
>>>
>>> static int iommufd_cdev_map(const VFIOContainerBase *bcontainer,
>hwaddr iova,
>>> ram_addr_t size, void *vaddr, bool readonly)
>>> @@ -619,6 +620,41 @@ static void
>vfio_iommu_iommufd_class_init(ObjectClass *klass, void *data)
>>> vioc->pci_hot_reset = iommufd_cdev_pci_hot_reset;
>>> };
>>>
>>> +static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void
>*opaque,
>>> + Error **errp)
>>> +{
>>> + VFIODevice *vdev = opaque;
>> I think it would make sense to store vdev in hiod. This would allow to
>> postpone some computations in the HostIOMMUDevice ops instead of
>doing
>> everything in the realize.
>> For instance to retrieve the usable iova_ranges I will need to access
>> the base container in the associated ops.
>
>this would need to be opaque since the agent device can be either
>VFIODevice or VDPA object though
This will give vIOMMU access to all VFIODevice or VDPA object elements
and I'm not sure if VDPA supports iova_ranges.
What about exposing only what we need, like below.
If VDPA doesn't support iova_ranges, get_cap() should return 0.
--- a/include/sysemu/host_iommu_device.h
+++ b/include/sysemu/host_iommu_device.h
@@ -32,6 +32,7 @@ typedef struct HostIOMMUDeviceCaps {
bool nesting;
bool fs1gp;
uint32_t errata;
+ GList *iova_ranges;
} HostIOMMUDeviceCaps;
#define TYPE_HOST_IOMMU_DEVICE "host-iommu-device"
@@ -96,6 +97,7 @@ struct HostIOMMUDeviceClass {
#define HOST_IOMMU_DEVICE_CAP_NESTING 2
#define HOST_IOMMU_DEVICE_CAP_FS1GP 3
#define HOST_IOMMU_DEVICE_CAP_ERRATA 4
+#define HOST_IOMMU_DEVICE_CAP_IOVA_RANGES 5
/**
* enum host_iommu_device_iommu_hw_info_type - IOMMU Hardware Info Types
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 26e6f7fb4f..4c3e9e45c3 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -1145,6 +1145,7 @@ static bool hiod_legacy_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
hiod->name = g_strdup(vdev->name);
hiod->caps.aw_bits = vfio_device_get_aw_bits(vdev);
+ hiod->caps.iova_ranges = vdev->bcontainer->iova_ranges;
return true;
}
@@ -1157,6 +1158,8 @@ static int hiod_legacy_vfio_get_cap(HostIOMMUDevice *hiod, int cap,
switch (cap) {
case HOST_IOMMU_DEVICE_CAP_AW_BITS:
return caps->aw_bits;
+ case HOST_IOMMU_DEVICE_CAP_IOVA_RANGES:
+ return 1;
default:
error_setg(errp, "%s: unsupported capability %x", hiod->name, cap);
return -EINVAL;
Thanks
Zhenzhong
^ permalink raw reply related [flat|nested] 70+ messages in thread
* [PATCH v6 10/19] vfio/container: Implement HostIOMMUDeviceClass::get_cap() handler
2024-06-03 6:10 [PATCH v6 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
` (8 preceding siblings ...)
2024-06-03 6:10 ` [PATCH v6 09/19] vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler Zhenzhong Duan
@ 2024-06-03 6:10 ` Zhenzhong Duan
2024-06-03 6:10 ` [PATCH v6 11/19] backends/iommufd: " Zhenzhong Duan
` (9 subsequent siblings)
19 siblings, 0 replies; 70+ messages in thread
From: Zhenzhong Duan @ 2024-06-03 6:10 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
yi.l.liu, chao.p.peng, Zhenzhong Duan
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
hw/vfio/container.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 48800fe92f..a46c275a88 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -1154,11 +1154,26 @@ static bool hiod_legacy_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
return true;
}
+static int hiod_legacy_vfio_get_cap(HostIOMMUDevice *hiod, int cap,
+ Error **errp)
+{
+ HostIOMMUDeviceCaps *caps = &hiod->caps;
+
+ switch (cap) {
+ case HOST_IOMMU_DEVICE_CAP_AW_BITS:
+ return caps->aw_bits;
+ default:
+ error_setg(errp, "Not support get cap %x", cap);
+ return -EINVAL;
+ }
+}
+
static void hiod_legacy_vfio_class_init(ObjectClass *oc, void *data)
{
HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
hioc->realize = hiod_legacy_vfio_realize;
+ hioc->get_cap = hiod_legacy_vfio_get_cap;
};
static const TypeInfo types[] = {
--
2.34.1
^ permalink raw reply related [flat|nested] 70+ messages in thread
* [PATCH v6 11/19] backends/iommufd: Implement HostIOMMUDeviceClass::get_cap() handler
2024-06-03 6:10 [PATCH v6 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
` (9 preceding siblings ...)
2024-06-03 6:10 ` [PATCH v6 10/19] vfio/container: Implement HostIOMMUDeviceClass::get_cap() handler Zhenzhong Duan
@ 2024-06-03 6:10 ` Zhenzhong Duan
2024-06-03 11:32 ` Eric Auger
2024-06-03 6:10 ` [PATCH v6 12/19] vfio: Introduce VFIOIOMMUClass::hiod_typename attribute Zhenzhong Duan
` (8 subsequent siblings)
19 siblings, 1 reply; 70+ messages in thread
From: Zhenzhong Duan @ 2024-06-03 6:10 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
yi.l.liu, chao.p.peng, Zhenzhong Duan
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
backends/iommufd.c | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)
diff --git a/backends/iommufd.c b/backends/iommufd.c
index c7e969d6f7..f2f7a762a0 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -230,6 +230,28 @@ bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
return true;
}
+static int hiod_iommufd_get_cap(HostIOMMUDevice *hiod, int cap, Error **errp)
+{
+ HostIOMMUDeviceCaps *caps = &hiod->caps;
+
+ switch (cap) {
+ case HOST_IOMMU_DEVICE_CAP_IOMMU_TYPE:
+ return caps->type;
+ case HOST_IOMMU_DEVICE_CAP_AW_BITS:
+ return caps->aw_bits;
+ default:
+ error_setg(errp, "Not support get cap %x", cap);
+ return -EINVAL;
+ }
+}
+
+static void hiod_iommufd_class_init(ObjectClass *oc, void *data)
+{
+ HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
+
+ hioc->get_cap = hiod_iommufd_get_cap;
+};
+
static const TypeInfo types[] = {
{
.name = TYPE_IOMMUFD_BACKEND,
@@ -246,6 +268,7 @@ static const TypeInfo types[] = {
}, {
.name = TYPE_HOST_IOMMU_DEVICE_IOMMUFD,
.parent = TYPE_HOST_IOMMU_DEVICE,
+ .class_init = hiod_iommufd_class_init,
.abstract = true,
}
};
--
2.34.1
^ permalink raw reply related [flat|nested] 70+ messages in thread
* Re: [PATCH v6 11/19] backends/iommufd: Implement HostIOMMUDeviceClass::get_cap() handler
2024-06-03 6:10 ` [PATCH v6 11/19] backends/iommufd: " Zhenzhong Duan
@ 2024-06-03 11:32 ` Eric Auger
2024-06-03 12:35 ` Cédric Le Goater
0 siblings, 1 reply; 70+ messages in thread
From: Eric Auger @ 2024-06-03 11:32 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, clg, mst, peterx, jasowang, jgg, nicolinc,
joao.m.martins, clement.mathieu--drif, kevin.tian, yi.l.liu,
chao.p.peng
On 6/3/24 08:10, Zhenzhong Duan wrote:
> Suggested-by: Cédric Le Goater <clg@redhat.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
> backends/iommufd.c | 23 +++++++++++++++++++++++
> 1 file changed, 23 insertions(+)
>
> diff --git a/backends/iommufd.c b/backends/iommufd.c
> index c7e969d6f7..f2f7a762a0 100644
> --- a/backends/iommufd.c
> +++ b/backends/iommufd.c
> @@ -230,6 +230,28 @@ bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
> return true;
> }
>
> +static int hiod_iommufd_get_cap(HostIOMMUDevice *hiod, int cap, Error **errp)
> +{
> + HostIOMMUDeviceCaps *caps = &hiod->caps;
> +
> + switch (cap) {
> + case HOST_IOMMU_DEVICE_CAP_IOMMU_TYPE:
> + return caps->type;
> + case HOST_IOMMU_DEVICE_CAP_AW_BITS:
> + return caps->aw_bits;
> + default:
> + error_setg(errp, "Not support get cap %x", cap);
can't you add details about the faulting HostIOMMUDevice by tracing the
devid for instance?
I would rephrase the error message into No support for capability 0x%x
Eric
> + return -EINVAL;
> + }
> +}
> +
> +static void hiod_iommufd_class_init(ObjectClass *oc, void *data)
> +{
> + HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
> +
> + hioc->get_cap = hiod_iommufd_get_cap;
> +};
> +
> static const TypeInfo types[] = {
> {
> .name = TYPE_IOMMUFD_BACKEND,
> @@ -246,6 +268,7 @@ static const TypeInfo types[] = {
> }, {
> .name = TYPE_HOST_IOMMU_DEVICE_IOMMUFD,
> .parent = TYPE_HOST_IOMMU_DEVICE,
> + .class_init = hiod_iommufd_class_init,
> .abstract = true,
> }
> };
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH v6 11/19] backends/iommufd: Implement HostIOMMUDeviceClass::get_cap() handler
2024-06-03 11:32 ` Eric Auger
@ 2024-06-03 12:35 ` Cédric Le Goater
2024-06-04 3:23 ` Duan, Zhenzhong
0 siblings, 1 reply; 70+ messages in thread
From: Cédric Le Goater @ 2024-06-03 12:35 UTC (permalink / raw)
To: eric.auger, Zhenzhong Duan, qemu-devel
Cc: alex.williamson, mst, peterx, jasowang, jgg, nicolinc,
joao.m.martins, clement.mathieu--drif, kevin.tian, yi.l.liu,
chao.p.peng
On 6/3/24 13:32, Eric Auger wrote:
>
>
> On 6/3/24 08:10, Zhenzhong Duan wrote:
>> Suggested-by: Cédric Le Goater <clg@redhat.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>> backends/iommufd.c | 23 +++++++++++++++++++++++
>> 1 file changed, 23 insertions(+)
>>
>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>> index c7e969d6f7..f2f7a762a0 100644
>> --- a/backends/iommufd.c
>> +++ b/backends/iommufd.c
>> @@ -230,6 +230,28 @@ bool iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
>> return true;
>> }
>>
>> +static int hiod_iommufd_get_cap(HostIOMMUDevice *hiod, int cap, Error **errp)
>> +{
>> + HostIOMMUDeviceCaps *caps = &hiod->caps;
>> +
>> + switch (cap) {
>> + case HOST_IOMMU_DEVICE_CAP_IOMMU_TYPE:
>> + return caps->type;
>> + case HOST_IOMMU_DEVICE_CAP_AW_BITS:
>> + return caps->aw_bits;
>> + default:
>> + error_setg(errp, "Not support get cap %x", cap);
> can't you add details about the faulting HostIOMMUDevice by tracing the
> devid for instance?
yes.
> I would rephrase the error message into No support for capability 0x%x
I was going to propose "Unsupported capability ..."
Thanks,
C.
>
> Eric
>> + return -EINVAL;
>> + }
>> +}
>> +
>> +static void hiod_iommufd_class_init(ObjectClass *oc, void *data)
>> +{
>> + HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);
>> +
>> + hioc->get_cap = hiod_iommufd_get_cap;
>> +};
>> +
>> static const TypeInfo types[] = {
>> {
>> .name = TYPE_IOMMUFD_BACKEND,
>> @@ -246,6 +268,7 @@ static const TypeInfo types[] = {
>> }, {
>> .name = TYPE_HOST_IOMMU_DEVICE_IOMMUFD,
>> .parent = TYPE_HOST_IOMMU_DEVICE,
>> + .class_init = hiod_iommufd_class_init,
>> .abstract = true,
>> }
>> };
>
^ permalink raw reply [flat|nested] 70+ messages in thread
* RE: [PATCH v6 11/19] backends/iommufd: Implement HostIOMMUDeviceClass::get_cap() handler
2024-06-03 12:35 ` Cédric Le Goater
@ 2024-06-04 3:23 ` Duan, Zhenzhong
2024-06-04 8:10 ` Eric Auger
0 siblings, 1 reply; 70+ messages in thread
From: Duan, Zhenzhong @ 2024-06-04 3:23 UTC (permalink / raw)
To: Cédric Le Goater, eric.auger@redhat.com,
qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, mst@redhat.com, peterx@redhat.com,
jasowang@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com,
joao.m.martins@oracle.com, clement.mathieu--drif@eviden.com,
Tian, Kevin, Liu, Yi L, Peng, Chao P
Hi Cédric, Eric,
>-----Original Message-----
>From: Cédric Le Goater <clg@redhat.com>
>Subject: Re: [PATCH v6 11/19] backends/iommufd: Implement
>HostIOMMUDeviceClass::get_cap() handler
>
>On 6/3/24 13:32, Eric Auger wrote:
>>
>>
>> On 6/3/24 08:10, Zhenzhong Duan wrote:
>>> Suggested-by: Cédric Le Goater <clg@redhat.com>
>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>> ---
>>> backends/iommufd.c | 23 +++++++++++++++++++++++
>>> 1 file changed, 23 insertions(+)
>>>
>>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>>> index c7e969d6f7..f2f7a762a0 100644
>>> --- a/backends/iommufd.c
>>> +++ b/backends/iommufd.c
>>> @@ -230,6 +230,28 @@ bool
>iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
>>> return true;
>>> }
>>>
>>> +static int hiod_iommufd_get_cap(HostIOMMUDevice *hiod, int cap,
>Error **errp)
>>> +{
>>> + HostIOMMUDeviceCaps *caps = &hiod->caps;
>>> +
>>> + switch (cap) {
>>> + case HOST_IOMMU_DEVICE_CAP_IOMMU_TYPE:
>>> + return caps->type;
>>> + case HOST_IOMMU_DEVICE_CAP_AW_BITS:
>>> + return caps->aw_bits;
>>> + default:
>>> + error_setg(errp, "Not support get cap %x", cap);
>> can't you add details about the faulting HostIOMMUDevice by tracing the
>> devid for instance?
>
>yes.
devid isn't added to make this series simpler.
It's added in nesting series, https://github.com/yiliu1765/qemu/commit/5333b1a0ae03b3c5119b46a1af786d199f103889
Do you want to add devid in this series for tracing purpose or adding trace in nesting series is fine for you?
>
>> I would rephrase the error message into No support for capability 0x%x
>
>I was going to propose "Unsupported capability ..."
Sounds better, will do.
Thanks
Zhenzhong
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH v6 11/19] backends/iommufd: Implement HostIOMMUDeviceClass::get_cap() handler
2024-06-04 3:23 ` Duan, Zhenzhong
@ 2024-06-04 8:10 ` Eric Auger
2024-06-04 8:46 ` Duan, Zhenzhong
0 siblings, 1 reply; 70+ messages in thread
From: Eric Auger @ 2024-06-04 8:10 UTC (permalink / raw)
To: Duan, Zhenzhong, Cédric Le Goater, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, mst@redhat.com, peterx@redhat.com,
jasowang@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com,
joao.m.martins@oracle.com, clement.mathieu--drif@eviden.com,
Tian, Kevin, Liu, Yi L, Peng, Chao P
On 6/4/24 05:23, Duan, Zhenzhong wrote:
> Hi Cédric, Eric,
>
>> -----Original Message-----
>> From: Cédric Le Goater <clg@redhat.com>
>> Subject: Re: [PATCH v6 11/19] backends/iommufd: Implement
>> HostIOMMUDeviceClass::get_cap() handler
>>
>> On 6/3/24 13:32, Eric Auger wrote:
>>>
>>> On 6/3/24 08:10, Zhenzhong Duan wrote:
>>>> Suggested-by: Cédric Le Goater <clg@redhat.com>
>>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>>> ---
>>>> backends/iommufd.c | 23 +++++++++++++++++++++++
>>>> 1 file changed, 23 insertions(+)
>>>>
>>>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>>>> index c7e969d6f7..f2f7a762a0 100644
>>>> --- a/backends/iommufd.c
>>>> +++ b/backends/iommufd.c
>>>> @@ -230,6 +230,28 @@ bool
>> iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t devid,
>>>> return true;
>>>> }
>>>>
>>>> +static int hiod_iommufd_get_cap(HostIOMMUDevice *hiod, int cap,
>> Error **errp)
>>>> +{
>>>> + HostIOMMUDeviceCaps *caps = &hiod->caps;
>>>> +
>>>> + switch (cap) {
>>>> + case HOST_IOMMU_DEVICE_CAP_IOMMU_TYPE:
>>>> + return caps->type;
>>>> + case HOST_IOMMU_DEVICE_CAP_AW_BITS:
>>>> + return caps->aw_bits;
>>>> + default:
>>>> + error_setg(errp, "Not support get cap %x", cap);
>>> can't you add details about the faulting HostIOMMUDevice by tracing the
>>> devid for instance?
>> yes.
> devid isn't added to make this series simpler.
> It's added in nesting series, https://github.com/yiliu1765/qemu/commit/5333b1a0ae03b3c5119b46a1af786d199f103889
>
> Do you want to add devid in this series for tracing purpose or adding trace in nesting series is fine for you?
what would be nice is to get a common way to identify a HostIOMMUDevice,
can't we use the name of the VFIO/VDPA device? devid does not exist on
legacy container. At least a kind of wrapper may be relevant to extract
the name.
Eric
>
>>> I would rephrase the error message into No support for capability 0x%x
>> I was going to propose "Unsupported capability ..."
> Sounds better, will do.
>
> Thanks
> Zhenzhong
>
^ permalink raw reply [flat|nested] 70+ messages in thread
* RE: [PATCH v6 11/19] backends/iommufd: Implement HostIOMMUDeviceClass::get_cap() handler
2024-06-04 8:10 ` Eric Auger
@ 2024-06-04 8:46 ` Duan, Zhenzhong
2024-06-04 9:37 ` Eric Auger
0 siblings, 1 reply; 70+ messages in thread
From: Duan, Zhenzhong @ 2024-06-04 8:46 UTC (permalink / raw)
To: eric.auger@redhat.com, Cédric Le Goater,
qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, mst@redhat.com, peterx@redhat.com,
jasowang@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com,
joao.m.martins@oracle.com, clement.mathieu--drif@eviden.com,
Tian, Kevin, Liu, Yi L, Peng, Chao P
>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: Re: [PATCH v6 11/19] backends/iommufd: Implement
>HostIOMMUDeviceClass::get_cap() handler
>
>
>
>On 6/4/24 05:23, Duan, Zhenzhong wrote:
>> Hi Cédric, Eric,
>>
>>> -----Original Message-----
>>> From: Cédric Le Goater <clg@redhat.com>
>>> Subject: Re: [PATCH v6 11/19] backends/iommufd: Implement
>>> HostIOMMUDeviceClass::get_cap() handler
>>>
>>> On 6/3/24 13:32, Eric Auger wrote:
>>>>
>>>> On 6/3/24 08:10, Zhenzhong Duan wrote:
>>>>> Suggested-by: Cédric Le Goater <clg@redhat.com>
>>>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>>>> ---
>>>>> backends/iommufd.c | 23 +++++++++++++++++++++++
>>>>> 1 file changed, 23 insertions(+)
>>>>>
>>>>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>>>>> index c7e969d6f7..f2f7a762a0 100644
>>>>> --- a/backends/iommufd.c
>>>>> +++ b/backends/iommufd.c
>>>>> @@ -230,6 +230,28 @@ bool
>>> iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t
>devid,
>>>>> return true;
>>>>> }
>>>>>
>>>>> +static int hiod_iommufd_get_cap(HostIOMMUDevice *hiod, int cap,
>>> Error **errp)
>>>>> +{
>>>>> + HostIOMMUDeviceCaps *caps = &hiod->caps;
>>>>> +
>>>>> + switch (cap) {
>>>>> + case HOST_IOMMU_DEVICE_CAP_IOMMU_TYPE:
>>>>> + return caps->type;
>>>>> + case HOST_IOMMU_DEVICE_CAP_AW_BITS:
>>>>> + return caps->aw_bits;
>>>>> + default:
>>>>> + error_setg(errp, "Not support get cap %x", cap);
>>>> can't you add details about the faulting HostIOMMUDevice by tracing
>the
>>>> devid for instance?
>>> yes.
>> devid isn't added to make this series simpler.
>> It's added in nesting series,
>https://github.com/yiliu1765/qemu/commit/5333b1a0ae03b3c5119b46a1
>af786d199f103889
>>
>> Do you want to add devid in this series for tracing purpose or adding trace
>in nesting series is fine for you?
>
>what would be nice is to get a common way to identify a HostIOMMUDevice,
>can't we use the name of the VFIO/VDPA device? devid does not exist on
>legacy container. At least a kind of wrapper may be relevant to extract
>the name.
Getting name directly is not easy, we can save a copy in .realize(), like below:
--- a/include/sysemu/host_iommu_device.h
+++ b/include/sysemu/host_iommu_device.h
@@ -33,6 +33,7 @@ OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass, HOST_IOMMU_DEVICE)
struct HostIOMMUDevice {
Object parent_obj;
+ char *name;
HostIOMMUDeviceCaps caps;
};
diff --git a/backends/iommufd.c b/backends/iommufd.c
index f2f7a762a0..84fefbc9ee 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -240,7 +240,7 @@ static int hiod_iommufd_get_cap(HostIOMMUDevice *hiod, int cap, Error **errp)
case HOST_IOMMU_DEVICE_CAP_AW_BITS:
return caps->aw_bits;
default:
- error_setg(errp, "Not support get cap %x", cap);
+ error_setg(errp, "%s: unsupported capability %x", hiod->name, cap);
return -EINVAL;
}
}
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index a830426647..e78538efec 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -1152,6 +1152,7 @@ static bool hiod_legacy_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
} else {
hiod->caps.aw_bits = 0xff;
}
+ hiod->name = g_strdup(vdev->name);
return true;
}
@@ -1165,7 +1166,7 @@ static int hiod_legacy_vfio_get_cap(HostIOMMUDevice *hiod, int cap,
case HOST_IOMMU_DEVICE_CAP_AW_BITS:
return caps->aw_bits;
default:
- error_setg(errp, "Not support get cap %x", cap);
+ error_setg(errp, "%s: unsupported capability %x", hiod->name, cap);
return -EINVAL;
}
}
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 8fd8d52bc2..2df3aed47f 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -637,6 +637,7 @@ static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
return false;
}
+ hiod->name = g_strdup(vdev->name);
caps->type = type;
^ permalink raw reply related [flat|nested] 70+ messages in thread
* Re: [PATCH v6 11/19] backends/iommufd: Implement HostIOMMUDeviceClass::get_cap() handler
2024-06-04 8:46 ` Duan, Zhenzhong
@ 2024-06-04 9:37 ` Eric Auger
0 siblings, 0 replies; 70+ messages in thread
From: Eric Auger @ 2024-06-04 9:37 UTC (permalink / raw)
To: Duan, Zhenzhong, Cédric Le Goater, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, mst@redhat.com, peterx@redhat.com,
jasowang@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com,
joao.m.martins@oracle.com, clement.mathieu--drif@eviden.com,
Tian, Kevin, Liu, Yi L, Peng, Chao P
On 6/4/24 10:46, Duan, Zhenzhong wrote:
>
>> -----Original Message-----
>> From: Eric Auger <eric.auger@redhat.com>
>> Subject: Re: [PATCH v6 11/19] backends/iommufd: Implement
>> HostIOMMUDeviceClass::get_cap() handler
>>
>>
>>
>> On 6/4/24 05:23, Duan, Zhenzhong wrote:
>>> Hi Cédric, Eric,
>>>
>>>> -----Original Message-----
>>>> From: Cédric Le Goater <clg@redhat.com>
>>>> Subject: Re: [PATCH v6 11/19] backends/iommufd: Implement
>>>> HostIOMMUDeviceClass::get_cap() handler
>>>>
>>>> On 6/3/24 13:32, Eric Auger wrote:
>>>>> On 6/3/24 08:10, Zhenzhong Duan wrote:
>>>>>> Suggested-by: Cédric Le Goater <clg@redhat.com>
>>>>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>>>>> ---
>>>>>> backends/iommufd.c | 23 +++++++++++++++++++++++
>>>>>> 1 file changed, 23 insertions(+)
>>>>>>
>>>>>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>>>>>> index c7e969d6f7..f2f7a762a0 100644
>>>>>> --- a/backends/iommufd.c
>>>>>> +++ b/backends/iommufd.c
>>>>>> @@ -230,6 +230,28 @@ bool
>>>> iommufd_backend_get_device_info(IOMMUFDBackend *be, uint32_t
>> devid,
>>>>>> return true;
>>>>>> }
>>>>>>
>>>>>> +static int hiod_iommufd_get_cap(HostIOMMUDevice *hiod, int cap,
>>>> Error **errp)
>>>>>> +{
>>>>>> + HostIOMMUDeviceCaps *caps = &hiod->caps;
>>>>>> +
>>>>>> + switch (cap) {
>>>>>> + case HOST_IOMMU_DEVICE_CAP_IOMMU_TYPE:
>>>>>> + return caps->type;
>>>>>> + case HOST_IOMMU_DEVICE_CAP_AW_BITS:
>>>>>> + return caps->aw_bits;
>>>>>> + default:
>>>>>> + error_setg(errp, "Not support get cap %x", cap);
>>>>> can't you add details about the faulting HostIOMMUDevice by tracing
>> the
>>>>> devid for instance?
>>>> yes.
>>> devid isn't added to make this series simpler.
>>> It's added in nesting series,
>> https://github.com/yiliu1765/qemu/commit/5333b1a0ae03b3c5119b46a1
>> af786d199f103889
>>> Do you want to add devid in this series for tracing purpose or adding trace
>> in nesting series is fine for you?
>>
>> what would be nice is to get a common way to identify a HostIOMMUDevice,
>> can't we use the name of the VFIO/VDPA device? devid does not exist on
>> legacy container. At least a kind of wrapper may be relevant to extract
>> the name.
> Getting name directly is not easy, we can save a copy in .realize(), like below:
sounds good + dealloc
Eric
>
> --- a/include/sysemu/host_iommu_device.h
> +++ b/include/sysemu/host_iommu_device.h
> @@ -33,6 +33,7 @@ OBJECT_DECLARE_TYPE(HostIOMMUDevice, HostIOMMUDeviceClass, HOST_IOMMU_DEVICE)
> struct HostIOMMUDevice {
> Object parent_obj;
>
> + char *name;
> HostIOMMUDeviceCaps caps;
> };
>
> diff --git a/backends/iommufd.c b/backends/iommufd.c
> index f2f7a762a0..84fefbc9ee 100644
> --- a/backends/iommufd.c
> +++ b/backends/iommufd.c
> @@ -240,7 +240,7 @@ static int hiod_iommufd_get_cap(HostIOMMUDevice *hiod, int cap, Error **errp)
> case HOST_IOMMU_DEVICE_CAP_AW_BITS:
> return caps->aw_bits;
> default:
> - error_setg(errp, "Not support get cap %x", cap);
> + error_setg(errp, "%s: unsupported capability %x", hiod->name, cap);
> return -EINVAL;
> }
> }
> diff --git a/hw/vfio/container.c b/hw/vfio/container.c
> index a830426647..e78538efec 100644
> --- a/hw/vfio/container.c
> +++ b/hw/vfio/container.c
> @@ -1152,6 +1152,7 @@ static bool hiod_legacy_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
> } else {
> hiod->caps.aw_bits = 0xff;
> }
> + hiod->name = g_strdup(vdev->name);
>
> return true;
> }
> @@ -1165,7 +1166,7 @@ static int hiod_legacy_vfio_get_cap(HostIOMMUDevice *hiod, int cap,
> case HOST_IOMMU_DEVICE_CAP_AW_BITS:
> return caps->aw_bits;
> default:
> - error_setg(errp, "Not support get cap %x", cap);
> + error_setg(errp, "%s: unsupported capability %x", hiod->name, cap);
> return -EINVAL;
> }
> }
> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
> index 8fd8d52bc2..2df3aed47f 100644
> --- a/hw/vfio/iommufd.c
> +++ b/hw/vfio/iommufd.c
> @@ -637,6 +637,7 @@ static bool hiod_iommufd_vfio_realize(HostIOMMUDevice *hiod, void *opaque,
> return false;
> }
>
> + hiod->name = g_strdup(vdev->name);
> caps->type = type;
^ permalink raw reply [flat|nested] 70+ messages in thread
* [PATCH v6 12/19] vfio: Introduce VFIOIOMMUClass::hiod_typename attribute
2024-06-03 6:10 [PATCH v6 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
` (10 preceding siblings ...)
2024-06-03 6:10 ` [PATCH v6 11/19] backends/iommufd: " Zhenzhong Duan
@ 2024-06-03 6:10 ` Zhenzhong Duan
2024-06-03 6:10 ` [PATCH v6 13/19] vfio: Create host IOMMU device instance Zhenzhong Duan
` (7 subsequent siblings)
19 siblings, 0 replies; 70+ messages in thread
From: Zhenzhong Duan @ 2024-06-03 6:10 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
yi.l.liu, chao.p.peng, Zhenzhong Duan
Initialize attribute VFIOIOMMUClass::hiod_typename based on
VFIO backend type.
This attribute will facilitate HostIOMMUDevice creation in
vfio_attach_device().
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
include/hw/vfio/vfio-container-base.h | 3 +++
hw/vfio/container.c | 2 ++
hw/vfio/iommufd.c | 2 ++
3 files changed, 7 insertions(+)
diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h
index 2776481fc9..442c0dfc4c 100644
--- a/include/hw/vfio/vfio-container-base.h
+++ b/include/hw/vfio/vfio-container-base.h
@@ -109,6 +109,9 @@ DECLARE_CLASS_CHECKERS(VFIOIOMMUClass, VFIO_IOMMU, TYPE_VFIO_IOMMU)
struct VFIOIOMMUClass {
InterfaceClass parent_class;
+ /* Properties */
+ const char *hiod_typename;
+
/* basic feature */
bool (*setup)(VFIOContainerBase *bcontainer, Error **errp);
int (*dma_map)(const VFIOContainerBase *bcontainer,
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index a46c275a88..a830426647 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -1126,6 +1126,8 @@ static void vfio_iommu_legacy_class_init(ObjectClass *klass, void *data)
{
VFIOIOMMUClass *vioc = VFIO_IOMMU_CLASS(klass);
+ vioc->hiod_typename = TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO;
+
vioc->setup = vfio_legacy_setup;
vioc->dma_map = vfio_legacy_dma_map;
vioc->dma_unmap = vfio_legacy_dma_unmap;
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 9d2e95e20e..8fd8d52bc2 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -613,6 +613,8 @@ static void vfio_iommu_iommufd_class_init(ObjectClass *klass, void *data)
{
VFIOIOMMUClass *vioc = VFIO_IOMMU_CLASS(klass);
+ vioc->hiod_typename = TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO;
+
vioc->dma_map = iommufd_cdev_map;
vioc->dma_unmap = iommufd_cdev_unmap;
vioc->attach_device = iommufd_cdev_attach;
--
2.34.1
^ permalink raw reply related [flat|nested] 70+ messages in thread
* [PATCH v6 13/19] vfio: Create host IOMMU device instance
2024-06-03 6:10 [PATCH v6 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
` (11 preceding siblings ...)
2024-06-03 6:10 ` [PATCH v6 12/19] vfio: Introduce VFIOIOMMUClass::hiod_typename attribute Zhenzhong Duan
@ 2024-06-03 6:10 ` Zhenzhong Duan
2024-06-03 12:59 ` Eric Auger
2024-06-03 6:10 ` [PATCH v6 14/19] hw/pci: Introduce helper function pci_device_get_iommu_bus_devfn() Zhenzhong Duan
` (6 subsequent siblings)
19 siblings, 1 reply; 70+ messages in thread
From: Zhenzhong Duan @ 2024-06-03 6:10 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
yi.l.liu, chao.p.peng, Zhenzhong Duan
Create host IOMMU device instance in vfio_attach_device() and call
.realize() to initialize it further.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
include/hw/vfio/vfio-common.h | 1 +
hw/vfio/common.c | 16 +++++++++++++++-
2 files changed, 16 insertions(+), 1 deletion(-)
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 56d1717211..c0851e83bb 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -127,6 +127,7 @@ typedef struct VFIODevice {
OnOffAuto pre_copy_dirty_page_tracking;
bool dirty_pages_supported;
bool dirty_tracking;
+ HostIOMMUDevice *hiod;
int devid;
IOMMUFDBackend *iommufd;
} VFIODevice;
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index f9619a1dfb..f20a7b5bba 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1528,6 +1528,7 @@ bool vfio_attach_device(char *name, VFIODevice *vbasedev,
{
const VFIOIOMMUClass *ops =
VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_LEGACY));
+ HostIOMMUDevice *hiod;
if (vbasedev->iommufd) {
ops = VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_IOMMUFD));
@@ -1535,7 +1536,19 @@ bool vfio_attach_device(char *name, VFIODevice *vbasedev,
assert(ops);
- return ops->attach_device(name, vbasedev, as, errp);
+ if (!ops->attach_device(name, vbasedev, as, errp)) {
+ return false;
+ }
+
+ hiod = HOST_IOMMU_DEVICE(object_new(ops->hiod_typename));
+ if (!HOST_IOMMU_DEVICE_GET_CLASS(hiod)->realize(hiod, vbasedev, errp)) {
+ object_unref(hiod);
+ ops->detach_device(vbasedev);
+ return false;
+ }
+ vbasedev->hiod = hiod;
+
+ return true;
}
void vfio_detach_device(VFIODevice *vbasedev)
@@ -1543,5 +1556,6 @@ void vfio_detach_device(VFIODevice *vbasedev)
if (!vbasedev->bcontainer) {
return;
}
+ object_unref(vbasedev->hiod);
vbasedev->bcontainer->ops->detach_device(vbasedev);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 70+ messages in thread
* Re: [PATCH v6 13/19] vfio: Create host IOMMU device instance
2024-06-03 6:10 ` [PATCH v6 13/19] vfio: Create host IOMMU device instance Zhenzhong Duan
@ 2024-06-03 12:59 ` Eric Auger
2024-06-04 3:47 ` Duan, Zhenzhong
0 siblings, 1 reply; 70+ messages in thread
From: Eric Auger @ 2024-06-03 12:59 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, clg, mst, peterx, jasowang, jgg, nicolinc,
joao.m.martins, clement.mathieu--drif, kevin.tian, yi.l.liu,
chao.p.peng
Hi Zhenzhong,
On 6/3/24 08:10, Zhenzhong Duan wrote:
> Create host IOMMU device instance in vfio_attach_device() and call
> .realize() to initialize it further.
I would squash this with the previous patch
Eric
>
> Suggested-by: Cédric Le Goater <clg@redhat.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
> include/hw/vfio/vfio-common.h | 1 +
> hw/vfio/common.c | 16 +++++++++++++++-
> 2 files changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index 56d1717211..c0851e83bb 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -127,6 +127,7 @@ typedef struct VFIODevice {
> OnOffAuto pre_copy_dirty_page_tracking;
> bool dirty_pages_supported;
> bool dirty_tracking;
> + HostIOMMUDevice *hiod;
> int devid;
> IOMMUFDBackend *iommufd;
> } VFIODevice;
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index f9619a1dfb..f20a7b5bba 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -1528,6 +1528,7 @@ bool vfio_attach_device(char *name, VFIODevice *vbasedev,
> {
> const VFIOIOMMUClass *ops =
> VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_LEGACY));
> + HostIOMMUDevice *hiod;
>
> if (vbasedev->iommufd) {
> ops = VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_IOMMUFD));
> @@ -1535,7 +1536,19 @@ bool vfio_attach_device(char *name, VFIODevice *vbasedev,
>
> assert(ops);
>
> - return ops->attach_device(name, vbasedev, as, errp);
> + if (!ops->attach_device(name, vbasedev, as, errp)) {
> + return false;
> + }
> +
> + hiod = HOST_IOMMU_DEVICE(object_new(ops->hiod_typename));
> + if (!HOST_IOMMU_DEVICE_GET_CLASS(hiod)->realize(hiod, vbasedev, errp)) {
> + object_unref(hiod);
> + ops->detach_device(vbasedev);
> + return false;
> + }
> + vbasedev->hiod = hiod;
> +
> + return true;
> }
>
> void vfio_detach_device(VFIODevice *vbasedev)
> @@ -1543,5 +1556,6 @@ void vfio_detach_device(VFIODevice *vbasedev)
> if (!vbasedev->bcontainer) {
> return;
> }
> + object_unref(vbasedev->hiod);
> vbasedev->bcontainer->ops->detach_device(vbasedev);
> }
^ permalink raw reply [flat|nested] 70+ messages in thread
* RE: [PATCH v6 13/19] vfio: Create host IOMMU device instance
2024-06-03 12:59 ` Eric Auger
@ 2024-06-04 3:47 ` Duan, Zhenzhong
0 siblings, 0 replies; 70+ messages in thread
From: Duan, Zhenzhong @ 2024-06-04 3:47 UTC (permalink / raw)
To: eric.auger@redhat.com, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
peterx@redhat.com, jasowang@redhat.com, jgg@nvidia.com,
nicolinc@nvidia.com, joao.m.martins@oracle.com,
clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
Peng, Chao P
>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: Re: [PATCH v6 13/19] vfio: Create host IOMMU device instance
>
>Hi Zhenzhong,
>
>On 6/3/24 08:10, Zhenzhong Duan wrote:
>> Create host IOMMU device instance in vfio_attach_device() and call
>> .realize() to initialize it further.
>
>I would squash this with the previous patch
Will do.
Thanks
Zhenzhong
>
>Eric
>>
>> Suggested-by: Cédric Le Goater <clg@redhat.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>> include/hw/vfio/vfio-common.h | 1 +
>> hw/vfio/common.c | 16 +++++++++++++++-
>> 2 files changed, 16 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-
>common.h
>> index 56d1717211..c0851e83bb 100644
>> --- a/include/hw/vfio/vfio-common.h
>> +++ b/include/hw/vfio/vfio-common.h
>> @@ -127,6 +127,7 @@ typedef struct VFIODevice {
>> OnOffAuto pre_copy_dirty_page_tracking;
>> bool dirty_pages_supported;
>> bool dirty_tracking;
>> + HostIOMMUDevice *hiod;
>> int devid;
>> IOMMUFDBackend *iommufd;
>> } VFIODevice;
>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>> index f9619a1dfb..f20a7b5bba 100644
>> --- a/hw/vfio/common.c
>> +++ b/hw/vfio/common.c
>> @@ -1528,6 +1528,7 @@ bool vfio_attach_device(char *name,
>VFIODevice *vbasedev,
>> {
>> const VFIOIOMMUClass *ops =
>>
>VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_LEGACY));
>> + HostIOMMUDevice *hiod;
>>
>> if (vbasedev->iommufd) {
>> ops =
>VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_IOMMUF
>D));
>> @@ -1535,7 +1536,19 @@ bool vfio_attach_device(char *name,
>VFIODevice *vbasedev,
>>
>> assert(ops);
>>
>> - return ops->attach_device(name, vbasedev, as, errp);
>> + if (!ops->attach_device(name, vbasedev, as, errp)) {
>> + return false;
>> + }
>> +
>> + hiod = HOST_IOMMU_DEVICE(object_new(ops->hiod_typename));
>> + if (!HOST_IOMMU_DEVICE_GET_CLASS(hiod)->realize(hiod, vbasedev,
>errp)) {
>> + object_unref(hiod);
>> + ops->detach_device(vbasedev);
>> + return false;
>> + }
>> + vbasedev->hiod = hiod;
>> +
>> + return true;
>> }
>>
>> void vfio_detach_device(VFIODevice *vbasedev)
>> @@ -1543,5 +1556,6 @@ void vfio_detach_device(VFIODevice *vbasedev)
>> if (!vbasedev->bcontainer) {
>> return;
>> }
>> + object_unref(vbasedev->hiod);
>> vbasedev->bcontainer->ops->detach_device(vbasedev);
>> }
^ permalink raw reply [flat|nested] 70+ messages in thread
* [PATCH v6 14/19] hw/pci: Introduce helper function pci_device_get_iommu_bus_devfn()
2024-06-03 6:10 [PATCH v6 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
` (12 preceding siblings ...)
2024-06-03 6:10 ` [PATCH v6 13/19] vfio: Create host IOMMU device instance Zhenzhong Duan
@ 2024-06-03 6:10 ` Zhenzhong Duan
2024-06-03 13:40 ` Eric Auger
2024-06-03 6:10 ` [PATCH v6 15/19] hw/pci: Introduce pci_device_[set|unset]_iommu_device() Zhenzhong Duan
` (5 subsequent siblings)
19 siblings, 1 reply; 70+ messages in thread
From: Zhenzhong Duan @ 2024-06-03 6:10 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
yi.l.liu, chao.p.peng, Zhenzhong Duan, Yi Sun, Marcel Apfelbaum
Extract out pci_device_get_iommu_bus_devfn() from
pci_device_iommu_address_space() to facilitate
implementation of pci_device_[set|unset]_iommu_device()
in following patch.
No functional change intended.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
hw/pci/pci.c | 48 +++++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 45 insertions(+), 3 deletions(-)
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 324c1302d2..02a4bb2af6 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2648,11 +2648,27 @@ static void pci_device_class_base_init(ObjectClass *klass, void *data)
}
}
-AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
+/*
+ * Get IOMMU root bus, aliased bus and devfn of a PCI device
+ *
+ * IOMMU root bus is needed by all call sites to call into iommu_ops.
+ * For call sites which don't need aliased BDF, passing NULL to
+ * aliased_[bus|devfn] is allowed.
+ *
+ * @piommu_bus: return root #PCIBus backed by an IOMMU for the PCI device.
+ *
+ * @aliased_bus: return aliased #PCIBus of the PCI device, optional.
+ *
+ * @aliased_devfn: return aliased devfn of the PCI device, optional.
+ */
+static void pci_device_get_iommu_bus_devfn(PCIDevice *dev,
+ PCIBus **piommu_bus,
+ PCIBus **aliased_bus,
+ int *aliased_devfn)
{
PCIBus *bus = pci_get_bus(dev);
PCIBus *iommu_bus = bus;
- uint8_t devfn = dev->devfn;
+ int devfn = dev->devfn;
while (iommu_bus && !iommu_bus->iommu_ops && iommu_bus->parent_dev) {
PCIBus *parent_bus = pci_get_bus(iommu_bus->parent_dev);
@@ -2693,7 +2709,33 @@ AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
iommu_bus = parent_bus;
}
- if (!pci_bus_bypass_iommu(bus) && iommu_bus->iommu_ops) {
+
+ assert(0 <= devfn && devfn < PCI_DEVFN_MAX);
+ assert(iommu_bus);
+
+ if (pci_bus_bypass_iommu(bus) || !iommu_bus->iommu_ops) {
+ iommu_bus = NULL;
+ }
+
+ *piommu_bus = iommu_bus;
+
+ if (aliased_bus) {
+ *aliased_bus = bus;
+ }
+
+ if (aliased_devfn) {
+ *aliased_devfn = devfn;
+ }
+}
+
+AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
+{
+ PCIBus *bus;
+ PCIBus *iommu_bus;
+ int devfn;
+
+ pci_device_get_iommu_bus_devfn(dev, &iommu_bus, &bus, &devfn);
+ if (iommu_bus) {
return iommu_bus->iommu_ops->get_address_space(bus,
iommu_bus->iommu_opaque, devfn);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 70+ messages in thread
* Re: [PATCH v6 14/19] hw/pci: Introduce helper function pci_device_get_iommu_bus_devfn()
2024-06-03 6:10 ` [PATCH v6 14/19] hw/pci: Introduce helper function pci_device_get_iommu_bus_devfn() Zhenzhong Duan
@ 2024-06-03 13:40 ` Eric Auger
0 siblings, 0 replies; 70+ messages in thread
From: Eric Auger @ 2024-06-03 13:40 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, clg, mst, peterx, jasowang, jgg, nicolinc,
joao.m.martins, clement.mathieu--drif, kevin.tian, yi.l.liu,
chao.p.peng, Yi Sun, Marcel Apfelbaum
On 6/3/24 08:10, Zhenzhong Duan wrote:
> Extract out pci_device_get_iommu_bus_devfn() from
> pci_device_iommu_address_space() to facilitate
> implementation of pci_device_[set|unset]_iommu_device()
> in following patch.
>
> No functional change intended.
>
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Eric
> ---
> hw/pci/pci.c | 48 +++++++++++++++++++++++++++++++++++++++++++++---
> 1 file changed, 45 insertions(+), 3 deletions(-)
>
> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> index 324c1302d2..02a4bb2af6 100644
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -2648,11 +2648,27 @@ static void pci_device_class_base_init(ObjectClass *klass, void *data)
> }
> }
>
> -AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
> +/*
> + * Get IOMMU root bus, aliased bus and devfn of a PCI device
> + *
> + * IOMMU root bus is needed by all call sites to call into iommu_ops.
> + * For call sites which don't need aliased BDF, passing NULL to
> + * aliased_[bus|devfn] is allowed.
> + *
> + * @piommu_bus: return root #PCIBus backed by an IOMMU for the PCI device.
> + *
> + * @aliased_bus: return aliased #PCIBus of the PCI device, optional.
> + *
> + * @aliased_devfn: return aliased devfn of the PCI device, optional.
> + */
> +static void pci_device_get_iommu_bus_devfn(PCIDevice *dev,
> + PCIBus **piommu_bus,
> + PCIBus **aliased_bus,
> + int *aliased_devfn)
> {
> PCIBus *bus = pci_get_bus(dev);
> PCIBus *iommu_bus = bus;
> - uint8_t devfn = dev->devfn;
> + int devfn = dev->devfn;
>
> while (iommu_bus && !iommu_bus->iommu_ops && iommu_bus->parent_dev) {
> PCIBus *parent_bus = pci_get_bus(iommu_bus->parent_dev);
> @@ -2693,7 +2709,33 @@ AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
>
> iommu_bus = parent_bus;
> }
> - if (!pci_bus_bypass_iommu(bus) && iommu_bus->iommu_ops) {
> +
> + assert(0 <= devfn && devfn < PCI_DEVFN_MAX);
> + assert(iommu_bus);
> +
> + if (pci_bus_bypass_iommu(bus) || !iommu_bus->iommu_ops) {
> + iommu_bus = NULL;
> + }
> +
> + *piommu_bus = iommu_bus;
> +
> + if (aliased_bus) {
> + *aliased_bus = bus;
> + }
> +
> + if (aliased_devfn) {
> + *aliased_devfn = devfn;
> + }
> +}
> +
> +AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
> +{
> + PCIBus *bus;
> + PCIBus *iommu_bus;
> + int devfn;
> +
> + pci_device_get_iommu_bus_devfn(dev, &iommu_bus, &bus, &devfn);
> + if (iommu_bus) {
> return iommu_bus->iommu_ops->get_address_space(bus,
> iommu_bus->iommu_opaque, devfn);
> }
^ permalink raw reply [flat|nested] 70+ messages in thread
* [PATCH v6 15/19] hw/pci: Introduce pci_device_[set|unset]_iommu_device()
2024-06-03 6:10 [PATCH v6 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
` (13 preceding siblings ...)
2024-06-03 6:10 ` [PATCH v6 14/19] hw/pci: Introduce helper function pci_device_get_iommu_bus_devfn() Zhenzhong Duan
@ 2024-06-03 6:10 ` Zhenzhong Duan
2024-06-03 13:54 ` Eric Auger
2024-06-03 6:10 ` [PATCH v6 16/19] vfio/pci: Pass HostIOMMUDevice to vIOMMU Zhenzhong Duan
` (4 subsequent siblings)
19 siblings, 1 reply; 70+ messages in thread
From: Zhenzhong Duan @ 2024-06-03 6:10 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
yi.l.liu, chao.p.peng, Yi Sun, Zhenzhong Duan, Marcel Apfelbaum
From: Yi Liu <yi.l.liu@intel.com>
pci_device_[set|unset]_iommu_device() call pci_device_get_iommu_bus_devfn()
to get iommu_bus->iommu_ops and call [set|unset]_iommu_device callback to
set/unset HostIOMMUDevice for a given PCI device.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
include/hw/pci/pci.h | 38 +++++++++++++++++++++++++++++++++++++-
hw/pci/pci.c | 27 +++++++++++++++++++++++++++
2 files changed, 64 insertions(+), 1 deletion(-)
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index eaa3fc99d8..c84cc9b99a 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -3,6 +3,7 @@
#include "exec/memory.h"
#include "sysemu/dma.h"
+#include "sysemu/host_iommu_device.h"
/* PCI includes legacy ISA access. */
#include "hw/isa/isa.h"
@@ -383,10 +384,45 @@ typedef struct PCIIOMMUOps {
*
* @devfn: device and function number
*/
- AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int devfn);
+ AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int devfn);
+ /**
+ * @set_iommu_device: attach a HostIOMMUDevice to a vIOMMU
+ *
+ * Optional callback, if not implemented in vIOMMU, then vIOMMU can't
+ * retrieve host information from the associated HostIOMMUDevice.
+ *
+ * @bus: the #PCIBus of the PCI device.
+ *
+ * @opaque: the data passed to pci_setup_iommu().
+ *
+ * @devfn: device and function number of the PCI device.
+ *
+ * @dev: the data structure representing host IOMMU device.
+ *
+ * @errp: pass an Error out only when return false
+ *
+ * Returns: true if HostIOMMUDevice is attached or else false with errp set.
+ */
+ bool (*set_iommu_device)(PCIBus *bus, void *opaque, int devfn,
+ HostIOMMUDevice *dev, Error **errp);
+ /**
+ * @unset_iommu_device: detach a HostIOMMUDevice from a vIOMMU
+ *
+ * Optional callback.
+ *
+ * @bus: the #PCIBus of the PCI device.
+ *
+ * @opaque: the data passed to pci_setup_iommu().
+ *
+ * @devfn: device and function number of the PCI device.
+ */
+ void (*unset_iommu_device)(PCIBus *bus, void *opaque, int devfn);
} PCIIOMMUOps;
AddressSpace *pci_device_iommu_address_space(PCIDevice *dev);
+bool pci_device_set_iommu_device(PCIDevice *dev, HostIOMMUDevice *hiod,
+ Error **errp);
+void pci_device_unset_iommu_device(PCIDevice *dev);
/**
* pci_setup_iommu: Initialize specific IOMMU handlers for a PCIBus
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 02a4bb2af6..c8a8aab306 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2742,6 +2742,33 @@ AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
return &address_space_memory;
}
+bool pci_device_set_iommu_device(PCIDevice *dev, HostIOMMUDevice *hiod,
+ Error **errp)
+{
+ PCIBus *iommu_bus;
+
+ /* set_iommu_device requires device's direct BDF instead of aliased BDF */
+ pci_device_get_iommu_bus_devfn(dev, &iommu_bus, NULL, NULL);
+ if (iommu_bus && iommu_bus->iommu_ops->set_iommu_device) {
+ return iommu_bus->iommu_ops->set_iommu_device(pci_get_bus(dev),
+ iommu_bus->iommu_opaque,
+ dev->devfn, hiod, errp);
+ }
+ return true;
+}
+
+void pci_device_unset_iommu_device(PCIDevice *dev)
+{
+ PCIBus *iommu_bus;
+
+ pci_device_get_iommu_bus_devfn(dev, &iommu_bus, NULL, NULL);
+ if (iommu_bus && iommu_bus->iommu_ops->unset_iommu_device) {
+ return iommu_bus->iommu_ops->unset_iommu_device(pci_get_bus(dev),
+ iommu_bus->iommu_opaque,
+ dev->devfn);
+ }
+}
+
void pci_setup_iommu(PCIBus *bus, const PCIIOMMUOps *ops, void *opaque)
{
/*
--
2.34.1
^ permalink raw reply related [flat|nested] 70+ messages in thread
* Re: [PATCH v6 15/19] hw/pci: Introduce pci_device_[set|unset]_iommu_device()
2024-06-03 6:10 ` [PATCH v6 15/19] hw/pci: Introduce pci_device_[set|unset]_iommu_device() Zhenzhong Duan
@ 2024-06-03 13:54 ` Eric Auger
0 siblings, 0 replies; 70+ messages in thread
From: Eric Auger @ 2024-06-03 13:54 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, clg, mst, peterx, jasowang, jgg, nicolinc,
joao.m.martins, clement.mathieu--drif, kevin.tian, yi.l.liu,
chao.p.peng, Yi Sun, Marcel Apfelbaum
On 6/3/24 08:10, Zhenzhong Duan wrote:
> From: Yi Liu <yi.l.liu@intel.com>
>
> pci_device_[set|unset]_iommu_device() call pci_device_get_iommu_bus_devfn()
> to get iommu_bus->iommu_ops and call [set|unset]_iommu_device callback to
> set/unset HostIOMMUDevice for a given PCI device.
>
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
> include/hw/pci/pci.h | 38 +++++++++++++++++++++++++++++++++++++-
> hw/pci/pci.c | 27 +++++++++++++++++++++++++++
> 2 files changed, 64 insertions(+), 1 deletion(-)
>
> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> index eaa3fc99d8..c84cc9b99a 100644
> --- a/include/hw/pci/pci.h
> +++ b/include/hw/pci/pci.h
> @@ -3,6 +3,7 @@
>
> #include "exec/memory.h"
> #include "sysemu/dma.h"
> +#include "sysemu/host_iommu_device.h"
>
> /* PCI includes legacy ISA access. */
> #include "hw/isa/isa.h"
> @@ -383,10 +384,45 @@ typedef struct PCIIOMMUOps {
> *
> * @devfn: device and function number
> */
> - AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int devfn);
> + AddressSpace * (*get_address_space)(PCIBus *bus, void *opaque, int devfn);
> + /**
> + * @set_iommu_device: attach a HostIOMMUDevice to a vIOMMU
> + *
> + * Optional callback, if not implemented in vIOMMU, then vIOMMU can't
> + * retrieve host information from the associated HostIOMMUDevice.
> + *
> + * @bus: the #PCIBus of the PCI device.
> + *
> + * @opaque: the data passed to pci_setup_iommu().
> + *
> + * @devfn: device and function number of the PCI device.
> + *
> + * @dev: the data structure representing host IOMMU device.
the #HostIOMMUDevice to attach?
> + *
> + * @errp: pass an Error out only when return false
> + *
> + * Returns: true if HostIOMMUDevice is attached or else false with errp set.
> + */
> + bool (*set_iommu_device)(PCIBus *bus, void *opaque, int devfn,
> + HostIOMMUDevice *dev, Error **errp);
> + /**
> + * @unset_iommu_device: detach a HostIOMMUDevice from a vIOMMU
> + *
> + * Optional callback.
> + *
> + * @bus: the #PCIBus of the PCI device.
> + *
> + * @opaque: the data passed to pci_setup_iommu().
> + *
> + * @devfn: device and function number of the PCI device.
> + */
> + void (*unset_iommu_device)(PCIBus *bus, void *opaque, int devfn);
> } PCIIOMMUOps;
>
> AddressSpace *pci_device_iommu_address_space(PCIDevice *dev);
> +bool pci_device_set_iommu_device(PCIDevice *dev, HostIOMMUDevice *hiod,
> + Error **errp);
> +void pci_device_unset_iommu_device(PCIDevice *dev);
>
> /**
> * pci_setup_iommu: Initialize specific IOMMU handlers for a PCIBus
> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> index 02a4bb2af6..c8a8aab306 100644
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -2742,6 +2742,33 @@ AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
> return &address_space_memory;
> }
>
> +bool pci_device_set_iommu_device(PCIDevice *dev, HostIOMMUDevice *hiod,
> + Error **errp)
> +{
> + PCIBus *iommu_bus;
> +
> + /* set_iommu_device requires device's direct BDF instead of aliased BDF */
> + pci_device_get_iommu_bus_devfn(dev, &iommu_bus, NULL, NULL);
> + if (iommu_bus && iommu_bus->iommu_ops->set_iommu_device) {
> + return iommu_bus->iommu_ops->set_iommu_device(pci_get_bus(dev),
> + iommu_bus->iommu_opaque,
> + dev->devfn, hiod, errp);
> + }
> + return true;
> +}
> +
> +void pci_device_unset_iommu_device(PCIDevice *dev)
> +{
> + PCIBus *iommu_bus;
> +
> + pci_device_get_iommu_bus_devfn(dev, &iommu_bus, NULL, NULL);
> + if (iommu_bus && iommu_bus->iommu_ops->unset_iommu_device) {
> + return iommu_bus->iommu_ops->unset_iommu_device(pci_get_bus(dev),
> + iommu_bus->iommu_opaque,
> + dev->devfn);
> + }
> +}
> +
> void pci_setup_iommu(PCIBus *bus, const PCIIOMMUOps *ops, void *opaque)
> {
> /*
Besides
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Eric
^ permalink raw reply [flat|nested] 70+ messages in thread
* [PATCH v6 16/19] vfio/pci: Pass HostIOMMUDevice to vIOMMU
2024-06-03 6:10 [PATCH v6 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
` (14 preceding siblings ...)
2024-06-03 6:10 ` [PATCH v6 15/19] hw/pci: Introduce pci_device_[set|unset]_iommu_device() Zhenzhong Duan
@ 2024-06-03 6:10 ` Zhenzhong Duan
2024-06-03 13:54 ` Eric Auger
2024-06-03 6:10 ` [PATCH v6 17/19] intel_iommu: Extract out vtd_cap_init() to initialize cap/ecap Zhenzhong Duan
` (3 subsequent siblings)
19 siblings, 1 reply; 70+ messages in thread
From: Zhenzhong Duan @ 2024-06-03 6:10 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
yi.l.liu, chao.p.peng, Zhenzhong Duan, Yi Sun
With HostIOMMUDevice passed, vIOMMU can check compatibility with host
IOMMU, call into IOMMUFD specific methods, etc.
Originally-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
hw/vfio/pci.c | 19 ++++++++++++++-----
1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 74a79bdf61..d8a76c1ee0 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -3121,10 +3121,15 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
vfio_bars_register(vdev);
- if (!vfio_add_capabilities(vdev, errp)) {
+ if (!pci_device_set_iommu_device(pdev, vbasedev->hiod, errp)) {
+ error_prepend(errp, "Failed to set iommu_device: ");
goto out_teardown;
}
+ if (!vfio_add_capabilities(vdev, errp)) {
+ goto out_unset_idev;
+ }
+
if (vdev->vga) {
vfio_vga_quirk_setup(vdev);
}
@@ -3141,7 +3146,7 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
error_setg(errp,
"cannot support IGD OpRegion feature on hotplugged "
"device");
- goto out_teardown;
+ goto out_unset_idev;
}
ret = vfio_get_dev_region_info(vbasedev,
@@ -3150,11 +3155,11 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
if (ret) {
error_setg_errno(errp, -ret,
"does not support requested IGD OpRegion feature");
- goto out_teardown;
+ goto out_unset_idev;
}
if (!vfio_pci_igd_opregion_init(vdev, opregion, errp)) {
- goto out_teardown;
+ goto out_unset_idev;
}
}
@@ -3238,6 +3243,8 @@ out_deregister:
if (vdev->intx.mmap_timer) {
timer_free(vdev->intx.mmap_timer);
}
+out_unset_idev:
+ pci_device_unset_iommu_device(pdev);
out_teardown:
vfio_teardown_msi(vdev);
vfio_bars_exit(vdev);
@@ -3266,6 +3273,7 @@ static void vfio_instance_finalize(Object *obj)
static void vfio_exitfn(PCIDevice *pdev)
{
VFIOPCIDevice *vdev = VFIO_PCI(pdev);
+ VFIODevice *vbasedev = &vdev->vbasedev;
vfio_unregister_req_notifier(vdev);
vfio_unregister_err_notifier(vdev);
@@ -3280,7 +3288,8 @@ static void vfio_exitfn(PCIDevice *pdev)
vfio_teardown_msi(vdev);
vfio_pci_disable_rp_atomics(vdev);
vfio_bars_exit(vdev);
- vfio_migration_exit(&vdev->vbasedev);
+ vfio_migration_exit(vbasedev);
+ pci_device_unset_iommu_device(pdev);
}
static void vfio_pci_reset(DeviceState *dev)
--
2.34.1
^ permalink raw reply related [flat|nested] 70+ messages in thread
* Re: [PATCH v6 16/19] vfio/pci: Pass HostIOMMUDevice to vIOMMU
2024-06-03 6:10 ` [PATCH v6 16/19] vfio/pci: Pass HostIOMMUDevice to vIOMMU Zhenzhong Duan
@ 2024-06-03 13:54 ` Eric Auger
0 siblings, 0 replies; 70+ messages in thread
From: Eric Auger @ 2024-06-03 13:54 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, clg, mst, peterx, jasowang, jgg, nicolinc,
joao.m.martins, clement.mathieu--drif, kevin.tian, yi.l.liu,
chao.p.peng, Yi Sun
On 6/3/24 08:10, Zhenzhong Duan wrote:
> With HostIOMMUDevice passed, vIOMMU can check compatibility with host
> IOMMU, call into IOMMUFD specific methods, etc.
>
> Originally-by: Yi Liu <yi.l.liu@intel.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
> hw/vfio/pci.c | 19 ++++++++++++++-----
> 1 file changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index 74a79bdf61..d8a76c1ee0 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -3121,10 +3121,15 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
>
> vfio_bars_register(vdev);
>
> - if (!vfio_add_capabilities(vdev, errp)) {
> + if (!pci_device_set_iommu_device(pdev, vbasedev->hiod, errp)) {
> + error_prepend(errp, "Failed to set iommu_device: ");
> goto out_teardown;
> }
>
> + if (!vfio_add_capabilities(vdev, errp)) {
> + goto out_unset_idev;
> + }
> +
> if (vdev->vga) {
> vfio_vga_quirk_setup(vdev);
> }
> @@ -3141,7 +3146,7 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
> error_setg(errp,
> "cannot support IGD OpRegion feature on hotplugged "
> "device");
> - goto out_teardown;
> + goto out_unset_idev;
> }
>
> ret = vfio_get_dev_region_info(vbasedev,
> @@ -3150,11 +3155,11 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
> if (ret) {
> error_setg_errno(errp, -ret,
> "does not support requested IGD OpRegion feature");
> - goto out_teardown;
> + goto out_unset_idev;
> }
>
> if (!vfio_pci_igd_opregion_init(vdev, opregion, errp)) {
> - goto out_teardown;
> + goto out_unset_idev;
> }
> }
>
> @@ -3238,6 +3243,8 @@ out_deregister:
> if (vdev->intx.mmap_timer) {
> timer_free(vdev->intx.mmap_timer);
> }
> +out_unset_idev:
> + pci_device_unset_iommu_device(pdev);
> out_teardown:
> vfio_teardown_msi(vdev);
> vfio_bars_exit(vdev);
> @@ -3266,6 +3273,7 @@ static void vfio_instance_finalize(Object *obj)
> static void vfio_exitfn(PCIDevice *pdev)
> {
> VFIOPCIDevice *vdev = VFIO_PCI(pdev);
> + VFIODevice *vbasedev = &vdev->vbasedev;
>
> vfio_unregister_req_notifier(vdev);
> vfio_unregister_err_notifier(vdev);
> @@ -3280,7 +3288,8 @@ static void vfio_exitfn(PCIDevice *pdev)
> vfio_teardown_msi(vdev);
> vfio_pci_disable_rp_atomics(vdev);
> vfio_bars_exit(vdev);
> - vfio_migration_exit(&vdev->vbasedev);
> + vfio_migration_exit(vbasedev);
> + pci_device_unset_iommu_device(pdev);
> }
>
> static void vfio_pci_reset(DeviceState *dev)
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Eric
^ permalink raw reply [flat|nested] 70+ messages in thread
* [PATCH v6 17/19] intel_iommu: Extract out vtd_cap_init() to initialize cap/ecap
2024-06-03 6:10 [PATCH v6 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
` (15 preceding siblings ...)
2024-06-03 6:10 ` [PATCH v6 16/19] vfio/pci: Pass HostIOMMUDevice to vIOMMU Zhenzhong Duan
@ 2024-06-03 6:10 ` Zhenzhong Duan
2024-06-03 6:10 ` [PATCH v6 18/19] intel_iommu: Implement [set|unset]_iommu_device() callbacks Zhenzhong Duan
` (2 subsequent siblings)
19 siblings, 0 replies; 70+ messages in thread
From: Zhenzhong Duan @ 2024-06-03 6:10 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
yi.l.liu, chao.p.peng, Zhenzhong Duan, Marcel Apfelbaum,
Paolo Bonzini, Richard Henderson, Eduardo Habkost
Extract cap/ecap initialization in vtd_cap_init() to make code
cleaner.
No functional change intended.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
hw/i386/intel_iommu.c | 93 ++++++++++++++++++++++++-------------------
1 file changed, 51 insertions(+), 42 deletions(-)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index cc8e59674e..519063c8f8 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3934,30 +3934,10 @@ static void vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n)
return;
}
-/* Do the initialization. It will also be called when reset, so pay
- * attention when adding new initialization stuff.
- */
-static void vtd_init(IntelIOMMUState *s)
+static void vtd_cap_init(IntelIOMMUState *s)
{
X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s);
- memset(s->csr, 0, DMAR_REG_SIZE);
- memset(s->wmask, 0, DMAR_REG_SIZE);
- memset(s->w1cmask, 0, DMAR_REG_SIZE);
- memset(s->womask, 0, DMAR_REG_SIZE);
-
- s->root = 0;
- s->root_scalable = false;
- s->dmar_enabled = false;
- s->intr_enabled = false;
- s->iq_head = 0;
- s->iq_tail = 0;
- s->iq = 0;
- s->iq_size = 0;
- s->qi_enabled = false;
- s->iq_last_desc_type = VTD_INV_DESC_NONE;
- s->iq_dw = false;
- s->next_frcd_reg = 0;
s->cap = VTD_CAP_FRO | VTD_CAP_NFR | VTD_CAP_ND |
VTD_CAP_MAMV | VTD_CAP_PSI | VTD_CAP_SLLPS |
VTD_CAP_MGAW(s->aw_bits);
@@ -3974,27 +3954,6 @@ static void vtd_init(IntelIOMMUState *s)
}
s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO;
- /*
- * Rsvd field masks for spte
- */
- vtd_spte_rsvd[0] = ~0ULL;
- vtd_spte_rsvd[1] = VTD_SPTE_PAGE_L1_RSVD_MASK(s->aw_bits,
- x86_iommu->dt_supported);
- vtd_spte_rsvd[2] = VTD_SPTE_PAGE_L2_RSVD_MASK(s->aw_bits);
- vtd_spte_rsvd[3] = VTD_SPTE_PAGE_L3_RSVD_MASK(s->aw_bits);
- vtd_spte_rsvd[4] = VTD_SPTE_PAGE_L4_RSVD_MASK(s->aw_bits);
-
- vtd_spte_rsvd_large[2] = VTD_SPTE_LPAGE_L2_RSVD_MASK(s->aw_bits,
- x86_iommu->dt_supported);
- vtd_spte_rsvd_large[3] = VTD_SPTE_LPAGE_L3_RSVD_MASK(s->aw_bits,
- x86_iommu->dt_supported);
-
- if (s->scalable_mode || s->snoop_control) {
- vtd_spte_rsvd[1] &= ~VTD_SPTE_SNP;
- vtd_spte_rsvd_large[2] &= ~VTD_SPTE_SNP;
- vtd_spte_rsvd_large[3] &= ~VTD_SPTE_SNP;
- }
-
if (x86_iommu_ir_supported(x86_iommu)) {
s->ecap |= VTD_ECAP_IR | VTD_ECAP_MHMV;
if (s->intr_eim == ON_OFF_AUTO_ON) {
@@ -4027,6 +3986,56 @@ static void vtd_init(IntelIOMMUState *s)
if (s->pasid) {
s->ecap |= VTD_ECAP_PASID;
}
+}
+
+/*
+ * Do the initialization. It will also be called when reset, so pay
+ * attention when adding new initialization stuff.
+ */
+static void vtd_init(IntelIOMMUState *s)
+{
+ X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s);
+
+ memset(s->csr, 0, DMAR_REG_SIZE);
+ memset(s->wmask, 0, DMAR_REG_SIZE);
+ memset(s->w1cmask, 0, DMAR_REG_SIZE);
+ memset(s->womask, 0, DMAR_REG_SIZE);
+
+ s->root = 0;
+ s->root_scalable = false;
+ s->dmar_enabled = false;
+ s->intr_enabled = false;
+ s->iq_head = 0;
+ s->iq_tail = 0;
+ s->iq = 0;
+ s->iq_size = 0;
+ s->qi_enabled = false;
+ s->iq_last_desc_type = VTD_INV_DESC_NONE;
+ s->iq_dw = false;
+ s->next_frcd_reg = 0;
+
+ vtd_cap_init(s);
+
+ /*
+ * Rsvd field masks for spte
+ */
+ vtd_spte_rsvd[0] = ~0ULL;
+ vtd_spte_rsvd[1] = VTD_SPTE_PAGE_L1_RSVD_MASK(s->aw_bits,
+ x86_iommu->dt_supported);
+ vtd_spte_rsvd[2] = VTD_SPTE_PAGE_L2_RSVD_MASK(s->aw_bits);
+ vtd_spte_rsvd[3] = VTD_SPTE_PAGE_L3_RSVD_MASK(s->aw_bits);
+ vtd_spte_rsvd[4] = VTD_SPTE_PAGE_L4_RSVD_MASK(s->aw_bits);
+
+ vtd_spte_rsvd_large[2] = VTD_SPTE_LPAGE_L2_RSVD_MASK(s->aw_bits,
+ x86_iommu->dt_supported);
+ vtd_spte_rsvd_large[3] = VTD_SPTE_LPAGE_L3_RSVD_MASK(s->aw_bits,
+ x86_iommu->dt_supported);
+
+ if (s->scalable_mode || s->snoop_control) {
+ vtd_spte_rsvd[1] &= ~VTD_SPTE_SNP;
+ vtd_spte_rsvd_large[2] &= ~VTD_SPTE_SNP;
+ vtd_spte_rsvd_large[3] &= ~VTD_SPTE_SNP;
+ }
vtd_reset_caches(s);
--
2.34.1
^ permalink raw reply related [flat|nested] 70+ messages in thread
* [PATCH v6 18/19] intel_iommu: Implement [set|unset]_iommu_device() callbacks
2024-06-03 6:10 [PATCH v6 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
` (16 preceding siblings ...)
2024-06-03 6:10 ` [PATCH v6 17/19] intel_iommu: Extract out vtd_cap_init() to initialize cap/ecap Zhenzhong Duan
@ 2024-06-03 6:10 ` Zhenzhong Duan
2024-06-03 10:12 ` CLEMENT MATHIEU--DRIF
` (2 more replies)
2024-06-03 6:10 ` [PATCH v6 19/19] intel_iommu: Check compatibility with host IOMMU capabilities Zhenzhong Duan
2024-06-03 12:43 ` [PATCH v6 00/19] Add a host IOMMU device abstraction to check with vIOMMU Eric Auger
19 siblings, 3 replies; 70+ messages in thread
From: Zhenzhong Duan @ 2024-06-03 6:10 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
yi.l.liu, chao.p.peng, Yi Sun, Zhenzhong Duan, Paolo Bonzini,
Richard Henderson, Eduardo Habkost, Marcel Apfelbaum
From: Yi Liu <yi.l.liu@intel.com>
Implement [set|unset]_iommu_device() callbacks in Intel vIOMMU.
In set call, a new structure VTDHostIOMMUDevice which holds
a reference to HostIOMMUDevice is stored in hash table
indexed by PCI BDF.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
hw/i386/intel_iommu_internal.h | 9 ++++
include/hw/i386/intel_iommu.h | 2 +
hw/i386/intel_iommu.c | 76 ++++++++++++++++++++++++++++++++++
3 files changed, 87 insertions(+)
diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index f8cf99bddf..b800d62ca0 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -28,6 +28,7 @@
#ifndef HW_I386_INTEL_IOMMU_INTERNAL_H
#define HW_I386_INTEL_IOMMU_INTERNAL_H
#include "hw/i386/intel_iommu.h"
+#include "sysemu/host_iommu_device.h"
/*
* Intel IOMMU register specification
@@ -537,4 +538,12 @@ typedef struct VTDRootEntry VTDRootEntry;
#define VTD_SL_IGN_COM 0xbff0000000000000ULL
#define VTD_SL_TM (1ULL << 62)
+
+typedef struct VTDHostIOMMUDevice {
+ IntelIOMMUState *iommu_state;
+ PCIBus *bus;
+ uint8_t devfn;
+ HostIOMMUDevice *dev;
+ QLIST_ENTRY(VTDHostIOMMUDevice) next;
+} VTDHostIOMMUDevice;
#endif
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 7d694b0813..2bbde41e45 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -293,6 +293,8 @@ struct IntelIOMMUState {
/* list of registered notifiers */
QLIST_HEAD(, VTDAddressSpace) vtd_as_with_notifiers;
+ GHashTable *vtd_host_iommu_dev; /* VTDHostIOMMUDevice */
+
/* interrupt remapping */
bool intr_enabled; /* Whether guest enabled IR */
dma_addr_t intr_root; /* Interrupt remapping table pointer */
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 519063c8f8..747c988bc4 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -237,6 +237,13 @@ static gboolean vtd_as_equal(gconstpointer v1, gconstpointer v2)
(key1->pasid == key2->pasid);
}
+static gboolean vtd_as_idev_equal(gconstpointer v1, gconstpointer v2)
+{
+ const struct vtd_as_key *key1 = v1;
+ const struct vtd_as_key *key2 = v2;
+
+ return (key1->bus == key2->bus) && (key1->devfn == key2->devfn);
+}
/*
* Note that we use pointer to PCIBus as the key, so hashing/shifting
* based on the pointer value is intended. Note that we deal with
@@ -3812,6 +3819,70 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
return vtd_dev_as;
}
+static bool vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
+ HostIOMMUDevice *hiod, Error **errp)
+{
+ IntelIOMMUState *s = opaque;
+ VTDHostIOMMUDevice *vtd_hdev;
+ struct vtd_as_key key = {
+ .bus = bus,
+ .devfn = devfn,
+ };
+ struct vtd_as_key *new_key;
+
+ assert(hiod);
+
+ vtd_iommu_lock(s);
+
+ vtd_hdev = g_hash_table_lookup(s->vtd_host_iommu_dev, &key);
+
+ if (vtd_hdev) {
+ error_setg(errp, "IOMMUFD device already exist");
+ vtd_iommu_unlock(s);
+ return false;
+ }
+
+ vtd_hdev = g_malloc0(sizeof(VTDHostIOMMUDevice));
+ vtd_hdev->bus = bus;
+ vtd_hdev->devfn = (uint8_t)devfn;
+ vtd_hdev->iommu_state = s;
+ vtd_hdev->dev = hiod;
+
+ new_key = g_malloc(sizeof(*new_key));
+ new_key->bus = bus;
+ new_key->devfn = devfn;
+
+ object_ref(hiod);
+ g_hash_table_insert(s->vtd_host_iommu_dev, new_key, vtd_hdev);
+
+ vtd_iommu_unlock(s);
+
+ return true;
+}
+
+static void vtd_dev_unset_iommu_device(PCIBus *bus, void *opaque, int devfn)
+{
+ IntelIOMMUState *s = opaque;
+ VTDHostIOMMUDevice *vtd_hdev;
+ struct vtd_as_key key = {
+ .bus = bus,
+ .devfn = devfn,
+ };
+
+ vtd_iommu_lock(s);
+
+ vtd_hdev = g_hash_table_lookup(s->vtd_host_iommu_dev, &key);
+ if (!vtd_hdev) {
+ vtd_iommu_unlock(s);
+ return;
+ }
+
+ g_hash_table_remove(s->vtd_host_iommu_dev, &key);
+ object_unref(vtd_hdev->dev);
+
+ vtd_iommu_unlock(s);
+}
+
/* Unmap the whole range in the notifier's scope. */
static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
{
@@ -4116,6 +4187,8 @@ static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
static PCIIOMMUOps vtd_iommu_ops = {
.get_address_space = vtd_host_dma_iommu,
+ .set_iommu_device = vtd_dev_set_iommu_device,
+ .unset_iommu_device = vtd_dev_unset_iommu_device,
};
static bool vtd_decide_config(IntelIOMMUState *s, Error **errp)
@@ -4235,6 +4308,9 @@ static void vtd_realize(DeviceState *dev, Error **errp)
g_free, g_free);
s->vtd_address_spaces = g_hash_table_new_full(vtd_as_hash, vtd_as_equal,
g_free, g_free);
+ s->vtd_host_iommu_dev = g_hash_table_new_full(vtd_as_hash,
+ vtd_as_idev_equal,
+ g_free, g_free);
vtd_init(s);
pci_setup_iommu(bus, &vtd_iommu_ops, dev);
/* Pseudo address space under root PCI bus. */
--
2.34.1
^ permalink raw reply related [flat|nested] 70+ messages in thread
* Re: [PATCH v6 18/19] intel_iommu: Implement [set|unset]_iommu_device() callbacks
2024-06-03 6:10 ` [PATCH v6 18/19] intel_iommu: Implement [set|unset]_iommu_device() callbacks Zhenzhong Duan
@ 2024-06-03 10:12 ` CLEMENT MATHIEU--DRIF
2024-06-03 11:02 ` Duan, Zhenzhong
2024-06-03 14:13 ` Eric Auger
2024-06-03 14:47 ` Eric Auger
2 siblings, 1 reply; 70+ messages in thread
From: CLEMENT MATHIEU--DRIF @ 2024-06-03 10:12 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, eric.auger@redhat.com,
mst@redhat.com, peterx@redhat.com, jasowang@redhat.com,
jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com,
kevin.tian@intel.com, yi.l.liu@intel.com, chao.p.peng@intel.com,
Yi Sun, Paolo Bonzini, Richard Henderson, Eduardo Habkost,
Marcel Apfelbaum
On 03/06/2024 08:10, Zhenzhong Duan wrote:
> Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe.
>
>
> From: Yi Liu <yi.l.liu@intel.com>
>
> Implement [set|unset]_iommu_device() callbacks in Intel vIOMMU.
> In set call, a new structure VTDHostIOMMUDevice which holds
> a reference to HostIOMMUDevice is stored in hash table
> indexed by PCI BDF.
>
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
> hw/i386/intel_iommu_internal.h | 9 ++++
> include/hw/i386/intel_iommu.h | 2 +
> hw/i386/intel_iommu.c | 76 ++++++++++++++++++++++++++++++++++
> 3 files changed, 87 insertions(+)
>
> diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
> index f8cf99bddf..b800d62ca0 100644
> --- a/hw/i386/intel_iommu_internal.h
> +++ b/hw/i386/intel_iommu_internal.h
> @@ -28,6 +28,7 @@
> #ifndef HW_I386_INTEL_IOMMU_INTERNAL_H
> #define HW_I386_INTEL_IOMMU_INTERNAL_H
> #include "hw/i386/intel_iommu.h"
> +#include "sysemu/host_iommu_device.h"
>
> /*
> * Intel IOMMU register specification
> @@ -537,4 +538,12 @@ typedef struct VTDRootEntry VTDRootEntry;
> #define VTD_SL_IGN_COM 0xbff0000000000000ULL
> #define VTD_SL_TM (1ULL << 62)
>
> +
> +typedef struct VTDHostIOMMUDevice {
> + IntelIOMMUState *iommu_state;
> + PCIBus *bus;
> + uint8_t devfn;
> + HostIOMMUDevice *dev;
> + QLIST_ENTRY(VTDHostIOMMUDevice) next;
> +} VTDHostIOMMUDevice;
> #endif
> diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
> index 7d694b0813..2bbde41e45 100644
> --- a/include/hw/i386/intel_iommu.h
> +++ b/include/hw/i386/intel_iommu.h
> @@ -293,6 +293,8 @@ struct IntelIOMMUState {
> /* list of registered notifiers */
> QLIST_HEAD(, VTDAddressSpace) vtd_as_with_notifiers;
>
> + GHashTable *vtd_host_iommu_dev; /* VTDHostIOMMUDevice */
> +
> /* interrupt remapping */
> bool intr_enabled; /* Whether guest enabled IR */
> dma_addr_t intr_root; /* Interrupt remapping table pointer */
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index 519063c8f8..747c988bc4 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -237,6 +237,13 @@ static gboolean vtd_as_equal(gconstpointer v1, gconstpointer v2)
> (key1->pasid == key2->pasid);
> }
>
> +static gboolean vtd_as_idev_equal(gconstpointer v1, gconstpointer v2)
> +{
> + const struct vtd_as_key *key1 = v1;
> + const struct vtd_as_key *key2 = v2;
> +
> + return (key1->bus == key2->bus) && (key1->devfn == key2->devfn);
> +}
> /*
> * Note that we use pointer to PCIBus as the key, so hashing/shifting
> * based on the pointer value is intended. Note that we deal with
> @@ -3812,6 +3819,70 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
> return vtd_dev_as;
> }
>
> +static bool vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
> + HostIOMMUDevice *hiod, Error **errp)
> +{
> + IntelIOMMUState *s = opaque;
> + VTDHostIOMMUDevice *vtd_hdev;
> + struct vtd_as_key key = {
> + .bus = bus,
> + .devfn = devfn,
> + };
> + struct vtd_as_key *new_key;
> +
> + assert(hiod);
> +
> + vtd_iommu_lock(s);
> +
> + vtd_hdev = g_hash_table_lookup(s->vtd_host_iommu_dev, &key);
> +
> + if (vtd_hdev) {
> + error_setg(errp, "IOMMUFD device already exist");
> + vtd_iommu_unlock(s);
> + return false;
> + }
> +
> + vtd_hdev = g_malloc0(sizeof(VTDHostIOMMUDevice));
> + vtd_hdev->bus = bus;
> + vtd_hdev->devfn = (uint8_t)devfn;
> + vtd_hdev->iommu_state = s;
> + vtd_hdev->dev = hiod;
> +
> + new_key = g_malloc(sizeof(*new_key));
> + new_key->bus = bus;
> + new_key->devfn = devfn;
> +
> + object_ref(hiod);
> + g_hash_table_insert(s->vtd_host_iommu_dev, new_key, vtd_hdev);
> +
> + vtd_iommu_unlock(s);
> +
> + return true;
> +}
> +
> +static void vtd_dev_unset_iommu_device(PCIBus *bus, void *opaque, int devfn)
> +{
> + IntelIOMMUState *s = opaque;
> + VTDHostIOMMUDevice *vtd_hdev;
> + struct vtd_as_key key = {
> + .bus = bus,
> + .devfn = devfn,
> + };
> +
> + vtd_iommu_lock(s);
> +
> + vtd_hdev = g_hash_table_lookup(s->vtd_host_iommu_dev, &key);
> + if (!vtd_hdev) {
> + vtd_iommu_unlock(s);
> + return;
> + }
> +
> + g_hash_table_remove(s->vtd_host_iommu_dev, &key);
> + object_unref(vtd_hdev->dev);
Not sure but isn't that a potential use after free?
> +
> + vtd_iommu_unlock(s);
> +}
> +
> /* Unmap the whole range in the notifier's scope. */
> static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
> {
> @@ -4116,6 +4187,8 @@ static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
>
> static PCIIOMMUOps vtd_iommu_ops = {
> .get_address_space = vtd_host_dma_iommu,
> + .set_iommu_device = vtd_dev_set_iommu_device,
> + .unset_iommu_device = vtd_dev_unset_iommu_device,
> };
>
> static bool vtd_decide_config(IntelIOMMUState *s, Error **errp)
> @@ -4235,6 +4308,9 @@ static void vtd_realize(DeviceState *dev, Error **errp)
> g_free, g_free);
> s->vtd_address_spaces = g_hash_table_new_full(vtd_as_hash, vtd_as_equal,
> g_free, g_free);
> + s->vtd_host_iommu_dev = g_hash_table_new_full(vtd_as_hash,
> + vtd_as_idev_equal,
> + g_free, g_free);
> vtd_init(s);
> pci_setup_iommu(bus, &vtd_iommu_ops, dev);
> /* Pseudo address space under root PCI bus. */
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 70+ messages in thread
* RE: [PATCH v6 18/19] intel_iommu: Implement [set|unset]_iommu_device() callbacks
2024-06-03 10:12 ` CLEMENT MATHIEU--DRIF
@ 2024-06-03 11:02 ` Duan, Zhenzhong
2024-06-03 12:56 ` Cédric Le Goater
0 siblings, 1 reply; 70+ messages in thread
From: Duan, Zhenzhong @ 2024-06-03 11:02 UTC (permalink / raw)
To: CLEMENT MATHIEU--DRIF, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, eric.auger@redhat.com,
mst@redhat.com, peterx@redhat.com, jasowang@redhat.com,
jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com,
Tian, Kevin, Liu, Yi L, Peng, Chao P, Yi Sun, Paolo Bonzini,
Richard Henderson, Eduardo Habkost, Marcel Apfelbaum
>-----Original Message-----
>From: CLEMENT MATHIEU--DRIF <clement.mathieu--drif@eviden.com>
>Subject: Re: [PATCH v6 18/19] intel_iommu: Implement
>[set|unset]_iommu_device() callbacks
>
>
>On 03/06/2024 08:10, Zhenzhong Duan wrote:
>> Caution: External email. Do not open attachments or click links, unless this
>email comes from a known sender and you know the content is safe.
>>
>>
>> From: Yi Liu <yi.l.liu@intel.com>
>>
>> Implement [set|unset]_iommu_device() callbacks in Intel vIOMMU.
>> In set call, a new structure VTDHostIOMMUDevice which holds
>> a reference to HostIOMMUDevice is stored in hash table
>> indexed by PCI BDF.
>>
>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
>> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>> hw/i386/intel_iommu_internal.h | 9 ++++
>> include/hw/i386/intel_iommu.h | 2 +
>> hw/i386/intel_iommu.c | 76
>++++++++++++++++++++++++++++++++++
>> 3 files changed, 87 insertions(+)
>>
>> diff --git a/hw/i386/intel_iommu_internal.h
>b/hw/i386/intel_iommu_internal.h
>> index f8cf99bddf..b800d62ca0 100644
>> --- a/hw/i386/intel_iommu_internal.h
>> +++ b/hw/i386/intel_iommu_internal.h
>> @@ -28,6 +28,7 @@
>> #ifndef HW_I386_INTEL_IOMMU_INTERNAL_H
>> #define HW_I386_INTEL_IOMMU_INTERNAL_H
>> #include "hw/i386/intel_iommu.h"
>> +#include "sysemu/host_iommu_device.h"
>>
>> /*
>> * Intel IOMMU register specification
>> @@ -537,4 +538,12 @@ typedef struct VTDRootEntry VTDRootEntry;
>> #define VTD_SL_IGN_COM 0xbff0000000000000ULL
>> #define VTD_SL_TM (1ULL << 62)
>>
>> +
>> +typedef struct VTDHostIOMMUDevice {
>> + IntelIOMMUState *iommu_state;
>> + PCIBus *bus;
>> + uint8_t devfn;
>> + HostIOMMUDevice *dev;
>> + QLIST_ENTRY(VTDHostIOMMUDevice) next;
>> +} VTDHostIOMMUDevice;
>> #endif
>> diff --git a/include/hw/i386/intel_iommu.h
>b/include/hw/i386/intel_iommu.h
>> index 7d694b0813..2bbde41e45 100644
>> --- a/include/hw/i386/intel_iommu.h
>> +++ b/include/hw/i386/intel_iommu.h
>> @@ -293,6 +293,8 @@ struct IntelIOMMUState {
>> /* list of registered notifiers */
>> QLIST_HEAD(, VTDAddressSpace) vtd_as_with_notifiers;
>>
>> + GHashTable *vtd_host_iommu_dev; /* VTDHostIOMMUDevice
>*/
>> +
>> /* interrupt remapping */
>> bool intr_enabled; /* Whether guest enabled IR */
>> dma_addr_t intr_root; /* Interrupt remapping table pointer */
>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>> index 519063c8f8..747c988bc4 100644
>> --- a/hw/i386/intel_iommu.c
>> +++ b/hw/i386/intel_iommu.c
>> @@ -237,6 +237,13 @@ static gboolean vtd_as_equal(gconstpointer v1,
>gconstpointer v2)
>> (key1->pasid == key2->pasid);
>> }
>>
>> +static gboolean vtd_as_idev_equal(gconstpointer v1, gconstpointer v2)
>> +{
>> + const struct vtd_as_key *key1 = v1;
>> + const struct vtd_as_key *key2 = v2;
>> +
>> + return (key1->bus == key2->bus) && (key1->devfn == key2->devfn);
>> +}
>> /*
>> * Note that we use pointer to PCIBus as the key, so hashing/shifting
>> * based on the pointer value is intended. Note that we deal with
>> @@ -3812,6 +3819,70 @@ VTDAddressSpace
>*vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
>> return vtd_dev_as;
>> }
>>
>> +static bool vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int
>devfn,
>> + HostIOMMUDevice *hiod, Error **errp)
>> +{
>> + IntelIOMMUState *s = opaque;
>> + VTDHostIOMMUDevice *vtd_hdev;
>> + struct vtd_as_key key = {
>> + .bus = bus,
>> + .devfn = devfn,
>> + };
>> + struct vtd_as_key *new_key;
>> +
>> + assert(hiod);
>> +
>> + vtd_iommu_lock(s);
>> +
>> + vtd_hdev = g_hash_table_lookup(s->vtd_host_iommu_dev, &key);
>> +
>> + if (vtd_hdev) {
>> + error_setg(errp, "IOMMUFD device already exist");
>> + vtd_iommu_unlock(s);
>> + return false;
>> + }
>> +
>> + vtd_hdev = g_malloc0(sizeof(VTDHostIOMMUDevice));
>> + vtd_hdev->bus = bus;
>> + vtd_hdev->devfn = (uint8_t)devfn;
>> + vtd_hdev->iommu_state = s;
>> + vtd_hdev->dev = hiod;
>> +
>> + new_key = g_malloc(sizeof(*new_key));
>> + new_key->bus = bus;
>> + new_key->devfn = devfn;
>> +
>> + object_ref(hiod);
>> + g_hash_table_insert(s->vtd_host_iommu_dev, new_key, vtd_hdev);
>> +
>> + vtd_iommu_unlock(s);
>> +
>> + return true;
>> +}
>> +
>> +static void vtd_dev_unset_iommu_device(PCIBus *bus, void *opaque, int
>devfn)
>> +{
>> + IntelIOMMUState *s = opaque;
>> + VTDHostIOMMUDevice *vtd_hdev;
>> + struct vtd_as_key key = {
>> + .bus = bus,
>> + .devfn = devfn,
>> + };
>> +
>> + vtd_iommu_lock(s);
>> +
>> + vtd_hdev = g_hash_table_lookup(s->vtd_host_iommu_dev, &key);
>> + if (!vtd_hdev) {
>> + vtd_iommu_unlock(s);
>> + return;
>> + }
>> +
>> + g_hash_table_remove(s->vtd_host_iommu_dev, &key);
>> + object_unref(vtd_hdev->dev);
>Not sure but isn't that a potential use after free?
Good catch! Will fix. Should be:
object_unref(vtd_hdev->dev);
g_hash_table_remove(s->vtd_host_iommu_dev, &key);
Thanks
Zhenzhong
>> +
>> + vtd_iommu_unlock(s);
>> +}
>> +
>> /* Unmap the whole range in the notifier's scope. */
>> static void vtd_address_space_unmap(VTDAddressSpace *as,
>IOMMUNotifier *n)
>> {
>> @@ -4116,6 +4187,8 @@ static AddressSpace
>*vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
>>
>> static PCIIOMMUOps vtd_iommu_ops = {
>> .get_address_space = vtd_host_dma_iommu,
>> + .set_iommu_device = vtd_dev_set_iommu_device,
>> + .unset_iommu_device = vtd_dev_unset_iommu_device,
>> };
>>
>> static bool vtd_decide_config(IntelIOMMUState *s, Error **errp)
>> @@ -4235,6 +4308,9 @@ static void vtd_realize(DeviceState *dev, Error
>**errp)
>> g_free, g_free);
>> s->vtd_address_spaces = g_hash_table_new_full(vtd_as_hash,
>vtd_as_equal,
>> g_free, g_free);
>> + s->vtd_host_iommu_dev = g_hash_table_new_full(vtd_as_hash,
>> + vtd_as_idev_equal,
>> + g_free, g_free);
>> vtd_init(s);
>> pci_setup_iommu(bus, &vtd_iommu_ops, dev);
>> /* Pseudo address space under root PCI bus. */
>> --
>> 2.34.1
>>
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH v6 18/19] intel_iommu: Implement [set|unset]_iommu_device() callbacks
2024-06-03 11:02 ` Duan, Zhenzhong
@ 2024-06-03 12:56 ` Cédric Le Goater
2024-06-04 3:46 ` Duan, Zhenzhong
0 siblings, 1 reply; 70+ messages in thread
From: Cédric Le Goater @ 2024-06-03 12:56 UTC (permalink / raw)
To: Duan, Zhenzhong, CLEMENT MATHIEU--DRIF, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, eric.auger@redhat.com, mst@redhat.com,
peterx@redhat.com, jasowang@redhat.com, jgg@nvidia.com,
nicolinc@nvidia.com, joao.m.martins@oracle.com, Tian, Kevin,
Liu, Yi L, Peng, Chao P, Yi Sun, Paolo Bonzini, Richard Henderson,
Eduardo Habkost, Marcel Apfelbaum
On 6/3/24 13:02, Duan, Zhenzhong wrote:
>
>
>> -----Original Message-----
>> From: CLEMENT MATHIEU--DRIF <clement.mathieu--drif@eviden.com>
>> Subject: Re: [PATCH v6 18/19] intel_iommu: Implement
>> [set|unset]_iommu_device() callbacks
>>
>>
>> On 03/06/2024 08:10, Zhenzhong Duan wrote:
>>> Caution: External email. Do not open attachments or click links, unless this
>> email comes from a known sender and you know the content is safe.
>>>
>>>
>>> From: Yi Liu <yi.l.liu@intel.com>
>>>
>>> Implement [set|unset]_iommu_device() callbacks in Intel vIOMMU.
>>> In set call, a new structure VTDHostIOMMUDevice which holds
>>> a reference to HostIOMMUDevice is stored in hash table
>>> indexed by PCI BDF.
>>>
>>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
>>> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>> ---
>>> hw/i386/intel_iommu_internal.h | 9 ++++
>>> include/hw/i386/intel_iommu.h | 2 +
>>> hw/i386/intel_iommu.c | 76
>> ++++++++++++++++++++++++++++++++++
>>> 3 files changed, 87 insertions(+)
>>>
>>> diff --git a/hw/i386/intel_iommu_internal.h
>> b/hw/i386/intel_iommu_internal.h
>>> index f8cf99bddf..b800d62ca0 100644
>>> --- a/hw/i386/intel_iommu_internal.h
>>> +++ b/hw/i386/intel_iommu_internal.h
>>> @@ -28,6 +28,7 @@
>>> #ifndef HW_I386_INTEL_IOMMU_INTERNAL_H
>>> #define HW_I386_INTEL_IOMMU_INTERNAL_H
>>> #include "hw/i386/intel_iommu.h"
>>> +#include "sysemu/host_iommu_device.h"
>>>
>>> /*
>>> * Intel IOMMU register specification
>>> @@ -537,4 +538,12 @@ typedef struct VTDRootEntry VTDRootEntry;
>>> #define VTD_SL_IGN_COM 0xbff0000000000000ULL
>>> #define VTD_SL_TM (1ULL << 62)
>>>
>>> +
>>> +typedef struct VTDHostIOMMUDevice {
>>> + IntelIOMMUState *iommu_state;
>>> + PCIBus *bus;
>>> + uint8_t devfn;
>>> + HostIOMMUDevice *dev;
>>> + QLIST_ENTRY(VTDHostIOMMUDevice) next;
>>> +} VTDHostIOMMUDevice;
>>> #endif
>>> diff --git a/include/hw/i386/intel_iommu.h
>> b/include/hw/i386/intel_iommu.h
>>> index 7d694b0813..2bbde41e45 100644
>>> --- a/include/hw/i386/intel_iommu.h
>>> +++ b/include/hw/i386/intel_iommu.h
>>> @@ -293,6 +293,8 @@ struct IntelIOMMUState {
>>> /* list of registered notifiers */
>>> QLIST_HEAD(, VTDAddressSpace) vtd_as_with_notifiers;
>>>
>>> + GHashTable *vtd_host_iommu_dev; /* VTDHostIOMMUDevice
>> */
>>> +
>>> /* interrupt remapping */
>>> bool intr_enabled; /* Whether guest enabled IR */
>>> dma_addr_t intr_root; /* Interrupt remapping table pointer */
>>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>>> index 519063c8f8..747c988bc4 100644
>>> --- a/hw/i386/intel_iommu.c
>>> +++ b/hw/i386/intel_iommu.c
>>> @@ -237,6 +237,13 @@ static gboolean vtd_as_equal(gconstpointer v1,
>> gconstpointer v2)
>>> (key1->pasid == key2->pasid);
>>> }
>>>
>>> +static gboolean vtd_as_idev_equal(gconstpointer v1, gconstpointer v2)
>>> +{
>>> + const struct vtd_as_key *key1 = v1;
>>> + const struct vtd_as_key *key2 = v2;
>>> +
>>> + return (key1->bus == key2->bus) && (key1->devfn == key2->devfn);
>>> +}
>>> /*
>>> * Note that we use pointer to PCIBus as the key, so hashing/shifting
>>> * based on the pointer value is intended. Note that we deal with
>>> @@ -3812,6 +3819,70 @@ VTDAddressSpace
>> *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
>>> return vtd_dev_as;
>>> }
>>>
>>> +static bool vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int
>> devfn,
>>> + HostIOMMUDevice *hiod, Error **errp)
>>> +{
>>> + IntelIOMMUState *s = opaque;
>>> + VTDHostIOMMUDevice *vtd_hdev;
>>> + struct vtd_as_key key = {
>>> + .bus = bus,
>>> + .devfn = devfn,
>>> + };
>>> + struct vtd_as_key *new_key;
>>> +
>>> + assert(hiod);
>>> +
>>> + vtd_iommu_lock(s);
>>> +
>>> + vtd_hdev = g_hash_table_lookup(s->vtd_host_iommu_dev, &key);
>>> +
>>> + if (vtd_hdev) {
>>> + error_setg(errp, "IOMMUFD device already exist");
>>> + vtd_iommu_unlock(s);
>>> + return false;
>>> + }
>>> +
>>> + vtd_hdev = g_malloc0(sizeof(VTDHostIOMMUDevice));
>>> + vtd_hdev->bus = bus;
>>> + vtd_hdev->devfn = (uint8_t)devfn;
>>> + vtd_hdev->iommu_state = s;
>>> + vtd_hdev->dev = hiod;
>>> +
>>> + new_key = g_malloc(sizeof(*new_key));
>>> + new_key->bus = bus;
>>> + new_key->devfn = devfn;
>>> +
>>> + object_ref(hiod);
>>> + g_hash_table_insert(s->vtd_host_iommu_dev, new_key, vtd_hdev);
>>> +
>>> + vtd_iommu_unlock(s);
>>> +
>>> + return true;
>>> +}
>>> +
>>> +static void vtd_dev_unset_iommu_device(PCIBus *bus, void *opaque, int
>> devfn)
>>> +{
>>> + IntelIOMMUState *s = opaque;
>>> + VTDHostIOMMUDevice *vtd_hdev;
>>> + struct vtd_as_key key = {
>>> + .bus = bus,
>>> + .devfn = devfn,
>>> + };
>>> +
>>> + vtd_iommu_lock(s);
>>> +
>>> + vtd_hdev = g_hash_table_lookup(s->vtd_host_iommu_dev, &key);
>>> + if (!vtd_hdev) {
>>> + vtd_iommu_unlock(s);
>>> + return;
>>> + }
>>> +
>>> + g_hash_table_remove(s->vtd_host_iommu_dev, &key);
>>> + object_unref(vtd_hdev->dev);
>> Not sure but isn't that a potential use after free?
>
> Good catch! Will fix. Should be:
>
> object_unref(vtd_hdev->dev);
> g_hash_table_remove(s->vtd_host_iommu_dev, &key);
you could also implement a custom destroy hash function.
Thanks,
C.
>
> Thanks
> Zhenzhong
>
>>> +
>>> + vtd_iommu_unlock(s);
>>> +}
>>> +
>>> /* Unmap the whole range in the notifier's scope. */
>>> static void vtd_address_space_unmap(VTDAddressSpace *as,
>> IOMMUNotifier *n)
>>> {
>>> @@ -4116,6 +4187,8 @@ static AddressSpace
>> *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
>>>
>>> static PCIIOMMUOps vtd_iommu_ops = {
>>> .get_address_space = vtd_host_dma_iommu,
>>> + .set_iommu_device = vtd_dev_set_iommu_device,
>>> + .unset_iommu_device = vtd_dev_unset_iommu_device,
>>> };
>>>
>>> static bool vtd_decide_config(IntelIOMMUState *s, Error **errp)
>>> @@ -4235,6 +4308,9 @@ static void vtd_realize(DeviceState *dev, Error
>> **errp)
>>> g_free, g_free);
>>> s->vtd_address_spaces = g_hash_table_new_full(vtd_as_hash,
>> vtd_as_equal,
>>> g_free, g_free);
>>> + s->vtd_host_iommu_dev = g_hash_table_new_full(vtd_as_hash,
>>> + vtd_as_idev_equal,
>>> + g_free, g_free);
>>> vtd_init(s);
>>> pci_setup_iommu(bus, &vtd_iommu_ops, dev);
>>> /* Pseudo address space under root PCI bus. */
>>> --
>>> 2.34.1
>>>
^ permalink raw reply [flat|nested] 70+ messages in thread
* RE: [PATCH v6 18/19] intel_iommu: Implement [set|unset]_iommu_device() callbacks
2024-06-03 12:56 ` Cédric Le Goater
@ 2024-06-04 3:46 ` Duan, Zhenzhong
0 siblings, 0 replies; 70+ messages in thread
From: Duan, Zhenzhong @ 2024-06-04 3:46 UTC (permalink / raw)
To: Cédric Le Goater, CLEMENT MATHIEU--DRIF,
qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, eric.auger@redhat.com, mst@redhat.com,
peterx@redhat.com, jasowang@redhat.com, jgg@nvidia.com,
nicolinc@nvidia.com, joao.m.martins@oracle.com, Tian, Kevin,
Liu, Yi L, Peng, Chao P, Yi Sun, Paolo Bonzini, Richard Henderson,
Eduardo Habkost, Marcel Apfelbaum
>-----Original Message-----
>From: Cédric Le Goater <clg@redhat.com>
>Subject: Re: [PATCH v6 18/19] intel_iommu: Implement
>[set|unset]_iommu_device() callbacks
>
>On 6/3/24 13:02, Duan, Zhenzhong wrote:
>>
>>
>>> -----Original Message-----
>>> From: CLEMENT MATHIEU--DRIF <clement.mathieu--drif@eviden.com>
>>> Subject: Re: [PATCH v6 18/19] intel_iommu: Implement
>>> [set|unset]_iommu_device() callbacks
>>>
>>>
>>> On 03/06/2024 08:10, Zhenzhong Duan wrote:
>>>> Caution: External email. Do not open attachments or click links, unless
>this
>>> email comes from a known sender and you know the content is safe.
>>>>
>>>>
>>>> From: Yi Liu <yi.l.liu@intel.com>
>>>>
>>>> Implement [set|unset]_iommu_device() callbacks in Intel vIOMMU.
>>>> In set call, a new structure VTDHostIOMMUDevice which holds
>>>> a reference to HostIOMMUDevice is stored in hash table
>>>> indexed by PCI BDF.
>>>>
>>>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
>>>> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
>>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>>> ---
>>>> hw/i386/intel_iommu_internal.h | 9 ++++
>>>> include/hw/i386/intel_iommu.h | 2 +
>>>> hw/i386/intel_iommu.c | 76
>>> ++++++++++++++++++++++++++++++++++
>>>> 3 files changed, 87 insertions(+)
>>>>
>>>> diff --git a/hw/i386/intel_iommu_internal.h
>>> b/hw/i386/intel_iommu_internal.h
>>>> index f8cf99bddf..b800d62ca0 100644
>>>> --- a/hw/i386/intel_iommu_internal.h
>>>> +++ b/hw/i386/intel_iommu_internal.h
>>>> @@ -28,6 +28,7 @@
>>>> #ifndef HW_I386_INTEL_IOMMU_INTERNAL_H
>>>> #define HW_I386_INTEL_IOMMU_INTERNAL_H
>>>> #include "hw/i386/intel_iommu.h"
>>>> +#include "sysemu/host_iommu_device.h"
>>>>
>>>> /*
>>>> * Intel IOMMU register specification
>>>> @@ -537,4 +538,12 @@ typedef struct VTDRootEntry VTDRootEntry;
>>>> #define VTD_SL_IGN_COM 0xbff0000000000000ULL
>>>> #define VTD_SL_TM (1ULL << 62)
>>>>
>>>> +
>>>> +typedef struct VTDHostIOMMUDevice {
>>>> + IntelIOMMUState *iommu_state;
>>>> + PCIBus *bus;
>>>> + uint8_t devfn;
>>>> + HostIOMMUDevice *dev;
>>>> + QLIST_ENTRY(VTDHostIOMMUDevice) next;
>>>> +} VTDHostIOMMUDevice;
>>>> #endif
>>>> diff --git a/include/hw/i386/intel_iommu.h
>>> b/include/hw/i386/intel_iommu.h
>>>> index 7d694b0813..2bbde41e45 100644
>>>> --- a/include/hw/i386/intel_iommu.h
>>>> +++ b/include/hw/i386/intel_iommu.h
>>>> @@ -293,6 +293,8 @@ struct IntelIOMMUState {
>>>> /* list of registered notifiers */
>>>> QLIST_HEAD(, VTDAddressSpace) vtd_as_with_notifiers;
>>>>
>>>> + GHashTable *vtd_host_iommu_dev; /*
>VTDHostIOMMUDevice
>>> */
>>>> +
>>>> /* interrupt remapping */
>>>> bool intr_enabled; /* Whether guest enabled IR */
>>>> dma_addr_t intr_root; /* Interrupt remapping table pointer
>*/
>>>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>>>> index 519063c8f8..747c988bc4 100644
>>>> --- a/hw/i386/intel_iommu.c
>>>> +++ b/hw/i386/intel_iommu.c
>>>> @@ -237,6 +237,13 @@ static gboolean vtd_as_equal(gconstpointer
>v1,
>>> gconstpointer v2)
>>>> (key1->pasid == key2->pasid);
>>>> }
>>>>
>>>> +static gboolean vtd_as_idev_equal(gconstpointer v1, gconstpointer v2)
>>>> +{
>>>> + const struct vtd_as_key *key1 = v1;
>>>> + const struct vtd_as_key *key2 = v2;
>>>> +
>>>> + return (key1->bus == key2->bus) && (key1->devfn == key2->devfn);
>>>> +}
>>>> /*
>>>> * Note that we use pointer to PCIBus as the key, so hashing/shifting
>>>> * based on the pointer value is intended. Note that we deal with
>>>> @@ -3812,6 +3819,70 @@ VTDAddressSpace
>>> *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
>>>> return vtd_dev_as;
>>>> }
>>>>
>>>> +static bool vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int
>>> devfn,
>>>> + HostIOMMUDevice *hiod, Error **errp)
>>>> +{
>>>> + IntelIOMMUState *s = opaque;
>>>> + VTDHostIOMMUDevice *vtd_hdev;
>>>> + struct vtd_as_key key = {
>>>> + .bus = bus,
>>>> + .devfn = devfn,
>>>> + };
>>>> + struct vtd_as_key *new_key;
>>>> +
>>>> + assert(hiod);
>>>> +
>>>> + vtd_iommu_lock(s);
>>>> +
>>>> + vtd_hdev = g_hash_table_lookup(s->vtd_host_iommu_dev, &key);
>>>> +
>>>> + if (vtd_hdev) {
>>>> + error_setg(errp, "IOMMUFD device already exist");
>>>> + vtd_iommu_unlock(s);
>>>> + return false;
>>>> + }
>>>> +
>>>> + vtd_hdev = g_malloc0(sizeof(VTDHostIOMMUDevice));
>>>> + vtd_hdev->bus = bus;
>>>> + vtd_hdev->devfn = (uint8_t)devfn;
>>>> + vtd_hdev->iommu_state = s;
>>>> + vtd_hdev->dev = hiod;
>>>> +
>>>> + new_key = g_malloc(sizeof(*new_key));
>>>> + new_key->bus = bus;
>>>> + new_key->devfn = devfn;
>>>> +
>>>> + object_ref(hiod);
>>>> + g_hash_table_insert(s->vtd_host_iommu_dev, new_key, vtd_hdev);
>>>> +
>>>> + vtd_iommu_unlock(s);
>>>> +
>>>> + return true;
>>>> +}
>>>> +
>>>> +static void vtd_dev_unset_iommu_device(PCIBus *bus, void *opaque,
>int
>>> devfn)
>>>> +{
>>>> + IntelIOMMUState *s = opaque;
>>>> + VTDHostIOMMUDevice *vtd_hdev;
>>>> + struct vtd_as_key key = {
>>>> + .bus = bus,
>>>> + .devfn = devfn,
>>>> + };
>>>> +
>>>> + vtd_iommu_lock(s);
>>>> +
>>>> + vtd_hdev = g_hash_table_lookup(s->vtd_host_iommu_dev, &key);
>>>> + if (!vtd_hdev) {
>>>> + vtd_iommu_unlock(s);
>>>> + return;
>>>> + }
>>>> +
>>>> + g_hash_table_remove(s->vtd_host_iommu_dev, &key);
>>>> + object_unref(vtd_hdev->dev);
>>> Not sure but isn't that a potential use after free?
>>
>> Good catch! Will fix. Should be:
>>
>> object_unref(vtd_hdev->dev);
>> g_hash_table_remove(s->vtd_host_iommu_dev, &key);
>
>you could also implement a custom destroy hash function.
Yes, but I'd like to have it to match object_ref() call in vtd_dev_set_iommu_device()
Thanks
Zhenzhong
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH v6 18/19] intel_iommu: Implement [set|unset]_iommu_device() callbacks
2024-06-03 6:10 ` [PATCH v6 18/19] intel_iommu: Implement [set|unset]_iommu_device() callbacks Zhenzhong Duan
2024-06-03 10:12 ` CLEMENT MATHIEU--DRIF
@ 2024-06-03 14:13 ` Eric Auger
2024-06-04 5:40 ` Duan, Zhenzhong
2024-06-03 14:47 ` Eric Auger
2 siblings, 1 reply; 70+ messages in thread
From: Eric Auger @ 2024-06-03 14:13 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, clg, mst, peterx, jasowang, jgg, nicolinc,
joao.m.martins, clement.mathieu--drif, kevin.tian, yi.l.liu,
chao.p.peng, Yi Sun, Paolo Bonzini, Richard Henderson,
Eduardo Habkost, Marcel Apfelbaum
Hi Zhenzhong,
On 6/3/24 08:10, Zhenzhong Duan wrote:
> From: Yi Liu <yi.l.liu@intel.com>
>
> Implement [set|unset]_iommu_device() callbacks in Intel vIOMMU.
> In set call, a new structure VTDHostIOMMUDevice which holds
> a reference to HostIOMMUDevice is stored in hash table
> indexed by PCI BDF.
maybe precise that this is not the aliased one?
>
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
> hw/i386/intel_iommu_internal.h | 9 ++++
> include/hw/i386/intel_iommu.h | 2 +
> hw/i386/intel_iommu.c | 76 ++++++++++++++++++++++++++++++++++
> 3 files changed, 87 insertions(+)
>
> diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
> index f8cf99bddf..b800d62ca0 100644
> --- a/hw/i386/intel_iommu_internal.h
> +++ b/hw/i386/intel_iommu_internal.h
> @@ -28,6 +28,7 @@
> #ifndef HW_I386_INTEL_IOMMU_INTERNAL_H
> #define HW_I386_INTEL_IOMMU_INTERNAL_H
> #include "hw/i386/intel_iommu.h"
> +#include "sysemu/host_iommu_device.h"
>
> /*
> * Intel IOMMU register specification
> @@ -537,4 +538,12 @@ typedef struct VTDRootEntry VTDRootEntry;
> #define VTD_SL_IGN_COM 0xbff0000000000000ULL
> #define VTD_SL_TM (1ULL << 62)
>
> +
> +typedef struct VTDHostIOMMUDevice {
> + IntelIOMMUState *iommu_state;
Why do you need the iommu_state?
> + PCIBus *bus;
> + uint8_t devfn;
> + HostIOMMUDevice *dev;
> + QLIST_ENTRY(VTDHostIOMMUDevice) next;
> +} VTDHostIOMMUDevice;
How VTD specific is it?
> #endif
> diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
> index 7d694b0813..2bbde41e45 100644
> --- a/include/hw/i386/intel_iommu.h
> +++ b/include/hw/i386/intel_iommu.h
> @@ -293,6 +293,8 @@ struct IntelIOMMUState {
> /* list of registered notifiers */
> QLIST_HEAD(, VTDAddressSpace) vtd_as_with_notifiers;
>
> + GHashTable *vtd_host_iommu_dev; /* VTDHostIOMMUDevice */
> +
> /* interrupt remapping */
> bool intr_enabled; /* Whether guest enabled IR */
> dma_addr_t intr_root; /* Interrupt remapping table pointer */
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index 519063c8f8..747c988bc4 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -237,6 +237,13 @@ static gboolean vtd_as_equal(gconstpointer v1, gconstpointer v2)
> (key1->pasid == key2->pasid);
> }
>
> +static gboolean vtd_as_idev_equal(gconstpointer v1, gconstpointer v2)
> +{
> + const struct vtd_as_key *key1 = v1;
> + const struct vtd_as_key *key2 = v2;
> +
> + return (key1->bus == key2->bus) && (key1->devfn == key2->devfn);
> +}
can't you reuse the key with pasid?
> /*
> * Note that we use pointer to PCIBus as the key, so hashing/shifting
> * based on the pointer value is intended. Note that we deal with
> @@ -3812,6 +3819,70 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
> return vtd_dev_as;
> }
>
> +static bool vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
> + HostIOMMUDevice *hiod, Error **errp)
> +{
> + IntelIOMMUState *s = opaque;
> + VTDHostIOMMUDevice *vtd_hdev;
> + struct vtd_as_key key = {
> + .bus = bus,
> + .devfn = devfn,
> + };
> + struct vtd_as_key *new_key;
> +
> + assert(hiod);
> +
> + vtd_iommu_lock(s);
> +
> + vtd_hdev = g_hash_table_lookup(s->vtd_host_iommu_dev, &key);
> +
> + if (vtd_hdev) {
> + error_setg(errp, "IOMMUFD device already exist");
s/IOMMUFD/Host IOMMU device?
> + vtd_iommu_unlock(s);
> + return false;
> + }
> +
> + vtd_hdev = g_malloc0(sizeof(VTDHostIOMMUDevice));
> + vtd_hdev->bus = bus;
> + vtd_hdev->devfn = (uint8_t)devfn;
> + vtd_hdev->iommu_state = s;
> + vtd_hdev->dev = hiod;
> +
> + new_key = g_malloc(sizeof(*new_key));
> + new_key->bus = bus;
> + new_key->devfn = devfn;
> +
> + object_ref(hiod);
> + g_hash_table_insert(s->vtd_host_iommu_dev, new_key, vtd_hdev);
> +
> + vtd_iommu_unlock(s);
> +
> + return true;
> +}
> +
> +static void vtd_dev_unset_iommu_device(PCIBus *bus, void *opaque, int devfn)
> +{
> + IntelIOMMUState *s = opaque;
> + VTDHostIOMMUDevice *vtd_hdev;
> + struct vtd_as_key key = {
> + .bus = bus,
> + .devfn = devfn,
> + };
> +
> + vtd_iommu_lock(s);
> +
> + vtd_hdev = g_hash_table_lookup(s->vtd_host_iommu_dev, &key);
> + if (!vtd_hdev) {
> + vtd_iommu_unlock(s);
> + return;
> + }
> +
> + g_hash_table_remove(s->vtd_host_iommu_dev, &key);
> + object_unref(vtd_hdev->dev);
> +
> + vtd_iommu_unlock(s);
> +}
> +
> /* Unmap the whole range in the notifier's scope. */
> static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
> {
> @@ -4116,6 +4187,8 @@ static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
>
> static PCIIOMMUOps vtd_iommu_ops = {
> .get_address_space = vtd_host_dma_iommu,
> + .set_iommu_device = vtd_dev_set_iommu_device,
> + .unset_iommu_device = vtd_dev_unset_iommu_device,
> };
>
> static bool vtd_decide_config(IntelIOMMUState *s, Error **errp)
> @@ -4235,6 +4308,9 @@ static void vtd_realize(DeviceState *dev, Error **errp)
> g_free, g_free);
> s->vtd_address_spaces = g_hash_table_new_full(vtd_as_hash, vtd_as_equal,
> g_free, g_free);
> + s->vtd_host_iommu_dev = g_hash_table_new_full(vtd_as_hash,
> + vtd_as_idev_equal,
> + g_free, g_free);
> vtd_init(s);
> pci_setup_iommu(bus, &vtd_iommu_ops, dev);
> /* Pseudo address space under root PCI bus. */
Thanks
Eric
^ permalink raw reply [flat|nested] 70+ messages in thread
* RE: [PATCH v6 18/19] intel_iommu: Implement [set|unset]_iommu_device() callbacks
2024-06-03 14:13 ` Eric Auger
@ 2024-06-04 5:40 ` Duan, Zhenzhong
2024-06-04 8:14 ` Eric Auger
0 siblings, 1 reply; 70+ messages in thread
From: Duan, Zhenzhong @ 2024-06-04 5:40 UTC (permalink / raw)
To: eric.auger@redhat.com, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
peterx@redhat.com, jasowang@redhat.com, jgg@nvidia.com,
nicolinc@nvidia.com, joao.m.martins@oracle.com,
clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
Peng, Chao P, Yi Sun, Paolo Bonzini, Richard Henderson,
Eduardo Habkost, Marcel Apfelbaum
>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: Re: [PATCH v6 18/19] intel_iommu: Implement
>[set|unset]_iommu_device() callbacks
>
>Hi Zhenzhong,
>
>On 6/3/24 08:10, Zhenzhong Duan wrote:
>> From: Yi Liu <yi.l.liu@intel.com>
>>
>> Implement [set|unset]_iommu_device() callbacks in Intel vIOMMU.
>> In set call, a new structure VTDHostIOMMUDevice which holds
>> a reference to HostIOMMUDevice is stored in hash table
>> indexed by PCI BDF.
>maybe precise that this is not the aliased one?
Sure.
>>
>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
>> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>> hw/i386/intel_iommu_internal.h | 9 ++++
>> include/hw/i386/intel_iommu.h | 2 +
>> hw/i386/intel_iommu.c | 76
>++++++++++++++++++++++++++++++++++
>> 3 files changed, 87 insertions(+)
>>
>> diff --git a/hw/i386/intel_iommu_internal.h
>b/hw/i386/intel_iommu_internal.h
>> index f8cf99bddf..b800d62ca0 100644
>> --- a/hw/i386/intel_iommu_internal.h
>> +++ b/hw/i386/intel_iommu_internal.h
>> @@ -28,6 +28,7 @@
>> #ifndef HW_I386_INTEL_IOMMU_INTERNAL_H
>> #define HW_I386_INTEL_IOMMU_INTERNAL_H
>> #include "hw/i386/intel_iommu.h"
>> +#include "sysemu/host_iommu_device.h"
>>
>> /*
>> * Intel IOMMU register specification
>> @@ -537,4 +538,12 @@ typedef struct VTDRootEntry VTDRootEntry;
>> #define VTD_SL_IGN_COM 0xbff0000000000000ULL
>> #define VTD_SL_TM (1ULL << 62)
>>
>> +
>> +typedef struct VTDHostIOMMUDevice {
>> + IntelIOMMUState *iommu_state;
>Why do you need the iommu_state?
It is used in nesting series.
>> + PCIBus *bus;
>> + uint8_t devfn;
>> + HostIOMMUDevice *dev;
>> + QLIST_ENTRY(VTDHostIOMMUDevice) next;
>> +} VTDHostIOMMUDevice;
>How VTD specific is it?
In nesting series, it has element iommu_state and errata
which are VTD specific.
>> #endif
>> diff --git a/include/hw/i386/intel_iommu.h
>b/include/hw/i386/intel_iommu.h
>> index 7d694b0813..2bbde41e45 100644
>> --- a/include/hw/i386/intel_iommu.h
>> +++ b/include/hw/i386/intel_iommu.h
>> @@ -293,6 +293,8 @@ struct IntelIOMMUState {
>> /* list of registered notifiers */
>> QLIST_HEAD(, VTDAddressSpace) vtd_as_with_notifiers;
>>
>> + GHashTable *vtd_host_iommu_dev; /* VTDHostIOMMUDevice
>*/
>> +
>> /* interrupt remapping */
>> bool intr_enabled; /* Whether guest enabled IR */
>> dma_addr_t intr_root; /* Interrupt remapping table pointer */
>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>> index 519063c8f8..747c988bc4 100644
>> --- a/hw/i386/intel_iommu.c
>> +++ b/hw/i386/intel_iommu.c
>> @@ -237,6 +237,13 @@ static gboolean vtd_as_equal(gconstpointer v1,
>gconstpointer v2)
>> (key1->pasid == key2->pasid);
>> }
>>
>> +static gboolean vtd_as_idev_equal(gconstpointer v1, gconstpointer v2)
>> +{
>> + const struct vtd_as_key *key1 = v1;
>> + const struct vtd_as_key *key2 = v2;
>> +
>> + return (key1->bus == key2->bus) && (key1->devfn == key2->devfn);
>> +}
>can't you reuse the key with pasid?
s->vtd_host_iommu_dev isn't indexed by pasid but only BDF.
Maybe I'd better to define its own key struct, hash() and equal() functions.
>> /*
>> * Note that we use pointer to PCIBus as the key, so hashing/shifting
>> * based on the pointer value is intended. Note that we deal with
>> @@ -3812,6 +3819,70 @@ VTDAddressSpace
>*vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
>> return vtd_dev_as;
>> }
>>
>> +static bool vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int
>devfn,
>> + HostIOMMUDevice *hiod, Error **errp)
>> +{
>> + IntelIOMMUState *s = opaque;
>> + VTDHostIOMMUDevice *vtd_hdev;
>> + struct vtd_as_key key = {
>> + .bus = bus,
>> + .devfn = devfn,
>> + };
>> + struct vtd_as_key *new_key;
>> +
>> + assert(hiod);
>> +
>> + vtd_iommu_lock(s);
>> +
>> + vtd_hdev = g_hash_table_lookup(s->vtd_host_iommu_dev, &key);
>> +
>> + if (vtd_hdev) {
>> + error_setg(errp, "IOMMUFD device already exist");
>s/IOMMUFD/Host IOMMU device?
Good catch, will fix.
Thanks
Zhenzhong
>> + vtd_iommu_unlock(s);
>> + return false;
>> + }
>> +
>> + vtd_hdev = g_malloc0(sizeof(VTDHostIOMMUDevice));
>> + vtd_hdev->bus = bus;
>> + vtd_hdev->devfn = (uint8_t)devfn;
>> + vtd_hdev->iommu_state = s;
>> + vtd_hdev->dev = hiod;
>> +
>> + new_key = g_malloc(sizeof(*new_key));
>> + new_key->bus = bus;
>> + new_key->devfn = devfn;
>> +
>> + object_ref(hiod);
>> + g_hash_table_insert(s->vtd_host_iommu_dev, new_key, vtd_hdev);
>> +
>> + vtd_iommu_unlock(s);
>> +
>> + return true;
>> +}
>> +
>> +static void vtd_dev_unset_iommu_device(PCIBus *bus, void *opaque, int
>devfn)
>> +{
>> + IntelIOMMUState *s = opaque;
>> + VTDHostIOMMUDevice *vtd_hdev;
>> + struct vtd_as_key key = {
>> + .bus = bus,
>> + .devfn = devfn,
>> + };
>> +
>> + vtd_iommu_lock(s);
>> +
>> + vtd_hdev = g_hash_table_lookup(s->vtd_host_iommu_dev, &key);
>> + if (!vtd_hdev) {
>> + vtd_iommu_unlock(s);
>> + return;
>> + }
>> +
>> + g_hash_table_remove(s->vtd_host_iommu_dev, &key);
>> + object_unref(vtd_hdev->dev);
>> +
>> + vtd_iommu_unlock(s);
>> +}
>> +
>> /* Unmap the whole range in the notifier's scope. */
>> static void vtd_address_space_unmap(VTDAddressSpace *as,
>IOMMUNotifier *n)
>> {
>> @@ -4116,6 +4187,8 @@ static AddressSpace
>*vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
>>
>> static PCIIOMMUOps vtd_iommu_ops = {
>> .get_address_space = vtd_host_dma_iommu,
>> + .set_iommu_device = vtd_dev_set_iommu_device,
>> + .unset_iommu_device = vtd_dev_unset_iommu_device,
>> };
>>
>> static bool vtd_decide_config(IntelIOMMUState *s, Error **errp)
>> @@ -4235,6 +4308,9 @@ static void vtd_realize(DeviceState *dev, Error
>**errp)
>> g_free, g_free);
>> s->vtd_address_spaces = g_hash_table_new_full(vtd_as_hash,
>vtd_as_equal,
>> g_free, g_free);
>> + s->vtd_host_iommu_dev = g_hash_table_new_full(vtd_as_hash,
>> + vtd_as_idev_equal,
>> + g_free, g_free);
>> vtd_init(s);
>> pci_setup_iommu(bus, &vtd_iommu_ops, dev);
>> /* Pseudo address space under root PCI bus. */
>Thanks
>
>Eric
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH v6 18/19] intel_iommu: Implement [set|unset]_iommu_device() callbacks
2024-06-04 5:40 ` Duan, Zhenzhong
@ 2024-06-04 8:14 ` Eric Auger
2024-06-04 8:48 ` Duan, Zhenzhong
0 siblings, 1 reply; 70+ messages in thread
From: Eric Auger @ 2024-06-04 8:14 UTC (permalink / raw)
To: Duan, Zhenzhong, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
peterx@redhat.com, jasowang@redhat.com, jgg@nvidia.com,
nicolinc@nvidia.com, joao.m.martins@oracle.com,
clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
Peng, Chao P, Yi Sun, Paolo Bonzini, Richard Henderson,
Eduardo Habkost, Marcel Apfelbaum
On 6/4/24 07:40, Duan, Zhenzhong wrote:
>
>> -----Original Message-----
>> From: Eric Auger <eric.auger@redhat.com>
>> Subject: Re: [PATCH v6 18/19] intel_iommu: Implement
>> [set|unset]_iommu_device() callbacks
>>
>> Hi Zhenzhong,
>>
>> On 6/3/24 08:10, Zhenzhong Duan wrote:
>>> From: Yi Liu <yi.l.liu@intel.com>
>>>
>>> Implement [set|unset]_iommu_device() callbacks in Intel vIOMMU.
>>> In set call, a new structure VTDHostIOMMUDevice which holds
>>> a reference to HostIOMMUDevice is stored in hash table
>>> indexed by PCI BDF.
>> maybe precise that this is not the aliased one?
> Sure.
>
>>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
>>> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>> ---
>>> hw/i386/intel_iommu_internal.h | 9 ++++
>>> include/hw/i386/intel_iommu.h | 2 +
>>> hw/i386/intel_iommu.c | 76
>> ++++++++++++++++++++++++++++++++++
>>> 3 files changed, 87 insertions(+)
>>>
>>> diff --git a/hw/i386/intel_iommu_internal.h
>> b/hw/i386/intel_iommu_internal.h
>>> index f8cf99bddf..b800d62ca0 100644
>>> --- a/hw/i386/intel_iommu_internal.h
>>> +++ b/hw/i386/intel_iommu_internal.h
>>> @@ -28,6 +28,7 @@
>>> #ifndef HW_I386_INTEL_IOMMU_INTERNAL_H
>>> #define HW_I386_INTEL_IOMMU_INTERNAL_H
>>> #include "hw/i386/intel_iommu.h"
>>> +#include "sysemu/host_iommu_device.h"
>>>
>>> /*
>>> * Intel IOMMU register specification
>>> @@ -537,4 +538,12 @@ typedef struct VTDRootEntry VTDRootEntry;
>>> #define VTD_SL_IGN_COM 0xbff0000000000000ULL
>>> #define VTD_SL_TM (1ULL << 62)
>>>
>>> +
>>> +typedef struct VTDHostIOMMUDevice {
>>> + IntelIOMMUState *iommu_state;
>> Why do you need the iommu_state?
> It is used in nesting series.
>
>>> + PCIBus *bus;
>>> + uint8_t devfn;
>>> + HostIOMMUDevice *dev;
>>> + QLIST_ENTRY(VTDHostIOMMUDevice) next;
>>> +} VTDHostIOMMUDevice;
>> How VTD specific is it?
> In nesting series, it has element iommu_state and errata
> which are VTD specific.
so at least I would add a comment in the commit message explaining this.
>
>>> #endif
>>> diff --git a/include/hw/i386/intel_iommu.h
>> b/include/hw/i386/intel_iommu.h
>>> index 7d694b0813..2bbde41e45 100644
>>> --- a/include/hw/i386/intel_iommu.h
>>> +++ b/include/hw/i386/intel_iommu.h
>>> @@ -293,6 +293,8 @@ struct IntelIOMMUState {
>>> /* list of registered notifiers */
>>> QLIST_HEAD(, VTDAddressSpace) vtd_as_with_notifiers;
>>>
>>> + GHashTable *vtd_host_iommu_dev; /* VTDHostIOMMUDevice
>> */
>>> +
>>> /* interrupt remapping */
>>> bool intr_enabled; /* Whether guest enabled IR */
>>> dma_addr_t intr_root; /* Interrupt remapping table pointer */
>>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>>> index 519063c8f8..747c988bc4 100644
>>> --- a/hw/i386/intel_iommu.c
>>> +++ b/hw/i386/intel_iommu.c
>>> @@ -237,6 +237,13 @@ static gboolean vtd_as_equal(gconstpointer v1,
>> gconstpointer v2)
>>> (key1->pasid == key2->pasid);
>>> }
>>>
>>> +static gboolean vtd_as_idev_equal(gconstpointer v1, gconstpointer v2)
>>> +{
>>> + const struct vtd_as_key *key1 = v1;
>>> + const struct vtd_as_key *key2 = v2;
>>> +
>>> + return (key1->bus == key2->bus) && (key1->devfn == key2->devfn);
>>> +}
>> can't you reuse the key with pasid?
> s->vtd_host_iommu_dev isn't indexed by pasid but only BDF.
> Maybe I'd better to define its own key struct, hash() and equal() functions.
you could set a default pasid. But up to you
Eric
>
>>> /*
>>> * Note that we use pointer to PCIBus as the key, so hashing/shifting
>>> * based on the pointer value is intended. Note that we deal with
>>> @@ -3812,6 +3819,70 @@ VTDAddressSpace
>> *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
>>> return vtd_dev_as;
>>> }
>>>
>>> +static bool vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int
>> devfn,
>>> + HostIOMMUDevice *hiod, Error **errp)
>>> +{
>>> + IntelIOMMUState *s = opaque;
>>> + VTDHostIOMMUDevice *vtd_hdev;
>>> + struct vtd_as_key key = {
>>> + .bus = bus,
>>> + .devfn = devfn,
>>> + };
>>> + struct vtd_as_key *new_key;
>>> +
>>> + assert(hiod);
>>> +
>>> + vtd_iommu_lock(s);
>>> +
>>> + vtd_hdev = g_hash_table_lookup(s->vtd_host_iommu_dev, &key);
>>> +
>>> + if (vtd_hdev) {
>>> + error_setg(errp, "IOMMUFD device already exist");
>> s/IOMMUFD/Host IOMMU device?
> Good catch, will fix.
>
> Thanks
> Zhenzhong
>
>>> + vtd_iommu_unlock(s);
>>> + return false;
>>> + }
>>> +
>>> + vtd_hdev = g_malloc0(sizeof(VTDHostIOMMUDevice));
>>> + vtd_hdev->bus = bus;
>>> + vtd_hdev->devfn = (uint8_t)devfn;
>>> + vtd_hdev->iommu_state = s;
>>> + vtd_hdev->dev = hiod;
>>> +
>>> + new_key = g_malloc(sizeof(*new_key));
>>> + new_key->bus = bus;
>>> + new_key->devfn = devfn;
>>> +
>>> + object_ref(hiod);
>>> + g_hash_table_insert(s->vtd_host_iommu_dev, new_key, vtd_hdev);
>>> +
>>> + vtd_iommu_unlock(s);
>>> +
>>> + return true;
>>> +}
>>> +
>>> +static void vtd_dev_unset_iommu_device(PCIBus *bus, void *opaque, int
>> devfn)
>>> +{
>>> + IntelIOMMUState *s = opaque;
>>> + VTDHostIOMMUDevice *vtd_hdev;
>>> + struct vtd_as_key key = {
>>> + .bus = bus,
>>> + .devfn = devfn,
>>> + };
>>> +
>>> + vtd_iommu_lock(s);
>>> +
>>> + vtd_hdev = g_hash_table_lookup(s->vtd_host_iommu_dev, &key);
>>> + if (!vtd_hdev) {
>>> + vtd_iommu_unlock(s);
>>> + return;
>>> + }
>>> +
>>> + g_hash_table_remove(s->vtd_host_iommu_dev, &key);
>>> + object_unref(vtd_hdev->dev);
>>> +
>>> + vtd_iommu_unlock(s);
>>> +}
>>> +
>>> /* Unmap the whole range in the notifier's scope. */
>>> static void vtd_address_space_unmap(VTDAddressSpace *as,
>> IOMMUNotifier *n)
>>> {
>>> @@ -4116,6 +4187,8 @@ static AddressSpace
>> *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
>>> static PCIIOMMUOps vtd_iommu_ops = {
>>> .get_address_space = vtd_host_dma_iommu,
>>> + .set_iommu_device = vtd_dev_set_iommu_device,
>>> + .unset_iommu_device = vtd_dev_unset_iommu_device,
>>> };
>>>
>>> static bool vtd_decide_config(IntelIOMMUState *s, Error **errp)
>>> @@ -4235,6 +4308,9 @@ static void vtd_realize(DeviceState *dev, Error
>> **errp)
>>> g_free, g_free);
>>> s->vtd_address_spaces = g_hash_table_new_full(vtd_as_hash,
>> vtd_as_equal,
>>> g_free, g_free);
>>> + s->vtd_host_iommu_dev = g_hash_table_new_full(vtd_as_hash,
>>> + vtd_as_idev_equal,
>>> + g_free, g_free);
>>> vtd_init(s);
>>> pci_setup_iommu(bus, &vtd_iommu_ops, dev);
>>> /* Pseudo address space under root PCI bus. */
>> Thanks
>>
>> Eric
^ permalink raw reply [flat|nested] 70+ messages in thread
* RE: [PATCH v6 18/19] intel_iommu: Implement [set|unset]_iommu_device() callbacks
2024-06-04 8:14 ` Eric Auger
@ 2024-06-04 8:48 ` Duan, Zhenzhong
2024-06-04 9:38 ` Eric Auger
0 siblings, 1 reply; 70+ messages in thread
From: Duan, Zhenzhong @ 2024-06-04 8:48 UTC (permalink / raw)
To: eric.auger@redhat.com, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
peterx@redhat.com, jasowang@redhat.com, jgg@nvidia.com,
nicolinc@nvidia.com, joao.m.martins@oracle.com,
clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
Peng, Chao P, Yi Sun, Paolo Bonzini, Richard Henderson,
Eduardo Habkost, Marcel Apfelbaum
>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: Re: [PATCH v6 18/19] intel_iommu: Implement
>[set|unset]_iommu_device() callbacks
>
>
>
>On 6/4/24 07:40, Duan, Zhenzhong wrote:
>>
>>> -----Original Message-----
>>> From: Eric Auger <eric.auger@redhat.com>
>>> Subject: Re: [PATCH v6 18/19] intel_iommu: Implement
>>> [set|unset]_iommu_device() callbacks
>>>
>>> Hi Zhenzhong,
>>>
>>> On 6/3/24 08:10, Zhenzhong Duan wrote:
>>>> From: Yi Liu <yi.l.liu@intel.com>
>>>>
>>>> Implement [set|unset]_iommu_device() callbacks in Intel vIOMMU.
>>>> In set call, a new structure VTDHostIOMMUDevice which holds
>>>> a reference to HostIOMMUDevice is stored in hash table
>>>> indexed by PCI BDF.
>>> maybe precise that this is not the aliased one?
>> Sure.
>>
>>>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
>>>> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
>>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>>> ---
>>>> hw/i386/intel_iommu_internal.h | 9 ++++
>>>> include/hw/i386/intel_iommu.h | 2 +
>>>> hw/i386/intel_iommu.c | 76
>>> ++++++++++++++++++++++++++++++++++
>>>> 3 files changed, 87 insertions(+)
>>>>
>>>> diff --git a/hw/i386/intel_iommu_internal.h
>>> b/hw/i386/intel_iommu_internal.h
>>>> index f8cf99bddf..b800d62ca0 100644
>>>> --- a/hw/i386/intel_iommu_internal.h
>>>> +++ b/hw/i386/intel_iommu_internal.h
>>>> @@ -28,6 +28,7 @@
>>>> #ifndef HW_I386_INTEL_IOMMU_INTERNAL_H
>>>> #define HW_I386_INTEL_IOMMU_INTERNAL_H
>>>> #include "hw/i386/intel_iommu.h"
>>>> +#include "sysemu/host_iommu_device.h"
>>>>
>>>> /*
>>>> * Intel IOMMU register specification
>>>> @@ -537,4 +538,12 @@ typedef struct VTDRootEntry VTDRootEntry;
>>>> #define VTD_SL_IGN_COM 0xbff0000000000000ULL
>>>> #define VTD_SL_TM (1ULL << 62)
>>>>
>>>> +
>>>> +typedef struct VTDHostIOMMUDevice {
>>>> + IntelIOMMUState *iommu_state;
>>> Why do you need the iommu_state?
>> It is used in nesting series.
>>
>>>> + PCIBus *bus;
>>>> + uint8_t devfn;
>>>> + HostIOMMUDevice *dev;
>>>> + QLIST_ENTRY(VTDHostIOMMUDevice) next;
>>>> +} VTDHostIOMMUDevice;
>>> How VTD specific is it?
>> In nesting series, it has element iommu_state and errata
>> which are VTD specific.
>
>so at least I would add a comment in the commit message explaining this.
I'd like to drop it and reintroduce it in nesting series.
Thanks
Zhenzhong
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH v6 18/19] intel_iommu: Implement [set|unset]_iommu_device() callbacks
2024-06-04 8:48 ` Duan, Zhenzhong
@ 2024-06-04 9:38 ` Eric Auger
0 siblings, 0 replies; 70+ messages in thread
From: Eric Auger @ 2024-06-04 9:38 UTC (permalink / raw)
To: Duan, Zhenzhong, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
peterx@redhat.com, jasowang@redhat.com, jgg@nvidia.com,
nicolinc@nvidia.com, joao.m.martins@oracle.com,
clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
Peng, Chao P, Yi Sun, Paolo Bonzini, Richard Henderson,
Eduardo Habkost, Marcel Apfelbaum
On 6/4/24 10:48, Duan, Zhenzhong wrote:
>
>> -----Original Message-----
>> From: Eric Auger <eric.auger@redhat.com>
>> Subject: Re: [PATCH v6 18/19] intel_iommu: Implement
>> [set|unset]_iommu_device() callbacks
>>
>>
>>
>> On 6/4/24 07:40, Duan, Zhenzhong wrote:
>>>> -----Original Message-----
>>>> From: Eric Auger <eric.auger@redhat.com>
>>>> Subject: Re: [PATCH v6 18/19] intel_iommu: Implement
>>>> [set|unset]_iommu_device() callbacks
>>>>
>>>> Hi Zhenzhong,
>>>>
>>>> On 6/3/24 08:10, Zhenzhong Duan wrote:
>>>>> From: Yi Liu <yi.l.liu@intel.com>
>>>>>
>>>>> Implement [set|unset]_iommu_device() callbacks in Intel vIOMMU.
>>>>> In set call, a new structure VTDHostIOMMUDevice which holds
>>>>> a reference to HostIOMMUDevice is stored in hash table
>>>>> indexed by PCI BDF.
>>>> maybe precise that this is not the aliased one?
>>> Sure.
>>>
>>>>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
>>>>> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
>>>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>>>> ---
>>>>> hw/i386/intel_iommu_internal.h | 9 ++++
>>>>> include/hw/i386/intel_iommu.h | 2 +
>>>>> hw/i386/intel_iommu.c | 76
>>>> ++++++++++++++++++++++++++++++++++
>>>>> 3 files changed, 87 insertions(+)
>>>>>
>>>>> diff --git a/hw/i386/intel_iommu_internal.h
>>>> b/hw/i386/intel_iommu_internal.h
>>>>> index f8cf99bddf..b800d62ca0 100644
>>>>> --- a/hw/i386/intel_iommu_internal.h
>>>>> +++ b/hw/i386/intel_iommu_internal.h
>>>>> @@ -28,6 +28,7 @@
>>>>> #ifndef HW_I386_INTEL_IOMMU_INTERNAL_H
>>>>> #define HW_I386_INTEL_IOMMU_INTERNAL_H
>>>>> #include "hw/i386/intel_iommu.h"
>>>>> +#include "sysemu/host_iommu_device.h"
>>>>>
>>>>> /*
>>>>> * Intel IOMMU register specification
>>>>> @@ -537,4 +538,12 @@ typedef struct VTDRootEntry VTDRootEntry;
>>>>> #define VTD_SL_IGN_COM 0xbff0000000000000ULL
>>>>> #define VTD_SL_TM (1ULL << 62)
>>>>>
>>>>> +
>>>>> +typedef struct VTDHostIOMMUDevice {
>>>>> + IntelIOMMUState *iommu_state;
>>>> Why do you need the iommu_state?
>>> It is used in nesting series.
>>>
>>>>> + PCIBus *bus;
>>>>> + uint8_t devfn;
>>>>> + HostIOMMUDevice *dev;
>>>>> + QLIST_ENTRY(VTDHostIOMMUDevice) next;
>>>>> +} VTDHostIOMMUDevice;
>>>> How VTD specific is it?
>>> In nesting series, it has element iommu_state and errata
>>> which are VTD specific.
>> so at least I would add a comment in the commit message explaining this.
> I'd like to drop it and reintroduce it in nesting series.
OK then just justify the choice of introducing another struct because
other VTD specific fields will be introduced later on
Eric
>
> Thanks
> Zhenzhong
>
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH v6 18/19] intel_iommu: Implement [set|unset]_iommu_device() callbacks
2024-06-03 6:10 ` [PATCH v6 18/19] intel_iommu: Implement [set|unset]_iommu_device() callbacks Zhenzhong Duan
2024-06-03 10:12 ` CLEMENT MATHIEU--DRIF
2024-06-03 14:13 ` Eric Auger
@ 2024-06-03 14:47 ` Eric Auger
2024-06-04 5:46 ` Duan, Zhenzhong
2 siblings, 1 reply; 70+ messages in thread
From: Eric Auger @ 2024-06-03 14:47 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, clg, mst, peterx, jasowang, jgg, nicolinc,
joao.m.martins, clement.mathieu--drif, kevin.tian, yi.l.liu,
chao.p.peng, Yi Sun, Paolo Bonzini, Richard Henderson,
Eduardo Habkost, Marcel Apfelbaum
On 6/3/24 08:10, Zhenzhong Duan wrote:
> From: Yi Liu <yi.l.liu@intel.com>
>
> Implement [set|unset]_iommu_device() callbacks in Intel vIOMMU.
> In set call, a new structure VTDHostIOMMUDevice which holds
> a reference to HostIOMMUDevice is stored in hash table
> indexed by PCI BDF.
>
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
> hw/i386/intel_iommu_internal.h | 9 ++++
> include/hw/i386/intel_iommu.h | 2 +
> hw/i386/intel_iommu.c | 76 ++++++++++++++++++++++++++++++++++
> 3 files changed, 87 insertions(+)
>
> diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
> index f8cf99bddf..b800d62ca0 100644
> --- a/hw/i386/intel_iommu_internal.h
> +++ b/hw/i386/intel_iommu_internal.h
> @@ -28,6 +28,7 @@
> #ifndef HW_I386_INTEL_IOMMU_INTERNAL_H
> #define HW_I386_INTEL_IOMMU_INTERNAL_H
> #include "hw/i386/intel_iommu.h"
> +#include "sysemu/host_iommu_device.h"
>
> /*
> * Intel IOMMU register specification
> @@ -537,4 +538,12 @@ typedef struct VTDRootEntry VTDRootEntry;
> #define VTD_SL_IGN_COM 0xbff0000000000000ULL
> #define VTD_SL_TM (1ULL << 62)
>
> +
> +typedef struct VTDHostIOMMUDevice {
> + IntelIOMMUState *iommu_state;
> + PCIBus *bus;
> + uint8_t devfn;
> + HostIOMMUDevice *dev;
> + QLIST_ENTRY(VTDHostIOMMUDevice) next;
> +} VTDHostIOMMUDevice;
> #endif
> diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
> index 7d694b0813..2bbde41e45 100644
> --- a/include/hw/i386/intel_iommu.h
> +++ b/include/hw/i386/intel_iommu.h
> @@ -293,6 +293,8 @@ struct IntelIOMMUState {
> /* list of registered notifiers */
> QLIST_HEAD(, VTDAddressSpace) vtd_as_with_notifiers;
>
> + GHashTable *vtd_host_iommu_dev; /* VTDHostIOMMUDevice */
> +
> /* interrupt remapping */
> bool intr_enabled; /* Whether guest enabled IR */
> dma_addr_t intr_root; /* Interrupt remapping table pointer */
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index 519063c8f8..747c988bc4 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -237,6 +237,13 @@ static gboolean vtd_as_equal(gconstpointer v1, gconstpointer v2)
> (key1->pasid == key2->pasid);
> }
>
> +static gboolean vtd_as_idev_equal(gconstpointer v1, gconstpointer v2)
> +{
> + const struct vtd_as_key *key1 = v1;
> + const struct vtd_as_key *key2 = v2;
> +
> + return (key1->bus == key2->bus) && (key1->devfn == key2->devfn);
> +}
> /*
> * Note that we use pointer to PCIBus as the key, so hashing/shifting
> * based on the pointer value is intended. Note that we deal with
> @@ -3812,6 +3819,70 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
> return vtd_dev_as;
> }
>
> +static bool vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
> + HostIOMMUDevice *hiod, Error **errp)
> +{
> + IntelIOMMUState *s = opaque;
> + VTDHostIOMMUDevice *vtd_hdev;
> + struct vtd_as_key key = {
> + .bus = bus,
> + .devfn = devfn,
> + };
> + struct vtd_as_key *new_key;
> +
> + assert(hiod);
> +
> + vtd_iommu_lock(s);
> +
> + vtd_hdev = g_hash_table_lookup(s->vtd_host_iommu_dev, &key);
> +
> + if (vtd_hdev) {
> + error_setg(errp, "IOMMUFD device already exist");
> + vtd_iommu_unlock(s);
> + return false;
> + }
> +
> + vtd_hdev = g_malloc0(sizeof(VTDHostIOMMUDevice));
> + vtd_hdev->bus = bus;
> + vtd_hdev->devfn = (uint8_t)devfn;
> + vtd_hdev->iommu_state = s;
> + vtd_hdev->dev = hiod;
I am still not totally clear about why we couldn't reuse VTDAddressSpace
instance for this bus/devid. Is it a matter of aliased versus non
aliased bus/devfn, or a matter of pasid diff. An AddressSpace could back
an assigned device in which case a HostIOMMUDevice could be added to
this latter. I think this should be explained in the commit msg
Eric
> +
> + new_key = g_malloc(sizeof(*new_key));
> + new_key->bus = bus;
> + new_key->devfn = devfn;
> +
> + object_ref(hiod);
> + g_hash_table_insert(s->vtd_host_iommu_dev, new_key, vtd_hdev);
> +
> + vtd_iommu_unlock(s);
> +
> + return true;
> +}
> +
> +static void vtd_dev_unset_iommu_device(PCIBus *bus, void *opaque, int devfn)
> +{
> + IntelIOMMUState *s = opaque;
> + VTDHostIOMMUDevice *vtd_hdev;
> + struct vtd_as_key key = {
> + .bus = bus,
> + .devfn = devfn,
> + };
> +
> + vtd_iommu_lock(s);
> +
> + vtd_hdev = g_hash_table_lookup(s->vtd_host_iommu_dev, &key);
> + if (!vtd_hdev) {
> + vtd_iommu_unlock(s);
> + return;
> + }
> +
> + g_hash_table_remove(s->vtd_host_iommu_dev, &key);
> + object_unref(vtd_hdev->dev);
> +
> + vtd_iommu_unlock(s);
> +}
> +
> /* Unmap the whole range in the notifier's scope. */
> static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
> {
> @@ -4116,6 +4187,8 @@ static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
>
> static PCIIOMMUOps vtd_iommu_ops = {
> .get_address_space = vtd_host_dma_iommu,
> + .set_iommu_device = vtd_dev_set_iommu_device,
> + .unset_iommu_device = vtd_dev_unset_iommu_device,
> };
>
> static bool vtd_decide_config(IntelIOMMUState *s, Error **errp)
> @@ -4235,6 +4308,9 @@ static void vtd_realize(DeviceState *dev, Error **errp)
> g_free, g_free);
> s->vtd_address_spaces = g_hash_table_new_full(vtd_as_hash, vtd_as_equal,
> g_free, g_free);
> + s->vtd_host_iommu_dev = g_hash_table_new_full(vtd_as_hash,
> + vtd_as_idev_equal,
> + g_free, g_free);
> vtd_init(s);
> pci_setup_iommu(bus, &vtd_iommu_ops, dev);
> /* Pseudo address space under root PCI bus. */
^ permalink raw reply [flat|nested] 70+ messages in thread
* RE: [PATCH v6 18/19] intel_iommu: Implement [set|unset]_iommu_device() callbacks
2024-06-03 14:47 ` Eric Auger
@ 2024-06-04 5:46 ` Duan, Zhenzhong
2024-06-04 8:17 ` Eric Auger
0 siblings, 1 reply; 70+ messages in thread
From: Duan, Zhenzhong @ 2024-06-04 5:46 UTC (permalink / raw)
To: eric.auger@redhat.com, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
peterx@redhat.com, jasowang@redhat.com, jgg@nvidia.com,
nicolinc@nvidia.com, joao.m.martins@oracle.com,
clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
Peng, Chao P, Yi Sun, Paolo Bonzini, Richard Henderson,
Eduardo Habkost, Marcel Apfelbaum
>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: Re: [PATCH v6 18/19] intel_iommu: Implement
>[set|unset]_iommu_device() callbacks
>
>
>
>On 6/3/24 08:10, Zhenzhong Duan wrote:
>> From: Yi Liu <yi.l.liu@intel.com>
>>
>> Implement [set|unset]_iommu_device() callbacks in Intel vIOMMU.
>> In set call, a new structure VTDHostIOMMUDevice which holds
>> a reference to HostIOMMUDevice is stored in hash table
>> indexed by PCI BDF.
>>
>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
>> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>> hw/i386/intel_iommu_internal.h | 9 ++++
>> include/hw/i386/intel_iommu.h | 2 +
>> hw/i386/intel_iommu.c | 76
>++++++++++++++++++++++++++++++++++
>> 3 files changed, 87 insertions(+)
>>
>> diff --git a/hw/i386/intel_iommu_internal.h
>b/hw/i386/intel_iommu_internal.h
>> index f8cf99bddf..b800d62ca0 100644
>> --- a/hw/i386/intel_iommu_internal.h
>> +++ b/hw/i386/intel_iommu_internal.h
>> @@ -28,6 +28,7 @@
>> #ifndef HW_I386_INTEL_IOMMU_INTERNAL_H
>> #define HW_I386_INTEL_IOMMU_INTERNAL_H
>> #include "hw/i386/intel_iommu.h"
>> +#include "sysemu/host_iommu_device.h"
>>
>> /*
>> * Intel IOMMU register specification
>> @@ -537,4 +538,12 @@ typedef struct VTDRootEntry VTDRootEntry;
>> #define VTD_SL_IGN_COM 0xbff0000000000000ULL
>> #define VTD_SL_TM (1ULL << 62)
>>
>> +
>> +typedef struct VTDHostIOMMUDevice {
>> + IntelIOMMUState *iommu_state;
>> + PCIBus *bus;
>> + uint8_t devfn;
>> + HostIOMMUDevice *dev;
>> + QLIST_ENTRY(VTDHostIOMMUDevice) next;
>> +} VTDHostIOMMUDevice;
>> #endif
>> diff --git a/include/hw/i386/intel_iommu.h
>b/include/hw/i386/intel_iommu.h
>> index 7d694b0813..2bbde41e45 100644
>> --- a/include/hw/i386/intel_iommu.h
>> +++ b/include/hw/i386/intel_iommu.h
>> @@ -293,6 +293,8 @@ struct IntelIOMMUState {
>> /* list of registered notifiers */
>> QLIST_HEAD(, VTDAddressSpace) vtd_as_with_notifiers;
>>
>> + GHashTable *vtd_host_iommu_dev; /* VTDHostIOMMUDevice
>*/
>> +
>> /* interrupt remapping */
>> bool intr_enabled; /* Whether guest enabled IR */
>> dma_addr_t intr_root; /* Interrupt remapping table pointer */
>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>> index 519063c8f8..747c988bc4 100644
>> --- a/hw/i386/intel_iommu.c
>> +++ b/hw/i386/intel_iommu.c
>> @@ -237,6 +237,13 @@ static gboolean vtd_as_equal(gconstpointer v1,
>gconstpointer v2)
>> (key1->pasid == key2->pasid);
>> }
>>
>> +static gboolean vtd_as_idev_equal(gconstpointer v1, gconstpointer v2)
>> +{
>> + const struct vtd_as_key *key1 = v1;
>> + const struct vtd_as_key *key2 = v2;
>> +
>> + return (key1->bus == key2->bus) && (key1->devfn == key2->devfn);
>> +}
>> /*
>> * Note that we use pointer to PCIBus as the key, so hashing/shifting
>> * based on the pointer value is intended. Note that we deal with
>> @@ -3812,6 +3819,70 @@ VTDAddressSpace
>*vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
>> return vtd_dev_as;
>> }
>>
>> +static bool vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int
>devfn,
>> + HostIOMMUDevice *hiod, Error **errp)
>> +{
>> + IntelIOMMUState *s = opaque;
>> + VTDHostIOMMUDevice *vtd_hdev;
>> + struct vtd_as_key key = {
>> + .bus = bus,
>> + .devfn = devfn,
>> + };
>> + struct vtd_as_key *new_key;
>> +
>> + assert(hiod);
>> +
>> + vtd_iommu_lock(s);
>> +
>> + vtd_hdev = g_hash_table_lookup(s->vtd_host_iommu_dev, &key);
>> +
>> + if (vtd_hdev) {
>> + error_setg(errp, "IOMMUFD device already exist");
>> + vtd_iommu_unlock(s);
>> + return false;
>> + }
>> +
>> + vtd_hdev = g_malloc0(sizeof(VTDHostIOMMUDevice));
>> + vtd_hdev->bus = bus;
>> + vtd_hdev->devfn = (uint8_t)devfn;
>> + vtd_hdev->iommu_state = s;
>> + vtd_hdev->dev = hiod;
>I am still not totally clear about why we couldn't reuse VTDAddressSpace
>instance for this bus/devid. Is it a matter of aliased versus non
>aliased bus/devfn, or a matter of pasid diff. An AddressSpace could back
>an assigned device in which case a HostIOMMUDevice could be added to
>this latter. I think this should be explained in the commit msg
Yes, as you said, it's a matter of aliased vs non aliased BDF.
VTDAddressSpace is per aliased BDF while VTDHostIOMMUDevice is per non aliased BDF.
There can be multiple assigned devices under same virtual iommu group and share same
VTDAddressSpace, but they have their own VTDHostIOMMUDevice.
Will refine commit msg.
Thanks
Zhenzhong
>
>Eric
>> +
>> + new_key = g_malloc(sizeof(*new_key));
>> + new_key->bus = bus;
>> + new_key->devfn = devfn;
>> +
>> + object_ref(hiod);
>> + g_hash_table_insert(s->vtd_host_iommu_dev, new_key, vtd_hdev);
>> +
>> + vtd_iommu_unlock(s);
>> +
>> + return true;
>> +}
>> +
>> +static void vtd_dev_unset_iommu_device(PCIBus *bus, void *opaque, int
>devfn)
>> +{
>> + IntelIOMMUState *s = opaque;
>> + VTDHostIOMMUDevice *vtd_hdev;
>> + struct vtd_as_key key = {
>> + .bus = bus,
>> + .devfn = devfn,
>> + };
>> +
>> + vtd_iommu_lock(s);
>> +
>> + vtd_hdev = g_hash_table_lookup(s->vtd_host_iommu_dev, &key);
>> + if (!vtd_hdev) {
>> + vtd_iommu_unlock(s);
>> + return;
>> + }
>> +
>> + g_hash_table_remove(s->vtd_host_iommu_dev, &key);
>> + object_unref(vtd_hdev->dev);
>> +
>> + vtd_iommu_unlock(s);
>> +}
>> +
>> /* Unmap the whole range in the notifier's scope. */
>> static void vtd_address_space_unmap(VTDAddressSpace *as,
>IOMMUNotifier *n)
>> {
>> @@ -4116,6 +4187,8 @@ static AddressSpace
>*vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
>>
>> static PCIIOMMUOps vtd_iommu_ops = {
>> .get_address_space = vtd_host_dma_iommu,
>> + .set_iommu_device = vtd_dev_set_iommu_device,
>> + .unset_iommu_device = vtd_dev_unset_iommu_device,
>> };
>>
>> static bool vtd_decide_config(IntelIOMMUState *s, Error **errp)
>> @@ -4235,6 +4308,9 @@ static void vtd_realize(DeviceState *dev, Error
>**errp)
>> g_free, g_free);
>> s->vtd_address_spaces = g_hash_table_new_full(vtd_as_hash,
>vtd_as_equal,
>> g_free, g_free);
>> + s->vtd_host_iommu_dev = g_hash_table_new_full(vtd_as_hash,
>> + vtd_as_idev_equal,
>> + g_free, g_free);
>> vtd_init(s);
>> pci_setup_iommu(bus, &vtd_iommu_ops, dev);
>> /* Pseudo address space under root PCI bus. */
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH v6 18/19] intel_iommu: Implement [set|unset]_iommu_device() callbacks
2024-06-04 5:46 ` Duan, Zhenzhong
@ 2024-06-04 8:17 ` Eric Auger
2024-06-06 4:04 ` Duan, Zhenzhong
0 siblings, 1 reply; 70+ messages in thread
From: Eric Auger @ 2024-06-04 8:17 UTC (permalink / raw)
To: Duan, Zhenzhong, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
peterx@redhat.com, jasowang@redhat.com, jgg@nvidia.com,
nicolinc@nvidia.com, joao.m.martins@oracle.com,
clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
Peng, Chao P, Yi Sun, Paolo Bonzini, Richard Henderson,
Eduardo Habkost, Marcel Apfelbaum
On 6/4/24 07:46, Duan, Zhenzhong wrote:
>
>> -----Original Message-----
>> From: Eric Auger <eric.auger@redhat.com>
>> Subject: Re: [PATCH v6 18/19] intel_iommu: Implement
>> [set|unset]_iommu_device() callbacks
>>
>>
>>
>> On 6/3/24 08:10, Zhenzhong Duan wrote:
>>> From: Yi Liu <yi.l.liu@intel.com>
>>>
>>> Implement [set|unset]_iommu_device() callbacks in Intel vIOMMU.
>>> In set call, a new structure VTDHostIOMMUDevice which holds
>>> a reference to HostIOMMUDevice is stored in hash table
>>> indexed by PCI BDF.
>>>
>>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
>>> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>> ---
>>> hw/i386/intel_iommu_internal.h | 9 ++++
>>> include/hw/i386/intel_iommu.h | 2 +
>>> hw/i386/intel_iommu.c | 76
>> ++++++++++++++++++++++++++++++++++
>>> 3 files changed, 87 insertions(+)
>>>
>>> diff --git a/hw/i386/intel_iommu_internal.h
>> b/hw/i386/intel_iommu_internal.h
>>> index f8cf99bddf..b800d62ca0 100644
>>> --- a/hw/i386/intel_iommu_internal.h
>>> +++ b/hw/i386/intel_iommu_internal.h
>>> @@ -28,6 +28,7 @@
>>> #ifndef HW_I386_INTEL_IOMMU_INTERNAL_H
>>> #define HW_I386_INTEL_IOMMU_INTERNAL_H
>>> #include "hw/i386/intel_iommu.h"
>>> +#include "sysemu/host_iommu_device.h"
>>>
>>> /*
>>> * Intel IOMMU register specification
>>> @@ -537,4 +538,12 @@ typedef struct VTDRootEntry VTDRootEntry;
>>> #define VTD_SL_IGN_COM 0xbff0000000000000ULL
>>> #define VTD_SL_TM (1ULL << 62)
>>>
>>> +
>>> +typedef struct VTDHostIOMMUDevice {
>>> + IntelIOMMUState *iommu_state;
>>> + PCIBus *bus;
>>> + uint8_t devfn;
>>> + HostIOMMUDevice *dev;
>>> + QLIST_ENTRY(VTDHostIOMMUDevice) next;
>>> +} VTDHostIOMMUDevice;
>>> #endif
>>> diff --git a/include/hw/i386/intel_iommu.h
>> b/include/hw/i386/intel_iommu.h
>>> index 7d694b0813..2bbde41e45 100644
>>> --- a/include/hw/i386/intel_iommu.h
>>> +++ b/include/hw/i386/intel_iommu.h
>>> @@ -293,6 +293,8 @@ struct IntelIOMMUState {
>>> /* list of registered notifiers */
>>> QLIST_HEAD(, VTDAddressSpace) vtd_as_with_notifiers;
>>>
>>> + GHashTable *vtd_host_iommu_dev; /* VTDHostIOMMUDevice
>> */
>>> +
>>> /* interrupt remapping */
>>> bool intr_enabled; /* Whether guest enabled IR */
>>> dma_addr_t intr_root; /* Interrupt remapping table pointer */
>>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>>> index 519063c8f8..747c988bc4 100644
>>> --- a/hw/i386/intel_iommu.c
>>> +++ b/hw/i386/intel_iommu.c
>>> @@ -237,6 +237,13 @@ static gboolean vtd_as_equal(gconstpointer v1,
>> gconstpointer v2)
>>> (key1->pasid == key2->pasid);
>>> }
>>>
>>> +static gboolean vtd_as_idev_equal(gconstpointer v1, gconstpointer v2)
>>> +{
>>> + const struct vtd_as_key *key1 = v1;
>>> + const struct vtd_as_key *key2 = v2;
>>> +
>>> + return (key1->bus == key2->bus) && (key1->devfn == key2->devfn);
>>> +}
>>> /*
>>> * Note that we use pointer to PCIBus as the key, so hashing/shifting
>>> * based on the pointer value is intended. Note that we deal with
>>> @@ -3812,6 +3819,70 @@ VTDAddressSpace
>> *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
>>> return vtd_dev_as;
>>> }
>>>
>>> +static bool vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int
>> devfn,
>>> + HostIOMMUDevice *hiod, Error **errp)
>>> +{
>>> + IntelIOMMUState *s = opaque;
>>> + VTDHostIOMMUDevice *vtd_hdev;
>>> + struct vtd_as_key key = {
>>> + .bus = bus,
>>> + .devfn = devfn,
>>> + };
>>> + struct vtd_as_key *new_key;
>>> +
>>> + assert(hiod);
>>> +
>>> + vtd_iommu_lock(s);
>>> +
>>> + vtd_hdev = g_hash_table_lookup(s->vtd_host_iommu_dev, &key);
>>> +
>>> + if (vtd_hdev) {
>>> + error_setg(errp, "IOMMUFD device already exist");
>>> + vtd_iommu_unlock(s);
>>> + return false;
>>> + }
>>> +
>>> + vtd_hdev = g_malloc0(sizeof(VTDHostIOMMUDevice));
>>> + vtd_hdev->bus = bus;
>>> + vtd_hdev->devfn = (uint8_t)devfn;
>>> + vtd_hdev->iommu_state = s;
>>> + vtd_hdev->dev = hiod;
>> I am still not totally clear about why we couldn't reuse VTDAddressSpace
>> instance for this bus/devid. Is it a matter of aliased versus non
>> aliased bus/devfn, or a matter of pasid diff. An AddressSpace could back
>> an assigned device in which case a HostIOMMUDevice could be added to
>> this latter. I think this should be explained in the commit msg
> Yes, as you said, it's a matter of aliased vs non aliased BDF.
>
> VTDAddressSpace is per aliased BDF while VTDHostIOMMUDevice is per non aliased BDF.
> There can be multiple assigned devices under same virtual iommu group and share same
> VTDAddressSpace, but they have their own VTDHostIOMMUDevice.
>
> Will refine commit msg.
OK thank you for the confirmation. A general concern is this is the kind
of code we are going to duplicate in each vIOMMU. This is beyond the
scope of this series but we shall really think about introducing a
common base object for vIOMMU. Unfortunately there are issues related to
multiple inheritence that may prevent us from using usual QOM
inheritence but just as we have done for backends and HostIOMMUDevice we
may implement inheritence another way.
Eric
>
> Thanks
> Zhenzhong
>
>> Eric
>>> +
>>> + new_key = g_malloc(sizeof(*new_key));
>>> + new_key->bus = bus;
>>> + new_key->devfn = devfn;
>>> +
>>> + object_ref(hiod);
>>> + g_hash_table_insert(s->vtd_host_iommu_dev, new_key, vtd_hdev);
>>> +
>>> + vtd_iommu_unlock(s);
>>> +
>>> + return true;
>>> +}
>>> +
>>> +static void vtd_dev_unset_iommu_device(PCIBus *bus, void *opaque, int
>> devfn)
>>> +{
>>> + IntelIOMMUState *s = opaque;
>>> + VTDHostIOMMUDevice *vtd_hdev;
>>> + struct vtd_as_key key = {
>>> + .bus = bus,
>>> + .devfn = devfn,
>>> + };
>>> +
>>> + vtd_iommu_lock(s);
>>> +
>>> + vtd_hdev = g_hash_table_lookup(s->vtd_host_iommu_dev, &key);
>>> + if (!vtd_hdev) {
>>> + vtd_iommu_unlock(s);
>>> + return;
>>> + }
>>> +
>>> + g_hash_table_remove(s->vtd_host_iommu_dev, &key);
>>> + object_unref(vtd_hdev->dev);
>>> +
>>> + vtd_iommu_unlock(s);
>>> +}
>>> +
>>> /* Unmap the whole range in the notifier's scope. */
>>> static void vtd_address_space_unmap(VTDAddressSpace *as,
>> IOMMUNotifier *n)
>>> {
>>> @@ -4116,6 +4187,8 @@ static AddressSpace
>> *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
>>> static PCIIOMMUOps vtd_iommu_ops = {
>>> .get_address_space = vtd_host_dma_iommu,
>>> + .set_iommu_device = vtd_dev_set_iommu_device,
>>> + .unset_iommu_device = vtd_dev_unset_iommu_device,
>>> };
>>>
>>> static bool vtd_decide_config(IntelIOMMUState *s, Error **errp)
>>> @@ -4235,6 +4308,9 @@ static void vtd_realize(DeviceState *dev, Error
>> **errp)
>>> g_free, g_free);
>>> s->vtd_address_spaces = g_hash_table_new_full(vtd_as_hash,
>> vtd_as_equal,
>>> g_free, g_free);
>>> + s->vtd_host_iommu_dev = g_hash_table_new_full(vtd_as_hash,
>>> + vtd_as_idev_equal,
>>> + g_free, g_free);
>>> vtd_init(s);
>>> pci_setup_iommu(bus, &vtd_iommu_ops, dev);
>>> /* Pseudo address space under root PCI bus. */
^ permalink raw reply [flat|nested] 70+ messages in thread
* RE: [PATCH v6 18/19] intel_iommu: Implement [set|unset]_iommu_device() callbacks
2024-06-04 8:17 ` Eric Auger
@ 2024-06-06 4:04 ` Duan, Zhenzhong
0 siblings, 0 replies; 70+ messages in thread
From: Duan, Zhenzhong @ 2024-06-06 4:04 UTC (permalink / raw)
To: eric.auger@redhat.com, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
peterx@redhat.com, jasowang@redhat.com, jgg@nvidia.com,
nicolinc@nvidia.com, joao.m.martins@oracle.com,
clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
Peng, Chao P, Yi Sun, Paolo Bonzini, Richard Henderson,
Eduardo Habkost, Marcel Apfelbaum
>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: Re: [PATCH v6 18/19] intel_iommu: Implement
>[set|unset]_iommu_device() callbacks
>
>
>
>On 6/4/24 07:46, Duan, Zhenzhong wrote:
>>
>>> -----Original Message-----
>>> From: Eric Auger <eric.auger@redhat.com>
>>> Subject: Re: [PATCH v6 18/19] intel_iommu: Implement
>>> [set|unset]_iommu_device() callbacks
>>>
>>>
>>>
>>> On 6/3/24 08:10, Zhenzhong Duan wrote:
>>>> From: Yi Liu <yi.l.liu@intel.com>
>>>>
>>>> Implement [set|unset]_iommu_device() callbacks in Intel vIOMMU.
>>>> In set call, a new structure VTDHostIOMMUDevice which holds
>>>> a reference to HostIOMMUDevice is stored in hash table
>>>> indexed by PCI BDF.
>>>>
>>>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
>>>> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
>>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>>> ---
>>>> hw/i386/intel_iommu_internal.h | 9 ++++
>>>> include/hw/i386/intel_iommu.h | 2 +
>>>> hw/i386/intel_iommu.c | 76
>>> ++++++++++++++++++++++++++++++++++
>>>> 3 files changed, 87 insertions(+)
>>>>
>>>> diff --git a/hw/i386/intel_iommu_internal.h
>>> b/hw/i386/intel_iommu_internal.h
>>>> index f8cf99bddf..b800d62ca0 100644
>>>> --- a/hw/i386/intel_iommu_internal.h
>>>> +++ b/hw/i386/intel_iommu_internal.h
>>>> @@ -28,6 +28,7 @@
>>>> #ifndef HW_I386_INTEL_IOMMU_INTERNAL_H
>>>> #define HW_I386_INTEL_IOMMU_INTERNAL_H
>>>> #include "hw/i386/intel_iommu.h"
>>>> +#include "sysemu/host_iommu_device.h"
>>>>
>>>> /*
>>>> * Intel IOMMU register specification
>>>> @@ -537,4 +538,12 @@ typedef struct VTDRootEntry VTDRootEntry;
>>>> #define VTD_SL_IGN_COM 0xbff0000000000000ULL
>>>> #define VTD_SL_TM (1ULL << 62)
>>>>
>>>> +
>>>> +typedef struct VTDHostIOMMUDevice {
>>>> + IntelIOMMUState *iommu_state;
>>>> + PCIBus *bus;
>>>> + uint8_t devfn;
>>>> + HostIOMMUDevice *dev;
>>>> + QLIST_ENTRY(VTDHostIOMMUDevice) next;
>>>> +} VTDHostIOMMUDevice;
>>>> #endif
>>>> diff --git a/include/hw/i386/intel_iommu.h
>>> b/include/hw/i386/intel_iommu.h
>>>> index 7d694b0813..2bbde41e45 100644
>>>> --- a/include/hw/i386/intel_iommu.h
>>>> +++ b/include/hw/i386/intel_iommu.h
>>>> @@ -293,6 +293,8 @@ struct IntelIOMMUState {
>>>> /* list of registered notifiers */
>>>> QLIST_HEAD(, VTDAddressSpace) vtd_as_with_notifiers;
>>>>
>>>> + GHashTable *vtd_host_iommu_dev; /*
>VTDHostIOMMUDevice
>>> */
>>>> +
>>>> /* interrupt remapping */
>>>> bool intr_enabled; /* Whether guest enabled IR */
>>>> dma_addr_t intr_root; /* Interrupt remapping table pointer */
>>>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>>>> index 519063c8f8..747c988bc4 100644
>>>> --- a/hw/i386/intel_iommu.c
>>>> +++ b/hw/i386/intel_iommu.c
>>>> @@ -237,6 +237,13 @@ static gboolean vtd_as_equal(gconstpointer
>v1,
>>> gconstpointer v2)
>>>> (key1->pasid == key2->pasid);
>>>> }
>>>>
>>>> +static gboolean vtd_as_idev_equal(gconstpointer v1, gconstpointer v2)
>>>> +{
>>>> + const struct vtd_as_key *key1 = v1;
>>>> + const struct vtd_as_key *key2 = v2;
>>>> +
>>>> + return (key1->bus == key2->bus) && (key1->devfn == key2->devfn);
>>>> +}
>>>> /*
>>>> * Note that we use pointer to PCIBus as the key, so hashing/shifting
>>>> * based on the pointer value is intended. Note that we deal with
>>>> @@ -3812,6 +3819,70 @@ VTDAddressSpace
>>> *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
>>>> return vtd_dev_as;
>>>> }
>>>>
>>>> +static bool vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int
>>> devfn,
>>>> + HostIOMMUDevice *hiod, Error **errp)
>>>> +{
>>>> + IntelIOMMUState *s = opaque;
>>>> + VTDHostIOMMUDevice *vtd_hdev;
>>>> + struct vtd_as_key key = {
>>>> + .bus = bus,
>>>> + .devfn = devfn,
>>>> + };
>>>> + struct vtd_as_key *new_key;
>>>> +
>>>> + assert(hiod);
>>>> +
>>>> + vtd_iommu_lock(s);
>>>> +
>>>> + vtd_hdev = g_hash_table_lookup(s->vtd_host_iommu_dev, &key);
>>>> +
>>>> + if (vtd_hdev) {
>>>> + error_setg(errp, "IOMMUFD device already exist");
>>>> + vtd_iommu_unlock(s);
>>>> + return false;
>>>> + }
>>>> +
>>>> + vtd_hdev = g_malloc0(sizeof(VTDHostIOMMUDevice));
>>>> + vtd_hdev->bus = bus;
>>>> + vtd_hdev->devfn = (uint8_t)devfn;
>>>> + vtd_hdev->iommu_state = s;
>>>> + vtd_hdev->dev = hiod;
>>> I am still not totally clear about why we couldn't reuse VTDAddressSpace
>>> instance for this bus/devid. Is it a matter of aliased versus non
>>> aliased bus/devfn, or a matter of pasid diff. An AddressSpace could back
>>> an assigned device in which case a HostIOMMUDevice could be added to
>>> this latter. I think this should be explained in the commit msg
>> Yes, as you said, it's a matter of aliased vs non aliased BDF.
>>
>> VTDAddressSpace is per aliased BDF while VTDHostIOMMUDevice is per
>non aliased BDF.
>> There can be multiple assigned devices under same virtual iommu group
>and share same
>> VTDAddressSpace, but they have their own VTDHostIOMMUDevice.
>>
>> Will refine commit msg.
>
>OK thank you for the confirmation. A general concern is this is the kind
>of code we are going to duplicate in each vIOMMU.
The hash table code can be common, but will we have other duplicate code?
I feel most of the codes are VTD specific.
> This is beyond the
>scope of this series but we shall really think about introducing a
>common base object for vIOMMU. Unfortunately there are issues related to
>multiple inheritence that may prevent us from using usual QOM
>inheritence but just as we have done for backends and HostIOMMUDevice
>we may implement inheritence another way.
Yes, virtio-iommu is different from others, it inherits from TYPE_VIRTIO_DEVICE.
Not get about the another way, could you explain a bit?
Thanks
Zhenzhong
^ permalink raw reply [flat|nested] 70+ messages in thread
* [PATCH v6 19/19] intel_iommu: Check compatibility with host IOMMU capabilities
2024-06-03 6:10 [PATCH v6 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
` (17 preceding siblings ...)
2024-06-03 6:10 ` [PATCH v6 18/19] intel_iommu: Implement [set|unset]_iommu_device() callbacks Zhenzhong Duan
@ 2024-06-03 6:10 ` Zhenzhong Duan
2024-06-03 14:23 ` Eric Auger
2024-06-03 12:43 ` [PATCH v6 00/19] Add a host IOMMU device abstraction to check with vIOMMU Eric Auger
19 siblings, 1 reply; 70+ messages in thread
From: Zhenzhong Duan @ 2024-06-03 6:10 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, eric.auger, mst, peterx, jasowang, jgg,
nicolinc, joao.m.martins, clement.mathieu--drif, kevin.tian,
yi.l.liu, chao.p.peng, Zhenzhong Duan, Marcel Apfelbaum,
Paolo Bonzini, Richard Henderson, Eduardo Habkost
If check fails, host device (either VFIO or VDPA device) is not
compatible with current vIOMMU config and should not be passed to
guest.
Only aw_bits is checked for now, we don't care other capabilities
before scalable modern mode is introduced.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
hw/i386/intel_iommu.c | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 747c988bc4..d8202a77dd 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3819,6 +3819,30 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
return vtd_dev_as;
}
+static bool vtd_check_hdev(IntelIOMMUState *s, HostIOMMUDevice *hiod,
+ Error **errp)
+{
+ HostIOMMUDeviceClass *hiodc = HOST_IOMMU_DEVICE_GET_CLASS(hiod);
+ int ret;
+
+ if (!hiodc->get_cap) {
+ error_setg(errp, ".get_cap() not implemented");
+ return false;
+ }
+
+ /* Common checks */
+ ret = hiodc->get_cap(hiod, HOST_IOMMU_DEVICE_CAP_AW_BITS, errp);
+ if (ret < 0) {
+ return false;
+ }
+ if (s->aw_bits > ret) {
+ error_setg(errp, "aw-bits %d > host aw-bits %d", s->aw_bits, ret);
+ return false;
+ }
+
+ return true;
+}
+
static bool vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
HostIOMMUDevice *hiod, Error **errp)
{
@@ -3842,6 +3866,11 @@ static bool vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
return false;
}
+ if (!vtd_check_hdev(s, hiod, errp)) {
+ vtd_iommu_unlock(s);
+ return false;
+ }
+
vtd_hdev = g_malloc0(sizeof(VTDHostIOMMUDevice));
vtd_hdev->bus = bus;
vtd_hdev->devfn = (uint8_t)devfn;
--
2.34.1
^ permalink raw reply related [flat|nested] 70+ messages in thread
* Re: [PATCH v6 19/19] intel_iommu: Check compatibility with host IOMMU capabilities
2024-06-03 6:10 ` [PATCH v6 19/19] intel_iommu: Check compatibility with host IOMMU capabilities Zhenzhong Duan
@ 2024-06-03 14:23 ` Eric Auger
2024-06-04 5:46 ` Duan, Zhenzhong
0 siblings, 1 reply; 70+ messages in thread
From: Eric Auger @ 2024-06-03 14:23 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, clg, mst, peterx, jasowang, jgg, nicolinc,
joao.m.martins, clement.mathieu--drif, kevin.tian, yi.l.liu,
chao.p.peng, Marcel Apfelbaum, Paolo Bonzini, Richard Henderson,
Eduardo Habkost
Hi Zhenzhong,
On 6/3/24 08:10, Zhenzhong Duan wrote:
> If check fails, host device (either VFIO or VDPA device) is not
> compatible with current vIOMMU config and should not be passed to
> guest.
>
> Only aw_bits is checked for now, we don't care other capabilities
we don't care about other caps
> before scalable modern mode is introduced.
>
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
> hw/i386/intel_iommu.c | 29 +++++++++++++++++++++++++++++
> 1 file changed, 29 insertions(+)
>
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index 747c988bc4..d8202a77dd 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -3819,6 +3819,30 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
> return vtd_dev_as;
> }
>
> +static bool vtd_check_hdev(IntelIOMMUState *s, HostIOMMUDevice *hiod,
> + Error **errp)
> +{
> + HostIOMMUDeviceClass *hiodc = HOST_IOMMU_DEVICE_GET_CLASS(hiod);
> + int ret;
> +
> + if (!hiodc->get_cap) {
> + error_setg(errp, ".get_cap() not implemented");
> + return false;
> + }
> +
> + /* Common checks */
> + ret = hiodc->get_cap(hiod, HOST_IOMMU_DEVICE_CAP_AW_BITS, errp);
> + if (ret < 0) {
> + return false;
> + }
> + if (s->aw_bits > ret) {
> + error_setg(errp, "aw-bits %d > host aw-bits %d", s->aw_bits, ret);
> + return false;
> + }
> +
> + return true;
> +}
> +
> static bool vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
> HostIOMMUDevice *hiod, Error **errp)
> {
> @@ -3842,6 +3866,11 @@ static bool vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
> return false;
> }
>
> + if (!vtd_check_hdev(s, hiod, errp)) {
> + vtd_iommu_unlock(s);
> + return false;
> + }
> +
> vtd_hdev = g_malloc0(sizeof(VTDHostIOMMUDevice));
> vtd_hdev->bus = bus;
> vtd_hdev->devfn = (uint8_t)devfn;
Eric
^ permalink raw reply [flat|nested] 70+ messages in thread
* RE: [PATCH v6 19/19] intel_iommu: Check compatibility with host IOMMU capabilities
2024-06-03 14:23 ` Eric Auger
@ 2024-06-04 5:46 ` Duan, Zhenzhong
0 siblings, 0 replies; 70+ messages in thread
From: Duan, Zhenzhong @ 2024-06-04 5:46 UTC (permalink / raw)
To: eric.auger@redhat.com, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
peterx@redhat.com, jasowang@redhat.com, jgg@nvidia.com,
nicolinc@nvidia.com, joao.m.martins@oracle.com,
clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
Peng, Chao P, Marcel Apfelbaum, Paolo Bonzini, Richard Henderson,
Eduardo Habkost
>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: Re: [PATCH v6 19/19] intel_iommu: Check compatibility with host
>IOMMU capabilities
>
>Hi Zhenzhong,
>
>On 6/3/24 08:10, Zhenzhong Duan wrote:
>> If check fails, host device (either VFIO or VDPA device) is not
>> compatible with current vIOMMU config and should not be passed to
>> guest.
>>
>> Only aw_bits is checked for now, we don't care other capabilities
>we don't care about other caps
Will do.
Thanks
Zhenzhong
>> before scalable modern mode is introduced.
>>
>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>> hw/i386/intel_iommu.c | 29 +++++++++++++++++++++++++++++
>> 1 file changed, 29 insertions(+)
>>
>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>> index 747c988bc4..d8202a77dd 100644
>> --- a/hw/i386/intel_iommu.c
>> +++ b/hw/i386/intel_iommu.c
>> @@ -3819,6 +3819,30 @@ VTDAddressSpace
>*vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus,
>> return vtd_dev_as;
>> }
>>
>> +static bool vtd_check_hdev(IntelIOMMUState *s, HostIOMMUDevice
>*hiod,
>> + Error **errp)
>> +{
>> + HostIOMMUDeviceClass *hiodc =
>HOST_IOMMU_DEVICE_GET_CLASS(hiod);
>> + int ret;
>> +
>> + if (!hiodc->get_cap) {
>> + error_setg(errp, ".get_cap() not implemented");
>> + return false;
>> + }
>> +
>> + /* Common checks */
>> + ret = hiodc->get_cap(hiod, HOST_IOMMU_DEVICE_CAP_AW_BITS,
>errp);
>> + if (ret < 0) {
>> + return false;
>> + }
>> + if (s->aw_bits > ret) {
>> + error_setg(errp, "aw-bits %d > host aw-bits %d", s->aw_bits, ret);
>> + return false;
>> + }
>> +
>> + return true;
>> +}
>> +
>> static bool vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int
>devfn,
>> HostIOMMUDevice *hiod, Error **errp)
>> {
>> @@ -3842,6 +3866,11 @@ static bool
>vtd_dev_set_iommu_device(PCIBus *bus, void *opaque, int devfn,
>> return false;
>> }
>>
>> + if (!vtd_check_hdev(s, hiod, errp)) {
>> + vtd_iommu_unlock(s);
>> + return false;
>> + }
>> +
>> vtd_hdev = g_malloc0(sizeof(VTDHostIOMMUDevice));
>> vtd_hdev->bus = bus;
>> vtd_hdev->devfn = (uint8_t)devfn;
>Eric
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH v6 00/19] Add a host IOMMU device abstraction to check with vIOMMU
2024-06-03 6:10 [PATCH v6 00/19] Add a host IOMMU device abstraction to check with vIOMMU Zhenzhong Duan
` (18 preceding siblings ...)
2024-06-03 6:10 ` [PATCH v6 19/19] intel_iommu: Check compatibility with host IOMMU capabilities Zhenzhong Duan
@ 2024-06-03 12:43 ` Eric Auger
2024-06-04 3:32 ` Duan, Zhenzhong
19 siblings, 1 reply; 70+ messages in thread
From: Eric Auger @ 2024-06-03 12:43 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, clg, mst, peterx, jasowang, jgg, nicolinc,
joao.m.martins, clement.mathieu--drif, kevin.tian, yi.l.liu,
chao.p.peng
Hi Zhenzhong,
On 6/3/24 08:10, Zhenzhong Duan wrote:
> Hi,
>
> This series introduce a HostIOMMUDevice abstraction and sub-classes.
> Also HostIOMMUDeviceCaps structure in HostIOMMUDevice and a new interface
> between vIOMMU and HostIOMMUDevice.
I think we should have a textual description of what is a
HostIOMMUDevice. Because to me the terminology may be confusing as the
reader can understand this is an abstraction for the physical IOMMU.
Would it be correct to say:
A HostIOMMUDevice is an abstraction for an assigned device that is
protected by a physical IOMMU (aka host IOMMU). The userspace
interaction with this physical IOMMU can be done either through the VFIO
IOMMU type 1 legacy backend or the new iommufd backend. The assigned
device can be a VFIO device or a VDPA device. The HostIOMMUDevice is
needed to interact with the host IOMMU that protects the assigned
device. It is especially useful when the device is also protected by a
virtual IOMMU as this latter use the translation services of the
physical IOMMU and is constraained by it. In that context the
HostIOMMUDevice can be passed to the virtual IOMMU to collect physical
IOMMU capabilities such as the supported address width. In the future,
the virtual IOMMU will use the HostIOMMUDevice to program the guest page
tables in the first translation stage of the physical IOMMU.
If such kind of description is correct, I would also suggest to embed it
in the patch 1 commit msg.
Thanks
Eric
>
> HostIOMMUDeviceClass::realize() is introduced to initialize
> HostIOMMUDeviceCaps and other fields of HostIOMMUDevice variants.
>
> HostIOMMUDeviceClass::get_cap() is introduced to query host IOMMU
> device capabilities.
>
> The class tree is as below:
>
> HostIOMMUDevice
> | .caps
> | .realize()
> | .get_cap()
> |
> .-----------------------------------------------.
> | | |
> HostIOMMUDeviceLegacyVFIO {HostIOMMUDeviceLegacyVDPA} HostIOMMUDeviceIOMMUFD
> | | | [.iommufd]
> | [.devid]
> | [.ioas_id]
> | [.attach_hwpt()]
> | [.detach_hwpt()]
> |
> .----------------------.
> | |
> HostIOMMUDeviceIOMMUFDVFIO {HostIOMMUDeviceIOMMUFDVDPA}
> | [.vdev] | {.vdev}
>
> * The attributes in [] will be implemented in nesting series.
> * The classes in {} will be implemented in future.
> * .vdev in different class points to different agent device,
> * i.e., for VFIO it points to VFIODevice.
>
> PATCH1-4: Introduce HostIOMMUDevice and its sub classes
> PATCH5-11: Introduce HostIOMMUDeviceCaps, implement .realize() and .get_cap() handler
> PATCH12-16: Create HostIOMMUDevice instance and pass to vIOMMU
> PATCH17-19: Implement compatibility check between host IOMMU and vIOMMU(intel_iommu)
>
> Test done:
> make check
> vfio device hotplug/unplug with different backend on linux
> reboot
> build test on linux and windows11
>
> Qemu code can be found at:
> https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_preq_v6
>
> Besides the compatibility check in this series, in nesting series, this
> host IOMMU device is extended for much wider usage. For anyone interested
> on the nesting series, here is the link:
> https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_rfcv2
>
> Thanks
> Zhenzhong
>
> Changelog:
> v6:
> - Open coded host_iommu_device_get_cap() to avoid #ifdef in intel_iommu.c (Cédric)
>
> v5:
> - pci_device_set_iommu_device return true (Cédric)
> - fix build failure on windows (thanks Cédric found that issue)
>
> v4:
> - move properties vdev, iommufd and devid to nesting series where need it (Cédric)
> - fix 32bit build with clz64 (Cédric)
> - change check_cap naming to get_cap (Cédric)
> - return bool if error is passed through errp (Cédric)
> - drop HostIOMMUDevice[LegacyVFIO|IOMMUFD|IOMMUFDVFIO] declaration (Cédric)
> - drop HOST_IOMMU_DEVICE_CAP_IOMMUFD (Cédric)
> - replace include directive with forward declaration (Cédric)
>
> v3:
> - refine declaration and doc for HostIOMMUDevice (Cédric, Philippe)
> - introduce HostIOMMUDeviceCaps, .realize() and .check_cap() (Cédric)
> - introduce helper range_get_last_bit() for range operation (Cédric)
> - separate pci_device_get_iommu_bus_devfn() in a prereq patch (Cédric)
> - replace HIOD_ abbreviation with HOST_IOMMU_DEVICE_ (Cédric)
> - add header in include/sysemu/iommufd.h (Cédric)
>
> v2:
> - use QOM to abstract host IOMMU device and its sub-classes (Cédric)
> - move host IOMMU device creation in attach_device() (Cédric)
> - refine pci_device_set/unset_iommu_device doc further (Eric)
> - define host IOMMU info format of different backend
> - implement get_host_iommu_info() for different backend (Cédric)
> - drop cap/ecap update logic (MST)
> - check aw-bits from get_host_iommu_info() in legacy mode
>
> v1:
> - use HostIOMMUDevice handle instead of union in VFIODevice (Eric)
> - change host_iommu_device_init to host_iommu_device_create
> - allocate HostIOMMUDevice in host_iommu_device_create callback
> and set the VFIODevice base_hdev handle (Eric)
> - refine pci_device_set/unset_iommu_device doc (Eric)
> - use HostIOMMUDevice handle instead of union in VTDHostIOMMUDevice (Eric)
> - convert HostIOMMUDevice to sub object pointer in vtd_check_hdev
>
> rfcv2:
> - introduce common abstract HostIOMMUDevice and sub struct for different BEs (Eric, Cédric)
> - remove iommufd_device.[ch] (Cédric)
> - remove duplicate iommufd/devid define from VFIODevice (Eric)
> - drop the p in aliased_pbus and aliased_pdevfn (Eric)
> - assert devfn and iommu_bus in pci_device_get_iommu_bus_devfn (Cédric, Eric)
> - use errp in iommufd_device_get_info (Eric)
> - split and simplify cap/ecap check/sync code in intel_iommu.c (Cédric)
> - move VTDHostIOMMUDevice declaration to intel_iommu_internal.h (Cédric)
> - make '(vtd->cap_reg >> 16) & 0x3fULL' a MACRO and add missed '+1' (Cédric)
> - block migration if vIOMMU cap/ecap updated based on host IOMMU cap/ecap
> - add R-B
>
> Yi Liu (2):
> hw/pci: Introduce pci_device_[set|unset]_iommu_device()
> intel_iommu: Implement [set|unset]_iommu_device() callbacks
>
> Zhenzhong Duan (17):
> backends: Introduce HostIOMMUDevice abstract
> vfio/container: Introduce TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO device
> backends/iommufd: Introduce abstract TYPE_HOST_IOMMU_DEVICE_IOMMUFD
> device
> vfio/iommufd: Introduce TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO device
> backends/host_iommu_device: Introduce HostIOMMUDeviceCaps
> range: Introduce range_get_last_bit()
> vfio/container: Implement HostIOMMUDeviceClass::realize() handler
> backends/iommufd: Introduce helper function
> iommufd_backend_get_device_info()
> vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler
> vfio/container: Implement HostIOMMUDeviceClass::get_cap() handler
> backends/iommufd: Implement HostIOMMUDeviceClass::get_cap() handler
> vfio: Introduce VFIOIOMMUClass::hiod_typename attribute
> vfio: Create host IOMMU device instance
> hw/pci: Introduce helper function pci_device_get_iommu_bus_devfn()
> vfio/pci: Pass HostIOMMUDevice to vIOMMU
> intel_iommu: Extract out vtd_cap_init() to initialize cap/ecap
> intel_iommu: Check compatibility with host IOMMU capabilities
>
> MAINTAINERS | 2 +
> hw/i386/intel_iommu_internal.h | 9 ++
> include/hw/i386/intel_iommu.h | 3 +
> include/hw/pci/pci.h | 38 ++++-
> include/hw/vfio/vfio-common.h | 7 +
> include/hw/vfio/vfio-container-base.h | 3 +
> include/qemu/range.h | 11 ++
> include/sysemu/host_iommu_device.h | 88 ++++++++++++
> include/sysemu/iommufd.h | 19 +++
> backends/host_iommu_device.c | 30 ++++
> backends/iommufd.c | 76 ++++++++--
> hw/i386/intel_iommu.c | 198 ++++++++++++++++++++------
> hw/pci/pci.c | 75 +++++++++-
> hw/vfio/common.c | 16 ++-
> hw/vfio/container.c | 48 ++++++-
> hw/vfio/iommufd.c | 44 +++++-
> hw/vfio/pci.c | 19 ++-
> backends/Kconfig | 5 +
> backends/meson.build | 1 +
> 19 files changed, 623 insertions(+), 69 deletions(-)
> create mode 100644 include/sysemu/host_iommu_device.h
> create mode 100644 backends/host_iommu_device.c
>
^ permalink raw reply [flat|nested] 70+ messages in thread
* RE: [PATCH v6 00/19] Add a host IOMMU device abstraction to check with vIOMMU
2024-06-03 12:43 ` [PATCH v6 00/19] Add a host IOMMU device abstraction to check with vIOMMU Eric Auger
@ 2024-06-04 3:32 ` Duan, Zhenzhong
0 siblings, 0 replies; 70+ messages in thread
From: Duan, Zhenzhong @ 2024-06-04 3:32 UTC (permalink / raw)
To: eric.auger@redhat.com, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
peterx@redhat.com, jasowang@redhat.com, jgg@nvidia.com,
nicolinc@nvidia.com, joao.m.martins@oracle.com,
clement.mathieu--drif@eviden.com, Tian, Kevin, Liu, Yi L,
Peng, Chao P
>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: Re: [PATCH v6 00/19] Add a host IOMMU device abstraction to
>check with vIOMMU
>
>Hi Zhenzhong,
>
>On 6/3/24 08:10, Zhenzhong Duan wrote:
>> Hi,
>>
>> This series introduce a HostIOMMUDevice abstraction and sub-classes.
>> Also HostIOMMUDeviceCaps structure in HostIOMMUDevice and a new
>interface
>> between vIOMMU and HostIOMMUDevice.
>
>I think we should have a textual description of what is a
>HostIOMMUDevice. Because to me the terminology may be confusing as the
>reader can understand this is an abstraction for the physical IOMMU.
>
>Would it be correct to say:
>
>A HostIOMMUDevice is an abstraction for an assigned device that is
>protected by a physical IOMMU (aka host IOMMU). The userspace
>interaction with this physical IOMMU can be done either through the VFIO
>IOMMU type 1 legacy backend or the new iommufd backend. The assigned
>device can be a VFIO device or a VDPA device. The HostIOMMUDevice is
>needed to interact with the host IOMMU that protects the assigned
>device. It is especially useful when the device is also protected by a
>virtual IOMMU as this latter use the translation services of the
>physical IOMMU and is constraained by it. In that context the
>HostIOMMUDevice can be passed to the virtual IOMMU to collect physical
>IOMMU capabilities such as the supported address width. In the future,
>the virtual IOMMU will use the HostIOMMUDevice to program the guest
>page
>tables in the first translation stage of the physical IOMMU.
Great, thanks Eric.
>
>If such kind of description is correct, I would also suggest to embed it
>in the patch 1 commit msg.
Sure.
Thanks
Zhenzhong
>
>Thanks
>
>Eric
>
>
>>
>> HostIOMMUDeviceClass::realize() is introduced to initialize
>> HostIOMMUDeviceCaps and other fields of HostIOMMUDevice variants.
>>
>> HostIOMMUDeviceClass::get_cap() is introduced to query host IOMMU
>> device capabilities.
>>
>> The class tree is as below:
>>
>> HostIOMMUDevice
>> | .caps
>> | .realize()
>> | .get_cap()
>> |
>> .-----------------------------------------------.
>> | | |
>> HostIOMMUDeviceLegacyVFIO {HostIOMMUDeviceLegacyVDPA}
>HostIOMMUDeviceIOMMUFD
>> | | | [.iommufd]
>> | [.devid]
>> | [.ioas_id]
>> | [.attach_hwpt()]
>> | [.detach_hwpt()]
>> |
>> .----------------------.
>> | |
>> HostIOMMUDeviceIOMMUFDVFIO
>{HostIOMMUDeviceIOMMUFDVDPA}
>> | [.vdev] | {.vdev}
>>
>> * The attributes in [] will be implemented in nesting series.
>> * The classes in {} will be implemented in future.
>> * .vdev in different class points to different agent device,
>> * i.e., for VFIO it points to VFIODevice.
>>
>> PATCH1-4: Introduce HostIOMMUDevice and its sub classes
>> PATCH5-11: Introduce HostIOMMUDeviceCaps, implement .realize()
>and .get_cap() handler
>> PATCH12-16: Create HostIOMMUDevice instance and pass to vIOMMU
>> PATCH17-19: Implement compatibility check between host IOMMU and
>vIOMMU(intel_iommu)
>>
>> Test done:
>> make check
>> vfio device hotplug/unplug with different backend on linux
>> reboot
>> build test on linux and windows11
>>
>> Qemu code can be found at:
>>
>https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_pre
>q_v6
>>
>> Besides the compatibility check in this series, in nesting series, this
>> host IOMMU device is extended for much wider usage. For anyone
>interested
>> on the nesting series, here is the link:
>>
>https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_nesting_rfc
>v2
>>
>> Thanks
>> Zhenzhong
>>
>> Changelog:
>> v6:
>> - Open coded host_iommu_device_get_cap() to avoid #ifdef in
>intel_iommu.c (Cédric)
>>
>> v5:
>> - pci_device_set_iommu_device return true (Cédric)
>> - fix build failure on windows (thanks Cédric found that issue)
>>
>> v4:
>> - move properties vdev, iommufd and devid to nesting series where need it
>(Cédric)
>> - fix 32bit build with clz64 (Cédric)
>> - change check_cap naming to get_cap (Cédric)
>> - return bool if error is passed through errp (Cédric)
>> - drop HostIOMMUDevice[LegacyVFIO|IOMMUFD|IOMMUFDVFIO]
>declaration (Cédric)
>> - drop HOST_IOMMU_DEVICE_CAP_IOMMUFD (Cédric)
>> - replace include directive with forward declaration (Cédric)
>>
>> v3:
>> - refine declaration and doc for HostIOMMUDevice (Cédric, Philippe)
>> - introduce HostIOMMUDeviceCaps, .realize() and .check_cap() (Cédric)
>> - introduce helper range_get_last_bit() for range operation (Cédric)
>> - separate pci_device_get_iommu_bus_devfn() in a prereq patch (Cédric)
>> - replace HIOD_ abbreviation with HOST_IOMMU_DEVICE_ (Cédric)
>> - add header in include/sysemu/iommufd.h (Cédric)
>>
>> v2:
>> - use QOM to abstract host IOMMU device and its sub-classes (Cédric)
>> - move host IOMMU device creation in attach_device() (Cédric)
>> - refine pci_device_set/unset_iommu_device doc further (Eric)
>> - define host IOMMU info format of different backend
>> - implement get_host_iommu_info() for different backend (Cédric)
>> - drop cap/ecap update logic (MST)
>> - check aw-bits from get_host_iommu_info() in legacy mode
>>
>> v1:
>> - use HostIOMMUDevice handle instead of union in VFIODevice (Eric)
>> - change host_iommu_device_init to host_iommu_device_create
>> - allocate HostIOMMUDevice in host_iommu_device_create callback
>> and set the VFIODevice base_hdev handle (Eric)
>> - refine pci_device_set/unset_iommu_device doc (Eric)
>> - use HostIOMMUDevice handle instead of union in VTDHostIOMMUDevice
>(Eric)
>> - convert HostIOMMUDevice to sub object pointer in vtd_check_hdev
>>
>> rfcv2:
>> - introduce common abstract HostIOMMUDevice and sub struct for
>different BEs (Eric, Cédric)
>> - remove iommufd_device.[ch] (Cédric)
>> - remove duplicate iommufd/devid define from VFIODevice (Eric)
>> - drop the p in aliased_pbus and aliased_pdevfn (Eric)
>> - assert devfn and iommu_bus in pci_device_get_iommu_bus_devfn
>(Cédric, Eric)
>> - use errp in iommufd_device_get_info (Eric)
>> - split and simplify cap/ecap check/sync code in intel_iommu.c (Cédric)
>> - move VTDHostIOMMUDevice declaration to intel_iommu_internal.h
>(Cédric)
>> - make '(vtd->cap_reg >> 16) & 0x3fULL' a MACRO and add missed '+1'
>(Cédric)
>> - block migration if vIOMMU cap/ecap updated based on host IOMMU
>cap/ecap
>> - add R-B
>>
>> Yi Liu (2):
>> hw/pci: Introduce pci_device_[set|unset]_iommu_device()
>> intel_iommu: Implement [set|unset]_iommu_device() callbacks
>>
>> Zhenzhong Duan (17):
>> backends: Introduce HostIOMMUDevice abstract
>> vfio/container: Introduce TYPE_HOST_IOMMU_DEVICE_LEGACY_VFIO
>device
>> backends/iommufd: Introduce abstract
>TYPE_HOST_IOMMU_DEVICE_IOMMUFD
>> device
>> vfio/iommufd: Introduce TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO
>device
>> backends/host_iommu_device: Introduce HostIOMMUDeviceCaps
>> range: Introduce range_get_last_bit()
>> vfio/container: Implement HostIOMMUDeviceClass::realize() handler
>> backends/iommufd: Introduce helper function
>> iommufd_backend_get_device_info()
>> vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler
>> vfio/container: Implement HostIOMMUDeviceClass::get_cap() handler
>> backends/iommufd: Implement HostIOMMUDeviceClass::get_cap()
>handler
>> vfio: Introduce VFIOIOMMUClass::hiod_typename attribute
>> vfio: Create host IOMMU device instance
>> hw/pci: Introduce helper function pci_device_get_iommu_bus_devfn()
>> vfio/pci: Pass HostIOMMUDevice to vIOMMU
>> intel_iommu: Extract out vtd_cap_init() to initialize cap/ecap
>> intel_iommu: Check compatibility with host IOMMU capabilities
>>
>> MAINTAINERS | 2 +
>> hw/i386/intel_iommu_internal.h | 9 ++
>> include/hw/i386/intel_iommu.h | 3 +
>> include/hw/pci/pci.h | 38 ++++-
>> include/hw/vfio/vfio-common.h | 7 +
>> include/hw/vfio/vfio-container-base.h | 3 +
>> include/qemu/range.h | 11 ++
>> include/sysemu/host_iommu_device.h | 88 ++++++++++++
>> include/sysemu/iommufd.h | 19 +++
>> backends/host_iommu_device.c | 30 ++++
>> backends/iommufd.c | 76 ++++++++--
>> hw/i386/intel_iommu.c | 198 ++++++++++++++++++++------
>> hw/pci/pci.c | 75 +++++++++-
>> hw/vfio/common.c | 16 ++-
>> hw/vfio/container.c | 48 ++++++-
>> hw/vfio/iommufd.c | 44 +++++-
>> hw/vfio/pci.c | 19 ++-
>> backends/Kconfig | 5 +
>> backends/meson.build | 1 +
>> 19 files changed, 623 insertions(+), 69 deletions(-)
>> create mode 100644 include/sysemu/host_iommu_device.h
>> create mode 100644 backends/host_iommu_device.c
>>
^ permalink raw reply [flat|nested] 70+ messages in thread