* [PATCH RFC v3 0/4] vdpa: decouple reset of iotlb mapping from device reset
@ 2023-09-09 13:43 Si-Wei Liu
2023-09-09 13:43 ` [PATCH RFC v3 1/4] vdpa: introduce .reset_map operation callback Si-Wei Liu
` (3 more replies)
0 siblings, 4 replies; 13+ messages in thread
From: Si-Wei Liu @ 2023-09-09 13:43 UTC (permalink / raw)
To: eperezma, jasowang, mst, xuanzhuo, dtatulea; +Cc: virtualization
In order to reduce needlessly high setup and teardown cost
of iotlb mapping during live migration, it's crucial to
decouple the vhost-vdpa iotlb abstraction from the virtio
device life cycle, i.e. iotlb mappings should be left
intact across virtio device reset [1]. For it to work, the
on-chip IOMMU parent device should implement a separate
.reset_map() operation callback to restore 1:1 DMA mapping
without having to resort to the .reset() callback, which
is mainly used to reset virtio specific device state.
This new .reset_map() callback will be invoked only when
the vhost-vdpa driver is to be removed and detached from
the vdpa bus, such that other vdpa bus drivers, e.g.
virtio-vdpa, can start with 1:1 DMA mapping when they
are attached. For the context, those on-chip IOMMU parent
devices, create the 1:1 DMA mapping at vdpa device add,
and they would implicitly destroy the 1:1 mapping when
the first .set_map or .dma_map callback is invoked.
[1] Reducing vdpa migration downtime because of memory pin / maps
https://www.mail-archive.com/qemu-devel@nongnu.org/msg953755.html
---
RFC v3:
- fix missing return due to merge error in patch #4
RFC v2:
- rebased on top of the "[PATCH RFC v2 0/3] vdpa: dedicated descriptor table group" series:
https://lore.kernel.org/virtualization/1694248959-13369-1-git-send-email-si-wei.liu@oracle.com/
---
Si-Wei Liu (4):
vdpa: introduce .reset_map operation callback
vdpa/mlx5: implement .reset_map driver op
vhost-vdpa: should restore 1:1 dma mapping before detaching driver
vhost-vdpa: introduce IOTLB_PERSIST backend feature bit
drivers/vdpa/mlx5/core/mlx5_vdpa.h | 1 +
drivers/vdpa/mlx5/core/mr.c | 70 +++++++++++++++++++++++---------------
drivers/vdpa/mlx5/net/mlx5_vnet.c | 18 +++++++---
drivers/vhost/vdpa.c | 32 ++++++++++++++++-
include/linux/vdpa.h | 7 ++++
include/uapi/linux/vhost_types.h | 2 ++
6 files changed, 96 insertions(+), 34 deletions(-)
--
1.8.3.1
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 13+ messages in thread* [PATCH RFC v3 1/4] vdpa: introduce .reset_map operation callback 2023-09-09 13:43 [PATCH RFC v3 0/4] vdpa: decouple reset of iotlb mapping from device reset Si-Wei Liu @ 2023-09-09 13:43 ` Si-Wei Liu 2023-09-09 13:43 ` [PATCH RFC v3 2/4] vdpa/mlx5: implement .reset_map driver op Si-Wei Liu ` (2 subsequent siblings) 3 siblings, 0 replies; 13+ messages in thread From: Si-Wei Liu @ 2023-09-09 13:43 UTC (permalink / raw) To: eperezma, jasowang, mst, xuanzhuo, dtatulea; +Cc: virtualization On-chip IOMMU parent driver could use it to restore memory mapping to the initial state. Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> --- include/linux/vdpa.h | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h index 17a4efa..daecf55 100644 --- a/include/linux/vdpa.h +++ b/include/linux/vdpa.h @@ -324,6 +324,12 @@ struct vdpa_map_file { * @iova: iova to be unmapped * @size: size of the area * Returns integer: success (0) or error (< 0) + * @reset_map: Reset device memory mapping (optional) + * Needed for device that using device + * specific DMA translation (on-chip IOMMU) + * @vdev: vdpa device + * @asid: address space identifier + * Returns integer: success (0) or error (< 0) * @get_vq_dma_dev: Get the dma device for a specific * virtqueue (optional) * @vdev: vdpa device @@ -401,6 +407,7 @@ struct vdpa_config_ops { u64 iova, u64 size, u64 pa, u32 perm, void *opaque); int (*dma_unmap)(struct vdpa_device *vdev, unsigned int asid, u64 iova, u64 size); + int (*reset_map)(struct vdpa_device *vdev, unsigned int asid); int (*set_group_asid)(struct vdpa_device *vdev, unsigned int group, unsigned int asid); struct device *(*get_vq_dma_dev)(struct vdpa_device *vdev, u16 idx); -- 1.8.3.1 _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH RFC v3 2/4] vdpa/mlx5: implement .reset_map driver op 2023-09-09 13:43 [PATCH RFC v3 0/4] vdpa: decouple reset of iotlb mapping from device reset Si-Wei Liu 2023-09-09 13:43 ` [PATCH RFC v3 1/4] vdpa: introduce .reset_map operation callback Si-Wei Liu @ 2023-09-09 13:43 ` Si-Wei Liu 2023-09-11 3:48 ` Jason Wang 2023-09-09 13:43 ` [PATCH RFC v3 3/4] vhost-vdpa: should restore 1:1 dma mapping before detaching driver Si-Wei Liu 2023-09-09 13:43 ` [PATCH RFC v3 4/4] vhost-vdpa: introduce IOTLB_PERSIST backend feature bit Si-Wei Liu 3 siblings, 1 reply; 13+ messages in thread From: Si-Wei Liu @ 2023-09-09 13:43 UTC (permalink / raw) To: eperezma, jasowang, mst, xuanzhuo, dtatulea; +Cc: virtualization Today, mlx5_vdpa gets started by preallocate 1:1 DMA mapping at device creation time, while this 1:1 mapping will be implicitly destroyed when the first .set_map call is invoked. Everytime when the .reset callback is invoked, any mapping left behind will be dropped then reset back to the initial 1:1 DMA mapping. In order to reduce excessive memory mapping cost during live migration, it is desirable to decouple the vhost-vdpa iotlb abstraction from the virtio device life cycle, i.e. mappings should be left intact across virtio device reset. Leverage the .reset_map callback to reset memory mapping, then the device .reset routine can run free from having to clean up memory mappings. Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> --- RFC v1 -> v2: - fix error path when both CVQ and DVQ fall in same asid --- drivers/vdpa/mlx5/core/mlx5_vdpa.h | 1 + drivers/vdpa/mlx5/core/mr.c | 70 +++++++++++++++++++++++--------------- drivers/vdpa/mlx5/net/mlx5_vnet.c | 18 +++++++--- 3 files changed, 56 insertions(+), 33 deletions(-) diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h index b53420e..5c9a25a 100644 --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h @@ -123,6 +123,7 @@ int mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, struct vhost_iotlb *iotlb, unsigned int asid); void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev); void mlx5_vdpa_destroy_mr_asid(struct mlx5_vdpa_dev *mvdev, unsigned int asid); +int mlx5_vdpa_reset_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid); #define mlx5_vdpa_warn(__dev, format, ...) \ dev_warn((__dev)->mdev->device, "%s:%d:(pid %d) warning: " format, __func__, __LINE__, \ diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c index 5a1971fc..ec2c7b4e1 100644 --- a/drivers/vdpa/mlx5/core/mr.c +++ b/drivers/vdpa/mlx5/core/mr.c @@ -489,21 +489,15 @@ static void destroy_user_mr(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_mr *mr } } -static void _mlx5_vdpa_destroy_cvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) +static void _mlx5_vdpa_destroy_cvq_mr(struct mlx5_vdpa_dev *mvdev) { - if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) - return; - prune_iotlb(mvdev); } -static void _mlx5_vdpa_destroy_dvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) +static void _mlx5_vdpa_destroy_dvq_mr(struct mlx5_vdpa_dev *mvdev) { struct mlx5_vdpa_mr *mr = &mvdev->mr; - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) - return; - if (!mr->initialized) return; @@ -521,8 +515,10 @@ void mlx5_vdpa_destroy_mr_asid(struct mlx5_vdpa_dev *mvdev, unsigned int asid) mutex_lock(&mr->mkey_mtx); - _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); - _mlx5_vdpa_destroy_cvq_mr(mvdev, asid); + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) + _mlx5_vdpa_destroy_dvq_mr(mvdev); + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] == asid) + _mlx5_vdpa_destroy_cvq_mr(mvdev); mutex_unlock(&mr->mkey_mtx); } @@ -534,25 +530,17 @@ void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) } static int _mlx5_vdpa_create_cvq_mr(struct mlx5_vdpa_dev *mvdev, - struct vhost_iotlb *iotlb, - unsigned int asid) + struct vhost_iotlb *iotlb) { - if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) - return 0; - return dup_iotlb(mvdev, iotlb); } static int _mlx5_vdpa_create_dvq_mr(struct mlx5_vdpa_dev *mvdev, - struct vhost_iotlb *iotlb, - unsigned int asid) + struct vhost_iotlb *iotlb) { struct mlx5_vdpa_mr *mr = &mvdev->mr; int err; - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) - return 0; - if (mr->initialized) return 0; @@ -574,18 +562,22 @@ static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, { int err; - err = _mlx5_vdpa_create_dvq_mr(mvdev, iotlb, asid); - if (err) - return err; - - err = _mlx5_vdpa_create_cvq_mr(mvdev, iotlb, asid); - if (err) - goto out_err; + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { + err = _mlx5_vdpa_create_dvq_mr(mvdev, iotlb); + if (err) + return err; + } + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] == asid) { + err = _mlx5_vdpa_create_cvq_mr(mvdev, iotlb); + if (err) + goto out_err; + } return 0; out_err: - _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) + _mlx5_vdpa_destroy_dvq_mr(mvdev); return err; } @@ -601,6 +593,28 @@ int mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, struct vhost_iotlb *iotlb, return err; } +int mlx5_vdpa_reset_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) +{ + struct mlx5_vdpa_mr *mr = &mvdev->mr; + int err = 0; + + if (asid != 0) + return 0; + + mutex_lock(&mr->mkey_mtx); + if (!mr->user_mr) + goto out; + _mlx5_vdpa_destroy_dvq_mr(mvdev); + if (MLX5_CAP_GEN(mvdev->mdev, umem_uid_0)) { + err = _mlx5_vdpa_create_dvq_mr(mvdev, NULL); + if (err) + mlx5_vdpa_warn(mvdev, "create DMA MR failed\n"); + } +out: + mutex_unlock(&mr->mkey_mtx); + return err; +} + int mlx5_vdpa_handle_set_map(struct mlx5_vdpa_dev *mvdev, struct vhost_iotlb *iotlb, bool *change_map, unsigned int asid) { diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c index 37be945..3cb5db6 100644 --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c @@ -2824,7 +2824,6 @@ static int mlx5_vdpa_reset(struct vdpa_device *vdev) unregister_link_notifier(ndev); teardown_driver(ndev); clear_vqs_ready(ndev); - mlx5_vdpa_destroy_mr(&ndev->mvdev); ndev->mvdev.status = 0; ndev->mvdev.suspended = false; ndev->cur_num_vqs = 0; @@ -2835,10 +2834,6 @@ static int mlx5_vdpa_reset(struct vdpa_device *vdev) init_group_to_asid_map(mvdev); ++mvdev->generation; - if (MLX5_CAP_GEN(mvdev->mdev, umem_uid_0)) { - if (mlx5_vdpa_create_mr(mvdev, NULL, 0)) - mlx5_vdpa_warn(mvdev, "create MR failed\n"); - } up_write(&ndev->reslock); return 0; @@ -2903,6 +2898,18 @@ static int mlx5_vdpa_set_map(struct vdpa_device *vdev, unsigned int asid, return err; } +static int mlx5_vdpa_reset_map(struct vdpa_device *vdev, unsigned int asid) +{ + struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev); + struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev); + int err; + + down_write(&ndev->reslock); + err = mlx5_vdpa_reset_mr(mvdev, asid); + up_write(&ndev->reslock); + return err; +} + static struct device *mlx5_get_vq_dma_dev(struct vdpa_device *vdev, u16 idx) { struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev); @@ -3162,6 +3169,7 @@ static int mlx5_set_group_asid(struct vdpa_device *vdev, u32 group, .set_config = mlx5_vdpa_set_config, .get_generation = mlx5_vdpa_get_generation, .set_map = mlx5_vdpa_set_map, + .reset_map = mlx5_vdpa_reset_map, .set_group_asid = mlx5_set_group_asid, .get_vq_dma_dev = mlx5_get_vq_dma_dev, .free = mlx5_vdpa_free, -- 1.8.3.1 _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH RFC v3 2/4] vdpa/mlx5: implement .reset_map driver op 2023-09-09 13:43 ` [PATCH RFC v3 2/4] vdpa/mlx5: implement .reset_map driver op Si-Wei Liu @ 2023-09-11 3:48 ` Jason Wang 2023-09-12 0:01 ` Si-Wei Liu 0 siblings, 1 reply; 13+ messages in thread From: Jason Wang @ 2023-09-11 3:48 UTC (permalink / raw) To: Si-Wei Liu; +Cc: eperezma, virtualization, xuanzhuo, mst On Sat, Sep 9, 2023 at 9:46 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > > Today, mlx5_vdpa gets started by preallocate 1:1 DMA mapping at > device creation time, while this 1:1 mapping will be implicitly > destroyed when the first .set_map call is invoked. Everytime > when the .reset callback is invoked, any mapping left behind will > be dropped then reset back to the initial 1:1 DMA mapping. > > In order to reduce excessive memory mapping cost during live > migration, it is desirable to decouple the vhost-vdpa iotlb > abstraction from the virtio device life cycle, i.e. mappings > should be left intact across virtio device reset. Leverage the > .reset_map callback to reset memory mapping, then the device > .reset routine can run free from having to clean up memory > mappings. It's not clear the direct relationship between the persistent mapping and reset_map. Could we do it step by step? For example, remove the mlx5_vdpa_destroy_mr() in mlx5_vdpa_reset() when PERSIST_IOTLB exists? And then we can deal with the resetting and others on top, or it needs some explanation for why reset_map() must be done first. Thanks > > Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> > > --- > RFC v1 -> v2: > - fix error path when both CVQ and DVQ fall in same asid > --- > drivers/vdpa/mlx5/core/mlx5_vdpa.h | 1 + > drivers/vdpa/mlx5/core/mr.c | 70 +++++++++++++++++++++++--------------- > drivers/vdpa/mlx5/net/mlx5_vnet.c | 18 +++++++--- > 3 files changed, 56 insertions(+), 33 deletions(-) > > diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > index b53420e..5c9a25a 100644 > --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h > +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > @@ -123,6 +123,7 @@ int mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, struct vhost_iotlb *iotlb, > unsigned int asid); > void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev); > void mlx5_vdpa_destroy_mr_asid(struct mlx5_vdpa_dev *mvdev, unsigned int asid); > +int mlx5_vdpa_reset_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid); > > #define mlx5_vdpa_warn(__dev, format, ...) \ > dev_warn((__dev)->mdev->device, "%s:%d:(pid %d) warning: " format, __func__, __LINE__, \ > diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c > index 5a1971fc..ec2c7b4e1 100644 > --- a/drivers/vdpa/mlx5/core/mr.c > +++ b/drivers/vdpa/mlx5/core/mr.c > @@ -489,21 +489,15 @@ static void destroy_user_mr(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_mr *mr > } > } > > -static void _mlx5_vdpa_destroy_cvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) > +static void _mlx5_vdpa_destroy_cvq_mr(struct mlx5_vdpa_dev *mvdev) > { > - if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) > - return; > - > prune_iotlb(mvdev); > } > > -static void _mlx5_vdpa_destroy_dvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) > +static void _mlx5_vdpa_destroy_dvq_mr(struct mlx5_vdpa_dev *mvdev) > { > struct mlx5_vdpa_mr *mr = &mvdev->mr; > > - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) > - return; > - > if (!mr->initialized) > return; > > @@ -521,8 +515,10 @@ void mlx5_vdpa_destroy_mr_asid(struct mlx5_vdpa_dev *mvdev, unsigned int asid) > > mutex_lock(&mr->mkey_mtx); > > - _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); > - _mlx5_vdpa_destroy_cvq_mr(mvdev, asid); > + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) > + _mlx5_vdpa_destroy_dvq_mr(mvdev); > + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] == asid) > + _mlx5_vdpa_destroy_cvq_mr(mvdev); > > mutex_unlock(&mr->mkey_mtx); > } > @@ -534,25 +530,17 @@ void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) > } > > static int _mlx5_vdpa_create_cvq_mr(struct mlx5_vdpa_dev *mvdev, > - struct vhost_iotlb *iotlb, > - unsigned int asid) > + struct vhost_iotlb *iotlb) > { > - if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) > - return 0; > - > return dup_iotlb(mvdev, iotlb); > } > > static int _mlx5_vdpa_create_dvq_mr(struct mlx5_vdpa_dev *mvdev, > - struct vhost_iotlb *iotlb, > - unsigned int asid) > + struct vhost_iotlb *iotlb) > { > struct mlx5_vdpa_mr *mr = &mvdev->mr; > int err; > > - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) > - return 0; > - > if (mr->initialized) > return 0; > > @@ -574,18 +562,22 @@ static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, > { > int err; > > - err = _mlx5_vdpa_create_dvq_mr(mvdev, iotlb, asid); > - if (err) > - return err; > - > - err = _mlx5_vdpa_create_cvq_mr(mvdev, iotlb, asid); > - if (err) > - goto out_err; > + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { > + err = _mlx5_vdpa_create_dvq_mr(mvdev, iotlb); > + if (err) > + return err; > + } > + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] == asid) { > + err = _mlx5_vdpa_create_cvq_mr(mvdev, iotlb); > + if (err) > + goto out_err; > + } > > return 0; > > out_err: > - _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); > + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) > + _mlx5_vdpa_destroy_dvq_mr(mvdev); > > return err; > } > @@ -601,6 +593,28 @@ int mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, struct vhost_iotlb *iotlb, > return err; > } > > +int mlx5_vdpa_reset_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) > +{ > + struct mlx5_vdpa_mr *mr = &mvdev->mr; > + int err = 0; > + > + if (asid != 0) > + return 0; > + > + mutex_lock(&mr->mkey_mtx); > + if (!mr->user_mr) > + goto out; > + _mlx5_vdpa_destroy_dvq_mr(mvdev); > + if (MLX5_CAP_GEN(mvdev->mdev, umem_uid_0)) { > + err = _mlx5_vdpa_create_dvq_mr(mvdev, NULL); > + if (err) > + mlx5_vdpa_warn(mvdev, "create DMA MR failed\n"); > + } > +out: > + mutex_unlock(&mr->mkey_mtx); > + return err; > +} > + > int mlx5_vdpa_handle_set_map(struct mlx5_vdpa_dev *mvdev, struct vhost_iotlb *iotlb, > bool *change_map, unsigned int asid) > { > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c > index 37be945..3cb5db6 100644 > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c > @@ -2824,7 +2824,6 @@ static int mlx5_vdpa_reset(struct vdpa_device *vdev) > unregister_link_notifier(ndev); > teardown_driver(ndev); > clear_vqs_ready(ndev); > - mlx5_vdpa_destroy_mr(&ndev->mvdev); > ndev->mvdev.status = 0; > ndev->mvdev.suspended = false; > ndev->cur_num_vqs = 0; > @@ -2835,10 +2834,6 @@ static int mlx5_vdpa_reset(struct vdpa_device *vdev) > init_group_to_asid_map(mvdev); > ++mvdev->generation; > > - if (MLX5_CAP_GEN(mvdev->mdev, umem_uid_0)) { > - if (mlx5_vdpa_create_mr(mvdev, NULL, 0)) > - mlx5_vdpa_warn(mvdev, "create MR failed\n"); > - } > up_write(&ndev->reslock); > > return 0; > @@ -2903,6 +2898,18 @@ static int mlx5_vdpa_set_map(struct vdpa_device *vdev, unsigned int asid, > return err; > } > > +static int mlx5_vdpa_reset_map(struct vdpa_device *vdev, unsigned int asid) > +{ > + struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev); > + struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev); > + int err; > + > + down_write(&ndev->reslock); > + err = mlx5_vdpa_reset_mr(mvdev, asid); > + up_write(&ndev->reslock); > + return err; > +} > + > static struct device *mlx5_get_vq_dma_dev(struct vdpa_device *vdev, u16 idx) > { > struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev); > @@ -3162,6 +3169,7 @@ static int mlx5_set_group_asid(struct vdpa_device *vdev, u32 group, > .set_config = mlx5_vdpa_set_config, > .get_generation = mlx5_vdpa_get_generation, > .set_map = mlx5_vdpa_set_map, > + .reset_map = mlx5_vdpa_reset_map, > .set_group_asid = mlx5_set_group_asid, > .get_vq_dma_dev = mlx5_get_vq_dma_dev, > .free = mlx5_vdpa_free, > -- > 1.8.3.1 > _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH RFC v3 2/4] vdpa/mlx5: implement .reset_map driver op 2023-09-11 3:48 ` Jason Wang @ 2023-09-12 0:01 ` Si-Wei Liu 2023-09-12 6:53 ` Jason Wang 0 siblings, 1 reply; 13+ messages in thread From: Si-Wei Liu @ 2023-09-12 0:01 UTC (permalink / raw) To: Jason Wang; +Cc: eperezma, virtualization, xuanzhuo, mst On 9/10/2023 8:48 PM, Jason Wang wrote: > On Sat, Sep 9, 2023 at 9:46 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote: >> Today, mlx5_vdpa gets started by preallocate 1:1 DMA mapping at >> device creation time, while this 1:1 mapping will be implicitly >> destroyed when the first .set_map call is invoked. Everytime >> when the .reset callback is invoked, any mapping left behind will >> be dropped then reset back to the initial 1:1 DMA mapping. >> >> In order to reduce excessive memory mapping cost during live >> migration, it is desirable to decouple the vhost-vdpa iotlb >> abstraction from the virtio device life cycle, i.e. mappings >> should be left intact across virtio device reset. Leverage the >> .reset_map callback to reset memory mapping, then the device >> .reset routine can run free from having to clean up memory >> mappings. > It's not clear the direct relationship between the persistent mapping > and reset_map. Consider .reset_map as a simplified abstraction for on-chip IOMMU model, decoupling memory mapping mode switching from the current vdpa_reset hack. Slightly different than platform IOMMU iommu_domain_alloc/free, but works the best with existing .dma_map/.set_map APIs. As said in the other email, the distinction cannot be hidden, as there are bus drivers with varied mapping needs. On the other hand, I can live with the iommu_domain_alloc/free flavor strictly following the platform IOMMU model, but not sure if worth the complexity. > Could we do it step by step? For example, remove the > mlx5_vdpa_destroy_mr() in mlx5_vdpa_reset() when PERSIST_IOTLB exists? I think today there's no way for the parent driver to negotiate backend features with userspace, for e.g. parent won't be able to perform mlx5_vdpa_destroy_mr for the virtio-vdpa case when PERSIST_IOTLB doesn't exist. And this backend features stuff is a vhost specific thing, not specifically tied to vdpa itself. How do we get it extended and propagated up to the vdpa bus layer? > And then we can deal with the resetting and others on top, For this proposed fix, dealing with vdpa_reset from vhost-vdpa is not specifically an issue, but how to get the mapping reverted back to 1:1 identity/passthrough when users want to switch from vhost-vdpa to virtio-vdpa is. > or it needs > some explanation for why reset_map() must be done first. Yep, I can add more to the commit log. Thanks, -Siwei > > Thanks > >> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> >> >> --- >> RFC v1 -> v2: >> - fix error path when both CVQ and DVQ fall in same asid >> --- >> drivers/vdpa/mlx5/core/mlx5_vdpa.h | 1 + >> drivers/vdpa/mlx5/core/mr.c | 70 +++++++++++++++++++++++--------------- >> drivers/vdpa/mlx5/net/mlx5_vnet.c | 18 +++++++--- >> 3 files changed, 56 insertions(+), 33 deletions(-) >> >> diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h >> index b53420e..5c9a25a 100644 >> --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h >> +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h >> @@ -123,6 +123,7 @@ int mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, struct vhost_iotlb *iotlb, >> unsigned int asid); >> void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev); >> void mlx5_vdpa_destroy_mr_asid(struct mlx5_vdpa_dev *mvdev, unsigned int asid); >> +int mlx5_vdpa_reset_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid); >> >> #define mlx5_vdpa_warn(__dev, format, ...) \ >> dev_warn((__dev)->mdev->device, "%s:%d:(pid %d) warning: " format, __func__, __LINE__, \ >> diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c >> index 5a1971fc..ec2c7b4e1 100644 >> --- a/drivers/vdpa/mlx5/core/mr.c >> +++ b/drivers/vdpa/mlx5/core/mr.c >> @@ -489,21 +489,15 @@ static void destroy_user_mr(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_mr *mr >> } >> } >> >> -static void _mlx5_vdpa_destroy_cvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) >> +static void _mlx5_vdpa_destroy_cvq_mr(struct mlx5_vdpa_dev *mvdev) >> { >> - if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) >> - return; >> - >> prune_iotlb(mvdev); >> } >> >> -static void _mlx5_vdpa_destroy_dvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) >> +static void _mlx5_vdpa_destroy_dvq_mr(struct mlx5_vdpa_dev *mvdev) >> { >> struct mlx5_vdpa_mr *mr = &mvdev->mr; >> >> - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) >> - return; >> - >> if (!mr->initialized) >> return; >> >> @@ -521,8 +515,10 @@ void mlx5_vdpa_destroy_mr_asid(struct mlx5_vdpa_dev *mvdev, unsigned int asid) >> >> mutex_lock(&mr->mkey_mtx); >> >> - _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); >> - _mlx5_vdpa_destroy_cvq_mr(mvdev, asid); >> + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) >> + _mlx5_vdpa_destroy_dvq_mr(mvdev); >> + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] == asid) >> + _mlx5_vdpa_destroy_cvq_mr(mvdev); >> >> mutex_unlock(&mr->mkey_mtx); >> } >> @@ -534,25 +530,17 @@ void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) >> } >> >> static int _mlx5_vdpa_create_cvq_mr(struct mlx5_vdpa_dev *mvdev, >> - struct vhost_iotlb *iotlb, >> - unsigned int asid) >> + struct vhost_iotlb *iotlb) >> { >> - if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) >> - return 0; >> - >> return dup_iotlb(mvdev, iotlb); >> } >> >> static int _mlx5_vdpa_create_dvq_mr(struct mlx5_vdpa_dev *mvdev, >> - struct vhost_iotlb *iotlb, >> - unsigned int asid) >> + struct vhost_iotlb *iotlb) >> { >> struct mlx5_vdpa_mr *mr = &mvdev->mr; >> int err; >> >> - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) >> - return 0; >> - >> if (mr->initialized) >> return 0; >> >> @@ -574,18 +562,22 @@ static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, >> { >> int err; >> >> - err = _mlx5_vdpa_create_dvq_mr(mvdev, iotlb, asid); >> - if (err) >> - return err; >> - >> - err = _mlx5_vdpa_create_cvq_mr(mvdev, iotlb, asid); >> - if (err) >> - goto out_err; >> + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { >> + err = _mlx5_vdpa_create_dvq_mr(mvdev, iotlb); >> + if (err) >> + return err; >> + } >> + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] == asid) { >> + err = _mlx5_vdpa_create_cvq_mr(mvdev, iotlb); >> + if (err) >> + goto out_err; >> + } >> >> return 0; >> >> out_err: >> - _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); >> + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) >> + _mlx5_vdpa_destroy_dvq_mr(mvdev); >> >> return err; >> } >> @@ -601,6 +593,28 @@ int mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, struct vhost_iotlb *iotlb, >> return err; >> } >> >> +int mlx5_vdpa_reset_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) >> +{ >> + struct mlx5_vdpa_mr *mr = &mvdev->mr; >> + int err = 0; >> + >> + if (asid != 0) >> + return 0; >> + >> + mutex_lock(&mr->mkey_mtx); >> + if (!mr->user_mr) >> + goto out; >> + _mlx5_vdpa_destroy_dvq_mr(mvdev); >> + if (MLX5_CAP_GEN(mvdev->mdev, umem_uid_0)) { >> + err = _mlx5_vdpa_create_dvq_mr(mvdev, NULL); >> + if (err) >> + mlx5_vdpa_warn(mvdev, "create DMA MR failed\n"); >> + } >> +out: >> + mutex_unlock(&mr->mkey_mtx); >> + return err; >> +} >> + >> int mlx5_vdpa_handle_set_map(struct mlx5_vdpa_dev *mvdev, struct vhost_iotlb *iotlb, >> bool *change_map, unsigned int asid) >> { >> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c >> index 37be945..3cb5db6 100644 >> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c >> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c >> @@ -2824,7 +2824,6 @@ static int mlx5_vdpa_reset(struct vdpa_device *vdev) >> unregister_link_notifier(ndev); >> teardown_driver(ndev); >> clear_vqs_ready(ndev); >> - mlx5_vdpa_destroy_mr(&ndev->mvdev); >> ndev->mvdev.status = 0; >> ndev->mvdev.suspended = false; >> ndev->cur_num_vqs = 0; >> @@ -2835,10 +2834,6 @@ static int mlx5_vdpa_reset(struct vdpa_device *vdev) >> init_group_to_asid_map(mvdev); >> ++mvdev->generation; >> >> - if (MLX5_CAP_GEN(mvdev->mdev, umem_uid_0)) { >> - if (mlx5_vdpa_create_mr(mvdev, NULL, 0)) >> - mlx5_vdpa_warn(mvdev, "create MR failed\n"); >> - } >> up_write(&ndev->reslock); >> >> return 0; >> @@ -2903,6 +2898,18 @@ static int mlx5_vdpa_set_map(struct vdpa_device *vdev, unsigned int asid, >> return err; >> } >> >> +static int mlx5_vdpa_reset_map(struct vdpa_device *vdev, unsigned int asid) >> +{ >> + struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev); >> + struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev); >> + int err; >> + >> + down_write(&ndev->reslock); >> + err = mlx5_vdpa_reset_mr(mvdev, asid); >> + up_write(&ndev->reslock); >> + return err; >> +} >> + >> static struct device *mlx5_get_vq_dma_dev(struct vdpa_device *vdev, u16 idx) >> { >> struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev); >> @@ -3162,6 +3169,7 @@ static int mlx5_set_group_asid(struct vdpa_device *vdev, u32 group, >> .set_config = mlx5_vdpa_set_config, >> .get_generation = mlx5_vdpa_get_generation, >> .set_map = mlx5_vdpa_set_map, >> + .reset_map = mlx5_vdpa_reset_map, >> .set_group_asid = mlx5_set_group_asid, >> .get_vq_dma_dev = mlx5_get_vq_dma_dev, >> .free = mlx5_vdpa_free, >> -- >> 1.8.3.1 >> _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH RFC v3 2/4] vdpa/mlx5: implement .reset_map driver op 2023-09-12 0:01 ` Si-Wei Liu @ 2023-09-12 6:53 ` Jason Wang 2023-09-15 5:45 ` Si-Wei Liu 0 siblings, 1 reply; 13+ messages in thread From: Jason Wang @ 2023-09-12 6:53 UTC (permalink / raw) To: Si-Wei Liu; +Cc: eperezma, virtualization, xuanzhuo, mst On Tue, Sep 12, 2023 at 8:02 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > > > > On 9/10/2023 8:48 PM, Jason Wang wrote: > > On Sat, Sep 9, 2023 at 9:46 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > >> Today, mlx5_vdpa gets started by preallocate 1:1 DMA mapping at > >> device creation time, while this 1:1 mapping will be implicitly > >> destroyed when the first .set_map call is invoked. Everytime > >> when the .reset callback is invoked, any mapping left behind will > >> be dropped then reset back to the initial 1:1 DMA mapping. > >> > >> In order to reduce excessive memory mapping cost during live > >> migration, it is desirable to decouple the vhost-vdpa iotlb > >> abstraction from the virtio device life cycle, i.e. mappings > >> should be left intact across virtio device reset. Leverage the > >> .reset_map callback to reset memory mapping, then the device > >> .reset routine can run free from having to clean up memory > >> mappings. > > It's not clear the direct relationship between the persistent mapping > > and reset_map. > Consider .reset_map as a simplified abstraction for on-chip IOMMU model, > decoupling memory mapping mode switching from the current vdpa_reset > hack. Slightly different than platform IOMMU iommu_domain_alloc/free, > but works the best with existing .dma_map/.set_map APIs. Note that iommu_domain_alloc/free doesn't imply any mappings (even the identity mapping). > As said in the > other email, the distinction cannot be hidden, as there are bus drivers > with varied mapping needs. On the other hand, I can live with the > iommu_domain_alloc/free flavor strictly following the platform IOMMU > model, but not sure if worth the complexity. I'm not sure I get this, maybe you can post some RFC or pseudo code? > > > Could we do it step by step? For example, remove the > > mlx5_vdpa_destroy_mr() in mlx5_vdpa_reset() when PERSIST_IOTLB exists? > I think today there's no way for the parent driver to negotiate backend > features with userspace, for e.g. parent won't be able to perform > mlx5_vdpa_destroy_mr for the virtio-vdpa case when PERSIST_IOTLB doesn't > exist. And this backend features stuff is a vhost specific thing, not > specifically tied to vdpa itself. How do we get it extended and > propagated up to the vdpa bus layer? Just to make sure we are on the same page, I just want to know what happens if we simply remove mlx5_vdpa_destroy_mr() in mlx5_vdpa_reset()? > > > And then we can deal with the resetting and others on top, > For this proposed fix, dealing with vdpa_reset from vhost-vdpa is not > specifically an issue, but how to get the mapping reverted back to 1:1 > identity/passthrough when users want to switch from vhost-vdpa to > virtio-vdpa is. > > > or it needs > > some explanation for why reset_map() must be done first. > Yep, I can add more to the commit log. Thanks > > Thanks, > -Siwei > > > > > Thanks > > > >> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> > >> > >> --- > >> RFC v1 -> v2: > >> - fix error path when both CVQ and DVQ fall in same asid > >> --- > >> drivers/vdpa/mlx5/core/mlx5_vdpa.h | 1 + > >> drivers/vdpa/mlx5/core/mr.c | 70 +++++++++++++++++++++++--------------- > >> drivers/vdpa/mlx5/net/mlx5_vnet.c | 18 +++++++--- > >> 3 files changed, 56 insertions(+), 33 deletions(-) > >> > >> diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > >> index b53420e..5c9a25a 100644 > >> --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h > >> +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > >> @@ -123,6 +123,7 @@ int mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, struct vhost_iotlb *iotlb, > >> unsigned int asid); > >> void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev); > >> void mlx5_vdpa_destroy_mr_asid(struct mlx5_vdpa_dev *mvdev, unsigned int asid); > >> +int mlx5_vdpa_reset_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid); > >> > >> #define mlx5_vdpa_warn(__dev, format, ...) \ > >> dev_warn((__dev)->mdev->device, "%s:%d:(pid %d) warning: " format, __func__, __LINE__, \ > >> diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c > >> index 5a1971fc..ec2c7b4e1 100644 > >> --- a/drivers/vdpa/mlx5/core/mr.c > >> +++ b/drivers/vdpa/mlx5/core/mr.c > >> @@ -489,21 +489,15 @@ static void destroy_user_mr(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_mr *mr > >> } > >> } > >> > >> -static void _mlx5_vdpa_destroy_cvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) > >> +static void _mlx5_vdpa_destroy_cvq_mr(struct mlx5_vdpa_dev *mvdev) > >> { > >> - if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) > >> - return; > >> - > >> prune_iotlb(mvdev); > >> } > >> > >> -static void _mlx5_vdpa_destroy_dvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) > >> +static void _mlx5_vdpa_destroy_dvq_mr(struct mlx5_vdpa_dev *mvdev) > >> { > >> struct mlx5_vdpa_mr *mr = &mvdev->mr; > >> > >> - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) > >> - return; > >> - > >> if (!mr->initialized) > >> return; > >> > >> @@ -521,8 +515,10 @@ void mlx5_vdpa_destroy_mr_asid(struct mlx5_vdpa_dev *mvdev, unsigned int asid) > >> > >> mutex_lock(&mr->mkey_mtx); > >> > >> - _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); > >> - _mlx5_vdpa_destroy_cvq_mr(mvdev, asid); > >> + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) > >> + _mlx5_vdpa_destroy_dvq_mr(mvdev); > >> + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] == asid) > >> + _mlx5_vdpa_destroy_cvq_mr(mvdev); > >> > >> mutex_unlock(&mr->mkey_mtx); > >> } > >> @@ -534,25 +530,17 @@ void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) > >> } > >> > >> static int _mlx5_vdpa_create_cvq_mr(struct mlx5_vdpa_dev *mvdev, > >> - struct vhost_iotlb *iotlb, > >> - unsigned int asid) > >> + struct vhost_iotlb *iotlb) > >> { > >> - if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) > >> - return 0; > >> - > >> return dup_iotlb(mvdev, iotlb); > >> } > >> > >> static int _mlx5_vdpa_create_dvq_mr(struct mlx5_vdpa_dev *mvdev, > >> - struct vhost_iotlb *iotlb, > >> - unsigned int asid) > >> + struct vhost_iotlb *iotlb) > >> { > >> struct mlx5_vdpa_mr *mr = &mvdev->mr; > >> int err; > >> > >> - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) > >> - return 0; > >> - > >> if (mr->initialized) > >> return 0; > >> > >> @@ -574,18 +562,22 @@ static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, > >> { > >> int err; > >> > >> - err = _mlx5_vdpa_create_dvq_mr(mvdev, iotlb, asid); > >> - if (err) > >> - return err; > >> - > >> - err = _mlx5_vdpa_create_cvq_mr(mvdev, iotlb, asid); > >> - if (err) > >> - goto out_err; > >> + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { > >> + err = _mlx5_vdpa_create_dvq_mr(mvdev, iotlb); > >> + if (err) > >> + return err; > >> + } > >> + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] == asid) { > >> + err = _mlx5_vdpa_create_cvq_mr(mvdev, iotlb); > >> + if (err) > >> + goto out_err; > >> + } > >> > >> return 0; > >> > >> out_err: > >> - _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); > >> + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) > >> + _mlx5_vdpa_destroy_dvq_mr(mvdev); > >> > >> return err; > >> } > >> @@ -601,6 +593,28 @@ int mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, struct vhost_iotlb *iotlb, > >> return err; > >> } > >> > >> +int mlx5_vdpa_reset_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) > >> +{ > >> + struct mlx5_vdpa_mr *mr = &mvdev->mr; > >> + int err = 0; > >> + > >> + if (asid != 0) > >> + return 0; > >> + > >> + mutex_lock(&mr->mkey_mtx); > >> + if (!mr->user_mr) > >> + goto out; > >> + _mlx5_vdpa_destroy_dvq_mr(mvdev); > >> + if (MLX5_CAP_GEN(mvdev->mdev, umem_uid_0)) { > >> + err = _mlx5_vdpa_create_dvq_mr(mvdev, NULL); > >> + if (err) > >> + mlx5_vdpa_warn(mvdev, "create DMA MR failed\n"); > >> + } > >> +out: > >> + mutex_unlock(&mr->mkey_mtx); > >> + return err; > >> +} > >> + > >> int mlx5_vdpa_handle_set_map(struct mlx5_vdpa_dev *mvdev, struct vhost_iotlb *iotlb, > >> bool *change_map, unsigned int asid) > >> { > >> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c > >> index 37be945..3cb5db6 100644 > >> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c > >> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c > >> @@ -2824,7 +2824,6 @@ static int mlx5_vdpa_reset(struct vdpa_device *vdev) > >> unregister_link_notifier(ndev); > >> teardown_driver(ndev); > >> clear_vqs_ready(ndev); > >> - mlx5_vdpa_destroy_mr(&ndev->mvdev); > >> ndev->mvdev.status = 0; > >> ndev->mvdev.suspended = false; > >> ndev->cur_num_vqs = 0; > >> @@ -2835,10 +2834,6 @@ static int mlx5_vdpa_reset(struct vdpa_device *vdev) > >> init_group_to_asid_map(mvdev); > >> ++mvdev->generation; > >> > >> - if (MLX5_CAP_GEN(mvdev->mdev, umem_uid_0)) { > >> - if (mlx5_vdpa_create_mr(mvdev, NULL, 0)) > >> - mlx5_vdpa_warn(mvdev, "create MR failed\n"); > >> - } > >> up_write(&ndev->reslock); > >> > >> return 0; > >> @@ -2903,6 +2898,18 @@ static int mlx5_vdpa_set_map(struct vdpa_device *vdev, unsigned int asid, > >> return err; > >> } > >> > >> +static int mlx5_vdpa_reset_map(struct vdpa_device *vdev, unsigned int asid) > >> +{ > >> + struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev); > >> + struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev); > >> + int err; > >> + > >> + down_write(&ndev->reslock); > >> + err = mlx5_vdpa_reset_mr(mvdev, asid); > >> + up_write(&ndev->reslock); > >> + return err; > >> +} > >> + > >> static struct device *mlx5_get_vq_dma_dev(struct vdpa_device *vdev, u16 idx) > >> { > >> struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev); > >> @@ -3162,6 +3169,7 @@ static int mlx5_set_group_asid(struct vdpa_device *vdev, u32 group, > >> .set_config = mlx5_vdpa_set_config, > >> .get_generation = mlx5_vdpa_get_generation, > >> .set_map = mlx5_vdpa_set_map, > >> + .reset_map = mlx5_vdpa_reset_map, > >> .set_group_asid = mlx5_set_group_asid, > >> .get_vq_dma_dev = mlx5_get_vq_dma_dev, > >> .free = mlx5_vdpa_free, > >> -- > >> 1.8.3.1 > >> > _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH RFC v3 2/4] vdpa/mlx5: implement .reset_map driver op 2023-09-12 6:53 ` Jason Wang @ 2023-09-15 5:45 ` Si-Wei Liu 0 siblings, 0 replies; 13+ messages in thread From: Si-Wei Liu @ 2023-09-15 5:45 UTC (permalink / raw) To: Jason Wang; +Cc: eperezma, virtualization, xuanzhuo, mst On 9/11/2023 11:53 PM, Jason Wang wrote: > On Tue, Sep 12, 2023 at 8:02 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: >> >> >> On 9/10/2023 8:48 PM, Jason Wang wrote: >>> On Sat, Sep 9, 2023 at 9:46 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote: >>>> Today, mlx5_vdpa gets started by preallocate 1:1 DMA mapping at >>>> device creation time, while this 1:1 mapping will be implicitly >>>> destroyed when the first .set_map call is invoked. Everytime >>>> when the .reset callback is invoked, any mapping left behind will >>>> be dropped then reset back to the initial 1:1 DMA mapping. >>>> >>>> In order to reduce excessive memory mapping cost during live >>>> migration, it is desirable to decouple the vhost-vdpa iotlb >>>> abstraction from the virtio device life cycle, i.e. mappings >>>> should be left intact across virtio device reset. Leverage the >>>> .reset_map callback to reset memory mapping, then the device >>>> .reset routine can run free from having to clean up memory >>>> mappings. >>> It's not clear the direct relationship between the persistent mapping >>> and reset_map. >> Consider .reset_map as a simplified abstraction for on-chip IOMMU model, >> decoupling memory mapping mode switching from the current vdpa_reset >> hack. Slightly different than platform IOMMU iommu_domain_alloc/free, >> but works the best with existing .dma_map/.set_map APIs. > Note that iommu_domain_alloc/free doesn't imply any mappings (even the > identity mapping). Forget about this part, it just exposes the multi-dimension aspect of iommu domain unnecessarily, and I think we both don't like to. Although this was intended to make virtio-vdpa work seamlessly when it is used over mlx5-vdpa, similar to the DMA device deviation introduced to the vDPA driver API. Thanks, -Siwei > >> As said in the >> other email, the distinction cannot be hidden, as there are bus drivers >> with varied mapping needs. On the other hand, I can live with the >> iommu_domain_alloc/free flavor strictly following the platform IOMMU >> model, but not sure if worth the complexity. > I'm not sure I get this, maybe you can post some RFC or pseudo code? > >>> Could we do it step by step? For example, remove the >>> mlx5_vdpa_destroy_mr() in mlx5_vdpa_reset() when PERSIST_IOTLB exists? >> I think today there's no way for the parent driver to negotiate backend >> features with userspace, for e.g. parent won't be able to perform >> mlx5_vdpa_destroy_mr for the virtio-vdpa case when PERSIST_IOTLB doesn't >> exist. And this backend features stuff is a vhost specific thing, not >> specifically tied to vdpa itself. How do we get it extended and >> propagated up to the vdpa bus layer? > Just to make sure we are on the same page, I just want to know what > happens if we simply remove mlx5_vdpa_destroy_mr() in > mlx5_vdpa_reset()? > >>> And then we can deal with the resetting and others on top, >> For this proposed fix, dealing with vdpa_reset from vhost-vdpa is not >> specifically an issue, but how to get the mapping reverted back to 1:1 >> identity/passthrough when users want to switch from vhost-vdpa to >> virtio-vdpa is. >> >>> or it needs >>> some explanation for why reset_map() must be done first. >> Yep, I can add more to the commit log. > Thanks > >> Thanks, >> -Siwei >> >>> Thanks >>> >>>> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> >>>> >>>> --- >>>> RFC v1 -> v2: >>>> - fix error path when both CVQ and DVQ fall in same asid >>>> --- >>>> drivers/vdpa/mlx5/core/mlx5_vdpa.h | 1 + >>>> drivers/vdpa/mlx5/core/mr.c | 70 +++++++++++++++++++++++--------------- >>>> drivers/vdpa/mlx5/net/mlx5_vnet.c | 18 +++++++--- >>>> 3 files changed, 56 insertions(+), 33 deletions(-) >>>> >>>> diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h >>>> index b53420e..5c9a25a 100644 >>>> --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h >>>> +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h >>>> @@ -123,6 +123,7 @@ int mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, struct vhost_iotlb *iotlb, >>>> unsigned int asid); >>>> void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev); >>>> void mlx5_vdpa_destroy_mr_asid(struct mlx5_vdpa_dev *mvdev, unsigned int asid); >>>> +int mlx5_vdpa_reset_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid); >>>> >>>> #define mlx5_vdpa_warn(__dev, format, ...) \ >>>> dev_warn((__dev)->mdev->device, "%s:%d:(pid %d) warning: " format, __func__, __LINE__, \ >>>> diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c >>>> index 5a1971fc..ec2c7b4e1 100644 >>>> --- a/drivers/vdpa/mlx5/core/mr.c >>>> +++ b/drivers/vdpa/mlx5/core/mr.c >>>> @@ -489,21 +489,15 @@ static void destroy_user_mr(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_mr *mr >>>> } >>>> } >>>> >>>> -static void _mlx5_vdpa_destroy_cvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) >>>> +static void _mlx5_vdpa_destroy_cvq_mr(struct mlx5_vdpa_dev *mvdev) >>>> { >>>> - if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) >>>> - return; >>>> - >>>> prune_iotlb(mvdev); >>>> } >>>> >>>> -static void _mlx5_vdpa_destroy_dvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) >>>> +static void _mlx5_vdpa_destroy_dvq_mr(struct mlx5_vdpa_dev *mvdev) >>>> { >>>> struct mlx5_vdpa_mr *mr = &mvdev->mr; >>>> >>>> - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) >>>> - return; >>>> - >>>> if (!mr->initialized) >>>> return; >>>> >>>> @@ -521,8 +515,10 @@ void mlx5_vdpa_destroy_mr_asid(struct mlx5_vdpa_dev *mvdev, unsigned int asid) >>>> >>>> mutex_lock(&mr->mkey_mtx); >>>> >>>> - _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); >>>> - _mlx5_vdpa_destroy_cvq_mr(mvdev, asid); >>>> + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) >>>> + _mlx5_vdpa_destroy_dvq_mr(mvdev); >>>> + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] == asid) >>>> + _mlx5_vdpa_destroy_cvq_mr(mvdev); >>>> >>>> mutex_unlock(&mr->mkey_mtx); >>>> } >>>> @@ -534,25 +530,17 @@ void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev) >>>> } >>>> >>>> static int _mlx5_vdpa_create_cvq_mr(struct mlx5_vdpa_dev *mvdev, >>>> - struct vhost_iotlb *iotlb, >>>> - unsigned int asid) >>>> + struct vhost_iotlb *iotlb) >>>> { >>>> - if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid) >>>> - return 0; >>>> - >>>> return dup_iotlb(mvdev, iotlb); >>>> } >>>> >>>> static int _mlx5_vdpa_create_dvq_mr(struct mlx5_vdpa_dev *mvdev, >>>> - struct vhost_iotlb *iotlb, >>>> - unsigned int asid) >>>> + struct vhost_iotlb *iotlb) >>>> { >>>> struct mlx5_vdpa_mr *mr = &mvdev->mr; >>>> int err; >>>> >>>> - if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid) >>>> - return 0; >>>> - >>>> if (mr->initialized) >>>> return 0; >>>> >>>> @@ -574,18 +562,22 @@ static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, >>>> { >>>> int err; >>>> >>>> - err = _mlx5_vdpa_create_dvq_mr(mvdev, iotlb, asid); >>>> - if (err) >>>> - return err; >>>> - >>>> - err = _mlx5_vdpa_create_cvq_mr(mvdev, iotlb, asid); >>>> - if (err) >>>> - goto out_err; >>>> + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) { >>>> + err = _mlx5_vdpa_create_dvq_mr(mvdev, iotlb); >>>> + if (err) >>>> + return err; >>>> + } >>>> + if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] == asid) { >>>> + err = _mlx5_vdpa_create_cvq_mr(mvdev, iotlb); >>>> + if (err) >>>> + goto out_err; >>>> + } >>>> >>>> return 0; >>>> >>>> out_err: >>>> - _mlx5_vdpa_destroy_dvq_mr(mvdev, asid); >>>> + if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) >>>> + _mlx5_vdpa_destroy_dvq_mr(mvdev); >>>> >>>> return err; >>>> } >>>> @@ -601,6 +593,28 @@ int mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, struct vhost_iotlb *iotlb, >>>> return err; >>>> } >>>> >>>> +int mlx5_vdpa_reset_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid) >>>> +{ >>>> + struct mlx5_vdpa_mr *mr = &mvdev->mr; >>>> + int err = 0; >>>> + >>>> + if (asid != 0) >>>> + return 0; >>>> + >>>> + mutex_lock(&mr->mkey_mtx); >>>> + if (!mr->user_mr) >>>> + goto out; >>>> + _mlx5_vdpa_destroy_dvq_mr(mvdev); >>>> + if (MLX5_CAP_GEN(mvdev->mdev, umem_uid_0)) { >>>> + err = _mlx5_vdpa_create_dvq_mr(mvdev, NULL); >>>> + if (err) >>>> + mlx5_vdpa_warn(mvdev, "create DMA MR failed\n"); >>>> + } >>>> +out: >>>> + mutex_unlock(&mr->mkey_mtx); >>>> + return err; >>>> +} >>>> + >>>> int mlx5_vdpa_handle_set_map(struct mlx5_vdpa_dev *mvdev, struct vhost_iotlb *iotlb, >>>> bool *change_map, unsigned int asid) >>>> { >>>> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c >>>> index 37be945..3cb5db6 100644 >>>> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c >>>> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c >>>> @@ -2824,7 +2824,6 @@ static int mlx5_vdpa_reset(struct vdpa_device *vdev) >>>> unregister_link_notifier(ndev); >>>> teardown_driver(ndev); >>>> clear_vqs_ready(ndev); >>>> - mlx5_vdpa_destroy_mr(&ndev->mvdev); >>>> ndev->mvdev.status = 0; >>>> ndev->mvdev.suspended = false; >>>> ndev->cur_num_vqs = 0; >>>> @@ -2835,10 +2834,6 @@ static int mlx5_vdpa_reset(struct vdpa_device *vdev) >>>> init_group_to_asid_map(mvdev); >>>> ++mvdev->generation; >>>> >>>> - if (MLX5_CAP_GEN(mvdev->mdev, umem_uid_0)) { >>>> - if (mlx5_vdpa_create_mr(mvdev, NULL, 0)) >>>> - mlx5_vdpa_warn(mvdev, "create MR failed\n"); >>>> - } >>>> up_write(&ndev->reslock); >>>> >>>> return 0; >>>> @@ -2903,6 +2898,18 @@ static int mlx5_vdpa_set_map(struct vdpa_device *vdev, unsigned int asid, >>>> return err; >>>> } >>>> >>>> +static int mlx5_vdpa_reset_map(struct vdpa_device *vdev, unsigned int asid) >>>> +{ >>>> + struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev); >>>> + struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev); >>>> + int err; >>>> + >>>> + down_write(&ndev->reslock); >>>> + err = mlx5_vdpa_reset_mr(mvdev, asid); >>>> + up_write(&ndev->reslock); >>>> + return err; >>>> +} >>>> + >>>> static struct device *mlx5_get_vq_dma_dev(struct vdpa_device *vdev, u16 idx) >>>> { >>>> struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev); >>>> @@ -3162,6 +3169,7 @@ static int mlx5_set_group_asid(struct vdpa_device *vdev, u32 group, >>>> .set_config = mlx5_vdpa_set_config, >>>> .get_generation = mlx5_vdpa_get_generation, >>>> .set_map = mlx5_vdpa_set_map, >>>> + .reset_map = mlx5_vdpa_reset_map, >>>> .set_group_asid = mlx5_set_group_asid, >>>> .get_vq_dma_dev = mlx5_get_vq_dma_dev, >>>> .free = mlx5_vdpa_free, >>>> -- >>>> 1.8.3.1 >>>> _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH RFC v3 3/4] vhost-vdpa: should restore 1:1 dma mapping before detaching driver 2023-09-09 13:43 [PATCH RFC v3 0/4] vdpa: decouple reset of iotlb mapping from device reset Si-Wei Liu 2023-09-09 13:43 ` [PATCH RFC v3 1/4] vdpa: introduce .reset_map operation callback Si-Wei Liu 2023-09-09 13:43 ` [PATCH RFC v3 2/4] vdpa/mlx5: implement .reset_map driver op Si-Wei Liu @ 2023-09-09 13:43 ` Si-Wei Liu 2023-09-09 13:43 ` [PATCH RFC v3 4/4] vhost-vdpa: introduce IOTLB_PERSIST backend feature bit Si-Wei Liu 3 siblings, 0 replies; 13+ messages in thread From: Si-Wei Liu @ 2023-09-09 13:43 UTC (permalink / raw) To: eperezma, jasowang, mst, xuanzhuo, dtatulea; +Cc: virtualization Devices with on-chip IOMMU may need to restore iotlb to 1:1 identity mapping from IOVA to PA. Before vhost-vdpa is going away, give them a chance to clean up and reset iotlb back to 1:1 identify mapping mode. This is done so that any vdpa bus driver may start with 1:1 identity mapping by default. Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> --- drivers/vhost/vdpa.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index eabac06..71fbd559 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -131,6 +131,15 @@ static struct vhost_vdpa_as *vhost_vdpa_find_alloc_as(struct vhost_vdpa *v, return vhost_vdpa_alloc_as(v, asid); } +static void vhost_vdpa_reset_map(struct vhost_vdpa *v, u32 asid) +{ + struct vdpa_device *vdpa = v->vdpa; + const struct vdpa_config_ops *ops = vdpa->config; + + if (ops->reset_map) + ops->reset_map(vdpa, asid); +} + static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid) { struct vhost_vdpa_as *as = asid_to_as(v, asid); @@ -140,6 +149,14 @@ static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid) hlist_del(&as->hash_link); vhost_vdpa_iotlb_unmap(v, &as->iotlb, 0ULL, 0ULL - 1, asid); + /* + * Devices with on-chip IOMMU need to restore iotlb + * to 1:1 identity mapping before vhost-vdpa is going + * to be removed and detached from the device. Give + * them a chance to do so, as this cannot be done + * efficiently via the whole-range unmap call above. + */ + vhost_vdpa_reset_map(v, asid); kfree(as); return 0; -- 1.8.3.1 _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH RFC v3 4/4] vhost-vdpa: introduce IOTLB_PERSIST backend feature bit 2023-09-09 13:43 [PATCH RFC v3 0/4] vdpa: decouple reset of iotlb mapping from device reset Si-Wei Liu ` (2 preceding siblings ...) 2023-09-09 13:43 ` [PATCH RFC v3 3/4] vhost-vdpa: should restore 1:1 dma mapping before detaching driver Si-Wei Liu @ 2023-09-09 13:43 ` Si-Wei Liu 2023-09-11 3:52 ` Jason Wang 3 siblings, 1 reply; 13+ messages in thread From: Si-Wei Liu @ 2023-09-09 13:43 UTC (permalink / raw) To: eperezma, jasowang, mst, xuanzhuo, dtatulea; +Cc: virtualization Userspace needs this feature flag to distinguish if vhost-vdpa iotlb in the kernel supports persistent IOTLB mapping across device reset. There are two cases that backend may claim this feature bit on: - parent device that has to work with platform IOMMU - parent device with on-chip IOMMU that has the expected .reset_map support in driver Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> --- RFC v2 -> v3: - fix missing return due to merge error --- drivers/vhost/vdpa.c | 16 +++++++++++++++- include/uapi/linux/vhost_types.h | 2 ++ 2 files changed, 17 insertions(+), 1 deletion(-) diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index 71fbd559..b404504 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -414,6 +414,14 @@ static bool vhost_vdpa_has_desc_group(const struct vhost_vdpa *v) return ops->get_vq_desc_group; } +static bool vhost_vdpa_has_persistent_map(const struct vhost_vdpa *v) +{ + struct vdpa_device *vdpa = v->vdpa; + const struct vdpa_config_ops *ops = vdpa->config; + + return (!ops->set_map && !ops->dma_map) || ops->reset_map; +} + static long vhost_vdpa_get_features(struct vhost_vdpa *v, u64 __user *featurep) { struct vdpa_device *vdpa = v->vdpa; @@ -716,7 +724,8 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, if (features & ~(VHOST_VDPA_BACKEND_FEATURES | BIT_ULL(VHOST_BACKEND_F_DESC_ASID) | BIT_ULL(VHOST_BACKEND_F_SUSPEND) | - BIT_ULL(VHOST_BACKEND_F_RESUME))) + BIT_ULL(VHOST_BACKEND_F_RESUME) | + BIT_ULL(VHOST_BACKEND_F_IOTLB_PERSIST))) return -EOPNOTSUPP; if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) && !vhost_vdpa_can_suspend(v)) @@ -730,6 +739,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, if ((features & BIT_ULL(VHOST_BACKEND_F_DESC_ASID)) && !vhost_vdpa_has_desc_group(v)) return -EOPNOTSUPP; + if ((features & BIT_ULL(VHOST_BACKEND_F_IOTLB_PERSIST)) && + !vhost_vdpa_has_persistent_map(v)) + return -EOPNOTSUPP; vhost_set_backend_features(&v->vdev, features); return 0; } @@ -785,6 +797,8 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, features |= BIT_ULL(VHOST_BACKEND_F_RESUME); if (vhost_vdpa_has_desc_group(v)) features |= BIT_ULL(VHOST_BACKEND_F_DESC_ASID); + if (vhost_vdpa_has_persistent_map(v)) + features |= BIT_ULL(VHOST_BACKEND_F_IOTLB_PERSIST); if (copy_to_user(featurep, &features, sizeof(features))) r = -EFAULT; break; diff --git a/include/uapi/linux/vhost_types.h b/include/uapi/linux/vhost_types.h index 6acc604..0fdb6f0 100644 --- a/include/uapi/linux/vhost_types.h +++ b/include/uapi/linux/vhost_types.h @@ -186,5 +186,7 @@ struct vhost_vdpa_iova_range { * buffers may reside. Requires VHOST_BACKEND_F_IOTLB_ASID. */ #define VHOST_BACKEND_F_DESC_ASID 0x6 +/* IOTLB don't flush memory mapping across device reset */ +#define VHOST_BACKEND_F_IOTLB_PERSIST 0x7 #endif -- 1.8.3.1 _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH RFC v3 4/4] vhost-vdpa: introduce IOTLB_PERSIST backend feature bit 2023-09-09 13:43 ` [PATCH RFC v3 4/4] vhost-vdpa: introduce IOTLB_PERSIST backend feature bit Si-Wei Liu @ 2023-09-11 3:52 ` Jason Wang 2023-09-12 0:28 ` Si-Wei Liu 0 siblings, 1 reply; 13+ messages in thread From: Jason Wang @ 2023-09-11 3:52 UTC (permalink / raw) To: Si-Wei Liu; +Cc: eperezma, virtualization, xuanzhuo, mst On Sat, Sep 9, 2023 at 9:46 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > > Userspace needs this feature flag to distinguish if vhost-vdpa > iotlb in the kernel supports persistent IOTLB mapping across > device reset. As discussed, the IOTLB persists for devices with platform IOMMU at least. You've mentioned that this behaviour is covered by Qemu since it reset IOTLB on reset. I wonder what happens if we simply decouple the IOTLB reset from vDPA reset in Qemu. Could we meet bugs there? Btw, is there a Qemu patch for reference for this new feature? Thanks > There are two cases that backend may claim > this feature bit on: > > - parent device that has to work with platform IOMMU > - parent device with on-chip IOMMU that has the expected > .reset_map support in driver > > Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> > --- > RFC v2 -> v3: > - fix missing return due to merge error > > --- > drivers/vhost/vdpa.c | 16 +++++++++++++++- > include/uapi/linux/vhost_types.h | 2 ++ > 2 files changed, 17 insertions(+), 1 deletion(-) > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c > index 71fbd559..b404504 100644 > --- a/drivers/vhost/vdpa.c > +++ b/drivers/vhost/vdpa.c > @@ -414,6 +414,14 @@ static bool vhost_vdpa_has_desc_group(const struct vhost_vdpa *v) > return ops->get_vq_desc_group; > } > > +static bool vhost_vdpa_has_persistent_map(const struct vhost_vdpa *v) > +{ > + struct vdpa_device *vdpa = v->vdpa; > + const struct vdpa_config_ops *ops = vdpa->config; > + > + return (!ops->set_map && !ops->dma_map) || ops->reset_map; > +} > + > static long vhost_vdpa_get_features(struct vhost_vdpa *v, u64 __user *featurep) > { > struct vdpa_device *vdpa = v->vdpa; > @@ -716,7 +724,8 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, > if (features & ~(VHOST_VDPA_BACKEND_FEATURES | > BIT_ULL(VHOST_BACKEND_F_DESC_ASID) | > BIT_ULL(VHOST_BACKEND_F_SUSPEND) | > - BIT_ULL(VHOST_BACKEND_F_RESUME))) > + BIT_ULL(VHOST_BACKEND_F_RESUME) | > + BIT_ULL(VHOST_BACKEND_F_IOTLB_PERSIST))) > return -EOPNOTSUPP; > if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) && > !vhost_vdpa_can_suspend(v)) > @@ -730,6 +739,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, > if ((features & BIT_ULL(VHOST_BACKEND_F_DESC_ASID)) && > !vhost_vdpa_has_desc_group(v)) > return -EOPNOTSUPP; > + if ((features & BIT_ULL(VHOST_BACKEND_F_IOTLB_PERSIST)) && > + !vhost_vdpa_has_persistent_map(v)) > + return -EOPNOTSUPP; > vhost_set_backend_features(&v->vdev, features); > return 0; > } > @@ -785,6 +797,8 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, > features |= BIT_ULL(VHOST_BACKEND_F_RESUME); > if (vhost_vdpa_has_desc_group(v)) > features |= BIT_ULL(VHOST_BACKEND_F_DESC_ASID); > + if (vhost_vdpa_has_persistent_map(v)) > + features |= BIT_ULL(VHOST_BACKEND_F_IOTLB_PERSIST); > if (copy_to_user(featurep, &features, sizeof(features))) > r = -EFAULT; > break; > diff --git a/include/uapi/linux/vhost_types.h b/include/uapi/linux/vhost_types.h > index 6acc604..0fdb6f0 100644 > --- a/include/uapi/linux/vhost_types.h > +++ b/include/uapi/linux/vhost_types.h > @@ -186,5 +186,7 @@ struct vhost_vdpa_iova_range { > * buffers may reside. Requires VHOST_BACKEND_F_IOTLB_ASID. > */ > #define VHOST_BACKEND_F_DESC_ASID 0x6 > +/* IOTLB don't flush memory mapping across device reset */ > +#define VHOST_BACKEND_F_IOTLB_PERSIST 0x7 > > #endif > -- > 1.8.3.1 > _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH RFC v3 4/4] vhost-vdpa: introduce IOTLB_PERSIST backend feature bit 2023-09-11 3:52 ` Jason Wang @ 2023-09-12 0:28 ` Si-Wei Liu 2023-09-12 7:01 ` Jason Wang 0 siblings, 1 reply; 13+ messages in thread From: Si-Wei Liu @ 2023-09-12 0:28 UTC (permalink / raw) To: Jason Wang; +Cc: eperezma, virtualization, xuanzhuo, mst On 9/10/2023 8:52 PM, Jason Wang wrote: > On Sat, Sep 9, 2023 at 9:46 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote: >> Userspace needs this feature flag to distinguish if vhost-vdpa >> iotlb in the kernel supports persistent IOTLB mapping across >> device reset. > As discussed, the IOTLB persists for devices with platform IOMMU at > least. You've mentioned that this behaviour is covered by Qemu since > it reset IOTLB on reset. I wonder what happens if we simply decouple > the IOTLB reset from vDPA reset in Qemu. Could we meet bugs there? Not sure I understand. Simple decouple meaning to remove memory_listener registration/unregistration calls *unconditionally* from the vDPA dev start/stop path in QEMU, or make it conditional around the existence of PERSIST_IOTLB? If unconditional, it breaks older host kernel, as the older kernel still silently drops all mapping across vDPA reset (VM reboot), ending up with network loss afterwards; if make the QEMU change conditional on PERSIST_IOTLB without the .reset_map API, it can't cover the virtio-vdpa 1:1 identity mapping case. > Btw, is there a Qemu patch for reference for this new feature? There's a WIP version, not ready yet for review: https://github.com/siwliu-kernel/qemu branch: vdpa-svq-asid-poc Will need to clean up code and split to smaller patches before I can post it, if the kernel part can be settled. Thanks, -Siwei > > Thanks > >> There are two cases that backend may claim >> this feature bit on: >> >> - parent device that has to work with platform IOMMU >> - parent device with on-chip IOMMU that has the expected >> .reset_map support in driver >> >> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> >> --- >> RFC v2 -> v3: >> - fix missing return due to merge error >> >> --- >> drivers/vhost/vdpa.c | 16 +++++++++++++++- >> include/uapi/linux/vhost_types.h | 2 ++ >> 2 files changed, 17 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c >> index 71fbd559..b404504 100644 >> --- a/drivers/vhost/vdpa.c >> +++ b/drivers/vhost/vdpa.c >> @@ -414,6 +414,14 @@ static bool vhost_vdpa_has_desc_group(const struct vhost_vdpa *v) >> return ops->get_vq_desc_group; >> } >> >> +static bool vhost_vdpa_has_persistent_map(const struct vhost_vdpa *v) >> +{ >> + struct vdpa_device *vdpa = v->vdpa; >> + const struct vdpa_config_ops *ops = vdpa->config; >> + >> + return (!ops->set_map && !ops->dma_map) || ops->reset_map; >> +} >> + >> static long vhost_vdpa_get_features(struct vhost_vdpa *v, u64 __user *featurep) >> { >> struct vdpa_device *vdpa = v->vdpa; >> @@ -716,7 +724,8 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, >> if (features & ~(VHOST_VDPA_BACKEND_FEATURES | >> BIT_ULL(VHOST_BACKEND_F_DESC_ASID) | >> BIT_ULL(VHOST_BACKEND_F_SUSPEND) | >> - BIT_ULL(VHOST_BACKEND_F_RESUME))) >> + BIT_ULL(VHOST_BACKEND_F_RESUME) | >> + BIT_ULL(VHOST_BACKEND_F_IOTLB_PERSIST))) >> return -EOPNOTSUPP; >> if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) && >> !vhost_vdpa_can_suspend(v)) >> @@ -730,6 +739,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, >> if ((features & BIT_ULL(VHOST_BACKEND_F_DESC_ASID)) && >> !vhost_vdpa_has_desc_group(v)) >> return -EOPNOTSUPP; >> + if ((features & BIT_ULL(VHOST_BACKEND_F_IOTLB_PERSIST)) && >> + !vhost_vdpa_has_persistent_map(v)) >> + return -EOPNOTSUPP; >> vhost_set_backend_features(&v->vdev, features); >> return 0; >> } >> @@ -785,6 +797,8 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, >> features |= BIT_ULL(VHOST_BACKEND_F_RESUME); >> if (vhost_vdpa_has_desc_group(v)) >> features |= BIT_ULL(VHOST_BACKEND_F_DESC_ASID); >> + if (vhost_vdpa_has_persistent_map(v)) >> + features |= BIT_ULL(VHOST_BACKEND_F_IOTLB_PERSIST); >> if (copy_to_user(featurep, &features, sizeof(features))) >> r = -EFAULT; >> break; >> diff --git a/include/uapi/linux/vhost_types.h b/include/uapi/linux/vhost_types.h >> index 6acc604..0fdb6f0 100644 >> --- a/include/uapi/linux/vhost_types.h >> +++ b/include/uapi/linux/vhost_types.h >> @@ -186,5 +186,7 @@ struct vhost_vdpa_iova_range { >> * buffers may reside. Requires VHOST_BACKEND_F_IOTLB_ASID. >> */ >> #define VHOST_BACKEND_F_DESC_ASID 0x6 >> +/* IOTLB don't flush memory mapping across device reset */ >> +#define VHOST_BACKEND_F_IOTLB_PERSIST 0x7 >> >> #endif >> -- >> 1.8.3.1 >> _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH RFC v3 4/4] vhost-vdpa: introduce IOTLB_PERSIST backend feature bit 2023-09-12 0:28 ` Si-Wei Liu @ 2023-09-12 7:01 ` Jason Wang 2023-09-15 5:55 ` Si-Wei Liu 0 siblings, 1 reply; 13+ messages in thread From: Jason Wang @ 2023-09-12 7:01 UTC (permalink / raw) To: Si-Wei Liu; +Cc: eperezma, virtualization, xuanzhuo, mst On Tue, Sep 12, 2023 at 8:28 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > > > > On 9/10/2023 8:52 PM, Jason Wang wrote: > > On Sat, Sep 9, 2023 at 9:46 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > >> Userspace needs this feature flag to distinguish if vhost-vdpa > >> iotlb in the kernel supports persistent IOTLB mapping across > >> device reset. > > As discussed, the IOTLB persists for devices with platform IOMMU at > > least. You've mentioned that this behaviour is covered by Qemu since > > it reset IOTLB on reset. I wonder what happens if we simply decouple > > the IOTLB reset from vDPA reset in Qemu. Could we meet bugs there? > Not sure I understand. Simple decouple meaning to remove memory_listener > registration/unregistration calls *unconditionally* from the vDPA dev > start/stop path in QEMU, or make it conditional around the existence of > PERSIST_IOTLB? If my memory is correct, currently the listeners were registered/unregistered during start/stop. I mean what if we register/unregister during init/free? > If unconditional, it breaks older host kernel, as the > older kernel still silently drops all mapping across vDPA reset (VM > reboot), Ok, right. > ending up with network loss afterwards; if make the QEMU change > conditional on PERSIST_IOTLB without the .reset_map API, it can't cover > the virtio-vdpa 1:1 identity mapping case. Ok, I see. Let's add the above and explain why it can't cover the 1:1 mapping somewhere (probably the commit log) in the next version. So I think we can probably introduce reset_map() but not say it's for on-chip IOMMU. We can probably say, it's for resetting the vendor specific mapping into initialization state? > > > Btw, is there a Qemu patch for reference for this new feature? > There's a WIP version, not ready yet for review: > > https://github.com/siwliu-kernel/qemu > branch: vdpa-svq-asid-poc > > Will need to clean up code and split to smaller patches before I can > post it, if the kernel part can be settled. Ok. Thanks > > Thanks, > -Siwei > > > > > Thanks > > > >> There are two cases that backend may claim > >> this feature bit on: > >> > >> - parent device that has to work with platform IOMMU > >> - parent device with on-chip IOMMU that has the expected > >> .reset_map support in driver > >> > >> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> > >> --- > >> RFC v2 -> v3: > >> - fix missing return due to merge error > >> > >> --- > >> drivers/vhost/vdpa.c | 16 +++++++++++++++- > >> include/uapi/linux/vhost_types.h | 2 ++ > >> 2 files changed, 17 insertions(+), 1 deletion(-) > >> > >> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c > >> index 71fbd559..b404504 100644 > >> --- a/drivers/vhost/vdpa.c > >> +++ b/drivers/vhost/vdpa.c > >> @@ -414,6 +414,14 @@ static bool vhost_vdpa_has_desc_group(const struct vhost_vdpa *v) > >> return ops->get_vq_desc_group; > >> } > >> > >> +static bool vhost_vdpa_has_persistent_map(const struct vhost_vdpa *v) > >> +{ > >> + struct vdpa_device *vdpa = v->vdpa; > >> + const struct vdpa_config_ops *ops = vdpa->config; > >> + > >> + return (!ops->set_map && !ops->dma_map) || ops->reset_map; > >> +} > >> + > >> static long vhost_vdpa_get_features(struct vhost_vdpa *v, u64 __user *featurep) > >> { > >> struct vdpa_device *vdpa = v->vdpa; > >> @@ -716,7 +724,8 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, > >> if (features & ~(VHOST_VDPA_BACKEND_FEATURES | > >> BIT_ULL(VHOST_BACKEND_F_DESC_ASID) | > >> BIT_ULL(VHOST_BACKEND_F_SUSPEND) | > >> - BIT_ULL(VHOST_BACKEND_F_RESUME))) > >> + BIT_ULL(VHOST_BACKEND_F_RESUME) | > >> + BIT_ULL(VHOST_BACKEND_F_IOTLB_PERSIST))) > >> return -EOPNOTSUPP; > >> if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) && > >> !vhost_vdpa_can_suspend(v)) > >> @@ -730,6 +739,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, > >> if ((features & BIT_ULL(VHOST_BACKEND_F_DESC_ASID)) && > >> !vhost_vdpa_has_desc_group(v)) > >> return -EOPNOTSUPP; > >> + if ((features & BIT_ULL(VHOST_BACKEND_F_IOTLB_PERSIST)) && > >> + !vhost_vdpa_has_persistent_map(v)) > >> + return -EOPNOTSUPP; > >> vhost_set_backend_features(&v->vdev, features); > >> return 0; > >> } > >> @@ -785,6 +797,8 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, > >> features |= BIT_ULL(VHOST_BACKEND_F_RESUME); > >> if (vhost_vdpa_has_desc_group(v)) > >> features |= BIT_ULL(VHOST_BACKEND_F_DESC_ASID); > >> + if (vhost_vdpa_has_persistent_map(v)) > >> + features |= BIT_ULL(VHOST_BACKEND_F_IOTLB_PERSIST); > >> if (copy_to_user(featurep, &features, sizeof(features))) > >> r = -EFAULT; > >> break; > >> diff --git a/include/uapi/linux/vhost_types.h b/include/uapi/linux/vhost_types.h > >> index 6acc604..0fdb6f0 100644 > >> --- a/include/uapi/linux/vhost_types.h > >> +++ b/include/uapi/linux/vhost_types.h > >> @@ -186,5 +186,7 @@ struct vhost_vdpa_iova_range { > >> * buffers may reside. Requires VHOST_BACKEND_F_IOTLB_ASID. > >> */ > >> #define VHOST_BACKEND_F_DESC_ASID 0x6 > >> +/* IOTLB don't flush memory mapping across device reset */ > >> +#define VHOST_BACKEND_F_IOTLB_PERSIST 0x7 > >> > >> #endif > >> -- > >> 1.8.3.1 > >> > _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH RFC v3 4/4] vhost-vdpa: introduce IOTLB_PERSIST backend feature bit 2023-09-12 7:01 ` Jason Wang @ 2023-09-15 5:55 ` Si-Wei Liu 0 siblings, 0 replies; 13+ messages in thread From: Si-Wei Liu @ 2023-09-15 5:55 UTC (permalink / raw) To: Jason Wang; +Cc: eperezma, virtualization, xuanzhuo, mst On 9/12/2023 12:01 AM, Jason Wang wrote: > On Tue, Sep 12, 2023 at 8:28 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: >> >> >> On 9/10/2023 8:52 PM, Jason Wang wrote: >>> On Sat, Sep 9, 2023 at 9:46 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote: >>>> Userspace needs this feature flag to distinguish if vhost-vdpa >>>> iotlb in the kernel supports persistent IOTLB mapping across >>>> device reset. >>> As discussed, the IOTLB persists for devices with platform IOMMU at >>> least. You've mentioned that this behaviour is covered by Qemu since >>> it reset IOTLB on reset. I wonder what happens if we simply decouple >>> the IOTLB reset from vDPA reset in Qemu. Could we meet bugs there? >> Not sure I understand. Simple decouple meaning to remove memory_listener >> registration/unregistration calls *unconditionally* from the vDPA dev >> start/stop path in QEMU, or make it conditional around the existence of >> PERSIST_IOTLB? > If my memory is correct, currently the listeners were > registered/unregistered during start/stop. I mean what if we > register/unregister during init/free? Yes, the move to init/cleanup was assumed in below response. > >> If unconditional, it breaks older host kernel, as the >> older kernel still silently drops all mapping across vDPA reset (VM >> reboot), > Ok, right. > >> ending up with network loss afterwards; if make the QEMU change >> conditional on PERSIST_IOTLB without the .reset_map API, it can't cover >> the virtio-vdpa 1:1 identity mapping case. > Ok, I see. Let's add the above and explain why it can't cover the 1:1 > mapping somewhere (probably the commit log) in the next version. OK. Will do. > > So I think we can probably introduce reset_map() but not say it's for > on-chip IOMMU. We can probably say, it's for resetting the vendor > specific mapping into initialization state? For sure. That's exactly the intent, though I didn't specifically tie on-chip to two-dimension or entity mapping. Yes I can reword to "vendor specific" in the next rev to avoid confusions and ambiguity. Thanks, -Siwei > >>> Btw, is there a Qemu patch for reference for this new feature? >> There's a WIP version, not ready yet for review: >> >> https://github.com/siwliu-kernel/qemu >> branch: vdpa-svq-asid-poc >> >> Will need to clean up code and split to smaller patches before I can >> post it, if the kernel part can be settled. > Ok. > > Thanks > >> Thanks, >> -Siwei >> >>> Thanks >>> >>>> There are two cases that backend may claim >>>> this feature bit on: >>>> >>>> - parent device that has to work with platform IOMMU >>>> - parent device with on-chip IOMMU that has the expected >>>> .reset_map support in driver >>>> >>>> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> >>>> --- >>>> RFC v2 -> v3: >>>> - fix missing return due to merge error >>>> >>>> --- >>>> drivers/vhost/vdpa.c | 16 +++++++++++++++- >>>> include/uapi/linux/vhost_types.h | 2 ++ >>>> 2 files changed, 17 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c >>>> index 71fbd559..b404504 100644 >>>> --- a/drivers/vhost/vdpa.c >>>> +++ b/drivers/vhost/vdpa.c >>>> @@ -414,6 +414,14 @@ static bool vhost_vdpa_has_desc_group(const struct vhost_vdpa *v) >>>> return ops->get_vq_desc_group; >>>> } >>>> >>>> +static bool vhost_vdpa_has_persistent_map(const struct vhost_vdpa *v) >>>> +{ >>>> + struct vdpa_device *vdpa = v->vdpa; >>>> + const struct vdpa_config_ops *ops = vdpa->config; >>>> + >>>> + return (!ops->set_map && !ops->dma_map) || ops->reset_map; >>>> +} >>>> + >>>> static long vhost_vdpa_get_features(struct vhost_vdpa *v, u64 __user *featurep) >>>> { >>>> struct vdpa_device *vdpa = v->vdpa; >>>> @@ -716,7 +724,8 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, >>>> if (features & ~(VHOST_VDPA_BACKEND_FEATURES | >>>> BIT_ULL(VHOST_BACKEND_F_DESC_ASID) | >>>> BIT_ULL(VHOST_BACKEND_F_SUSPEND) | >>>> - BIT_ULL(VHOST_BACKEND_F_RESUME))) >>>> + BIT_ULL(VHOST_BACKEND_F_RESUME) | >>>> + BIT_ULL(VHOST_BACKEND_F_IOTLB_PERSIST))) >>>> return -EOPNOTSUPP; >>>> if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) && >>>> !vhost_vdpa_can_suspend(v)) >>>> @@ -730,6 +739,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, >>>> if ((features & BIT_ULL(VHOST_BACKEND_F_DESC_ASID)) && >>>> !vhost_vdpa_has_desc_group(v)) >>>> return -EOPNOTSUPP; >>>> + if ((features & BIT_ULL(VHOST_BACKEND_F_IOTLB_PERSIST)) && >>>> + !vhost_vdpa_has_persistent_map(v)) >>>> + return -EOPNOTSUPP; >>>> vhost_set_backend_features(&v->vdev, features); >>>> return 0; >>>> } >>>> @@ -785,6 +797,8 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, >>>> features |= BIT_ULL(VHOST_BACKEND_F_RESUME); >>>> if (vhost_vdpa_has_desc_group(v)) >>>> features |= BIT_ULL(VHOST_BACKEND_F_DESC_ASID); >>>> + if (vhost_vdpa_has_persistent_map(v)) >>>> + features |= BIT_ULL(VHOST_BACKEND_F_IOTLB_PERSIST); >>>> if (copy_to_user(featurep, &features, sizeof(features))) >>>> r = -EFAULT; >>>> break; >>>> diff --git a/include/uapi/linux/vhost_types.h b/include/uapi/linux/vhost_types.h >>>> index 6acc604..0fdb6f0 100644 >>>> --- a/include/uapi/linux/vhost_types.h >>>> +++ b/include/uapi/linux/vhost_types.h >>>> @@ -186,5 +186,7 @@ struct vhost_vdpa_iova_range { >>>> * buffers may reside. Requires VHOST_BACKEND_F_IOTLB_ASID. >>>> */ >>>> #define VHOST_BACKEND_F_DESC_ASID 0x6 >>>> +/* IOTLB don't flush memory mapping across device reset */ >>>> +#define VHOST_BACKEND_F_IOTLB_PERSIST 0x7 >>>> >>>> #endif >>>> -- >>>> 1.8.3.1 >>>> _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2023-09-15 5:56 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-09-09 13:43 [PATCH RFC v3 0/4] vdpa: decouple reset of iotlb mapping from device reset Si-Wei Liu 2023-09-09 13:43 ` [PATCH RFC v3 1/4] vdpa: introduce .reset_map operation callback Si-Wei Liu 2023-09-09 13:43 ` [PATCH RFC v3 2/4] vdpa/mlx5: implement .reset_map driver op Si-Wei Liu 2023-09-11 3:48 ` Jason Wang 2023-09-12 0:01 ` Si-Wei Liu 2023-09-12 6:53 ` Jason Wang 2023-09-15 5:45 ` Si-Wei Liu 2023-09-09 13:43 ` [PATCH RFC v3 3/4] vhost-vdpa: should restore 1:1 dma mapping before detaching driver Si-Wei Liu 2023-09-09 13:43 ` [PATCH RFC v3 4/4] vhost-vdpa: introduce IOTLB_PERSIST backend feature bit Si-Wei Liu 2023-09-11 3:52 ` Jason Wang 2023-09-12 0:28 ` Si-Wei Liu 2023-09-12 7:01 ` Jason Wang 2023-09-15 5:55 ` Si-Wei Liu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).