All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 1/1] drm/virtio: Implement device_attach
@ 2024-01-29 10:31 ` Julia Zhang
  0 siblings, 0 replies; 15+ messages in thread
From: Julia Zhang @ 2024-01-29 10:31 UTC (permalink / raw)
  To: Gurchetan Singh, Chia-I Wu, David Airlie, Gerd Hoffmann,
	linux-kernel, dri-devel, amd-gfx, virtualization
  Cc: Pierre-Eric Pelloux-Prayer, Erik Faye-Lund, Marek Olšák,
	Chen Jiqian, Huang Rui, Honglei Huang, Daniel Vetter,
	Alex Deucher, Julia Zhang, David Airlie, Christian König

As vram objects don't have backing pages and thus can't implement
drm_gem_object_funcs.get_sg_table callback. This removes drm dma-buf
callbacks in virtgpu_gem_map_dma_buf()/virtgpu_gem_unmap_dma_buf()
and implement virtgpu specific map/unmap/attach callbacks to support
both of shmem objects and vram objects.

Signed-off-by: Julia Zhang <julia.zhang@amd.com>
---
 drivers/gpu/drm/virtio/virtgpu_prime.c | 40 +++++++++++++++++++++++---
 1 file changed, 36 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_prime.c b/drivers/gpu/drm/virtio/virtgpu_prime.c
index 44425f20d91a..b490a5343b06 100644
--- a/drivers/gpu/drm/virtio/virtgpu_prime.c
+++ b/drivers/gpu/drm/virtio/virtgpu_prime.c
@@ -49,11 +49,26 @@ virtgpu_gem_map_dma_buf(struct dma_buf_attachment *attach,
 {
 	struct drm_gem_object *obj = attach->dmabuf->priv;
 	struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
+	struct sg_table *sgt;
+	int ret;
 
 	if (virtio_gpu_is_vram(bo))
 		return virtio_gpu_vram_map_dma_buf(bo, attach->dev, dir);
 
-	return drm_gem_map_dma_buf(attach, dir);
+	sgt = drm_prime_pages_to_sg(obj->dev,
+				    to_drm_gem_shmem_obj(obj)->pages,
+				    obj->size >> PAGE_SHIFT);
+	if (IS_ERR(sgt))
+		return sgt;
+
+	ret = dma_map_sgtable(attach->dev, sgt, dir, DMA_ATTR_SKIP_CPU_SYNC);
+	if (ret) {
+		sg_free_table(sgt);
+		kfree(sgt);
+		return ERR_PTR(ret);
+	}
+
+	return sgt;
 }
 
 static void virtgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
@@ -63,12 +78,29 @@ static void virtgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
 	struct drm_gem_object *obj = attach->dmabuf->priv;
 	struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
 
+	if (!sgt)
+		return;
+
 	if (virtio_gpu_is_vram(bo)) {
 		virtio_gpu_vram_unmap_dma_buf(attach->dev, sgt, dir);
-		return;
+	} else {
+		dma_unmap_sgtable(attach->dev, sgt, dir, DMA_ATTR_SKIP_CPU_SYNC);
+		sg_free_table(sgt);
+		kfree(sgt);
 	}
+}
+
+static int virtgpu_gem_device_attach(struct dma_buf *dma_buf,
+				     struct dma_buf_attachment *attach)
+{
+	struct drm_gem_object *obj = attach->dmabuf->priv;
+	struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
+	int ret = 0;
+
+	if (!virtio_gpu_is_vram(bo) && obj->funcs->pin)
+		ret = obj->funcs->pin(obj);
 
-	drm_gem_unmap_dma_buf(attach, sgt, dir);
+	return ret;
 }
 
 static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops =  {
@@ -83,7 +115,7 @@ static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops =  {
 		.vmap = drm_gem_dmabuf_vmap,
 		.vunmap = drm_gem_dmabuf_vunmap,
 	},
-	.device_attach = drm_gem_map_attach,
+	.device_attach = virtgpu_gem_device_attach,
 	.get_uuid = virtgpu_virtio_get_uuid,
 };
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 1/1] drm/virtio: Implement device_attach
@ 2024-01-29 10:31 ` Julia Zhang
  0 siblings, 0 replies; 15+ messages in thread
From: Julia Zhang @ 2024-01-29 10:31 UTC (permalink / raw)
  To: Gurchetan Singh, Chia-I Wu, David Airlie, Gerd Hoffmann,
	linux-kernel, dri-devel, amd-gfx, virtualization
  Cc: Alex Deucher, Christian König, Daniel Vetter, David Airlie,
	Erik Faye-Lund, Marek Olšák, Pierre-Eric Pelloux-Prayer,
	Honglei Huang, Chen Jiqian, Huang Rui, Julia Zhang

As vram objects don't have backing pages and thus can't implement
drm_gem_object_funcs.get_sg_table callback. This removes drm dma-buf
callbacks in virtgpu_gem_map_dma_buf()/virtgpu_gem_unmap_dma_buf()
and implement virtgpu specific map/unmap/attach callbacks to support
both of shmem objects and vram objects.

Signed-off-by: Julia Zhang <julia.zhang@amd.com>
---
 drivers/gpu/drm/virtio/virtgpu_prime.c | 40 +++++++++++++++++++++++---
 1 file changed, 36 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_prime.c b/drivers/gpu/drm/virtio/virtgpu_prime.c
index 44425f20d91a..b490a5343b06 100644
--- a/drivers/gpu/drm/virtio/virtgpu_prime.c
+++ b/drivers/gpu/drm/virtio/virtgpu_prime.c
@@ -49,11 +49,26 @@ virtgpu_gem_map_dma_buf(struct dma_buf_attachment *attach,
 {
 	struct drm_gem_object *obj = attach->dmabuf->priv;
 	struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
+	struct sg_table *sgt;
+	int ret;
 
 	if (virtio_gpu_is_vram(bo))
 		return virtio_gpu_vram_map_dma_buf(bo, attach->dev, dir);
 
-	return drm_gem_map_dma_buf(attach, dir);
+	sgt = drm_prime_pages_to_sg(obj->dev,
+				    to_drm_gem_shmem_obj(obj)->pages,
+				    obj->size >> PAGE_SHIFT);
+	if (IS_ERR(sgt))
+		return sgt;
+
+	ret = dma_map_sgtable(attach->dev, sgt, dir, DMA_ATTR_SKIP_CPU_SYNC);
+	if (ret) {
+		sg_free_table(sgt);
+		kfree(sgt);
+		return ERR_PTR(ret);
+	}
+
+	return sgt;
 }
 
 static void virtgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
@@ -63,12 +78,29 @@ static void virtgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
 	struct drm_gem_object *obj = attach->dmabuf->priv;
 	struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
 
+	if (!sgt)
+		return;
+
 	if (virtio_gpu_is_vram(bo)) {
 		virtio_gpu_vram_unmap_dma_buf(attach->dev, sgt, dir);
-		return;
+	} else {
+		dma_unmap_sgtable(attach->dev, sgt, dir, DMA_ATTR_SKIP_CPU_SYNC);
+		sg_free_table(sgt);
+		kfree(sgt);
 	}
+}
+
+static int virtgpu_gem_device_attach(struct dma_buf *dma_buf,
+				     struct dma_buf_attachment *attach)
+{
+	struct drm_gem_object *obj = attach->dmabuf->priv;
+	struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
+	int ret = 0;
+
+	if (!virtio_gpu_is_vram(bo) && obj->funcs->pin)
+		ret = obj->funcs->pin(obj);
 
-	drm_gem_unmap_dma_buf(attach, sgt, dir);
+	return ret;
 }
 
 static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops =  {
@@ -83,7 +115,7 @@ static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops =  {
 		.vmap = drm_gem_dmabuf_vmap,
 		.vunmap = drm_gem_dmabuf_vunmap,
 	},
-	.device_attach = drm_gem_map_attach,
+	.device_attach = virtgpu_gem_device_attach,
 	.get_uuid = virtgpu_virtio_get_uuid,
 };
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/1] drm/virtio: Implement device_attach
  2024-01-29 10:31 ` Julia Zhang
@ 2024-01-29 13:12   ` Christian König
  -1 siblings, 0 replies; 15+ messages in thread
From: Christian König @ 2024-01-29 13:12 UTC (permalink / raw)
  To: Julia Zhang, Gurchetan Singh, Chia-I Wu, David Airlie,
	Gerd Hoffmann, linux-kernel, dri-devel, amd-gfx, virtualization
  Cc: Pierre-Eric Pelloux-Prayer, Erik Faye-Lund, Marek Olšák,
	Chen Jiqian, Huang Rui, Honglei Huang, Daniel Vetter,
	Alex Deucher, David Airlie

Am 29.01.24 um 11:31 schrieb Julia Zhang:
> As vram objects don't have backing pages and thus can't implement
> drm_gem_object_funcs.get_sg_table callback. This removes drm dma-buf
> callbacks in virtgpu_gem_map_dma_buf()/virtgpu_gem_unmap_dma_buf()
> and implement virtgpu specific map/unmap/attach callbacks to support
> both of shmem objects and vram objects.
>
> Signed-off-by: Julia Zhang <julia.zhang@amd.com>

I need to find more time to look into the code, but of hand I would say 
that this is the correct solution.

Regards,
Christian.

> ---
>   drivers/gpu/drm/virtio/virtgpu_prime.c | 40 +++++++++++++++++++++++---
>   1 file changed, 36 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/virtio/virtgpu_prime.c b/drivers/gpu/drm/virtio/virtgpu_prime.c
> index 44425f20d91a..b490a5343b06 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_prime.c
> +++ b/drivers/gpu/drm/virtio/virtgpu_prime.c
> @@ -49,11 +49,26 @@ virtgpu_gem_map_dma_buf(struct dma_buf_attachment *attach,
>   {
>   	struct drm_gem_object *obj = attach->dmabuf->priv;
>   	struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
> +	struct sg_table *sgt;
> +	int ret;
>   
>   	if (virtio_gpu_is_vram(bo))
>   		return virtio_gpu_vram_map_dma_buf(bo, attach->dev, dir);
>   
> -	return drm_gem_map_dma_buf(attach, dir);
> +	sgt = drm_prime_pages_to_sg(obj->dev,
> +				    to_drm_gem_shmem_obj(obj)->pages,
> +				    obj->size >> PAGE_SHIFT);
> +	if (IS_ERR(sgt))
> +		return sgt;
> +
> +	ret = dma_map_sgtable(attach->dev, sgt, dir, DMA_ATTR_SKIP_CPU_SYNC);
> +	if (ret) {
> +		sg_free_table(sgt);
> +		kfree(sgt);
> +		return ERR_PTR(ret);
> +	}
> +
> +	return sgt;
>   }
>   
>   static void virtgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
> @@ -63,12 +78,29 @@ static void virtgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
>   	struct drm_gem_object *obj = attach->dmabuf->priv;
>   	struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
>   
> +	if (!sgt)
> +		return;
> +
>   	if (virtio_gpu_is_vram(bo)) {
>   		virtio_gpu_vram_unmap_dma_buf(attach->dev, sgt, dir);
> -		return;
> +	} else {
> +		dma_unmap_sgtable(attach->dev, sgt, dir, DMA_ATTR_SKIP_CPU_SYNC);
> +		sg_free_table(sgt);
> +		kfree(sgt);
>   	}
> +}
> +
> +static int virtgpu_gem_device_attach(struct dma_buf *dma_buf,
> +				     struct dma_buf_attachment *attach)
> +{
> +	struct drm_gem_object *obj = attach->dmabuf->priv;
> +	struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
> +	int ret = 0;
> +
> +	if (!virtio_gpu_is_vram(bo) && obj->funcs->pin)
> +		ret = obj->funcs->pin(obj);
>   
> -	drm_gem_unmap_dma_buf(attach, sgt, dir);
> +	return ret;
>   }
>   
>   static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops =  {
> @@ -83,7 +115,7 @@ static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops =  {
>   		.vmap = drm_gem_dmabuf_vmap,
>   		.vunmap = drm_gem_dmabuf_vunmap,
>   	},
> -	.device_attach = drm_gem_map_attach,
> +	.device_attach = virtgpu_gem_device_attach,
>   	.get_uuid = virtgpu_virtio_get_uuid,
>   };
>   


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/1] drm/virtio: Implement device_attach
@ 2024-01-29 13:12   ` Christian König
  0 siblings, 0 replies; 15+ messages in thread
From: Christian König @ 2024-01-29 13:12 UTC (permalink / raw)
  To: Julia Zhang, Gurchetan Singh, Chia-I Wu, David Airlie,
	Gerd Hoffmann, linux-kernel, dri-devel, amd-gfx, virtualization
  Cc: Alex Deucher, Daniel Vetter, David Airlie, Erik Faye-Lund,
	Marek Olšák, Pierre-Eric Pelloux-Prayer, Honglei Huang,
	Chen Jiqian, Huang Rui

Am 29.01.24 um 11:31 schrieb Julia Zhang:
> As vram objects don't have backing pages and thus can't implement
> drm_gem_object_funcs.get_sg_table callback. This removes drm dma-buf
> callbacks in virtgpu_gem_map_dma_buf()/virtgpu_gem_unmap_dma_buf()
> and implement virtgpu specific map/unmap/attach callbacks to support
> both of shmem objects and vram objects.
>
> Signed-off-by: Julia Zhang <julia.zhang@amd.com>

I need to find more time to look into the code, but of hand I would say 
that this is the correct solution.

Regards,
Christian.

> ---
>   drivers/gpu/drm/virtio/virtgpu_prime.c | 40 +++++++++++++++++++++++---
>   1 file changed, 36 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/virtio/virtgpu_prime.c b/drivers/gpu/drm/virtio/virtgpu_prime.c
> index 44425f20d91a..b490a5343b06 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_prime.c
> +++ b/drivers/gpu/drm/virtio/virtgpu_prime.c
> @@ -49,11 +49,26 @@ virtgpu_gem_map_dma_buf(struct dma_buf_attachment *attach,
>   {
>   	struct drm_gem_object *obj = attach->dmabuf->priv;
>   	struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
> +	struct sg_table *sgt;
> +	int ret;
>   
>   	if (virtio_gpu_is_vram(bo))
>   		return virtio_gpu_vram_map_dma_buf(bo, attach->dev, dir);
>   
> -	return drm_gem_map_dma_buf(attach, dir);
> +	sgt = drm_prime_pages_to_sg(obj->dev,
> +				    to_drm_gem_shmem_obj(obj)->pages,
> +				    obj->size >> PAGE_SHIFT);
> +	if (IS_ERR(sgt))
> +		return sgt;
> +
> +	ret = dma_map_sgtable(attach->dev, sgt, dir, DMA_ATTR_SKIP_CPU_SYNC);
> +	if (ret) {
> +		sg_free_table(sgt);
> +		kfree(sgt);
> +		return ERR_PTR(ret);
> +	}
> +
> +	return sgt;
>   }
>   
>   static void virtgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
> @@ -63,12 +78,29 @@ static void virtgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
>   	struct drm_gem_object *obj = attach->dmabuf->priv;
>   	struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
>   
> +	if (!sgt)
> +		return;
> +
>   	if (virtio_gpu_is_vram(bo)) {
>   		virtio_gpu_vram_unmap_dma_buf(attach->dev, sgt, dir);
> -		return;
> +	} else {
> +		dma_unmap_sgtable(attach->dev, sgt, dir, DMA_ATTR_SKIP_CPU_SYNC);
> +		sg_free_table(sgt);
> +		kfree(sgt);
>   	}
> +}
> +
> +static int virtgpu_gem_device_attach(struct dma_buf *dma_buf,
> +				     struct dma_buf_attachment *attach)
> +{
> +	struct drm_gem_object *obj = attach->dmabuf->priv;
> +	struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
> +	int ret = 0;
> +
> +	if (!virtio_gpu_is_vram(bo) && obj->funcs->pin)
> +		ret = obj->funcs->pin(obj);
>   
> -	drm_gem_unmap_dma_buf(attach, sgt, dir);
> +	return ret;
>   }
>   
>   static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops =  {
> @@ -83,7 +115,7 @@ static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops =  {
>   		.vmap = drm_gem_dmabuf_vmap,
>   		.vunmap = drm_gem_dmabuf_vunmap,
>   	},
> -	.device_attach = drm_gem_map_attach,
> +	.device_attach = virtgpu_gem_device_attach,
>   	.get_uuid = virtgpu_virtio_get_uuid,
>   };
>   


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/1] drm/virtio: Implement device_attach
  2024-01-29 10:31 ` Julia Zhang
  (?)
@ 2024-01-30 11:10   ` Daniel Vetter
  -1 siblings, 0 replies; 15+ messages in thread
From: Daniel Vetter @ 2024-01-30 11:10 UTC (permalink / raw)
  To: Julia Zhang
  Cc: Pierre-Eric Pelloux-Prayer, amd-gfx, Chen Jiqian,
	Christian König, Marek Olšák, Erik Faye-Lund,
	linux-kernel, dri-devel, Gurchetan Singh, Huang Rui,
	Gerd Hoffmann, Daniel Vetter, Honglei Huang, Alex Deucher,
	David Airlie, virtualization, David Airlie, Chia-I Wu

On Mon, Jan 29, 2024 at 06:31:19PM +0800, Julia Zhang wrote:
> As vram objects don't have backing pages and thus can't implement
> drm_gem_object_funcs.get_sg_table callback. This removes drm dma-buf
> callbacks in virtgpu_gem_map_dma_buf()/virtgpu_gem_unmap_dma_buf()
> and implement virtgpu specific map/unmap/attach callbacks to support
> both of shmem objects and vram objects.
> 
> Signed-off-by: Julia Zhang <julia.zhang@amd.com>
> ---
>  drivers/gpu/drm/virtio/virtgpu_prime.c | 40 +++++++++++++++++++++++---
>  1 file changed, 36 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/virtio/virtgpu_prime.c b/drivers/gpu/drm/virtio/virtgpu_prime.c
> index 44425f20d91a..b490a5343b06 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_prime.c
> +++ b/drivers/gpu/drm/virtio/virtgpu_prime.c
> @@ -49,11 +49,26 @@ virtgpu_gem_map_dma_buf(struct dma_buf_attachment *attach,
>  {
>  	struct drm_gem_object *obj = attach->dmabuf->priv;
>  	struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
> +	struct sg_table *sgt;
> +	int ret;
>  
>  	if (virtio_gpu_is_vram(bo))
>  		return virtio_gpu_vram_map_dma_buf(bo, attach->dev, dir);
>  
> -	return drm_gem_map_dma_buf(attach, dir);
> +	sgt = drm_prime_pages_to_sg(obj->dev,
> +				    to_drm_gem_shmem_obj(obj)->pages,
> +				    obj->size >> PAGE_SHIFT);
> +	if (IS_ERR(sgt))
> +		return sgt;
> +
> +	ret = dma_map_sgtable(attach->dev, sgt, dir, DMA_ATTR_SKIP_CPU_SYNC);
> +	if (ret) {
> +		sg_free_table(sgt);
> +		kfree(sgt);
> +		return ERR_PTR(ret);
> +	}
> +
> +	return sgt;
>  }
>  
>  static void virtgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
> @@ -63,12 +78,29 @@ static void virtgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
>  	struct drm_gem_object *obj = attach->dmabuf->priv;
>  	struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
>  
> +	if (!sgt)
> +		return;
> +
>  	if (virtio_gpu_is_vram(bo)) {
>  		virtio_gpu_vram_unmap_dma_buf(attach->dev, sgt, dir);
> -		return;
> +	} else {
> +		dma_unmap_sgtable(attach->dev, sgt, dir, DMA_ATTR_SKIP_CPU_SYNC);
> +		sg_free_table(sgt);
> +		kfree(sgt);
>  	}
> +}
> +
> +static int virtgpu_gem_device_attach(struct dma_buf *dma_buf,
> +				     struct dma_buf_attachment *attach)
> +{
> +	struct drm_gem_object *obj = attach->dmabuf->priv;
> +	struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
> +	int ret = 0;
> +
> +	if (!virtio_gpu_is_vram(bo) && obj->funcs->pin)
> +		ret = obj->funcs->pin(obj);
>  
> -	drm_gem_unmap_dma_buf(attach, sgt, dir);
> +	return ret;

This doesn't look like what I've expected. There should be no need to
change the map/unmap functions, especially not for the usual gem bo case.
We should definitely keep using the exact same code for that. Instead all
I expected is roughly

virtgpu_gem_device_attach()
{
	if (virtio_gpu_is_vram(bo)) {
		if (can_access_virtio_vram_directly(attach->dev)
			return 0;
		else
			return -EBUSY;
	} else {
		return drm_gem_map_attach();
	}
}

Note that I think can_access_virtio_vram_directly() needs to be
implemented first. I'm not even sure it's possible, might be that all the
importers need to set the attachment->peer2peer flag. Which is why this
thing exists really. But that's a pile more work to do.

Frankly the more I look at the original patch that added vram export
support the more this just looks like a "pls revert, this is just too
broken".

We should definitely not open-code any functions for the gem_bo export
case, which your patch seems to do? Or maybe I'm just extremely confused.
-Sima

>  
>  static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops =  {
> @@ -83,7 +115,7 @@ static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops =  {
>  		.vmap = drm_gem_dmabuf_vmap,
>  		.vunmap = drm_gem_dmabuf_vunmap,
>  	},
> -	.device_attach = drm_gem_map_attach,
> +	.device_attach = virtgpu_gem_device_attach,
>  	.get_uuid = virtgpu_virtio_get_uuid,
>  };
>  
> -- 
> 2.34.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/1] drm/virtio: Implement device_attach
@ 2024-01-30 11:10   ` Daniel Vetter
  0 siblings, 0 replies; 15+ messages in thread
From: Daniel Vetter @ 2024-01-30 11:10 UTC (permalink / raw)
  To: Julia Zhang
  Cc: Gurchetan Singh, Chia-I Wu, David Airlie, Gerd Hoffmann,
	linux-kernel, dri-devel, amd-gfx, virtualization, Alex Deucher,
	Christian König, Daniel Vetter, David Airlie, Erik Faye-Lund,
	Marek Olšák, Pierre-Eric Pelloux-Prayer, Honglei Huang,
	Chen Jiqian, Huang Rui

On Mon, Jan 29, 2024 at 06:31:19PM +0800, Julia Zhang wrote:
> As vram objects don't have backing pages and thus can't implement
> drm_gem_object_funcs.get_sg_table callback. This removes drm dma-buf
> callbacks in virtgpu_gem_map_dma_buf()/virtgpu_gem_unmap_dma_buf()
> and implement virtgpu specific map/unmap/attach callbacks to support
> both of shmem objects and vram objects.
> 
> Signed-off-by: Julia Zhang <julia.zhang@amd.com>
> ---
>  drivers/gpu/drm/virtio/virtgpu_prime.c | 40 +++++++++++++++++++++++---
>  1 file changed, 36 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/virtio/virtgpu_prime.c b/drivers/gpu/drm/virtio/virtgpu_prime.c
> index 44425f20d91a..b490a5343b06 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_prime.c
> +++ b/drivers/gpu/drm/virtio/virtgpu_prime.c
> @@ -49,11 +49,26 @@ virtgpu_gem_map_dma_buf(struct dma_buf_attachment *attach,
>  {
>  	struct drm_gem_object *obj = attach->dmabuf->priv;
>  	struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
> +	struct sg_table *sgt;
> +	int ret;
>  
>  	if (virtio_gpu_is_vram(bo))
>  		return virtio_gpu_vram_map_dma_buf(bo, attach->dev, dir);
>  
> -	return drm_gem_map_dma_buf(attach, dir);
> +	sgt = drm_prime_pages_to_sg(obj->dev,
> +				    to_drm_gem_shmem_obj(obj)->pages,
> +				    obj->size >> PAGE_SHIFT);
> +	if (IS_ERR(sgt))
> +		return sgt;
> +
> +	ret = dma_map_sgtable(attach->dev, sgt, dir, DMA_ATTR_SKIP_CPU_SYNC);
> +	if (ret) {
> +		sg_free_table(sgt);
> +		kfree(sgt);
> +		return ERR_PTR(ret);
> +	}
> +
> +	return sgt;
>  }
>  
>  static void virtgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
> @@ -63,12 +78,29 @@ static void virtgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
>  	struct drm_gem_object *obj = attach->dmabuf->priv;
>  	struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
>  
> +	if (!sgt)
> +		return;
> +
>  	if (virtio_gpu_is_vram(bo)) {
>  		virtio_gpu_vram_unmap_dma_buf(attach->dev, sgt, dir);
> -		return;
> +	} else {
> +		dma_unmap_sgtable(attach->dev, sgt, dir, DMA_ATTR_SKIP_CPU_SYNC);
> +		sg_free_table(sgt);
> +		kfree(sgt);
>  	}
> +}
> +
> +static int virtgpu_gem_device_attach(struct dma_buf *dma_buf,
> +				     struct dma_buf_attachment *attach)
> +{
> +	struct drm_gem_object *obj = attach->dmabuf->priv;
> +	struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
> +	int ret = 0;
> +
> +	if (!virtio_gpu_is_vram(bo) && obj->funcs->pin)
> +		ret = obj->funcs->pin(obj);
>  
> -	drm_gem_unmap_dma_buf(attach, sgt, dir);
> +	return ret;

This doesn't look like what I've expected. There should be no need to
change the map/unmap functions, especially not for the usual gem bo case.
We should definitely keep using the exact same code for that. Instead all
I expected is roughly

virtgpu_gem_device_attach()
{
	if (virtio_gpu_is_vram(bo)) {
		if (can_access_virtio_vram_directly(attach->dev)
			return 0;
		else
			return -EBUSY;
	} else {
		return drm_gem_map_attach();
	}
}

Note that I think can_access_virtio_vram_directly() needs to be
implemented first. I'm not even sure it's possible, might be that all the
importers need to set the attachment->peer2peer flag. Which is why this
thing exists really. But that's a pile more work to do.

Frankly the more I look at the original patch that added vram export
support the more this just looks like a "pls revert, this is just too
broken".

We should definitely not open-code any functions for the gem_bo export
case, which your patch seems to do? Or maybe I'm just extremely confused.
-Sima

>  
>  static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops =  {
> @@ -83,7 +115,7 @@ static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops =  {
>  		.vmap = drm_gem_dmabuf_vmap,
>  		.vunmap = drm_gem_dmabuf_vunmap,
>  	},
> -	.device_attach = drm_gem_map_attach,
> +	.device_attach = virtgpu_gem_device_attach,
>  	.get_uuid = virtgpu_virtio_get_uuid,
>  };
>  
> -- 
> 2.34.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/1] drm/virtio: Implement device_attach
@ 2024-01-30 11:10   ` Daniel Vetter
  0 siblings, 0 replies; 15+ messages in thread
From: Daniel Vetter @ 2024-01-30 11:10 UTC (permalink / raw)
  To: Julia Zhang
  Cc: Pierre-Eric Pelloux-Prayer, amd-gfx, Chen Jiqian,
	Christian König, Marek Olšák, Erik Faye-Lund,
	linux-kernel, dri-devel, Gurchetan Singh, Huang Rui,
	Gerd Hoffmann, Daniel Vetter, Honglei Huang, Alex Deucher,
	David Airlie, virtualization, David Airlie

On Mon, Jan 29, 2024 at 06:31:19PM +0800, Julia Zhang wrote:
> As vram objects don't have backing pages and thus can't implement
> drm_gem_object_funcs.get_sg_table callback. This removes drm dma-buf
> callbacks in virtgpu_gem_map_dma_buf()/virtgpu_gem_unmap_dma_buf()
> and implement virtgpu specific map/unmap/attach callbacks to support
> both of shmem objects and vram objects.
> 
> Signed-off-by: Julia Zhang <julia.zhang@amd.com>
> ---
>  drivers/gpu/drm/virtio/virtgpu_prime.c | 40 +++++++++++++++++++++++---
>  1 file changed, 36 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/virtio/virtgpu_prime.c b/drivers/gpu/drm/virtio/virtgpu_prime.c
> index 44425f20d91a..b490a5343b06 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_prime.c
> +++ b/drivers/gpu/drm/virtio/virtgpu_prime.c
> @@ -49,11 +49,26 @@ virtgpu_gem_map_dma_buf(struct dma_buf_attachment *attach,
>  {
>  	struct drm_gem_object *obj = attach->dmabuf->priv;
>  	struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
> +	struct sg_table *sgt;
> +	int ret;
>  
>  	if (virtio_gpu_is_vram(bo))
>  		return virtio_gpu_vram_map_dma_buf(bo, attach->dev, dir);
>  
> -	return drm_gem_map_dma_buf(attach, dir);
> +	sgt = drm_prime_pages_to_sg(obj->dev,
> +				    to_drm_gem_shmem_obj(obj)->pages,
> +				    obj->size >> PAGE_SHIFT);
> +	if (IS_ERR(sgt))
> +		return sgt;
> +
> +	ret = dma_map_sgtable(attach->dev, sgt, dir, DMA_ATTR_SKIP_CPU_SYNC);
> +	if (ret) {
> +		sg_free_table(sgt);
> +		kfree(sgt);
> +		return ERR_PTR(ret);
> +	}
> +
> +	return sgt;
>  }
>  
>  static void virtgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
> @@ -63,12 +78,29 @@ static void virtgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
>  	struct drm_gem_object *obj = attach->dmabuf->priv;
>  	struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
>  
> +	if (!sgt)
> +		return;
> +
>  	if (virtio_gpu_is_vram(bo)) {
>  		virtio_gpu_vram_unmap_dma_buf(attach->dev, sgt, dir);
> -		return;
> +	} else {
> +		dma_unmap_sgtable(attach->dev, sgt, dir, DMA_ATTR_SKIP_CPU_SYNC);
> +		sg_free_table(sgt);
> +		kfree(sgt);
>  	}
> +}
> +
> +static int virtgpu_gem_device_attach(struct dma_buf *dma_buf,
> +				     struct dma_buf_attachment *attach)
> +{
> +	struct drm_gem_object *obj = attach->dmabuf->priv;
> +	struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
> +	int ret = 0;
> +
> +	if (!virtio_gpu_is_vram(bo) && obj->funcs->pin)
> +		ret = obj->funcs->pin(obj);
>  
> -	drm_gem_unmap_dma_buf(attach, sgt, dir);
> +	return ret;

This doesn't look like what I've expected. There should be no need to
change the map/unmap functions, especially not for the usual gem bo case.
We should definitely keep using the exact same code for that. Instead all
I expected is roughly

virtgpu_gem_device_attach()
{
	if (virtio_gpu_is_vram(bo)) {
		if (can_access_virtio_vram_directly(attach->dev)
			return 0;
		else
			return -EBUSY;
	} else {
		return drm_gem_map_attach();
	}
}

Note that I think can_access_virtio_vram_directly() needs to be
implemented first. I'm not even sure it's possible, might be that all the
importers need to set the attachment->peer2peer flag. Which is why this
thing exists really. But that's a pile more work to do.

Frankly the more I look at the original patch that added vram export
support the more this just looks like a "pls revert, this is just too
broken".

We should definitely not open-code any functions for the gem_bo export
case, which your patch seems to do? Or maybe I'm just extremely confused.
-Sima

>  
>  static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops =  {
> @@ -83,7 +115,7 @@ static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops =  {
>  		.vmap = drm_gem_dmabuf_vmap,
>  		.vunmap = drm_gem_dmabuf_vunmap,
>  	},
> -	.device_attach = drm_gem_map_attach,
> +	.device_attach = virtgpu_gem_device_attach,
>  	.get_uuid = virtgpu_virtio_get_uuid,
>  };
>  
> -- 
> 2.34.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/1] drm/virtio: Implement device_attach
  2024-01-30 11:10   ` Daniel Vetter
  (?)
  (?)
@ 2024-01-30 11:16   ` Daniel Vetter
  2024-01-30 14:23     ` Christian König
  -1 siblings, 1 reply; 15+ messages in thread
From: Daniel Vetter @ 2024-01-30 11:16 UTC (permalink / raw)
  To: Julia Zhang, Gurchetan Singh, Chia-I Wu, David Airlie,
	Gerd Hoffmann, linux-kernel, dri-devel, amd-gfx, virtualization,
	Alex Deucher, Christian König, David Airlie, Erik Faye-Lund,
	Marek Olšák, Pierre-Eric Pelloux-Prayer, Honglei Huang,
	Chen Jiqian, Huang Rui, David Stevens

On Tue, Jan 30, 2024 at 12:10:31PM +0100, Daniel Vetter wrote:
> On Mon, Jan 29, 2024 at 06:31:19PM +0800, Julia Zhang wrote:
> > As vram objects don't have backing pages and thus can't implement
> > drm_gem_object_funcs.get_sg_table callback. This removes drm dma-buf
> > callbacks in virtgpu_gem_map_dma_buf()/virtgpu_gem_unmap_dma_buf()
> > and implement virtgpu specific map/unmap/attach callbacks to support
> > both of shmem objects and vram objects.
> > 
> > Signed-off-by: Julia Zhang <julia.zhang@amd.com>
> > ---
> >  drivers/gpu/drm/virtio/virtgpu_prime.c | 40 +++++++++++++++++++++++---
> >  1 file changed, 36 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/virtio/virtgpu_prime.c b/drivers/gpu/drm/virtio/virtgpu_prime.c
> > index 44425f20d91a..b490a5343b06 100644
> > --- a/drivers/gpu/drm/virtio/virtgpu_prime.c
> > +++ b/drivers/gpu/drm/virtio/virtgpu_prime.c
> > @@ -49,11 +49,26 @@ virtgpu_gem_map_dma_buf(struct dma_buf_attachment *attach,
> >  {
> >  	struct drm_gem_object *obj = attach->dmabuf->priv;
> >  	struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
> > +	struct sg_table *sgt;
> > +	int ret;
> >  
> >  	if (virtio_gpu_is_vram(bo))
> >  		return virtio_gpu_vram_map_dma_buf(bo, attach->dev, dir);
> >  
> > -	return drm_gem_map_dma_buf(attach, dir);
> > +	sgt = drm_prime_pages_to_sg(obj->dev,
> > +				    to_drm_gem_shmem_obj(obj)->pages,
> > +				    obj->size >> PAGE_SHIFT);
> > +	if (IS_ERR(sgt))
> > +		return sgt;
> > +
> > +	ret = dma_map_sgtable(attach->dev, sgt, dir, DMA_ATTR_SKIP_CPU_SYNC);
> > +	if (ret) {
> > +		sg_free_table(sgt);
> > +		kfree(sgt);
> > +		return ERR_PTR(ret);
> > +	}
> > +
> > +	return sgt;
> >  }
> >  
> >  static void virtgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
> > @@ -63,12 +78,29 @@ static void virtgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
> >  	struct drm_gem_object *obj = attach->dmabuf->priv;
> >  	struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
> >  
> > +	if (!sgt)
> > +		return;
> > +
> >  	if (virtio_gpu_is_vram(bo)) {
> >  		virtio_gpu_vram_unmap_dma_buf(attach->dev, sgt, dir);
> > -		return;
> > +	} else {
> > +		dma_unmap_sgtable(attach->dev, sgt, dir, DMA_ATTR_SKIP_CPU_SYNC);
> > +		sg_free_table(sgt);
> > +		kfree(sgt);
> >  	}
> > +}
> > +
> > +static int virtgpu_gem_device_attach(struct dma_buf *dma_buf,
> > +				     struct dma_buf_attachment *attach)
> > +{
> > +	struct drm_gem_object *obj = attach->dmabuf->priv;
> > +	struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
> > +	int ret = 0;
> > +
> > +	if (!virtio_gpu_is_vram(bo) && obj->funcs->pin)
> > +		ret = obj->funcs->pin(obj);
> >  
> > -	drm_gem_unmap_dma_buf(attach, sgt, dir);
> > +	return ret;
> 
> This doesn't look like what I've expected. There should be no need to
> change the map/unmap functions, especially not for the usual gem bo case.
> We should definitely keep using the exact same code for that. Instead all
> I expected is roughly
> 
> virtgpu_gem_device_attach()
> {
> 	if (virtio_gpu_is_vram(bo)) {
> 		if (can_access_virtio_vram_directly(attach->dev)
> 			return 0;
> 		else
> 			return -EBUSY;
> 	} else {
> 		return drm_gem_map_attach();
> 	}
> }
> 
> Note that I think can_access_virtio_vram_directly() needs to be
> implemented first. I'm not even sure it's possible, might be that all the
> importers need to set the attachment->peer2peer flag. Which is why this
> thing exists really. But that's a pile more work to do.
> 
> Frankly the more I look at the original patch that added vram export
> support the more this just looks like a "pls revert, this is just too
> broken".

The commit I mean is this one: ea5ea3d8a117 ("drm/virtio: support mapping
exported vram"). The commit message definitely needs to cite that one, and
also needs a cc: stable because not rejecting invalid imports is a pretty
big deal.

Also adding David.
-Sima

> 
> We should definitely not open-code any functions for the gem_bo export
> case, which your patch seems to do? Or maybe I'm just extremely confused.
> -Sima
> 
> >  
> >  static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops =  {
> > @@ -83,7 +115,7 @@ static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops =  {
> >  		.vmap = drm_gem_dmabuf_vmap,
> >  		.vunmap = drm_gem_dmabuf_vunmap,
> >  	},
> > -	.device_attach = drm_gem_map_attach,
> > +	.device_attach = virtgpu_gem_device_attach,
> >  	.get_uuid = virtgpu_virtio_get_uuid,
> >  };
> >  
> > -- 
> > 2.34.1
> > 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/1] drm/virtio: Implement device_attach
  2024-01-30 11:16   ` Daniel Vetter
@ 2024-01-30 14:23     ` Christian König
  2024-01-31 10:20       ` Zhang, Julia
  0 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2024-01-30 14:23 UTC (permalink / raw)
  To: Julia Zhang, Gurchetan Singh, Chia-I Wu, David Airlie,
	Gerd Hoffmann, linux-kernel, dri-devel, amd-gfx, virtualization,
	Alex Deucher, David Airlie, Erik Faye-Lund, Marek Olšák,
	Pierre-Eric Pelloux-Prayer, Honglei Huang, Chen Jiqian, Huang Rui,
	David Stevens

Am 30.01.24 um 12:16 schrieb Daniel Vetter:
> On Tue, Jan 30, 2024 at 12:10:31PM +0100, Daniel Vetter wrote:
>> On Mon, Jan 29, 2024 at 06:31:19PM +0800, Julia Zhang wrote:
>>> As vram objects don't have backing pages and thus can't implement
>>> drm_gem_object_funcs.get_sg_table callback. This removes drm dma-buf
>>> callbacks in virtgpu_gem_map_dma_buf()/virtgpu_gem_unmap_dma_buf()
>>> and implement virtgpu specific map/unmap/attach callbacks to support
>>> both of shmem objects and vram objects.
>>>
>>> Signed-off-by: Julia Zhang <julia.zhang@amd.com>
>>> ---
>>>   drivers/gpu/drm/virtio/virtgpu_prime.c | 40 +++++++++++++++++++++++---
>>>   1 file changed, 36 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/virtio/virtgpu_prime.c b/drivers/gpu/drm/virtio/virtgpu_prime.c
>>> index 44425f20d91a..b490a5343b06 100644
>>> --- a/drivers/gpu/drm/virtio/virtgpu_prime.c
>>> +++ b/drivers/gpu/drm/virtio/virtgpu_prime.c
>>> @@ -49,11 +49,26 @@ virtgpu_gem_map_dma_buf(struct dma_buf_attachment *attach,
>>>   {
>>>   	struct drm_gem_object *obj = attach->dmabuf->priv;
>>>   	struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
>>> +	struct sg_table *sgt;
>>> +	int ret;
>>>   
>>>   	if (virtio_gpu_is_vram(bo))
>>>   		return virtio_gpu_vram_map_dma_buf(bo, attach->dev, dir);
>>>   
>>> -	return drm_gem_map_dma_buf(attach, dir);
>>> +	sgt = drm_prime_pages_to_sg(obj->dev,
>>> +				    to_drm_gem_shmem_obj(obj)->pages,
>>> +				    obj->size >> PAGE_SHIFT);
>>> +	if (IS_ERR(sgt))
>>> +		return sgt;
>>> +
>>> +	ret = dma_map_sgtable(attach->dev, sgt, dir, DMA_ATTR_SKIP_CPU_SYNC);
>>> +	if (ret) {
>>> +		sg_free_table(sgt);
>>> +		kfree(sgt);
>>> +		return ERR_PTR(ret);
>>> +	}
>>> +
>>> +	return sgt;
>>>   }
>>>   
>>>   static void virtgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
>>> @@ -63,12 +78,29 @@ static void virtgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
>>>   	struct drm_gem_object *obj = attach->dmabuf->priv;
>>>   	struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
>>>   
>>> +	if (!sgt)
>>> +		return;
>>> +
>>>   	if (virtio_gpu_is_vram(bo)) {
>>>   		virtio_gpu_vram_unmap_dma_buf(attach->dev, sgt, dir);
>>> -		return;
>>> +	} else {
>>> +		dma_unmap_sgtable(attach->dev, sgt, dir, DMA_ATTR_SKIP_CPU_SYNC);
>>> +		sg_free_table(sgt);
>>> +		kfree(sgt);
>>>   	}
>>> +}
>>> +
>>> +static int virtgpu_gem_device_attach(struct dma_buf *dma_buf,
>>> +				     struct dma_buf_attachment *attach)
>>> +{
>>> +	struct drm_gem_object *obj = attach->dmabuf->priv;
>>> +	struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
>>> +	int ret = 0;
>>> +
>>> +	if (!virtio_gpu_is_vram(bo) && obj->funcs->pin)
>>> +		ret = obj->funcs->pin(obj);
>>>   
>>> -	drm_gem_unmap_dma_buf(attach, sgt, dir);
>>> +	return ret;
>> This doesn't look like what I've expected. There should be no need to
>> change the map/unmap functions, especially not for the usual gem bo case.
>> We should definitely keep using the exact same code for that. Instead all
>> I expected is roughly
>>
>> virtgpu_gem_device_attach()
>> {
>> 	if (virtio_gpu_is_vram(bo)) {
>> 		if (can_access_virtio_vram_directly(attach->dev)
>> 			return 0;
>> 		else
>> 			return -EBUSY;
>> 	} else {
>> 		return drm_gem_map_attach();
>> 	}
>> }
>>
>> Note that I think can_access_virtio_vram_directly() needs to be
>> implemented first. I'm not even sure it's possible, might be that all the
>> importers need to set the attachment->peer2peer flag. Which is why this
>> thing exists really. But that's a pile more work to do.

Yeah, that is really just speculative. All importers need to set the 
peer2peer flag just in case.

What happens under the hood is that IOMMU redirects the "VRAM" memory 
access to whatever address the DMA-buf on the host is pointing to 
(system, VRAM, doorbell, IOMMU, whatever).

I'm also not 100% sure if all the cache snooping is done correctly in 
all cases, but for now it seems to work.

>>
>> Frankly the more I look at the original patch that added vram export
>> support the more this just looks like a "pls revert, this is just too
>> broken".
> The commit I mean is this one: ea5ea3d8a117 ("drm/virtio: support mapping
> exported vram"). The commit message definitely needs to cite that one, and
> also needs a cc: stable because not rejecting invalid imports is a pretty
> big deal.

Yeah, I've pointed out that commit in an internal discussion as well. I 
was just not aware that it's that severely broken.

Regards,
Christian.

>
> Also adding David.
> -Sima
>
>> We should definitely not open-code any functions for the gem_bo export
>> case, which your patch seems to do? Or maybe I'm just extremely confused.
>> -Sima
>>
>>>   
>>>   static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops =  {
>>> @@ -83,7 +115,7 @@ static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops =  {
>>>   		.vmap = drm_gem_dmabuf_vmap,
>>>   		.vunmap = drm_gem_dmabuf_vunmap,
>>>   	},
>>> -	.device_attach = drm_gem_map_attach,
>>> +	.device_attach = virtgpu_gem_device_attach,
>>>   	.get_uuid = virtgpu_virtio_get_uuid,
>>>   };
>>>   
>>> -- 
>>> 2.34.1
>>>
>> -- 
>> Daniel Vetter
>> Software Engineer, Intel Corporation
>> http://blog.ffwll.ch


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/1] drm/virtio: Implement device_attach
  2024-01-30 14:23     ` Christian König
@ 2024-01-31 10:20       ` Zhang, Julia
  2024-01-31 14:32         ` Christian König
  0 siblings, 1 reply; 15+ messages in thread
From: Zhang, Julia @ 2024-01-31 10:20 UTC (permalink / raw)
  To: Koenig, Christian, Zhang, Julia, Gurchetan Singh, Chia-I Wu,
	David Airlie, Gerd Hoffmann, linux-kernel@vger.kernel.org,
	dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org,
	virtualization@lists.linux-foundation.org, Deucher, Alexander,
	David Airlie, Erik Faye-Lund, Olsak, Marek,
	Pelloux-Prayer,  Pierre-Eric, Huang, Honglei1, Chen, Jiqian,
	Huang, Ray, David Stevens



On 2024/1/30 22:23, Christian König wrote:
> Am 30.01.24 um 12:16 schrieb Daniel Vetter:
>> On Tue, Jan 30, 2024 at 12:10:31PM +0100, Daniel Vetter wrote:
>>> On Mon, Jan 29, 2024 at 06:31:19PM +0800, Julia Zhang wrote:
>>>> As vram objects don't have backing pages and thus can't implement
>>>> drm_gem_object_funcs.get_sg_table callback. This removes drm dma-buf
>>>> callbacks in virtgpu_gem_map_dma_buf()/virtgpu_gem_unmap_dma_buf()
>>>> and implement virtgpu specific map/unmap/attach callbacks to support
>>>> both of shmem objects and vram objects.
>>>>
>>>> Signed-off-by: Julia Zhang <julia.zhang@amd.com>
>>>> ---
>>>>   drivers/gpu/drm/virtio/virtgpu_prime.c | 40 +++++++++++++++++++++++---
>>>>   1 file changed, 36 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/virtio/virtgpu_prime.c b/drivers/gpu/drm/virtio/virtgpu_prime.c
>>>> index 44425f20d91a..b490a5343b06 100644
>>>> --- a/drivers/gpu/drm/virtio/virtgpu_prime.c
>>>> +++ b/drivers/gpu/drm/virtio/virtgpu_prime.c
>>>> @@ -49,11 +49,26 @@ virtgpu_gem_map_dma_buf(struct dma_buf_attachment *attach,
>>>>   {
>>>>       struct drm_gem_object *obj = attach->dmabuf->priv;
>>>>       struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
>>>> +    struct sg_table *sgt;
>>>> +    int ret;
>>>>         if (virtio_gpu_is_vram(bo))
>>>>           return virtio_gpu_vram_map_dma_buf(bo, attach->dev, dir);
>>>>   -    return drm_gem_map_dma_buf(attach, dir);
>>>> +    sgt = drm_prime_pages_to_sg(obj->dev,
>>>> +                    to_drm_gem_shmem_obj(obj)->pages,
>>>> +                    obj->size >> PAGE_SHIFT);
>>>> +    if (IS_ERR(sgt))
>>>> +        return sgt;
>>>> +
>>>> +    ret = dma_map_sgtable(attach->dev, sgt, dir, DMA_ATTR_SKIP_CPU_SYNC);
>>>> +    if (ret) {
>>>> +        sg_free_table(sgt);
>>>> +        kfree(sgt);
>>>> +        return ERR_PTR(ret);
>>>> +    }
>>>> +
>>>> +    return sgt;
>>>>   }
>>>>     static void virtgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
>>>> @@ -63,12 +78,29 @@ static void virtgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
>>>>       struct drm_gem_object *obj = attach->dmabuf->priv;
>>>>       struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
>>>>   +    if (!sgt)
>>>> +        return;
>>>> +
>>>>       if (virtio_gpu_is_vram(bo)) {
>>>>           virtio_gpu_vram_unmap_dma_buf(attach->dev, sgt, dir);
>>>> -        return;
>>>> +    } else {
>>>> +        dma_unmap_sgtable(attach->dev, sgt, dir, DMA_ATTR_SKIP_CPU_SYNC);
>>>> +        sg_free_table(sgt);
>>>> +        kfree(sgt);
>>>>       }
>>>> +}
>>>> +
>>>> +static int virtgpu_gem_device_attach(struct dma_buf *dma_buf,
>>>> +                     struct dma_buf_attachment *attach)
>>>> +{
>>>> +    struct drm_gem_object *obj = attach->dmabuf->priv;
>>>> +    struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
>>>> +    int ret = 0;
>>>> +
>>>> +    if (!virtio_gpu_is_vram(bo) && obj->funcs->pin)
>>>> +        ret = obj->funcs->pin(obj);
>>>>   -    drm_gem_unmap_dma_buf(attach, sgt, dir);
>>>> +    return ret;
>>> This doesn't look like what I've expected. There should be no need to
>>> change the map/unmap functions, especially not for the usual gem bo case.
>>> We should definitely keep using the exact same code for that. Instead all
>>> I expected is roughly
>>>
>>> virtgpu_gem_device_attach()
>>> {
>>>     if (virtio_gpu_is_vram(bo)) {
>>>         if (can_access_virtio_vram_directly(attach->dev)
>>>             return 0;
>>>         else
>>>             return -EBUSY;
>>>     } else {
>>>         return drm_gem_map_attach();
>>>     }
>>> }
>>>
>>> Note that I think can_access_virtio_vram_directly() needs to be
>>> implemented first. I'm not even sure it's possible, might be that all the
>>> importers need to set the attachment->peer2peer flag. Which is why this
>>> thing exists really. But that's a pile more work to do.
> 
Hi Sima, Christian,

> Yeah, that is really just speculative. All importers need to set the peer2peer flag just in case.

I see, I will modify this.

> 
> What happens under the hood is that IOMMU redirects the "VRAM" memory access to whatever address the DMA-buf on the host is pointing to (system, VRAM, doorbell, IOMMU, whatever).
> 
> I'm also not 100% sure if all the cache snooping is done correctly in all cases, but for now it seems to work.
>>>>
>>> Frankly the more I look at the original patch that added vram export
>>> support the more this just looks like a "pls revert, this is just too
>>> broken".
>> The commit I mean is this one: ea5ea3d8a117 ("drm/virtio: support mapping
>> exported vram"). The commit message definitely needs to cite that one, and
>> also needs a cc: stable because not rejecting invalid imports is a pretty
>> big deal.
> 
> Yeah, I've pointed out that commit in an internal discussion as well. I was just not aware that it's that severely broken.
> 

Yeah we have mentioned this patch before, but I don't totally understand why this is too broken. Without exporting vram objects, dGPU prime feature would not be realized.
Would you mind to explain more about it. Thanks!

Best regards,
Julia

> Regards,
> Christian.
> 
>>
>> Also adding David.
>> -Sima
>>
>>> We should definitely not open-code any functions for the gem_bo export
>>> case, which your patch seems to do? Or maybe I'm just extremely confused.
>>> -Sima
>>>
>>>>     static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops =  {
>>>> @@ -83,7 +115,7 @@ static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops =  {
>>>>           .vmap = drm_gem_dmabuf_vmap,
>>>>           .vunmap = drm_gem_dmabuf_vunmap,
>>>>       },
>>>> -    .device_attach = drm_gem_map_attach,
>>>> +    .device_attach = virtgpu_gem_device_attach,
>>>>       .get_uuid = virtgpu_virtio_get_uuid,
>>>>   };
>>>>   -- 
>>>> 2.34.1
>>>>
>>> -- 
>>> Daniel Vetter
>>> Software Engineer, Intel Corporation
>>> http://blog.ffwll.ch
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/1] drm/virtio: Implement device_attach
  2024-01-31 10:20       ` Zhang, Julia
@ 2024-01-31 14:32         ` Christian König
  2024-11-29  7:51           ` Zhang, Julia
  0 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2024-01-31 14:32 UTC (permalink / raw)
  To: Zhang, Julia, Gurchetan Singh, Chia-I Wu, David Airlie,
	Gerd Hoffmann, linux-kernel@vger.kernel.org,
	dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org,
	virtualization@lists.linux-foundation.org, Deucher, Alexander,
	David Airlie, Erik Faye-Lund, Olsak, Marek,
	Pelloux-Prayer, Pierre-Eric, Huang, Honglei1, Chen, Jiqian,
	Huang, Ray, David Stevens

[-- Attachment #1: Type: text/plain, Size: 1930 bytes --]

Am 31.01.24 um 11:20 schrieb Zhang, Julia:
> On 2024/1/30 22:23, Christian König wrote:
>> Am 30.01.24 um 12:16 schrieb Daniel Vetter:
>>> On Tue, Jan 30, 2024 at 12:10:31PM +0100, Daniel Vetter wrote:
>>>> [SNIP]
> Hi Sima, Christian,
>
>> Yeah, that is really just speculative. All importers need to set the peer2peer flag just in case.
> I see, I will modify this.
>
>> What happens under the hood is that IOMMU redirects the "VRAM" memory access to whatever address the DMA-buf on the host is pointing to (system, VRAM, doorbell, IOMMU, whatever).
>>
>> I'm also not 100% sure if all the cache snooping is done correctly in all cases, but for now it seems to work.
>>>> Frankly the more I look at the original patch that added vram export
>>>> support the more this just looks like a "pls revert, this is just too
>>>> broken".
>>> The commit I mean is this one: ea5ea3d8a117 ("drm/virtio: support mapping
>>> exported vram"). The commit message definitely needs to cite that one, and
>>> also needs a cc: stable because not rejecting invalid imports is a pretty
>>> big deal.
>> Yeah, I've pointed out that commit in an internal discussion as well. I was just not aware that it's that severely broken.
>>
> Yeah we have mentioned this patch before, but I don't totally understand why this is too broken. Without exporting vram objects, dGPU prime feature would not be realized.
> Would you mind to explain more about it. Thanks!

One reason is that using sg tables without struct pages is actually a 
hack we came up with because we couldn't hope to clean up the sg table 
structure any time soon to not include struct page pointers.

Another reason is that using this with devices which don't expect a DMA 
address pointing into a virtual PCI BAR. So doing this without checking 
the peer2peer flag can most likely cause quite a bit of trouble.

Regards,
Christian.

>
> Best regards,
> Julia
>
>> Regards,
>> Christian.
>>

[-- Attachment #2: Type: text/html, Size: 3723 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/1] drm/virtio: Implement device_attach
  2024-01-31 14:32         ` Christian König
@ 2024-11-29  7:51           ` Zhang, Julia
  2024-12-04  3:46             ` Zhang, Julia
  0 siblings, 1 reply; 15+ messages in thread
From: Zhang, Julia @ 2024-11-29  7:51 UTC (permalink / raw)
  To: Koenig, Christian, Zhang, Julia, Gurchetan Singh, Chia-I Wu,
	David Airlie, Gerd Hoffmann, linux-kernel@vger.kernel.org,
	dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org,
	virtualization@lists.linux-foundation.org, Deucher, Alexander,
	David Airlie, Erik Faye-Lund, Olsak, Marek,
	Pelloux-Prayer,  Pierre-Eric, Huang, Honglei1, Chen, Jiqian,
	Huang, Ray, David Stevens
  Cc: Huang, Ray, Zhu, Lingshan, robdclark@chromium.org

[-- Attachment #1: Type: text/plain, Size: 3508 bytes --]

Hi all,

Sorry for my late reply. I don't know if you still remember this thread, let me give a quick summary:

  1.  We want to implement the dGPU prime feature in guest VM. But we encountered this issue: virtio-gpu doesn’t have ->get_sg_table implemented which is required by drm_gem_map_attach(). This is modified by: 207395da5a97 (“drm/prime: reject DMA-BUF attach when get_sg_table is missing”).
  2.  To fix this, I override the function virtgpu_gem_device_attach() to not call drm_gem_map_attach() for vram object so drm_gem_map_attach() will not return -ENOSYS for not having ->get_sg_table.
  3.  Then you think this is incorrect and drm_gem_map_attach() requires get_sg_table to be implemented is intentional. I should either implement ->attach or ->get_sg_table for virtio-gpu.
  4.  As discussed, I implemented ->attach for virtio-gpu, but you suggested that I should check peer2peer flag first.
  5.  Now I have the implementation to get p2p_distance and check the p2p flag already, but I found that Rob Clark merged a patch to fix above patch: 207395da5a97 (“drm/prime: reject DMA-BUF attach when get_sg_table is missing”)
     *   Rob’s patch: https://patchwork.freedesktop.org/patch/584318/
  6.  With Rob’s patch, ->get_sg_table isn’t required for virtio-gpu anymore and  it seems p2p flag also doesn’t need to be checked anymore.

So I want to rediscuss if we still need to do p2p checking now?

If so, I will send out my implementation soon.

Best regards,

Julia


On 2024/1/31 22:32, Christian König wrote:
Am 31.01.24 um 11:20 schrieb Zhang, Julia:


On 2024/1/30 22:23, Christian König wrote:

Am 30.01.24 um 12:16 schrieb Daniel Vetter:

On Tue, Jan 30, 2024 at 12:10:31PM +0100, Daniel Vetter wrote:
[SNIP]

Hi Sima, Christian,



Yeah, that is really just speculative. All importers need to set the peer2peer flag just in case.

I see, I will modify this.



What happens under the hood is that IOMMU redirects the "VRAM" memory access to whatever address the DMA-buf on the host is pointing to (system, VRAM, doorbell, IOMMU, whatever).



I'm also not 100% sure if all the cache snooping is done correctly in all cases, but for now it seems to work.

Frankly the more I look at the original patch that added vram export

support the more this just looks like a "pls revert, this is just too

broken".

The commit I mean is this one: ea5ea3d8a117 ("drm/virtio: support mapping

exported vram"). The commit message definitely needs to cite that one, and

also needs a cc: stable because not rejecting invalid imports is a pretty

big deal.

Yeah, I've pointed out that commit in an internal discussion as well. I was just not aware that it's that severely broken.



Yeah we have mentioned this patch before, but I don't totally understand why this is too broken. Without exporting vram objects, dGPU prime feature would not be realized.

Would you mind to explain more about it. Thanks!

One reason is that using sg tables without struct pages is actually a hack we came up with because we couldn't hope to clean up the sg table structure any time soon to not include struct page pointers.

Another reason is that using this with devices which don't expect a DMA address pointing into a virtual PCI BAR. So doing this without checking the peer2peer flag can most likely cause quite a bit of trouble.

Regards,
Christian.



Best regards,

Julia



Regards,

Christian.



[-- Attachment #2: Type: text/html, Size: 10406 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH v2 1/1] drm/virtio: Implement device_attach
  2024-11-29  7:51           ` Zhang, Julia
@ 2024-12-04  3:46             ` Zhang, Julia
  2024-12-04  9:32               ` Christian König
  0 siblings, 1 reply; 15+ messages in thread
From: Zhang, Julia @ 2024-12-04  3:46 UTC (permalink / raw)
  To: Koenig, Christian, Gurchetan Singh, Chia-I Wu, David Airlie,
	Gerd Hoffmann, linux-kernel@vger.kernel.org,
	dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org,
	virtualization@lists.linux-foundation.org, Deucher, Alexander,
	David Airlie, Erik Faye-Lund, Olsak, Marek, David Stevens,
	Daniel Vetter
  Cc: Huang, Ray, Zhu, Lingshan, robdclark@chromium.org,
	Pelloux-Prayer, Pierre-Eric, Huang, Honglei1, Chen, Jiqian

[-- Attachment #1: Type: text/plain, Size: 4691 bytes --]

[AMD Official Use Only - AMD Internal Distribution Only]

Hi Sima, Christian,

I would like to rediscuss about p2p in guest VM, can you please take a look. Thanks.

Best regards,
Julia

From: Zhang, Julia <Julia.Zhang@amd.com>
Sent: Friday, November 29, 2024 3:52 PM
To: Koenig, Christian <Christian.Koenig@amd.com>; Zhang, Julia <Julia.Zhang@amd.com>; Gurchetan Singh <gurchetansingh@chromium.org>; Chia-I Wu <olvaffe@gmail.com>; David Airlie <airlied@redhat.com>; Gerd Hoffmann <kraxel@redhat.com>; linux-kernel@vger.kernel.org; dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; virtualization@lists.linux-foundation.org; Deucher, Alexander <Alexander.Deucher@amd.com>; David Airlie <airlied@gmail.com>; Erik Faye-Lund <kusmabite@gmail.com>; Olsak, Marek <Marek.Olsak@amd.com>; Pelloux-Prayer, Pierre-Eric <Pierre-eric.Pelloux-prayer@amd.com>; Huang, Honglei1 <Honglei1.Huang@amd.com>; Chen, Jiqian <Jiqian.Chen@amd.com>; Huang, Ray <Ray.Huang@amd.com>; David Stevens <stevensd@chromium.org>
Cc: Huang, Ray <Ray.Huang@amd.com>; Zhu, Lingshan <Lingshan.Zhu@amd.com>; robdclark@chromium.org
Subject: Re: [PATCH v2 1/1] drm/virtio: Implement device_attach


Hi all,

Sorry for my late reply. I don't know if you still remember this thread, let me give a quick summary:

  1.  We want to implement the dGPU prime feature in guest VM. But we encountered this issue: virtio-gpu doesn’t have ->get_sg_table implemented which is required by drm_gem_map_attach(). This is modified by: 207395da5a97 (“drm/prime: reject DMA-BUF attach when get_sg_table is missing”).
  2.  To fix this, I override the function virtgpu_gem_device_attach() to not call drm_gem_map_attach() for vram object so drm_gem_map_attach() will not return -ENOSYS for not having ->get_sg_table.
  3.  Then you think this is incorrect and drm_gem_map_attach() requires get_sg_table to be implemented is intentional. I should either implement ->attach or ->get_sg_table for virtio-gpu.
  4.  As discussed, I implemented ->attach for virtio-gpu, but you suggested that I should check peer2peer flag first.
  5.  Now I have the implementation to get p2p_distance and check the p2p flag already, but I found that Rob Clark merged a patch to fix above patch: 207395da5a97 (“drm/prime: reject DMA-BUF attach when get_sg_table is missing”)
     *   Rob’s patch: https://patchwork.freedesktop.org/patch/584318/
  6.  With Rob’s patch, ->get_sg_table isn’t required for virtio-gpu anymore and  it seems p2p flag also doesn’t need to be checked anymore.

So I want to rediscuss if we still need to do p2p checking now?

If so, I will send out my implementation soon.

Best regards,

Julia


On 2024/1/31 22:32, Christian König wrote:
Am 31.01.24 um 11:20 schrieb Zhang, Julia:

On 2024/1/30 22:23, Christian König wrote:

Am 30.01.24 um 12:16 schrieb Daniel Vetter:

On Tue, Jan 30, 2024 at 12:10:31PM +0100, Daniel Vetter wrote:
[SNIP]

Hi Sima, Christian,



Yeah, that is really just speculative. All importers need to set the peer2peer flag just in case.

I see, I will modify this.



What happens under the hood is that IOMMU redirects the "VRAM" memory access to whatever address the DMA-buf on the host is pointing to (system, VRAM, doorbell, IOMMU, whatever).



I'm also not 100% sure if all the cache snooping is done correctly in all cases, but for now it seems to work.

Frankly the more I look at the original patch that added vram export

support the more this just looks like a "pls revert, this is just too

broken".

The commit I mean is this one: ea5ea3d8a117 ("drm/virtio: support mapping

exported vram"). The commit message definitely needs to cite that one, and

also needs a cc: stable because not rejecting invalid imports is a pretty

big deal.

Yeah, I've pointed out that commit in an internal discussion as well. I was just not aware that it's that severely broken.



Yeah we have mentioned this patch before, but I don't totally understand why this is too broken. Without exporting vram objects, dGPU prime feature would not be realized.

Would you mind to explain more about it. Thanks!

One reason is that using sg tables without struct pages is actually a hack we came up with because we couldn't hope to clean up the sg table structure any time soon to not include struct page pointers.

Another reason is that using this with devices which don't expect a DMA address pointing into a virtual PCI BAR. So doing this without checking the peer2peer flag can most likely cause quite a bit of trouble.

Regards,
Christian.


Best regards,

Julia



Regards,

Christian.



[-- Attachment #2: Type: text/html, Size: 12640 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/1] drm/virtio: Implement device_attach
  2024-12-04  3:46             ` Zhang, Julia
@ 2024-12-04  9:32               ` Christian König
  2024-12-06  6:43                 ` Zhang, Julia
  0 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2024-12-04  9:32 UTC (permalink / raw)
  To: Zhang, Julia, Gurchetan Singh, Chia-I Wu, David Airlie,
	Gerd Hoffmann, linux-kernel@vger.kernel.org,
	dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org,
	virtualization@lists.linux-foundation.org, Deucher, Alexander,
	David Airlie, Erik Faye-Lund, Olsak, Marek, David Stevens,
	Daniel Vetter
  Cc: Huang, Ray, Zhu, Lingshan, robdclark@chromium.org,
	Pelloux-Prayer, Pierre-Eric, Huang, Honglei1, Chen, Jiqian

[-- Attachment #1: Type: text/plain, Size: 6306 bytes --]

Hi Julia,

sorry I totally missed your mail.

The basic problem for P2P is what I already described in my previous mail:

> Well the problem is the virtualized environment. pci_p2pdma_distance() 
> checks if two physical PCI devices can communicate with each other 
> (and returns how many hops are in between).
>
> But inside a VM you don't see the physical devices, you can only see 
> passed through devices plus your virtual device and a bunch of virtual 
> bridges.
>
> So what pci_p2pdma_distance() returns inside the VM is actually 
> completely meaningless. It can be that P2P works, but it can also be 
> P2P doesn't work because on the physical system you have a bridge, 
> root complex or whatever which is blacklisted and won't work for some 
> reason.

So the basic problem is that you can't figure out inside the VM if P2P 
is possible or not.

As long as you don't fix this it's irrelevant if you have get_sg_table 
implemented or not, you first need to figure out the basic and not try 
to implement some detail.

Regards,
Christian.

Am 04.12.24 um 04:46 schrieb Zhang, Julia:
>
> [AMD Official Use Only - AMD Internal Distribution Only]
>
>
> Hi Sima, Christian,
>
> I would like to rediscuss about p2p in guest VM, can you please take a 
> look. Thanks.
>
> Best regards,
>
> Julia
>
> *From:*Zhang, Julia <Julia.Zhang@amd.com>
> *Sent:* Friday, November 29, 2024 3:52 PM
> *To:* Koenig, Christian <Christian.Koenig@amd.com>; Zhang, Julia 
> <Julia.Zhang@amd.com>; Gurchetan Singh <gurchetansingh@chromium.org>; 
> Chia-I Wu <olvaffe@gmail.com>; David Airlie <airlied@redhat.com>; Gerd 
> Hoffmann <kraxel@redhat.com>; linux-kernel@vger.kernel.org; 
> dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; 
> virtualization@lists.linux-foundation.org; Deucher, Alexander 
> <Alexander.Deucher@amd.com>; David Airlie <airlied@gmail.com>; Erik 
> Faye-Lund <kusmabite@gmail.com>; Olsak, Marek <Marek.Olsak@amd.com>; 
> Pelloux-Prayer, Pierre-Eric <Pierre-eric.Pelloux-prayer@amd.com>; 
> Huang, Honglei1 <Honglei1.Huang@amd.com>; Chen, Jiqian 
> <Jiqian.Chen@amd.com>; Huang, Ray <Ray.Huang@amd.com>; David Stevens 
> <stevensd@chromium.org>
> *Cc:* Huang, Ray <Ray.Huang@amd.com>; Zhu, Lingshan 
> <Lingshan.Zhu@amd.com>; robdclark@chromium.org
> *Subject:* Re: [PATCH v2 1/1] drm/virtio: Implement device_attach
>
> Hi all,
>
> Sorry for my late reply. I don't know if you still remember this 
> thread, let me give a quick summary:
>
>  1. We want to implement the dGPU prime feature in guest VM. But we
>     encountered this issue: virtio-gpu doesn’t have ->get_sg_table
>     implemented which is required by drm_gem_map_attach(). This is
>     modified by: 207395da5a97 (“drm/prime: reject DMA-BUF attach when
>     get_sg_table is missing”).
>  2. To fix this, I override the function virtgpu_gem_device_attach()
>     to not call drm_gem_map_attach() for vram object so
>     drm_gem_map_attach() will not return -ENOSYS for not having
>     ->get_sg_table.
>  3. Then you think this is incorrect and drm_gem_map_attach() requires
>     get_sg_table to be implemented is intentional. I should either
>     implement ->attach or ->get_sg_table for virtio-gpu.
>  4. As discussed, I implemented ->attach for virtio-gpu, but you
>     suggested that I should check peer2peer flag first.
>  5. Now I have the implementation to get p2p_distance and check the
>     p2p flag already, but I found that Rob Clark merged a patch to fix
>     above patch: 207395da5a97 (“drm/prime: reject DMA-BUF attach when
>     get_sg_table is missing”)
>      1. Rob’s patch: https://patchwork.freedesktop.org/patch/584318/
>  6. With Rob’s patch, ->get_sg_table isn’t required for virtio-gpu
>     anymore and  it seems p2p flag also doesn’t need to be checked
>     anymore.
>
> So I want to rediscuss if we still need to do p2p checking now?
>
> If so, I will send out my implementation soon.
>
> Best regards,
>
> Julia
>
> On 2024/1/31 22:32, Christian König wrote:
>
>     Am 31.01.24 um 11:20 schrieb Zhang, Julia:
>
>         On 2024/1/30 22:23, Christian König wrote:
>
>             Am 30.01.24 um 12:16 schrieb Daniel Vetter:
>
>                 On Tue, Jan 30, 2024 at 12:10:31PM +0100, Daniel Vetter wrote:
>
>                     [SNIP]
>
>         Hi Sima, Christian,
>
>             Yeah, that is really just speculative. All importers need to set the peer2peer flag just in case.
>
>         I see, I will modify this.
>
>             What happens under the hood is that IOMMU redirects the "VRAM" memory access to whatever address the DMA-buf on the host is pointing to (system, VRAM, doorbell, IOMMU, whatever).
>
>             I'm also not 100% sure if all the cache snooping is done correctly in all cases, but for now it seems to work.
>
>                     Frankly the more I look at the original patch that added vram export
>
>                     support the more this just looks like a "pls revert, this is just too
>
>                     broken".
>
>                 The commit I mean is this one: ea5ea3d8a117 ("drm/virtio: support mapping
>
>                 exported vram"). The commit message definitely needs to cite that one, and
>
>                 also needs a cc: stable because not rejecting invalid imports is a pretty
>
>                 big deal.
>
>             Yeah, I've pointed out that commit in an internal discussion as well. I was just not aware that it's that severely broken.
>
>         Yeah we have mentioned this patch before, but I don't totally understand why this is too broken. Without exporting vram objects, dGPU prime feature would not be realized.
>
>         Would you mind to explain more about it. Thanks!
>
>
>     One reason is that using sg tables without struct pages is
>     actually a hack we came up with because we couldn't hope to clean
>     up the sg table structure any time soon to not include struct page
>     pointers.
>
>     Another reason is that using this with devices which don't expect
>     a DMA address pointing into a virtual PCI BAR. So doing this
>     without checking the peer2peer flag can most likely cause quite a
>     bit of trouble.
>
>     Regards,
>     Christian.
>
>         Best regards,
>
>         Julia
>
>             Regards,
>
>             Christian.
>

[-- Attachment #2: Type: text/html, Size: 16253 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/1] drm/virtio: Implement device_attach
  2024-12-04  9:32               ` Christian König
@ 2024-12-06  6:43                 ` Zhang, Julia
  0 siblings, 0 replies; 15+ messages in thread
From: Zhang, Julia @ 2024-12-06  6:43 UTC (permalink / raw)
  To: Koenig, Christian, Zhang, Julia, Gurchetan Singh, Chia-I Wu,
	David Airlie, Gerd Hoffmann, linux-kernel@vger.kernel.org,
	dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org,
	virtualization@lists.linux-foundation.org, Deucher, Alexander,
	David Airlie, Erik Faye-Lund, Olsak, Marek, David Stevens,
	Daniel Vetter
  Cc: Huang, Ray, Zhu, Lingshan, robdclark@chromium.org,
	Pelloux-Prayer, Pierre-Eric, Huang, Honglei1, Chen, Jiqian

[-- Attachment #1: Type: text/plain, Size: 6827 bytes --]

Hi Christian,

Thanks for your reply.
I get your point.  Actually I have a design to send a command from guest with pci notations of amdgpu and virtio-gpu to host side to calculate physical p2pdma_distance. I will send out those patches soon.
Best regards,
Julia


On 2024/12/4 17:32, Christian König wrote:
Hi Julia,

sorry I totally missed your mail.

The basic problem for P2P is what I already described in my previous mail:


Well the problem is the virtualized environment. pci_p2pdma_distance() checks if two physical PCI devices can communicate with each other (and returns how many hops are in between).

But inside a VM you don't see the physical devices, you can only see passed through devices plus your virtual device and a bunch of virtual bridges.

So what pci_p2pdma_distance() returns inside the VM is actually completely meaningless. It can be that P2P works, but it can also be P2P doesn't work because on the physical system you have a bridge, root complex or whatever which is blacklisted and won't work for some reason.

So the basic problem is that you can't figure out inside the VM if P2P is possible or not.

As long as you don't fix this it's irrelevant if you have get_sg_table implemented or not, you first need to figure out the basic and not try to implement some detail.

Regards,
Christian.
Am 04.12.24 um 04:46 schrieb Zhang, Julia:

[AMD Official Use Only - AMD Internal Distribution Only]

Hi Sima, Christian,

I would like to rediscuss about p2p in guest VM, can you please take a look. Thanks.

Best regards,
Julia

From: Zhang, Julia <Julia.Zhang@amd.com><mailto:Julia.Zhang@amd.com>
Sent: Friday, November 29, 2024 3:52 PM
To: Koenig, Christian <Christian.Koenig@amd.com><mailto:Christian.Koenig@amd.com>; Zhang, Julia <Julia.Zhang@amd.com><mailto:Julia.Zhang@amd.com>; Gurchetan Singh <gurchetansingh@chromium.org><mailto:gurchetansingh@chromium.org>; Chia-I Wu <olvaffe@gmail.com><mailto:olvaffe@gmail.com>; David Airlie <airlied@redhat.com><mailto:airlied@redhat.com>; Gerd Hoffmann <kraxel@redhat.com><mailto:kraxel@redhat.com>; linux-kernel@vger.kernel.org<mailto:linux-kernel@vger.kernel.org>; dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>; virtualization@lists.linux-foundation.org<mailto:virtualization@lists.linux-foundation.org>; Deucher, Alexander <Alexander.Deucher@amd.com><mailto:Alexander.Deucher@amd.com>; David Airlie <airlied@gmail.com><mailto:airlied@gmail.com>; Erik Faye-Lund <kusmabite@gmail.com><mailto:kusmabite@gmail.com>; Olsak, Marek <Marek.Olsak@amd.com><mailto:Marek.Olsak@amd.com>; Pelloux-Prayer, Pierre-Eric <Pierre-eric.Pelloux-prayer@amd.com><mailto:Pierre-eric.Pelloux-prayer@amd.com>; Huang, Honglei1 <Honglei1.Huang@amd.com><mailto:Honglei1.Huang@amd.com>; Chen, Jiqian <Jiqian.Chen@amd.com><mailto:Jiqian.Chen@amd.com>; Huang, Ray <Ray.Huang@amd.com><mailto:Ray.Huang@amd.com>; David Stevens <stevensd@chromium.org><mailto:stevensd@chromium.org>
Cc: Huang, Ray <Ray.Huang@amd.com><mailto:Ray.Huang@amd.com>; Zhu, Lingshan <Lingshan.Zhu@amd.com><mailto:Lingshan.Zhu@amd.com>; robdclark@chromium.org<mailto:robdclark@chromium.org>
Subject: Re: [PATCH v2 1/1] drm/virtio: Implement device_attach


Hi all,

Sorry for my late reply. I don't know if you still remember this thread, let me give a quick summary:

  1.  We want to implement the dGPU prime feature in guest VM. But we encountered this issue: virtio-gpu doesn’t have ->get_sg_table implemented which is required by drm_gem_map_attach(). This is modified by: 207395da5a97 (“drm/prime: reject DMA-BUF attach when get_sg_table is missing”).
  2.  To fix this, I override the function virtgpu_gem_device_attach() to not call drm_gem_map_attach() for vram object so drm_gem_map_attach() will not return -ENOSYS for not having ->get_sg_table.
  3.  Then you think this is incorrect and drm_gem_map_attach() requires get_sg_table to be implemented is intentional. I should either implement ->attach or ->get_sg_table for virtio-gpu.
  4.  As discussed, I implemented ->attach for virtio-gpu, but you suggested that I should check peer2peer flag first.
  5.  Now I have the implementation to get p2p_distance and check the p2p flag already, but I found that Rob Clark merged a patch to fix above patch: 207395da5a97 (“drm/prime: reject DMA-BUF attach when get_sg_table is missing”)

     *   Rob’s patch: https://patchwork.freedesktop.org/patch/584318/

  1.  With Rob’s patch, ->get_sg_table isn’t required for virtio-gpu anymore and  it seems p2p flag also doesn’t need to be checked anymore.

So I want to rediscuss if we still need to do p2p checking now?

If so, I will send out my implementation soon.

Best regards,

Julia


On 2024/1/31 22:32, Christian König wrote:
Am 31.01.24 um 11:20 schrieb Zhang, Julia:

On 2024/1/30 22:23, Christian König wrote:

Am 30.01.24 um 12:16 schrieb Daniel Vetter:

On Tue, Jan 30, 2024 at 12:10:31PM +0100, Daniel Vetter wrote:
[SNIP]

Hi Sima, Christian,



Yeah, that is really just speculative. All importers need to set the peer2peer flag just in case.

I see, I will modify this.



What happens under the hood is that IOMMU redirects the "VRAM" memory access to whatever address the DMA-buf on the host is pointing to (system, VRAM, doorbell, IOMMU, whatever).



I'm also not 100% sure if all the cache snooping is done correctly in all cases, but for now it seems to work.

Frankly the more I look at the original patch that added vram export

support the more this just looks like a "pls revert, this is just too

broken".

The commit I mean is this one: ea5ea3d8a117 ("drm/virtio: support mapping

exported vram"). The commit message definitely needs to cite that one, and

also needs a cc: stable because not rejecting invalid imports is a pretty

big deal.

Yeah, I've pointed out that commit in an internal discussion as well. I was just not aware that it's that severely broken.



Yeah we have mentioned this patch before, but I don't totally understand why this is too broken. Without exporting vram objects, dGPU prime feature would not be realized.

Would you mind to explain more about it. Thanks!

One reason is that using sg tables without struct pages is actually a hack we came up with because we couldn't hope to clean up the sg table structure any time soon to not include struct page pointers.

Another reason is that using this with devices which don't expect a DMA address pointing into a virtual PCI BAR. So doing this without checking the peer2peer flag can most likely cause quite a bit of trouble.

Regards,
Christian.



Best regards,

Julia



Regards,

Christian.




[-- Attachment #2: Type: text/html, Size: 15760 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2024-12-06  6:44 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-29 10:31 [PATCH v2 1/1] drm/virtio: Implement device_attach Julia Zhang
2024-01-29 10:31 ` Julia Zhang
2024-01-29 13:12 ` Christian König
2024-01-29 13:12   ` Christian König
2024-01-30 11:10 ` Daniel Vetter
2024-01-30 11:10   ` Daniel Vetter
2024-01-30 11:10   ` Daniel Vetter
2024-01-30 11:16   ` Daniel Vetter
2024-01-30 14:23     ` Christian König
2024-01-31 10:20       ` Zhang, Julia
2024-01-31 14:32         ` Christian König
2024-11-29  7:51           ` Zhang, Julia
2024-12-04  3:46             ` Zhang, Julia
2024-12-04  9:32               ` Christian König
2024-12-06  6:43                 ` Zhang, Julia

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.