Linux virtualization list
 help / color / mirror / Atom feed
* Re: [PATCH 13/15] drm/vboxvideo: Convert vboxvideo driver to Simple TTM
From: Hans de Goede @ 2019-04-09  7:09 UTC (permalink / raw)
  To: Thomas Zimmermann, daniel, airlied, kraxel, christian.koenig,
	ray.huang, Jerry.Zhang, z.liuxinliang, zourongrong,
	kong.kongxinwei, puck.chen
  Cc: dri-devel, virtualization
In-Reply-To: <20190408092144.4548-14-tzimmermann@suse.de>

Hi,

On 08-04-19 11:21, Thomas Zimmermann wrote:
> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>

Patch looks good to me (although perhaps it needs a commit msg):

Reviewed-by: Hans de Goede <hdegoede@redhat.com>

Regards,

Hans



> ---
>   drivers/gpu/drm/vboxvideo/Kconfig    |   1 +
>   drivers/gpu/drm/vboxvideo/vbox_drv.h |   6 +-
>   drivers/gpu/drm/vboxvideo/vbox_ttm.c | 123 ++-------------------------
>   3 files changed, 12 insertions(+), 118 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vboxvideo/Kconfig b/drivers/gpu/drm/vboxvideo/Kconfig
> index c1ca87df81df..e29df360978d 100644
> --- a/drivers/gpu/drm/vboxvideo/Kconfig
> +++ b/drivers/gpu/drm/vboxvideo/Kconfig
> @@ -4,6 +4,7 @@ config DRM_VBOXVIDEO
>   	select DRM_KMS_HELPER
>   	select DRM_TTM
>   	select DRM_GEM_TTM_HELPER
> +	select DRM_SIMPLE_TTM_HELPER
>   	select GENERIC_ALLOCATOR
>   	help
>   	  This is a KMS driver for the virtual Graphics Card used in
> diff --git a/drivers/gpu/drm/vboxvideo/vbox_drv.h b/drivers/gpu/drm/vboxvideo/vbox_drv.h
> index 7db4e961805d..d4cfcc6339ef 100644
> --- a/drivers/gpu/drm/vboxvideo/vbox_drv.h
> +++ b/drivers/gpu/drm/vboxvideo/vbox_drv.h
> @@ -20,6 +20,8 @@
>   #include <drm/drm_gem.h>
>   #include <drm/drm_gem_ttm_helper.h>
>   
> +#include <drm/drm_simple_ttm_helper.h>
> +
>   #include <drm/ttm/ttm_bo_api.h>
>   #include <drm/ttm/ttm_bo_driver.h>
>   #include <drm/ttm/ttm_placement.h>
> @@ -78,9 +80,7 @@ struct vbox_private {
>   
>   	int fb_mtrr;
>   
> -	struct {
> -		struct ttm_bo_device bdev;
> -	} ttm;
> +	struct drm_simple_ttm ttm;
>   
>   	struct mutex hw_mutex; /* protects modeset and accel/vbva accesses */
>   	struct work_struct hotplug_work;
> diff --git a/drivers/gpu/drm/vboxvideo/vbox_ttm.c b/drivers/gpu/drm/vboxvideo/vbox_ttm.c
> index a1d64e1ea90c..115ec44636ab 100644
> --- a/drivers/gpu/drm/vboxvideo/vbox_ttm.c
> +++ b/drivers/gpu/drm/vboxvideo/vbox_ttm.c
> @@ -11,128 +11,25 @@
>   #include <drm/ttm/ttm_page_alloc.h>
>   #include "vbox_drv.h"
>   
> -static inline struct vbox_private *vbox_bdev(struct ttm_bo_device *bd)
> -{
> -	return container_of(bd, struct vbox_private, ttm.bdev);
> -}
> -
> -static int
> -vbox_bo_init_mem_type(struct ttm_bo_device *bdev, u32 type,
> -		      struct ttm_mem_type_manager *man)
> -{
> -	switch (type) {
> -	case TTM_PL_SYSTEM:
> -		man->flags = TTM_MEMTYPE_FLAG_MAPPABLE;
> -		man->available_caching = TTM_PL_MASK_CACHING;
> -		man->default_caching = TTM_PL_FLAG_CACHED;
> -		break;
> -	case TTM_PL_VRAM:
> -		man->func = &ttm_bo_manager_func;
> -		man->flags = TTM_MEMTYPE_FLAG_FIXED | TTM_MEMTYPE_FLAG_MAPPABLE;
> -		man->available_caching = TTM_PL_FLAG_UNCACHED | TTM_PL_FLAG_WC;
> -		man->default_caching = TTM_PL_FLAG_WC;
> -		break;
> -	default:
> -		DRM_ERROR("Unsupported memory type %u\n", (unsigned int)type);
> -		return -EINVAL;
> -	}
> -
> -	return 0;
> -}
> -
> -static int vbox_ttm_io_mem_reserve(struct ttm_bo_device *bdev,
> -				   struct ttm_mem_reg *mem)
> -{
> -	struct ttm_mem_type_manager *man = &bdev->man[mem->mem_type];
> -	struct vbox_private *vbox = vbox_bdev(bdev);
> -
> -	mem->bus.addr = NULL;
> -	mem->bus.offset = 0;
> -	mem->bus.size = mem->num_pages << PAGE_SHIFT;
> -	mem->bus.base = 0;
> -	mem->bus.is_iomem = false;
> -	if (!(man->flags & TTM_MEMTYPE_FLAG_MAPPABLE))
> -		return -EINVAL;
> -	switch (mem->mem_type) {
> -	case TTM_PL_SYSTEM:
> -		/* system memory */
> -		return 0;
> -	case TTM_PL_VRAM:
> -		mem->bus.offset = mem->start << PAGE_SHIFT;
> -		mem->bus.base = pci_resource_start(vbox->ddev.pdev, 0);
> -		mem->bus.is_iomem = true;
> -		break;
> -	default:
> -		return -EINVAL;
> -	}
> -	return 0;
> -}
> -
> -static void vbox_ttm_io_mem_free(struct ttm_bo_device *bdev,
> -				 struct ttm_mem_reg *mem)
> -{
> -}
> -
> -static void vbox_ttm_backend_destroy(struct ttm_tt *tt)
> -{
> -	ttm_tt_fini(tt);
> -	kfree(tt);
> -}
> -
> -static struct ttm_backend_func vbox_tt_backend_func = {
> -	.destroy = &vbox_ttm_backend_destroy,
> -};
> -
> -static struct ttm_tt *vbox_ttm_tt_create(struct ttm_buffer_object *bo,
> -					 u32 page_flags)
> -{
> -	struct ttm_tt *tt;
> -
> -	tt = kzalloc(sizeof(*tt), GFP_KERNEL);
> -	if (!tt)
> -		return NULL;
> -
> -	tt->func = &vbox_tt_backend_func;
> -	if (ttm_tt_init(tt, bo, page_flags)) {
> -		kfree(tt);
> -		return NULL;
> -	}
> -
> -	return tt;
> -}
> -
> -static struct ttm_bo_driver vbox_bo_driver = {
> -	.ttm_tt_create = vbox_ttm_tt_create,
> -	.init_mem_type = vbox_bo_init_mem_type,
> -	.eviction_valuable = ttm_bo_eviction_valuable,
> +static const struct drm_simple_ttm_funcs vbox_simple_ttm_funcs = {
>   	.evict_flags = drm_gem_ttm_bo_driver_evict_flags,
> -	.verify_access = drm_gem_ttm_bo_driver_verify_access,
> -	.io_mem_reserve = &vbox_ttm_io_mem_reserve,
> -	.io_mem_free = &vbox_ttm_io_mem_free,
> +	.verify_access = drm_gem_ttm_bo_driver_verify_access
>   };
>   
>   int vbox_mm_init(struct vbox_private *vbox)
>   {
>   	int ret;
>   	struct drm_device *dev = &vbox->ddev;
> -	struct ttm_bo_device *bdev = &vbox->ttm.bdev;
>   
> -	ret = ttm_bo_device_init(&vbox->ttm.bdev,
> -				 &vbox_bo_driver,
> -				 dev->anon_inode->i_mapping,
> -				 true);
> +	ret = drm_simple_ttm_init(&vbox->ttm, dev,
> +				  pci_resource_start(dev->pdev, 0),
> +				  vbox->available_vram_size,
> +				  &vbox_simple_ttm_funcs);
>   	if (ret) {
> -		DRM_ERROR("Error initialising bo driver; %d\n", ret);
> +		DRM_ERROR("Error initializing Simple TTM; %d\n", ret);
>   		return ret;
>   	}
>   
> -	ret = ttm_bo_init_mm(bdev, TTM_PL_VRAM,
> -			     vbox->available_vram_size >> PAGE_SHIFT);
> -	if (ret) {
> -		DRM_ERROR("Failed ttm VRAM init: %d\n", ret);
> -		goto err_device_release;
> -	}
> -
>   #ifdef DRM_MTRR_WC
>   	vbox->fb_mtrr = drm_mtrr_add(pci_resource_start(dev->pdev, 0),
>   				     pci_resource_len(dev->pdev, 0),
> @@ -142,10 +39,6 @@ int vbox_mm_init(struct vbox_private *vbox)
>   					 pci_resource_len(dev->pdev, 0));
>   #endif
>   	return 0;
> -
> -err_device_release:
> -	ttm_bo_device_release(&vbox->ttm.bdev);
> -	return ret;
>   }
>   
>   void vbox_mm_fini(struct vbox_private *vbox)
> @@ -157,7 +50,7 @@ void vbox_mm_fini(struct vbox_private *vbox)
>   #else
>   	arch_phys_wc_del(vbox->fb_mtrr);
>   #endif
> -	ttm_bo_device_release(&vbox->ttm.bdev);
> +	drm_simple_ttm_cleanup(&vbox->ttm);
>   }
>   
>   int vbox_mmap(struct file *filp, struct vm_area_struct *vma)
> 

^ permalink raw reply

* Re: [PATCH 12/15] drm/vboxvideo: Convert vboxvideo driver to |struct drm_gem_ttm_object|
From: Hans de Goede @ 2019-04-09  7:09 UTC (permalink / raw)
  To: Thomas Zimmermann, daniel, airlied, kraxel, christian.koenig,
	ray.huang, Jerry.Zhang, z.liuxinliang, zourongrong,
	kong.kongxinwei, puck.chen
  Cc: dri-devel, virtualization
In-Reply-To: <20190408092144.4548-13-tzimmermann@suse.de>

Hi,

On 08-04-19 11:21, Thomas Zimmermann wrote:
> This patch replaces |struct vbox_bo| and its helpers with the generic
> implementation of |struct drm_gem_ttm_object|. The only change in
> semantics is that &ttm_bo_driver.verify_access() now does the actual
> verification.
> 
> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>

Nice cleanup, thank you, patch looks good to me:

Reviewed-by: Hans de Goede <hdegoede@redhat.com>

Regards,

Hans

> ---
>   drivers/gpu/drm/vboxvideo/Kconfig     |   1 +
>   drivers/gpu/drm/vboxvideo/vbox_drv.c  |   5 +-
>   drivers/gpu/drm/vboxvideo/vbox_drv.h  |  58 +------
>   drivers/gpu/drm/vboxvideo/vbox_fb.c   |  22 +--
>   drivers/gpu/drm/vboxvideo/vbox_main.c |  70 +-------
>   drivers/gpu/drm/vboxvideo/vbox_mode.c |  36 +++--
>   drivers/gpu/drm/vboxvideo/vbox_ttm.c  | 223 +-------------------------
>   7 files changed, 45 insertions(+), 370 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vboxvideo/Kconfig b/drivers/gpu/drm/vboxvideo/Kconfig
> index 1f4182e2e980..c1ca87df81df 100644
> --- a/drivers/gpu/drm/vboxvideo/Kconfig
> +++ b/drivers/gpu/drm/vboxvideo/Kconfig
> @@ -3,6 +3,7 @@ config DRM_VBOXVIDEO
>   	depends on DRM && X86 && PCI
>   	select DRM_KMS_HELPER
>   	select DRM_TTM
> +	select DRM_GEM_TTM_HELPER
>   	select GENERIC_ALLOCATOR
>   	help
>   	  This is a KMS driver for the virtual Graphics Card used in
> diff --git a/drivers/gpu/drm/vboxvideo/vbox_drv.c b/drivers/gpu/drm/vboxvideo/vbox_drv.c
> index fb6a0f0b8167..75b165386935 100644
> --- a/drivers/gpu/drm/vboxvideo/vbox_drv.c
> +++ b/drivers/gpu/drm/vboxvideo/vbox_drv.c
> @@ -215,9 +215,10 @@ static struct drm_driver driver = {
>   	.minor = DRIVER_MINOR,
>   	.patchlevel = DRIVER_PATCHLEVEL,
>   
> -	.gem_free_object_unlocked = vbox_gem_free_object,
> +	.gem_free_object_unlocked =
> +		drm_gem_ttm_driver_gem_free_object_unlocked,
>   	.dumb_create = vbox_dumb_create,
> -	.dumb_map_offset = vbox_dumb_mmap_offset,
> +	.dumb_map_offset = drm_gem_ttm_driver_dumb_mmap_offset,
>   	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
>   	.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
>   	.gem_prime_export = drm_gem_prime_export,
> diff --git a/drivers/gpu/drm/vboxvideo/vbox_drv.h b/drivers/gpu/drm/vboxvideo/vbox_drv.h
> index ece31f395540..7db4e961805d 100644
> --- a/drivers/gpu/drm/vboxvideo/vbox_drv.h
> +++ b/drivers/gpu/drm/vboxvideo/vbox_drv.h
> @@ -18,6 +18,7 @@
>   #include <drm/drm_encoder.h>
>   #include <drm/drm_fb_helper.h>
>   #include <drm/drm_gem.h>
> +#include <drm/drm_gem_ttm_helper.h>
>   
>   #include <drm/ttm/ttm_bo_api.h>
>   #include <drm/ttm/ttm_bo_driver.h>
> @@ -170,73 +171,16 @@ int vboxfb_create(struct drm_fb_helper *helper,
>   		  struct drm_fb_helper_surface_size *sizes);
>   void vbox_fbdev_fini(struct vbox_private *vbox);
>   
> -struct vbox_bo {
> -	struct ttm_buffer_object bo;
> -	struct ttm_placement placement;
> -	struct ttm_bo_kmap_obj kmap;
> -	struct drm_gem_object gem;
> -	struct ttm_place placements[3];
> -	int pin_count;
> -};
> -
> -#define gem_to_vbox_bo(gobj) container_of((gobj), struct vbox_bo, gem)
> -
> -static inline struct vbox_bo *vbox_bo(struct ttm_buffer_object *bo)
> -{
> -	return container_of(bo, struct vbox_bo, bo);
> -}
> -
> -#define to_vbox_obj(x) container_of(x, struct vbox_gem_object, base)
> -
> -static inline u64 vbox_bo_gpu_offset(struct vbox_bo *bo)
> -{
> -	return bo->bo.offset;
> -}
> -
>   int vbox_dumb_create(struct drm_file *file,
>   		     struct drm_device *dev,
>   		     struct drm_mode_create_dumb *args);
>   
> -void vbox_gem_free_object(struct drm_gem_object *obj);
> -int vbox_dumb_mmap_offset(struct drm_file *file,
> -			  struct drm_device *dev,
> -			  u32 handle, u64 *offset);
> -
>   int vbox_mm_init(struct vbox_private *vbox);
>   void vbox_mm_fini(struct vbox_private *vbox);
>   
> -int vbox_bo_create(struct vbox_private *vbox, int size, int align,
> -		   u32 flags, struct vbox_bo **pvboxbo);
> -
>   int vbox_gem_create(struct vbox_private *vbox,
>   		    u32 size, bool iskernel, struct drm_gem_object **obj);
> -
> -int vbox_bo_pin(struct vbox_bo *bo, u32 pl_flag);
> -int vbox_bo_unpin(struct vbox_bo *bo);
> -
> -static inline int vbox_bo_reserve(struct vbox_bo *bo, bool no_wait)
> -{
> -	int ret;
> -
> -	ret = ttm_bo_reserve(&bo->bo, true, no_wait, NULL);
> -	if (ret) {
> -		if (ret != -ERESTARTSYS && ret != -EBUSY)
> -			DRM_ERROR("reserve failed %p\n", bo);
> -		return ret;
> -	}
> -	return 0;
> -}
> -
> -static inline void vbox_bo_unreserve(struct vbox_bo *bo)
> -{
> -	ttm_bo_unreserve(&bo->bo);
> -}
> -
> -void vbox_ttm_placement(struct vbox_bo *bo, int domain);
> -int vbox_bo_push_sysram(struct vbox_bo *bo);
>   int vbox_mmap(struct file *filp, struct vm_area_struct *vma);
> -void *vbox_bo_kmap(struct vbox_bo *bo);
> -void vbox_bo_kunmap(struct vbox_bo *bo);
>   
>   /* vbox_prime.c */
>   int vbox_gem_prime_pin(struct drm_gem_object *obj);
> diff --git a/drivers/gpu/drm/vboxvideo/vbox_fb.c b/drivers/gpu/drm/vboxvideo/vbox_fb.c
> index b724fe7c0c30..1cf0c6bd58b9 100644
> --- a/drivers/gpu/drm/vboxvideo/vbox_fb.c
> +++ b/drivers/gpu/drm/vboxvideo/vbox_fb.c
> @@ -51,9 +51,9 @@ int vboxfb_create(struct drm_fb_helper *helper,
>   	struct drm_framebuffer *fb;
>   	struct fb_info *info;
>   	struct drm_gem_object *gobj;
> -	struct vbox_bo *bo;
> +	struct drm_gem_ttm_object *gbo;
>   	int size, ret;
> -	u64 gpu_addr;
> +	s64 gpu_addr;
>   	u32 pitch;
>   
>   	mode_cmd.width = sizes->surface_width;
> @@ -75,9 +75,9 @@ int vboxfb_create(struct drm_fb_helper *helper,
>   	if (ret)
>   		return ret;
>   
> -	bo = gem_to_vbox_bo(gobj);
> +	gbo = drm_gem_ttm_of_gem(gobj);
>   
> -	ret = vbox_bo_pin(bo, TTM_PL_FLAG_VRAM);
> +	ret = drm_gem_ttm_pin(gbo, TTM_PL_FLAG_VRAM);
>   	if (ret)
>   		return ret;
>   
> @@ -86,7 +86,7 @@ int vboxfb_create(struct drm_fb_helper *helper,
>   		return PTR_ERR(info);
>   
>   	info->screen_size = size;
> -	info->screen_base = (char __iomem *)vbox_bo_kmap(bo);
> +	info->screen_base = (char __iomem *)drm_gem_ttm_kmap(gbo, true);
>   	if (IS_ERR(info->screen_base))
>   		return PTR_ERR(info->screen_base);
>   
> @@ -104,7 +104,9 @@ int vboxfb_create(struct drm_fb_helper *helper,
>   
>   	drm_fb_helper_fill_info(info, helper, sizes);
>   
> -	gpu_addr = vbox_bo_gpu_offset(bo);
> +	gpu_addr = drm_gem_ttm_vram_offset(gbo);
> +	if (gpu_addr < 0)
> +		return (int)gpu_addr;
>   	info->fix.smem_start = info->apertures->ranges[0].base + gpu_addr;
>   	info->fix.smem_len = vbox->available_vram_size - gpu_addr;
>   
> @@ -132,12 +134,10 @@ void vbox_fbdev_fini(struct vbox_private *vbox)
>   	drm_fb_helper_unregister_fbi(&vbox->fb_helper);
>   
>   	if (afb->obj) {
> -		struct vbox_bo *bo = gem_to_vbox_bo(afb->obj);
> +		struct drm_gem_ttm_object *gbo = drm_gem_ttm_of_gem(afb->obj);
>   
> -		vbox_bo_kunmap(bo);
> -
> -		if (bo->pin_count)
> -			vbox_bo_unpin(bo);
> +		drm_gem_ttm_kunmap(gbo);
> +		drm_gem_ttm_unpin(gbo);
>   
>   		drm_gem_object_put_unlocked(afb->obj);
>   		afb->obj = NULL;
> diff --git a/drivers/gpu/drm/vboxvideo/vbox_main.c b/drivers/gpu/drm/vboxvideo/vbox_main.c
> index f4d02de5518a..0c3ede058f2b 100644
> --- a/drivers/gpu/drm/vboxvideo/vbox_main.c
> +++ b/drivers/gpu/drm/vboxvideo/vbox_main.c
> @@ -274,7 +274,7 @@ void vbox_hw_fini(struct vbox_private *vbox)
>   int vbox_gem_create(struct vbox_private *vbox,
>   		    u32 size, bool iskernel, struct drm_gem_object **obj)
>   {
> -	struct vbox_bo *vboxbo;
> +	struct drm_gem_ttm_object *gbo;
>   	int ret;
>   
>   	*obj = NULL;
> @@ -283,14 +283,15 @@ int vbox_gem_create(struct vbox_private *vbox,
>   	if (size == 0)
>   		return -EINVAL;
>   
> -	ret = vbox_bo_create(vbox, size, 0, 0, &vboxbo);
> -	if (ret) {
> +	gbo = drm_gem_ttm_create(&vbox->ddev, &vbox->ttm.bdev, size, 0, false);
> +	if (IS_ERR(gbo)) {
> +		ret = PTR_ERR(gbo);
>   		if (ret != -ERESTARTSYS)
>   			DRM_ERROR("failed to allocate GEM object\n");
>   		return ret;
>   	}
>   
> -	*obj = &vboxbo->gem;
> +	*obj = &gbo->gem;
>   
>   	return 0;
>   }
> @@ -298,64 +299,9 @@ int vbox_gem_create(struct vbox_private *vbox,
>   int vbox_dumb_create(struct drm_file *file,
>   		     struct drm_device *dev, struct drm_mode_create_dumb *args)
>   {
> -	struct vbox_private *vbox =
> -		container_of(dev, struct vbox_private, ddev);
> -	struct drm_gem_object *gobj;
> -	u32 handle;
> -	int ret;
> -
> -	args->pitch = args->width * ((args->bpp + 7) / 8);
> -	args->size = args->pitch * args->height;
> -
> -	ret = vbox_gem_create(vbox, args->size, false, &gobj);
> -	if (ret)
> -		return ret;
> -
> -	ret = drm_gem_handle_create(file, gobj, &handle);
> -	drm_gem_object_put_unlocked(gobj);
> -	if (ret)
> -		return ret;
> +	struct vbox_private *vbox = dev->dev_private;
>   
> -	args->handle = handle;
> -
> -	return 0;
> -}
> -
> -void vbox_gem_free_object(struct drm_gem_object *obj)
> -{
> -	struct vbox_bo *vbox_bo = gem_to_vbox_bo(obj);
> +	return drm_gem_ttm_fill_create_dumb(file, dev, &vbox->ttm.bdev, 0,
> +					    false, args);
>   
> -	ttm_bo_put(&vbox_bo->bo);
> -}
> -
> -static inline u64 vbox_bo_mmap_offset(struct vbox_bo *bo)
> -{
> -	return drm_vma_node_offset_addr(&bo->bo.vma_node);
> -}
> -
> -int
> -vbox_dumb_mmap_offset(struct drm_file *file,
> -		      struct drm_device *dev,
> -		      u32 handle, u64 *offset)
> -{
> -	struct drm_gem_object *obj;
> -	int ret;
> -	struct vbox_bo *bo;
> -
> -	mutex_lock(&dev->struct_mutex);
> -	obj = drm_gem_object_lookup(file, handle);
> -	if (!obj) {
> -		ret = -ENOENT;
> -		goto out_unlock;
> -	}
> -
> -	bo = gem_to_vbox_bo(obj);
> -	*offset = vbox_bo_mmap_offset(bo);
> -
> -	drm_gem_object_put(obj);
> -	ret = 0;
> -
> -out_unlock:
> -	mutex_unlock(&dev->struct_mutex);
> -	return ret;
>   }
> diff --git a/drivers/gpu/drm/vboxvideo/vbox_mode.c b/drivers/gpu/drm/vboxvideo/vbox_mode.c
> index 620a6e38f71f..faabf2801739 100644
> --- a/drivers/gpu/drm/vboxvideo/vbox_mode.c
> +++ b/drivers/gpu/drm/vboxvideo/vbox_mode.c
> @@ -173,7 +173,8 @@ static void vbox_crtc_set_base_and_mode(struct drm_crtc *crtc,
>   					struct drm_framebuffer *fb,
>   					int x, int y)
>   {
> -	struct vbox_bo *bo = gem_to_vbox_bo(to_vbox_framebuffer(fb)->obj);
> +	struct drm_gem_ttm_object *gbo =
> +		drm_gem_ttm_of_gem(to_vbox_framebuffer(fb)->obj);
>   	struct vbox_private *vbox = crtc->dev->dev_private;
>   	struct vbox_crtc *vbox_crtc = to_vbox_crtc(crtc);
>   	bool needs_modeset = drm_atomic_crtc_needs_modeset(crtc->state);
> @@ -187,7 +188,7 @@ static void vbox_crtc_set_base_and_mode(struct drm_crtc *crtc,
>   
>   	vbox_crtc->x = x;
>   	vbox_crtc->y = y;
> -	vbox_crtc->fb_offset = vbox_bo_gpu_offset(bo);
> +	vbox_crtc->fb_offset = drm_gem_ttm_vram_offset(gbo);
>   
>   	/* vbox_do_modeset() checks vbox->single_framebuffer so update it now */
>   	if (needs_modeset && vbox_set_up_input_mapping(vbox)) {
> @@ -303,14 +304,14 @@ static void vbox_primary_atomic_disable(struct drm_plane *plane,
>   static int vbox_primary_prepare_fb(struct drm_plane *plane,
>   				   struct drm_plane_state *new_state)
>   {
> -	struct vbox_bo *bo;
> +	struct drm_gem_ttm_object *gbo;
>   	int ret;
>   
>   	if (!new_state->fb)
>   		return 0;
>   
> -	bo = gem_to_vbox_bo(to_vbox_framebuffer(new_state->fb)->obj);
> -	ret = vbox_bo_pin(bo, TTM_PL_FLAG_VRAM);
> +	gbo = drm_gem_ttm_of_gem(to_vbox_framebuffer(new_state->fb)->obj);
> +	ret = drm_gem_ttm_pin(gbo, TTM_PL_FLAG_VRAM);
>   	if (ret)
>   		DRM_WARN("Error %d pinning new fb, out of video mem?\n", ret);
>   
> @@ -320,13 +321,13 @@ static int vbox_primary_prepare_fb(struct drm_plane *plane,
>   static void vbox_primary_cleanup_fb(struct drm_plane *plane,
>   				    struct drm_plane_state *old_state)
>   {
> -	struct vbox_bo *bo;
> +	struct drm_gem_ttm_object *gbo;
>   
>   	if (!old_state->fb)
>   		return;
>   
> -	bo = gem_to_vbox_bo(to_vbox_framebuffer(old_state->fb)->obj);
> -	vbox_bo_unpin(bo);
> +	gbo = drm_gem_ttm_of_gem(to_vbox_framebuffer(old_state->fb)->obj);
> +	drm_gem_ttm_unpin(gbo);
>   }
>   
>   static int vbox_cursor_atomic_check(struct drm_plane *plane,
> @@ -386,7 +387,8 @@ static void vbox_cursor_atomic_update(struct drm_plane *plane,
>   		container_of(plane->dev, struct vbox_private, ddev);
>   	struct vbox_crtc *vbox_crtc = to_vbox_crtc(plane->state->crtc);
>   	struct drm_framebuffer *fb = plane->state->fb;
> -	struct vbox_bo *bo = gem_to_vbox_bo(to_vbox_framebuffer(fb)->obj);
> +	struct drm_gem_ttm_object *gbo =
> +		drm_gem_ttm_of_gem(to_vbox_framebuffer(fb)->obj);
>   	u32 width = plane->state->crtc_w;
>   	u32 height = plane->state->crtc_h;
>   	size_t data_size, mask_size;
> @@ -405,7 +407,7 @@ static void vbox_cursor_atomic_update(struct drm_plane *plane,
>   	vbox_crtc->cursor_enabled = true;
>   
>   	/* pinning is done in prepare/cleanup framebuffer */
> -	src = vbox_bo_kmap(bo);
> +	src = drm_gem_ttm_kmap(gbo, true);
>   	if (IS_ERR(src)) {
>   		mutex_unlock(&vbox->hw_mutex);
>   		DRM_WARN("Could not kmap cursor bo, skipping update\n");
> @@ -421,7 +423,7 @@ static void vbox_cursor_atomic_update(struct drm_plane *plane,
>   	data_size = width * height * 4 + mask_size;
>   
>   	copy_cursor_image(src, vbox->cursor_data, width, height, mask_size);
> -	vbox_bo_kunmap(bo);
> +	drm_gem_ttm_kunmap(gbo);
>   
>   	flags = VBOX_MOUSE_POINTER_VISIBLE | VBOX_MOUSE_POINTER_SHAPE |
>   		VBOX_MOUSE_POINTER_ALPHA;
> @@ -461,25 +463,25 @@ static void vbox_cursor_atomic_disable(struct drm_plane *plane,
>   static int vbox_cursor_prepare_fb(struct drm_plane *plane,
>   				  struct drm_plane_state *new_state)
>   {
> -	struct vbox_bo *bo;
> +	struct drm_gem_ttm_object *gbo;
>   
>   	if (!new_state->fb)
>   		return 0;
>   
> -	bo = gem_to_vbox_bo(to_vbox_framebuffer(new_state->fb)->obj);
> -	return vbox_bo_pin(bo, TTM_PL_FLAG_SYSTEM);
> +	gbo = drm_gem_ttm_of_gem(to_vbox_framebuffer(new_state->fb)->obj);
> +	return drm_gem_ttm_pin(gbo, TTM_PL_FLAG_SYSTEM);
>   }
>   
>   static void vbox_cursor_cleanup_fb(struct drm_plane *plane,
>   				   struct drm_plane_state *old_state)
>   {
> -	struct vbox_bo *bo;
> +	struct drm_gem_ttm_object *gbo;
>   
>   	if (!plane->state->fb)
>   		return;
>   
> -	bo = gem_to_vbox_bo(to_vbox_framebuffer(plane->state->fb)->obj);
> -	vbox_bo_unpin(bo);
> +	gbo = drm_gem_ttm_of_gem(to_vbox_framebuffer(plane->state->fb)->obj);
> +	drm_gem_ttm_unpin(gbo);
>   }
>   
>   static const u32 vbox_cursor_plane_formats[] = {
> diff --git a/drivers/gpu/drm/vboxvideo/vbox_ttm.c b/drivers/gpu/drm/vboxvideo/vbox_ttm.c
> index 9d78438c2877..a1d64e1ea90c 100644
> --- a/drivers/gpu/drm/vboxvideo/vbox_ttm.c
> +++ b/drivers/gpu/drm/vboxvideo/vbox_ttm.c
> @@ -16,24 +16,6 @@ static inline struct vbox_private *vbox_bdev(struct ttm_bo_device *bd)
>   	return container_of(bd, struct vbox_private, ttm.bdev);
>   }
>   
> -static void vbox_bo_ttm_destroy(struct ttm_buffer_object *tbo)
> -{
> -	struct vbox_bo *bo;
> -
> -	bo = container_of(tbo, struct vbox_bo, bo);
> -
> -	drm_gem_object_release(&bo->gem);
> -	kfree(bo);
> -}
> -
> -static bool vbox_ttm_bo_is_vbox_bo(struct ttm_buffer_object *bo)
> -{
> -	if (bo->destroy == &vbox_bo_ttm_destroy)
> -		return true;
> -
> -	return false;
> -}
> -
>   static int
>   vbox_bo_init_mem_type(struct ttm_bo_device *bdev, u32 type,
>   		      struct ttm_mem_type_manager *man)
> @@ -58,24 +40,6 @@ vbox_bo_init_mem_type(struct ttm_bo_device *bdev, u32 type,
>   	return 0;
>   }
>   
> -static void
> -vbox_bo_evict_flags(struct ttm_buffer_object *bo, struct ttm_placement *pl)
> -{
> -	struct vbox_bo *vboxbo = vbox_bo(bo);
> -
> -	if (!vbox_ttm_bo_is_vbox_bo(bo))
> -		return;
> -
> -	vbox_ttm_placement(vboxbo, TTM_PL_FLAG_SYSTEM);
> -	*pl = vboxbo->placement;
> -}
> -
> -static int vbox_bo_verify_access(struct ttm_buffer_object *bo,
> -				 struct file *filp)
> -{
> -	return 0;
> -}
> -
>   static int vbox_ttm_io_mem_reserve(struct ttm_bo_device *bdev,
>   				   struct ttm_mem_reg *mem)
>   {
> @@ -141,8 +105,8 @@ static struct ttm_bo_driver vbox_bo_driver = {
>   	.ttm_tt_create = vbox_ttm_tt_create,
>   	.init_mem_type = vbox_bo_init_mem_type,
>   	.eviction_valuable = ttm_bo_eviction_valuable,
> -	.evict_flags = vbox_bo_evict_flags,
> -	.verify_access = vbox_bo_verify_access,
> +	.evict_flags = drm_gem_ttm_bo_driver_evict_flags,
> +	.verify_access = drm_gem_ttm_bo_driver_verify_access,
>   	.io_mem_reserve = &vbox_ttm_io_mem_reserve,
>   	.io_mem_free = &vbox_ttm_io_mem_free,
>   };
> @@ -196,165 +160,6 @@ void vbox_mm_fini(struct vbox_private *vbox)
>   	ttm_bo_device_release(&vbox->ttm.bdev);
>   }
>   
> -void vbox_ttm_placement(struct vbox_bo *bo, int domain)
> -{
> -	unsigned int i;
> -	u32 c = 0;
> -
> -	bo->placement.placement = bo->placements;
> -	bo->placement.busy_placement = bo->placements;
> -
> -	if (domain & TTM_PL_FLAG_VRAM)
> -		bo->placements[c++].flags =
> -		    TTM_PL_FLAG_WC | TTM_PL_FLAG_UNCACHED | TTM_PL_FLAG_VRAM;
> -	if (domain & TTM_PL_FLAG_SYSTEM)
> -		bo->placements[c++].flags =
> -		    TTM_PL_MASK_CACHING | TTM_PL_FLAG_SYSTEM;
> -	if (!c)
> -		bo->placements[c++].flags =
> -		    TTM_PL_MASK_CACHING | TTM_PL_FLAG_SYSTEM;
> -
> -	bo->placement.num_placement = c;
> -	bo->placement.num_busy_placement = c;
> -
> -	for (i = 0; i < c; ++i) {
> -		bo->placements[i].fpfn = 0;
> -		bo->placements[i].lpfn = 0;
> -	}
> -}
> -
> -int vbox_bo_create(struct vbox_private *vbox, int size, int align,
> -		   u32 flags, struct vbox_bo **pvboxbo)
> -{
> -	struct vbox_bo *vboxbo;
> -	size_t acc_size;
> -	int ret;
> -
> -	vboxbo = kzalloc(sizeof(*vboxbo), GFP_KERNEL);
> -	if (!vboxbo)
> -		return -ENOMEM;
> -
> -	ret = drm_gem_object_init(&vbox->ddev, &vboxbo->gem, size);
> -	if (ret)
> -		goto err_free_vboxbo;
> -
> -	vboxbo->bo.bdev = &vbox->ttm.bdev;
> -
> -	vbox_ttm_placement(vboxbo, TTM_PL_FLAG_VRAM | TTM_PL_FLAG_SYSTEM);
> -
> -	acc_size = ttm_bo_dma_acc_size(&vbox->ttm.bdev, size,
> -				       sizeof(struct vbox_bo));
> -
> -	ret = ttm_bo_init(&vbox->ttm.bdev, &vboxbo->bo, size,
> -			  ttm_bo_type_device, &vboxbo->placement,
> -			  align >> PAGE_SHIFT, false, acc_size,
> -			  NULL, NULL, vbox_bo_ttm_destroy);
> -	if (ret)
> -		goto err_free_vboxbo;
> -
> -	*pvboxbo = vboxbo;
> -
> -	return 0;
> -
> -err_free_vboxbo:
> -	kfree(vboxbo);
> -	return ret;
> -}
> -
> -int vbox_bo_pin(struct vbox_bo *bo, u32 pl_flag)
> -{
> -	struct ttm_operation_ctx ctx = { false, false };
> -	int i, ret;
> -
> -	if (bo->pin_count) {
> -		bo->pin_count++;
> -		return 0;
> -	}
> -
> -	ret = vbox_bo_reserve(bo, false);
> -	if (ret)
> -		return ret;
> -
> -	vbox_ttm_placement(bo, pl_flag);
> -
> -	for (i = 0; i < bo->placement.num_placement; i++)
> -		bo->placements[i].flags |= TTM_PL_FLAG_NO_EVICT;
> -
> -	ret = ttm_bo_validate(&bo->bo, &bo->placement, &ctx);
> -	if (ret == 0)
> -		bo->pin_count = 1;
> -
> -	vbox_bo_unreserve(bo);
> -
> -	return ret;
> -}
> -
> -int vbox_bo_unpin(struct vbox_bo *bo)
> -{
> -	struct ttm_operation_ctx ctx = { false, false };
> -	int i, ret;
> -
> -	if (!bo->pin_count) {
> -		DRM_ERROR("unpin bad %p\n", bo);
> -		return 0;
> -	}
> -	bo->pin_count--;
> -	if (bo->pin_count)
> -		return 0;
> -
> -	ret = vbox_bo_reserve(bo, false);
> -	if (ret) {
> -		DRM_ERROR("Error %d reserving bo, leaving it pinned\n", ret);
> -		return ret;
> -	}
> -
> -	for (i = 0; i < bo->placement.num_placement; i++)
> -		bo->placements[i].flags &= ~TTM_PL_FLAG_NO_EVICT;
> -
> -	ret = ttm_bo_validate(&bo->bo, &bo->placement, &ctx);
> -
> -	vbox_bo_unreserve(bo);
> -
> -	return ret;
> -}
> -
> -/*
> - * Move a vbox-owned buffer object to system memory if no one else has it
> - * pinned.  The caller must have pinned it previously, and this call will
> - * release the caller's pin.
> - */
> -int vbox_bo_push_sysram(struct vbox_bo *bo)
> -{
> -	struct ttm_operation_ctx ctx = { false, false };
> -	int i, ret;
> -
> -	if (!bo->pin_count) {
> -		DRM_ERROR("unpin bad %p\n", bo);
> -		return 0;
> -	}
> -	bo->pin_count--;
> -	if (bo->pin_count)
> -		return 0;
> -
> -	if (bo->kmap.virtual) {
> -		ttm_bo_kunmap(&bo->kmap);
> -		bo->kmap.virtual = NULL;
> -	}
> -
> -	vbox_ttm_placement(bo, TTM_PL_FLAG_SYSTEM);
> -
> -	for (i = 0; i < bo->placement.num_placement; i++)
> -		bo->placements[i].flags |= TTM_PL_FLAG_NO_EVICT;
> -
> -	ret = ttm_bo_validate(&bo->bo, &bo->placement, &ctx);
> -	if (ret) {
> -		DRM_ERROR("pushing to VRAM failed\n");
> -		return ret;
> -	}
> -
> -	return 0;
> -}
> -
>   int vbox_mmap(struct file *filp, struct vm_area_struct *vma)
>   {
>   	struct drm_file *file_priv = filp->private_data;
> @@ -362,27 +167,3 @@ int vbox_mmap(struct file *filp, struct vm_area_struct *vma)
>   
>   	return ttm_bo_mmap(filp, vma, &vbox->ttm.bdev);
>   }
> -
> -void *vbox_bo_kmap(struct vbox_bo *bo)
> -{
> -	int ret;
> -
> -	if (bo->kmap.virtual)
> -		return bo->kmap.virtual;
> -
> -	ret = ttm_bo_kmap(&bo->bo, 0, bo->bo.num_pages, &bo->kmap);
> -	if (ret) {
> -		DRM_ERROR("Error kmapping bo: %d\n", ret);
> -		return NULL;
> -	}
> -
> -	return bo->kmap.virtual;
> -}
> -
> -void vbox_bo_kunmap(struct vbox_bo *bo)
> -{
> -	if (bo->kmap.virtual) {
> -		ttm_bo_kunmap(&bo->kmap);
> -		bo->kmap.virtual = NULL;
> -	}
> -}
> 

^ permalink raw reply

* Re: [PATCH] drm/qxl: drop prime import/export callbacks
From: David Airlie @ 2019-04-09  6:06 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: David Airlie, open list, dri-devel,
	open list:DRM DRIVER FOR QXL VIRTUAL GPU, Dave Airlie,
	open list:DRM DRIVER FOR QXL VIRTUAL GPU, Dave Airlie
In-Reply-To: <20190409060323.xrw3fwj3yvcxsx3t@sirius.home.kraxel.org>

On Tue, Apr 9, 2019 at 4:03 PM Gerd Hoffmann <kraxel@redhat.com> wrote:
>
> On Tue, Apr 09, 2019 at 02:01:33PM +1000, Dave Airlie wrote:
> > On Sat, 12 Jan 2019 at 07:13, Dave Airlie <airlied@gmail.com> wrote:
> > >
> > > On Thu, 10 Jan 2019 at 18:17, Gerd Hoffmann <kraxel@redhat.com> wrote:
> > > >
> > > > Also set prime_handle_to_fd and prime_fd_to_handle to NULL,
> > > > so drm will not advertive DRM_PRIME_CAP_{IMPORT,EXPORT} to
> > > > userspace.
> >
> > It's been pointed out to me that disables DRI3 for these devices, I'm
> > not sure that is the solution we actually wanted.
> >
> > any ideas?
>
> Well.  Lets have a look at where we stand:
>
>  * drm_gem_prime_export() works with qxl, you'll get a dma-buf handle.
>
>  * Other drivers trying to map that dma-buf (drm_gem_map_dma_buf()
>    callback) will not work, due to the ->gem_prime_get_sg_table()
>    callback not being there.
>
>  * drm_gem_prime_import() will work with buffers from the same qxl
>    device, there is a shortcut for this special case.  Otherwise it
>    will not work, due to the ->gem_prime_import_sg_table() callback
>    not being there.
>
> Bottom line: you can use prime to pass qxl object handles from one
> application to another.  But you can't actually export/import qxl
> buffer objects from/to other devices.
>
> Problem is that we have no way to signal to userspace that prime can
> be used that way.
>
> Setting DRM_PRIME_CAP_{IMPORT,EXPORT} even though the driver can't
> do that leads to other problems.  Userspace thinks it can have other
> devices (intel vgpu for example) handle the rendering, then import
> the rendered buffer into qxl for scanout.
>
> Should we add something like DRM_PRIME_CAP_SAME_DEVICE?

Yeah I expect we need some sort of same device only capability, so
that dri3 userspace can work.

If we just fail importing in these cases what happens? userspace just
gets confused, I know we used to print a backtrace if we hit the mmap
path, but if we didn't do that what happens?

Dave.

^ permalink raw reply

* Re: [PATCH] drm/qxl: drop prime import/export callbacks
From: Gerd Hoffmann @ 2019-04-09  6:03 UTC (permalink / raw)
  To: Dave Airlie
  Cc: David Airlie, open list, dri-devel,
	open list:DRM DRIVER FOR QXL VIRTUAL GPU, Dave Airlie,
	open list:DRM DRIVER FOR QXL VIRTUAL GPU
In-Reply-To: <CAPM=9tyEmn2LXhNmPSOi7JAgBc3QZwNtM+8FT8CWebz4GztaJA@mail.gmail.com>

On Tue, Apr 09, 2019 at 02:01:33PM +1000, Dave Airlie wrote:
> On Sat, 12 Jan 2019 at 07:13, Dave Airlie <airlied@gmail.com> wrote:
> >
> > On Thu, 10 Jan 2019 at 18:17, Gerd Hoffmann <kraxel@redhat.com> wrote:
> > >
> > > Also set prime_handle_to_fd and prime_fd_to_handle to NULL,
> > > so drm will not advertive DRM_PRIME_CAP_{IMPORT,EXPORT} to
> > > userspace.
> 
> It's been pointed out to me that disables DRI3 for these devices, I'm
> not sure that is the solution we actually wanted.
> 
> any ideas?

Well.  Lets have a look at where we stand:

 * drm_gem_prime_export() works with qxl, you'll get a dma-buf handle.

 * Other drivers trying to map that dma-buf (drm_gem_map_dma_buf()
   callback) will not work, due to the ->gem_prime_get_sg_table()
   callback not being there.

 * drm_gem_prime_import() will work with buffers from the same qxl
   device, there is a shortcut for this special case.  Otherwise it
   will not work, due to the ->gem_prime_import_sg_table() callback
   not being there.

Bottom line: you can use prime to pass qxl object handles from one
application to another.  But you can't actually export/import qxl
buffer objects from/to other devices.

Problem is that we have no way to signal to userspace that prime can
be used that way.

Setting DRM_PRIME_CAP_{IMPORT,EXPORT} even though the driver can't
do that leads to other problems.  Userspace thinks it can have other
devices (intel vgpu for example) handle the rendering, then import
the rendered buffer into qxl for scanout.

Should we add something like DRM_PRIME_CAP_SAME_DEVICE?

cheers,
  Gerd

^ permalink raw reply

* [PATCH net] vhost: flush dcache page when logging dirty pages
From: Jason Wang @ 2019-04-09  4:16 UTC (permalink / raw)
  To: mst, jasowang, kvm, virtualization, netdev, linux-kernel
  Cc: Christoph Hellwig, James Bottomley, Andrea Arcangeli

We set dirty bit through setting up kmaps and access them through
kernel virtual address, this may result alias in virtually tagged
caches that require a dcache flush afterwards.

Cc: Christoph Hellwig <hch@infradead.org>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Fixes: 3a4d5c94e9593 ("vhost_net: a kernel-level virtio server")
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/vhost/vhost.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 351af88231ad..34a1cedbc5ba 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1711,6 +1711,7 @@ static int set_bit_to_user(int nr, void __user *addr)
 	base = kmap_atomic(page);
 	set_bit(bit, base);
 	kunmap_atomic(base);
+	flush_dcache_page(page);
 	set_page_dirty_lock(page);
 	put_page(page);
 	return 0;
-- 
2.19.1

^ permalink raw reply related

* [PATCH net] vhost: reject zero size iova range
From: Jason Wang @ 2019-04-09  4:10 UTC (permalink / raw)
  To: mst, jasowang, kvm, virtualization, netdev, linux-kernel

We used to accept zero size iova range which will lead a infinite loop
in translate_desc(). Fixing this by failing the request in this case.

Reported-by: syzbot+d21e6e297322a900c128@syzkaller.appspotmail.com
Fixes: 6b1e6cc7 ("vhost: new device IOTLB API")
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/vhost/vhost.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 5ace833de746..351af88231ad 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -911,8 +911,12 @@ static int vhost_new_umem_range(struct vhost_umem *umem,
 				u64 start, u64 size, u64 end,
 				u64 userspace_addr, int perm)
 {
-	struct vhost_umem_node *tmp, *node = kmalloc(sizeof(*node), GFP_ATOMIC);
+	struct vhost_umem_node *tmp, *node;
 
+	if (!size)
+		return -EFAULT;
+
+	node = kmalloc(sizeof(*node), GFP_ATOMIC);
 	if (!node)
 		return -ENOMEM;
 
-- 
2.19.1

^ permalink raw reply related

* Re: [PATCH] drm/qxl: drop prime import/export callbacks
From: Dave Airlie @ 2019-04-09  4:01 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: David Airlie, open list, dri-devel,
	open list:DRM DRIVER FOR QXL VIRTUAL GPU, Dave Airlie,
	open list:DRM DRIVER FOR QXL VIRTUAL GPU
In-Reply-To: <CAPM=9tx8N6jxVzsT-RqtQFrFb9qLK4qUUje+d=fpTYZ6+PnYpg@mail.gmail.com>

On Sat, 12 Jan 2019 at 07:13, Dave Airlie <airlied@gmail.com> wrote:
>
> On Thu, 10 Jan 2019 at 18:17, Gerd Hoffmann <kraxel@redhat.com> wrote:
> >
> > Also set prime_handle_to_fd and prime_fd_to_handle to NULL,
> > so drm will not advertive DRM_PRIME_CAP_{IMPORT,EXPORT} to
> > userspace.

It's been pointed out to me that disables DRI3 for these devices, I'm
not sure that is the solution we actually wanted.

any ideas?
Dave.

> >
> > Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
>
> Reviewed-by: Dave Airlie <airlied@redhat.com>
> > ---
> >  drivers/gpu/drm/qxl/qxl_drv.c   |  4 ----
> >  drivers/gpu/drm/qxl/qxl_prime.c | 14 --------------
> >  2 files changed, 18 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/qxl/qxl_drv.c b/drivers/gpu/drm/qxl/qxl_drv.c
> > index 13c8a662f9..ccb090f3ab 100644
> > --- a/drivers/gpu/drm/qxl/qxl_drv.c
> > +++ b/drivers/gpu/drm/qxl/qxl_drv.c
> > @@ -250,14 +250,10 @@ static struct drm_driver qxl_driver = {
> >  #if defined(CONFIG_DEBUG_FS)
> >         .debugfs_init = qxl_debugfs_init,
> >  #endif
> > -       .prime_handle_to_fd = drm_gem_prime_handle_to_fd,
> > -       .prime_fd_to_handle = drm_gem_prime_fd_to_handle,
> >         .gem_prime_export = drm_gem_prime_export,
> >         .gem_prime_import = drm_gem_prime_import,
> >         .gem_prime_pin = qxl_gem_prime_pin,
> >         .gem_prime_unpin = qxl_gem_prime_unpin,
> > -       .gem_prime_get_sg_table = qxl_gem_prime_get_sg_table,
> > -       .gem_prime_import_sg_table = qxl_gem_prime_import_sg_table,
> >         .gem_prime_vmap = qxl_gem_prime_vmap,
> >         .gem_prime_vunmap = qxl_gem_prime_vunmap,
> >         .gem_prime_mmap = qxl_gem_prime_mmap,
> > diff --git a/drivers/gpu/drm/qxl/qxl_prime.c b/drivers/gpu/drm/qxl/qxl_prime.c
> > index a55dece118..df65d3c1a7 100644
> > --- a/drivers/gpu/drm/qxl/qxl_prime.c
> > +++ b/drivers/gpu/drm/qxl/qxl_prime.c
> > @@ -38,20 +38,6 @@ void qxl_gem_prime_unpin(struct drm_gem_object *obj)
> >         WARN_ONCE(1, "not implemented");
> >  }
> >
> > -struct sg_table *qxl_gem_prime_get_sg_table(struct drm_gem_object *obj)
> > -{
> > -       WARN_ONCE(1, "not implemented");
> > -       return ERR_PTR(-ENOSYS);
> > -}
> > -
> > -struct drm_gem_object *qxl_gem_prime_import_sg_table(
> > -       struct drm_device *dev, struct dma_buf_attachment *attach,
> > -       struct sg_table *table)
> > -{
> > -       WARN_ONCE(1, "not implemented");
> > -       return ERR_PTR(-ENOSYS);
> > -}
> > -
> >  void *qxl_gem_prime_vmap(struct drm_gem_object *obj)
> >  {
> >         WARN_ONCE(1, "not implemented");
> > --
> > 2.9.3
> >
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply

* Re: INFO: task hung in vhost_net_stop_vq
From: Jason Wang @ 2019-04-09  3:31 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: syzbot, weiyj.lk, KVM list, Michael S. Tsirkin, netdev,
	syzkaller-bugs, LKML, virtualization
In-Reply-To: <CACT4Y+Z1s-Lx1UXHKj88kQoOcbiD8gwyuRU3F_+cceP3pzbbrw@mail.gmail.com>


On 2019/3/26 下午6:28, Dmitry Vyukov wrote:
> On Tue, Mar 26, 2019 at 11:17 AM Jason Wang<jasowang@redhat.com>  wrote:
>> On 2019/3/25 下午10:02, Michael S. Tsirkin wrote:
>>> Looks like more iotlb locking mess?
>> Looking at the calltrace:
>>
>> [  221.743675] =============================================
>> [  221.744297] [ INFO: possible recursive locking detected ]
>> [  221.744944] 4.7.0+ #1 Not tainted
>> [  221.745326] ---------------------------------------------
>> [  221.746128] syz-executor1/6823 is trying to acquire lock:
>> [  221.746737]  (&vq->mutex){+.+...}, at: [<ffffffff84484b70>] vhost_process_iotlb_msg+0xe0/0x9e0
>> [  221.747789]
>> [  221.747789] but task is already holding lock:
>> [  221.748470]  (&vq->mutex){+.+...}, at: [<ffffffff84484b70>] vhost_process_iotlb_msg+0xe0/0x9e0
>> [  221.749535]
>> [  221.749535] other info that might help us debug this:
>> [  221.750280]  Possible unsafe locking scenario:
>> [  221.750280]
>> [  221.750946]        CPU0
>> [  221.751232]        ----
>> [  221.751523]   lock(&vq->mutex);
>> [  221.751922]   lock(&vq->mutex);
>> [  221.752339]
>> [  221.752339]  *** DEADLOCK ***
>> [  221.752339]
>>
>> I could not think of a path that can hit this. And I could not reproduce with the reproducer in the link in net-next.
> Looking at the bisection log, syzbot is able to reproduce this
> super-reliably on multiple kernel revisions. Are you sure you are
> using the right config/revision? What else can be in play? syzbot uses
> VMs. The image is available.
>
>

Yes, looks like the reason is vhost accept zero size iova range which 
lead a infinite loop when trying to translate iova. Will post a patch to 
fix this.

Thanks

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH net v7] failover: allow name change on IFF_UP slave interfaces
From: Michael S. Tsirkin @ 2019-04-08 21:00 UTC (permalink / raw)
  To: Si-Wei Liu
  Cc: jiri, kubakici, sridhar.samudrala, alexander.duyck,
	virtualization, liran.alon, netdev, boris.ostrovsky, davem
In-Reply-To: <1554755085-29098-1-git-send-email-si-wei.liu@oracle.com>

On Mon, Apr 08, 2019 at 04:24:45PM -0400, Si-Wei Liu wrote:
> When a netdev appears through hot plug then gets enslaved by a failover
> master that is already up and running, the slave will be opened
> right away after getting enslaved. Today there's a race that userspace
> (udev) may fail to rename the slave if the kernel (net_failover)
> opens the slave earlier than when the userspace rename happens.
> Unlike bond or team, the primary slave of failover can't be renamed by
> userspace ahead of time, since the kernel initiated auto-enslavement is
> unable to, or rather, is never meant to be synchronized with the rename
> request from userspace.
> 
> As the failover slave interfaces are not designed to be operated
> directly by userspace apps: IP configuration, filter rules with
> regard to network traffic passing and etc., should all be done on master
> interface. In general, userspace apps only care about the
> name of master interface, while slave names are less important as long
> as admin users can see reliable names that may carry
> other information describing the netdev. For e.g., they can infer that
> "ens3nsby" is a standby slave of "ens3", while for a
> name like "eth0" they can't tell which master it belongs to.
> 
> Historically the name of IFF_UP interface can't be changed because
> there might be admin script or management software that is already
> relying on such behavior and assumes that the slave name can't be
> changed once UP. But failover is special: with the in-kernel
> auto-enslavement mechanism, the userspace expectation for device
> enumeration and bring-up order is already broken. Previously initramfs
> and various userspace config tools were modified to bypass failover
> slaves because of auto-enslavement and duplicate MAC address. Similarly,
> in case that users care about seeing reliable slave name, the new type
> of failover slaves needs to be taken care of specifically in userspace
> anyway.
> 
> It's less risky to lift up the rename restriction on failover slave
> which is already UP. Although it's possible this change may potentially
> break userspace component (most likely configuration scripts or
> management software) that assumes slave name can't be changed while
> UP, it's relatively a limited and controllable set among all userspace
> components, which can be fixed specifically to listen for the rename
> and/or link down/up events on failover slaves.

This was correct with v6. With v7 link down/up events are no longer
emitted.

> Userspace component
> interacting with slaves is expected to be changed to operate on failover
> master interface instead, as the failover slave is dynamic in nature
> which may come and go at any point.  The goal is to make the role of
> failover slaves less relevant, and userspace components should only
> deal with failover master in the long run.
> 
> Fixes: 30c8bd5aa8b2 ("net: Introduce generic failover module")
> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
> Reviewed-by: Liran Alon <liran.alon@oracle.com>
> 
> --
> v1 -> v2:
> - Drop configurable module parameter (Sridhar)
> 
> v2 -> v3:
> - Drop additional IFF_SLAVE_RENAME_OK flag (Sridhar)
> - Send down and up events around rename (Michael S. Tsirkin)
> 
> v3 -> v4:
> - Simplify notification to be sent (Stephen Hemminger)
> 
> v4 -> v5:
> - Sync up code with latest net-next (Sridhar)
> - Use proper structure initialization (Stephen, Jiri)
> 
> v5 -> v6:
> - Make the property of live name change a generic flag (Stephen)
> 
> v6 -> v7:
> - Remove NETDEV_CHANGE notification that is deemed unnecessary
>   (Stephen)
> ---

I preferred v6 better. If we emit NETDEV_CHANGE then
userspace that expects that name never changes for up
interfaces will work since it will think state changed.

>  include/linux/netdevice.h |  3 +++
>  net/core/dev.c            | 16 +++++++++++++++-
>  net/core/failover.c       |  6 +++---
>  3 files changed, 21 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 78f5ec4e..ea9a63f 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -1498,6 +1498,7 @@ struct net_device_ops {
>   * @IFF_FAILOVER: device is a failover master device
>   * @IFF_FAILOVER_SLAVE: device is lower dev of a failover master device
>   * @IFF_L3MDEV_RX_HANDLER: only invoke the rx handler of L3 master device
> + * @IFF_LIVE_RENAME_OK: rename is allowed while device is up and running
>   */
>  enum netdev_priv_flags {
>  	IFF_802_1Q_VLAN			= 1<<0,
> @@ -1530,6 +1531,7 @@ enum netdev_priv_flags {
>  	IFF_FAILOVER			= 1<<27,
>  	IFF_FAILOVER_SLAVE		= 1<<28,
>  	IFF_L3MDEV_RX_HANDLER		= 1<<29,
> +	IFF_LIVE_RENAME_OK		= 1<<30,
>  };
>  
>  #define IFF_802_1Q_VLAN			IFF_802_1Q_VLAN
> @@ -1561,6 +1563,7 @@ enum netdev_priv_flags {
>  #define IFF_FAILOVER			IFF_FAILOVER
>  #define IFF_FAILOVER_SLAVE		IFF_FAILOVER_SLAVE
>  #define IFF_L3MDEV_RX_HANDLER		IFF_L3MDEV_RX_HANDLER
> +#define IFF_LIVE_RENAME_OK		IFF_LIVE_RENAME_OK
>  
>  /**
>   *	struct net_device - The DEVICE structure.
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 9823b77..1622d88 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -1185,7 +1185,21 @@ int dev_change_name(struct net_device *dev, const char *newname)
>  	BUG_ON(!dev_net(dev));
>  
>  	net = dev_net(dev);
> -	if (dev->flags & IFF_UP)
> +
> +	/* Some auto-enslaved devices e.g. failover slaves are
> +	 * special, as userspace might rename the device after
> +	 * the interface had been brought up and running since
> +	 * the point kernel initiated auto-enslavement. Allow
> +	 * live name change even when these slave devices are
> +	 * up and running.
> +	 *
> +	 * Typically, users of these auto-enslaving devices
> +	 * don't actually care about slave name change, as
> +	 * they are supposed to operate on master interface
> +	 * directly.
> +	 */
> +	if (dev->flags & IFF_UP &&
> +	    likely(!(dev->priv_flags & IFF_LIVE_RENAME_OK)))
>  		return -EBUSY;
>  
>  	write_seqcount_begin(&devnet_rename_seq);
> diff --git a/net/core/failover.c b/net/core/failover.c
> index 4a92a98..b5cd3c7 100644
> --- a/net/core/failover.c
> +++ b/net/core/failover.c
> @@ -80,14 +80,14 @@ static int failover_slave_register(struct net_device *slave_dev)
>  		goto err_upper_link;
>  	}
>  
> -	slave_dev->priv_flags |= IFF_FAILOVER_SLAVE;
> +	slave_dev->priv_flags |= (IFF_FAILOVER_SLAVE | IFF_LIVE_RENAME_OK);
>  
>  	if (fops && fops->slave_register &&
>  	    !fops->slave_register(slave_dev, failover_dev))
>  		return NOTIFY_OK;
>  
>  	netdev_upper_dev_unlink(slave_dev, failover_dev);
> -	slave_dev->priv_flags &= ~IFF_FAILOVER_SLAVE;
> +	slave_dev->priv_flags &= ~(IFF_FAILOVER_SLAVE | IFF_LIVE_RENAME_OK);
>  err_upper_link:
>  	netdev_rx_handler_unregister(slave_dev);
>  done:
> @@ -121,7 +121,7 @@ int failover_slave_unregister(struct net_device *slave_dev)
>  
>  	netdev_rx_handler_unregister(slave_dev);
>  	netdev_upper_dev_unlink(slave_dev, failover_dev);
> -	slave_dev->priv_flags &= ~IFF_FAILOVER_SLAVE;
> +	slave_dev->priv_flags &= ~(IFF_FAILOVER_SLAVE | IFF_LIVE_RENAME_OK);
>  
>  	if (fops && fops->slave_unregister &&
>  	    !fops->slave_unregister(slave_dev, failover_dev))
> -- 
> 1.8.3.1

^ permalink raw reply

* Re: [PATCH RFC 3/4] vsock/virtio: change the maximum packet size allowed
From: Stefan Hajnoczi @ 2019-04-08 15:45 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: kvm, Michael S. Tsirkin, linux-kernel, virtualization, netdev,
	David S. Miller
In-Reply-To: <20190408151735.itsfswajk5ww3ejv@steredhat>


[-- Attachment #1.1: Type: text/plain, Size: 1461 bytes --]

On Mon, Apr 08, 2019 at 05:17:35PM +0200, Stefano Garzarella wrote:
> On Mon, Apr 08, 2019 at 10:57:44AM -0400, Michael S. Tsirkin wrote:
> > On Mon, Apr 08, 2019 at 04:55:31PM +0200, Stefano Garzarella wrote:
> > > > Anyway, any change to this behavior requires compatibility so new guest
> > > > drivers work with old vhost_vsock.ko.  Therefore we should probably just
> > > > leave the limit for now.
> > > 
> > > I understood your point of view and I completely agree with you.
> > > But, until we don't have a way to expose features/versions between guest
> > > and host,
> > 
> > Why not use the standard virtio feature negotiation mechanism for this?
> > 
> 
> Yes, I have this in my mind :), but I want to understand better if we can
> use virtio-net also for this mechanism.
> For now, I don't think limiting the packets to 64 KiB is a big issue.
> 
> What do you think if I postpone this when I have more clear if we can
> use virtio-net or not? (in order to avoid duplicated work)

Yes, I agree.  VIRTIO has feature negotiation and we can use it to
change this behavior cleanly.

However, this will require a spec change and this patch series delivers
significant performance improvements that can be merged sooner than
VIRTIO spec changes.

Let's defer the max packet size change via VIRTIO feature bits.  It can
be done separately if we decide to stick to the virtio-vsock device
design and not virtio-net.

Stefan

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH RFC 3/4] vsock/virtio: change the maximum packet size allowed
From: Stefano Garzarella @ 2019-04-08 15:17 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, linux-kernel, virtualization, Stefan Hajnoczi, netdev,
	David S. Miller
In-Reply-To: <20190408105619-mutt-send-email-mst@kernel.org>

On Mon, Apr 08, 2019 at 10:57:44AM -0400, Michael S. Tsirkin wrote:
> On Mon, Apr 08, 2019 at 04:55:31PM +0200, Stefano Garzarella wrote:
> > > Anyway, any change to this behavior requires compatibility so new guest
> > > drivers work with old vhost_vsock.ko.  Therefore we should probably just
> > > leave the limit for now.
> > 
> > I understood your point of view and I completely agree with you.
> > But, until we don't have a way to expose features/versions between guest
> > and host,
> 
> Why not use the standard virtio feature negotiation mechanism for this?
> 

Yes, I have this in my mind :), but I want to understand better if we can
use virtio-net also for this mechanism.
For now, I don't think limiting the packets to 64 KiB is a big issue.

What do you think if I postpone this when I have more clear if we can
use virtio-net or not? (in order to avoid duplicated work)

Thanks,
Stefano

^ permalink raw reply

* Re: [PATCH RFC 3/4] vsock/virtio: change the maximum packet size allowed
From: Michael S. Tsirkin @ 2019-04-08 14:57 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: kvm, linux-kernel, virtualization, Stefan Hajnoczi, netdev,
	David S. Miller
In-Reply-To: <20190408145531.yreyawkn5vjqj7sl@steredhat>

On Mon, Apr 08, 2019 at 04:55:31PM +0200, Stefano Garzarella wrote:
> > Anyway, any change to this behavior requires compatibility so new guest
> > drivers work with old vhost_vsock.ko.  Therefore we should probably just
> > leave the limit for now.
> 
> I understood your point of view and I completely agree with you.
> But, until we don't have a way to expose features/versions between guest
> and host,

Why not use the standard virtio feature negotiation mechanism for this?

> maybe is better to leave the limit in order to be compatible
> with old vhost_vsock.

-- 
MST

^ permalink raw reply

* Re: [PATCH RFC 3/4] vsock/virtio: change the maximum packet size allowed
From: Stefano Garzarella @ 2019-04-08 14:55 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: kvm, Michael S. Tsirkin, linux-kernel, virtualization, netdev,
	David S. Miller
In-Reply-To: <20190408093723.GP15001@stefanha-x1.localdomain>

On Mon, Apr 08, 2019 at 10:37:23AM +0100, Stefan Hajnoczi wrote:
> On Fri, Apr 05, 2019 at 12:07:47PM +0200, Stefano Garzarella wrote:
> > On Fri, Apr 05, 2019 at 09:24:47AM +0100, Stefan Hajnoczi wrote:
> > > On Thu, Apr 04, 2019 at 12:58:37PM +0200, Stefano Garzarella wrote:
> > > > Since now we are able to split packets, we can avoid limiting
> > > > their sizes to VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE.
> > > > Instead, we can use VIRTIO_VSOCK_MAX_PKT_BUF_SIZE as the max
> > > > packet size.
> > > > 
> > > > Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> > > > ---
> > > >  net/vmw_vsock/virtio_transport_common.c | 4 ++--
> > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
> > > > index f32301d823f5..822e5d07a4ec 100644
> > > > --- a/net/vmw_vsock/virtio_transport_common.c
> > > > +++ b/net/vmw_vsock/virtio_transport_common.c
> > > > @@ -167,8 +167,8 @@ static int virtio_transport_send_pkt_info(struct vsock_sock *vsk,
> > > >  	vvs = vsk->trans;
> > > >  
> > > >  	/* we can send less than pkt_len bytes */
> > > > -	if (pkt_len > VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE)
> > > > -		pkt_len = VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE;
> > > > +	if (pkt_len > VIRTIO_VSOCK_MAX_PKT_BUF_SIZE)
> > > > +		pkt_len = VIRTIO_VSOCK_MAX_PKT_BUF_SIZE;
> > > 
> > > The next line limits pkt_len based on available credits:
> > > 
> > >   /* virtio_transport_get_credit might return less than pkt_len credit */
> > >   pkt_len = virtio_transport_get_credit(vvs, pkt_len);
> > > 
> > > I think drivers/vhost/vsock.c:vhost_transport_do_send_pkt() now works
> > > correctly even with pkt_len > VIRTIO_VSOCK_MAX_PKT_BUF_SIZE.
> > 
> > Correct.
> > 
> > > 
> > > The other ->send_pkt() callback is
> > > net/vmw_vsock/virtio_transport.c:virtio_transport_send_pkt_work() and it
> > > can already send any size packet.
> > > 
> > > Do you remember why VIRTIO_VSOCK_MAX_PKT_BUF_SIZE still needs to be the
> > > limit?  I'm wondering if we can get rid of it now and just limit packets
> > > to the available credits.
> > 
> > There are 2 reasons why I left this limit:
> > 1. When the host receives a packets, it must be <=
> >    VIRTIO_VSOCK_MAX_PKT_BUF_SIZE [drivers/vhost/vsock.c:vhost_vsock_alloc_pkt()]
> >    So in this way we can limit the packets sent from the guest.
> 
> The general intent is to prevent the guest from sending huge buffers.
> This is good.
> 
> However, the guest must already obey the credit limit advertized by the
> host.  Therefore I think we should be checking against that instead of
> an arbitrary constant limit.
> 
> So I think the limit should be the receive buffer size, not
> VIRTIO_VSOCK_MAX_PKT_BUF_SIZE.  But at this point the code doesn't know
> which connection the packet is associated with and cannot check the
> receive buffer size. :(
> 
> Anyway, any change to this behavior requires compatibility so new guest
> drivers work with old vhost_vsock.ko.  Therefore we should probably just
> leave the limit for now.

I understood your point of view and I completely agree with you.
But, until we don't have a way to expose features/versions between guest
and host, maybe is better to leave the limit in order to be compatible
with old vhost_vsock.

> 
> > 2. When the host send packets, it help us to increase the parallelism
> >    (especially if the guest has 64 KB RX buffers) because the user thread
> >    will split packets, calling multiple times transport->stream_enqueue()
> >    in net/vmw_vsock/af_vsock.c:vsock_stream_sendmsg() while the
> >    vhost_transport_send_pkt_work() send them to the guest.
> 
> Sorry, I don't understand the reasoning.  Overall this creates more
> work.  Are you saying the benefit is that
> vhost_transport_send_pkt_work() can run "early" and notify the guest of
> partial rx data before all of it has been enqueued?

Something like that. Your reasoning is more accurate.
Anyway, I'll do some tests in order to understand better the behaviour!

Thanks,
Stefano

^ permalink raw reply

* Re: [PATCH 1/1] virtio_blk: replace 0 by HCTX_TYPE_DEFAULT to index blk_mq_tag_set->map
From: Jens Axboe @ 2019-04-08 14:08 UTC (permalink / raw)
  To: Dongli Zhang, virtualization, linux-block, linux-kernel; +Cc: mst
In-Reply-To: <1552354316-20204-1-git-send-email-dongli.zhang@oracle.com>

On 3/11/19 7:31 PM, Dongli Zhang wrote:
> Use HCTX_TYPE_DEFAULT instead of 0 to avoid hardcoding.

Applied, thanks.

-- 
Jens Axboe

^ permalink raw reply

* Re: [PATCH 1/1] virtio_blk: replace 0 by HCTX_TYPE_DEFAULT to index blk_mq_tag_set->map
From: Dongli Zhang @ 2019-04-08 14:00 UTC (permalink / raw)
  To: virtualization, linux-block, linux-kernel; +Cc: axboe, mst
In-Reply-To: <1552354316-20204-1-git-send-email-dongli.zhang@oracle.com>

ping?

The similar patchset has been queued by linux-scsi 5.2/scsi-queue.

virtio-blk is the last place where HCTX_TYPE_DEFAULT is hardcoded.

It would be more friendly to cscope if the hardcoding is avoided.

Thank you very much!

Dongli Zhang

On 03/12/2019 09:31 AM, Dongli Zhang wrote:
> Use HCTX_TYPE_DEFAULT instead of 0 to avoid hardcoding.
> 
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> ---
> This is the only place to cleanup under drivers/block/*. Another patch set
> is sent to scsi and then all of below are cleaned up:
> - block/*
> - drivers/*
> 
>  drivers/block/virtio_blk.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> index 4bc083b..bed6035 100644
> --- a/drivers/block/virtio_blk.c
> +++ b/drivers/block/virtio_blk.c
> @@ -691,7 +691,8 @@ static int virtblk_map_queues(struct blk_mq_tag_set *set)
>  {
>  	struct virtio_blk *vblk = set->driver_data;
>  
> -	return blk_mq_virtio_map_queues(&set->map[0], vblk->vdev, 0);
> +	return blk_mq_virtio_map_queues(&set->map[HCTX_TYPE_DEFAULT],
> +					vblk->vdev, 0);
>  }
>  
>  #ifdef CONFIG_VIRTIO_BLK_SCSI
> 

^ permalink raw reply

* Re: [PATCH 0/2] Limit number of hw queues by nr_cpu_ids for virtio-blk and virtio-scsi
From: Dongli Zhang @ 2019-04-08 13:57 UTC (permalink / raw)
  To: linux-scsi, virtualization, linux-block
  Cc: axboe, martin.petersen, mst, cohuck, linux-kernel, jejb
In-Reply-To: <1553682995-5682-1-git-send-email-dongli.zhang@oracle.com>

ping?

Thank you very much!

Dongli Zhang

On 03/27/2019 06:36 PM, Dongli Zhang wrote:
> When tag_set->nr_maps is 1, the block layer limits the number of hw queues
> by nr_cpu_ids. No matter how many hw queues are use by
> virtio-blk/virtio-scsi, as they both have (tag_set->nr_maps == 1), they
> can use at most nr_cpu_ids hw queues.
> 
> In addition, specifically for pci scenario, when the 'num-queues' specified
> by qemu is more than maxcpus, virtio-blk/virtio-scsi would not be able to
> allocate more than maxcpus vectors in order to have a vector for each
> queue. As a result, they fall back into MSI-X with one vector for config
> and one shared for queues.
> 
> Considering above reasons, this patch set limits the number of hw queues
> used by nr_cpu_ids for both virtio-blk and virtio-scsi.
> 
> -------------------------------------------------------------
> 
> Here is test result of virtio-scsi:
> 
> qemu cmdline:
> 
> -smp 2,maxcpus=4, \
> -device virtio-scsi-pci,id=scsi0,num_queues=8, \
> -device scsi-hd,drive=drive0,bus=scsi0.0,channel=0,scsi-id=0,lun=0, \
> -drive file=test.img,if=none,id=drive0
> 
> Although maxcpus=4 and num_queues=8, 4 queues are used while 2 interrupts
> are allocated.
> 
> # cat /proc/interrupts
> ... ...
>  24:          0          0   PCI-MSI 65536-edge      virtio0-config
>  25:          0        369   PCI-MSI 65537-edge      virtio0-virtqueues
> ... ...
> 
> # /sys/block/sda/mq/
> 0  1  2  3   ------> 4 queues although qemu sets num_queues=8
> 
> 
> With the patch set, there is per-queue interrupt.
> 
> # cat /proc/interrupts
>  24:          0          0   PCI-MSI 65536-edge      virtio0-config
>  25:          0          0   PCI-MSI 65537-edge      virtio0-control
>  26:          0          0   PCI-MSI 65538-edge      virtio0-event
>  27:        296          0   PCI-MSI 65539-edge      virtio0-request
>  28:          0        139   PCI-MSI 65540-edge      virtio0-request
>  29:          0          0   PCI-MSI 65541-edge      virtio0-request
>  30:          0          0   PCI-MSI 65542-edge      virtio0-request
> 
> # ls /sys/block/sda/mq
> 0  1  2  3
> 
> -------------------------------------------------------------
> 
> Here is test result of virtio-blk:
> 
> qemu cmdline:
> 
> -smp 2,maxcpus=4,
> -device virtio-blk-pci,drive=drive-virtio-disk0,id=virtio-disk0,num-queues=8
> -drive test.img,format=raw,if=none,id=drive-virtio-disk0
> 
> Although maxcpus=4 and num-queues=8, 4 queues are used while 2 interrupts
> are allocated.
> 
> # cat /proc/interrupts
> ... ...
>  24:          0          0   PCI-MSI 65536-edge      virtio0-config
>  25:          0         65   PCI-MSI 65537-edge      virtio0-virtqueues
> ... ...
> 
> # ls /sys/block/vda/mq
> 0  1  2  3    -------> 4 queues although qemu sets num_queues=8
> 
> 
> With the patch set, there is per-queue interrupt.
> 
> # cat /proc/interrupts
>  24:          0          0   PCI-MSI 65536-edge      virtio0-config
>  25:         64          0   PCI-MSI 65537-edge      virtio0-req.0
>  26:          0      10290   PCI-MSI 65538-edge      virtio0-req.1
>  27:          0          0   PCI-MSI 65539-edge      virtio0-req.2
>  28:          0          0   PCI-MSI 65540-edge      virtio0-req.3
> 
> # ls /sys/block/vda/mq/
> 0  1  2  3
> 
> 
> Reference: https://lore.kernel.org/lkml/e4afe4c5-0262-4500-aeec-60f30734b4fc@default/
> 
> Thank you very much!
> 
> Dongli Zhang
> 

^ permalink raw reply

* Re: [PATCH] virtio: Honour 'may_reduce_num' in vring_create_virtqueue
From: Jens Freimann @ 2019-04-08 13:32 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: kvm, Michael S . Tsirkin, linux-kernel, stable, virtualization,
	Halil Pasic
In-Reply-To: <20190408123322.24086-1-cohuck@redhat.com>

On Mon, Apr 08, 2019 at 02:33:22PM +0200, Cornelia Huck wrote:
>vring_create_virtqueue() allows the caller to specify via the
>may_reduce_num parameter whether the vring code is allowed to
>allocate a smaller ring than specified.
>
>However, the split ring allocation code tries to allocate a
>smaller ring on allocation failure regardless of what the
>caller specified. This may cause trouble for e.g. virtio-pci
>in legacy mode, which does not support ring resizing. (The
>packed ring code does not resize in any case.)
>
>Let's fix this by bailing out immediately in the split ring code
>if the requested size cannot be allocated and may_reduce_num has
>not been specified.
>
>While at it, fix a typo in the usage instructions.
>
>Fixes: 2a2d1382fe9d ("virtio: Add improved queue allocation API")
>Cc: stable@vger.kernel.org # v4.6+
>Signed-off-by: Cornelia Huck <cohuck@redhat.com>
>---
> drivers/virtio/virtio_ring.c | 2 ++
> include/linux/virtio_ring.h  | 2 +-
> 2 files changed, 3 insertions(+), 1 deletion(-)
>

Reviewed-by: Jens Freimann <jfreimann@redhat.com>

^ permalink raw reply

* Re: [RFC PATCH 01/12] virtio/s390: use vring_create_virtqueue
From: Michael S. Tsirkin @ 2019-04-08 12:37 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Halil Pasic,
	Martin Schwidefsky, Viktor Mihajlovski, Janosch Frank
In-Reply-To: <20190408130128.7859febe.cohuck@redhat.com>

On Mon, Apr 08, 2019 at 01:01:28PM +0200, Cornelia Huck wrote:
> On Fri,  5 Apr 2019 01:16:11 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
> > establishes a new way of allocating virtqueues (as a part of the effort
> > that taught DMA to virtio rings).
> > 
> > In the future we will want virtio-ccw to use the DMA API as well.
> > 
> > Let us switch from the legacy method of allocating virtqueues to
> > vring_create_virtqueue() as the first step into that direction.
> > 
> > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > ---
> >  drivers/s390/virtio/virtio_ccw.c | 27 ++++++++-------------------
> >  1 file changed, 8 insertions(+), 19 deletions(-)
> > 
> > diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> > index 74c328321889..edf4afe2d688 100644
> > --- a/drivers/s390/virtio/virtio_ccw.c
> > +++ b/drivers/s390/virtio/virtio_ccw.c
> 
> > @@ -516,17 +512,10 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
> >  		err = info->num;
> >  		goto out_err;
> >  	}
> > -	size = PAGE_ALIGN(vring_size(info->num, KVM_VIRTIO_CCW_RING_ALIGN));
> > -	info->queue = alloc_pages_exact(size, GFP_KERNEL | __GFP_ZERO);
> > -	if (info->queue == NULL) {
> > -		dev_warn(&vcdev->cdev->dev, "no queue\n");
> > -		err = -ENOMEM;
> > -		goto out_err;
> > -	}
> > +	vq = vring_create_virtqueue(i, info->num, KVM_VIRTIO_CCW_RING_ALIGN,
> > +				    vdev, true, true, ctx,
> 
> This second true means 'may_reduce_num'. Looking at the vring code, it
> seems that this parameter is never checked; the code will try to
> allocate a smaller queue if it can't get the requested size in any
> case... this will probably be a problem for legacy virtio-pci, which
> explicitly sets may_reduce_num to false. (I can try to come up with a
> patch to fix that.)

Yes, pls do. Not too late for a bugfix to go into the current linux.

> > +				    virtio_ccw_kvm_notify, callback, name);
> >  
> > -	vq = vring_new_virtqueue(i, info->num, KVM_VIRTIO_CCW_RING_ALIGN, vdev,
> > -				 true, ctx, info->queue, virtio_ccw_kvm_notify,
> > -				 callback, name);
> >  	if (!vq) {
> >  		/* For now, we fail if we can't get the requested size. */
> >  		dev_warn(&vcdev->cdev->dev, "no vq\n");
> > @@ -534,15 +523,17 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
> >  		goto out_err;
> >  	}
> >  
> > +
> 
> Extra blank line :)
> 
> >  	/* Register it with the host. */
> > +	queue = virtqueue_get_desc_addr(vq);
> >  	if (vcdev->revision == 0) {
> > -		info->info_block->l.queue = (__u64)info->queue;
> > +		info->info_block->l.queue = queue;
> >  		info->info_block->l.align = KVM_VIRTIO_CCW_RING_ALIGN;
> >  		info->info_block->l.index = i;
> >  		info->info_block->l.num = info->num;
> 
> You always fill in the size requested by the host, but the actual size
> may be smaller (see above). I don't think that is allowed for revision
> 0 (which implies !virtio-1). You probably need to call
> vring_create_virtqueue with may_reduce_num=false for revision 0 (and
> wait for the generic vring code to be fixed...)
> 
> >  		ccw->count = sizeof(info->info_block->l);
> >  	} else {
> > -		info->info_block->s.desc = (__u64)info->queue;
> > +		info->info_block->s.desc = queue;
> >  		info->info_block->s.index = i;
> >  		info->info_block->s.num = info->num;
> 
> Here, you need to obtain the actual number via
> virtqueue_get_vring_size().
> 
> >  		info->info_block->s.avail = (__u64)virtqueue_get_avail(vq);

^ permalink raw reply

* [PATCH] virtio: Honour 'may_reduce_num' in vring_create_virtqueue
From: Cornelia Huck @ 2019-04-08 12:33 UTC (permalink / raw)
  To: Michael S . Tsirkin, Jason Wang
  Cc: kvm, Cornelia Huck, linux-kernel, stable, virtualization,
	Halil Pasic

vring_create_virtqueue() allows the caller to specify via the
may_reduce_num parameter whether the vring code is allowed to
allocate a smaller ring than specified.

However, the split ring allocation code tries to allocate a
smaller ring on allocation failure regardless of what the
caller specified. This may cause trouble for e.g. virtio-pci
in legacy mode, which does not support ring resizing. (The
packed ring code does not resize in any case.)

Let's fix this by bailing out immediately in the split ring code
if the requested size cannot be allocated and may_reduce_num has
not been specified.

While at it, fix a typo in the usage instructions.

Fixes: 2a2d1382fe9d ("virtio: Add improved queue allocation API")
Cc: stable@vger.kernel.org # v4.6+
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
---
 drivers/virtio/virtio_ring.c | 2 ++
 include/linux/virtio_ring.h  | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 18846afb39da..5df92c308286 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -882,6 +882,8 @@ static struct virtqueue *vring_create_virtqueue_split(
 					  GFP_KERNEL|__GFP_NOWARN|__GFP_ZERO);
 		if (queue)
 			break;
+		if (!may_reduce_num)
+			return NULL;
 	}
 
 	if (!num)
diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
index fab02133a919..3dc70adfe5f5 100644
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -63,7 +63,7 @@ struct virtqueue;
 /*
  * Creates a virtqueue and allocates the descriptor ring.  If
  * may_reduce_num is set, then this may allocate a smaller ring than
- * expected.  The caller should query virtqueue_get_ring_size to learn
+ * expected.  The caller should query virtqueue_get_vring_size to learn
  * the actual size of the ring.
  */
 struct virtqueue *vring_create_virtqueue(unsigned int index,
-- 
2.17.2

^ permalink raw reply related

* Re: [PATCH 00/15] Share TTM code among framebuffer drivers
From: Koenig, Christian @ 2019-04-08 11:10 UTC (permalink / raw)
  To: Thomas Zimmermann, daniel@ffwll.ch, airlied@linux.ie,
	kraxel@redhat.com, Huang, Ray, Zhang, Jerry, hdegoede@redhat.com,
	z.liuxinliang@hisilicon.com, zourongrong@gmail.com,
	kong.kongxinwei@hisilicon.com, puck.chen@hisilicon.com
  Cc: dri-devel@lists.freedesktop.org,
	virtualization@lists.linux-foundation.org
In-Reply-To: <20190408092144.4548-1-tzimmermann@suse.de>

Well first problem is I'm not sure if that is a good idea. Essentially 
we want to get rid of TTM in the long run.

On the other hand this work might aid with that goal, so it might be 
worth a try.

Second is that this might actually not work of hand. The problem is here:
> +	/* TODO: This test used to be performed by drivers, but can
> +	 * this actually happen? If so, should we put the check into
> +	 * drm_gem_ttm_of_gem()? */
> +	if (!drm_is_gem_ttm(bo))
> +		return;

Yeah, this test is mandatory because TTM on itself can present buffer 
object which doesn't belong to the driver called.

E.g. we sometimes have BOs which don't belong to the current drivers on 
a driver specific LRU. A totally brain dead  design if you ask me, but 
that's how it is.

Not 100% sure, but by converting all drivers to use a common GEM_TTM 
backend you might actually break that test.

I'm not sure if that is actually a problem in the real world, it most 
likely isn't. But I still won't bet on it without being able to test this.

Regards,
Christian.

Am 08.04.19 um 11:21 schrieb Thomas Zimmermann:
> Several simple framebuffer drivers copy most of the TTM code from each
> other. The implementation is always the same; except for the name of
> some data structures.
>
> As recently discussed, this patch set provides generic TTM memory-
> management code for framebuffers with dedicated video memory. It further
> converts the respective drivers to the generic code. The shared code
> is basically the same implementation as the one copied among individual
> drivers.
>
> The patch set contains two major changes: first, it introduces
> |struct drm_gem_ttm_object| and helpers. It's a GEM object that is
> backed by TTM-managed memory. The type's purpose is somewhat similar
> to |struct drm_gem_{cma, shmem}_object|. Second, the patch set
> introduces |struct drm_simple_ttm| (for the lack of a better name) and
> helpers. It's an implementation of a basic TTM-based memory manager.
>
> Both, GEM TTM and Simple TTM, support VRAM and SYSTEM placements. Support
> for TT could probably be added if necessary. Both can be used independedly
> from each other if desired by the DRM driver.
>
> Currently ast, bochs, mgag200, vboxvideo and hisilicon/hibmc can use
> these helpers. Cirrus would also be a candidate, but as it's being
> rewrtten from scratch, I didn't bother doing the conversion.
>
> Future directions: with these changes, the respective drivers can also
> share some of their mode-setting or fbdev code. GEM TTM could implement
> PRIME helpers, which would allow for using the generic fbcon.
>
> The patch set is against a recent drm-tip.
>
> Thomas Zimmermann (15):
>    drm: Add |struct drm_gem_ttm_object| and helpers
>    drm: Add |struct drm_gem_ttm_object| callbacks for |struct
>      ttm_bo_driver|
>    drm: Add |struct drm_gem_ttm_object| callbacks for |struct drm_driver|
>    drm: Add drm_gem_ttm_fill_create_dumb() to create dumb buffers
>    drm: Add Simple TTM, a memory manager for dedicated VRAM
>    drm/ast: Convert AST driver to |struct drm_gem_ttm_object|
>    drm/ast: Convert AST driver to Simple TTM
>    drm/bochs: Convert Bochs driver to |struct drm_gem_ttm_object|
>    drm/bochs: Convert Bochs driver to Simple TTM
>    drm/mgag200: Convert mgag200 driver to |struct drm_gem_ttm_object|
>    drm/mgag200: Convert mgag200 driver to Simple TTM
>    drm/vboxvideo: Convert vboxvideo driver to |struct drm_gem_ttm_object|
>    drm/vboxvideo: Convert vboxvideo driver to Simple TTM
>    drm/hisilicon: Convert hibmc-drm driver to |struct drm_gem_ttm_object|
>    drm/hisilicon: Convert hibmc-drm driver to Simple TTM
>
>   Documentation/gpu/drm-mm.rst                  |  23 +
>   drivers/gpu/drm/Kconfig                       |  20 +
>   drivers/gpu/drm/Makefile                      |   5 +
>   drivers/gpu/drm/ast/Kconfig                   |   3 +-
>   drivers/gpu/drm/ast/ast_drv.c                 |   4 +-
>   drivers/gpu/drm/ast/ast_drv.h                 |  58 +-
>   drivers/gpu/drm/ast/ast_fb.c                  |  18 +-
>   drivers/gpu/drm/ast/ast_main.c                |  74 +--
>   drivers/gpu/drm/ast/ast_mode.c                |  78 +--
>   drivers/gpu/drm/ast/ast_ttm.c                 | 290 +---------
>   drivers/gpu/drm/bochs/Kconfig                 |   2 +
>   drivers/gpu/drm/bochs/bochs.h                 |  42 +-
>   drivers/gpu/drm/bochs/bochs_drv.c             |   4 +-
>   drivers/gpu/drm/bochs/bochs_kms.c             |  18 +-
>   drivers/gpu/drm/bochs/bochs_mm.c              | 392 +-------------
>   drivers/gpu/drm/drm_gem_ttm_helper.c          | 507 ++++++++++++++++++
>   drivers/gpu/drm/drm_simple_ttm_helper.c       | 191 +++++++
>   drivers/gpu/drm/drm_ttm_helper_common.c       |   6 +
>   drivers/gpu/drm/hisilicon/hibmc/Kconfig       |   2 +
>   .../gpu/drm/hisilicon/hibmc/hibmc_drm_de.c    |  19 +-
>   .../gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c   |   4 +-
>   .../gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h   |  28 +-
>   .../gpu/drm/hisilicon/hibmc/hibmc_drm_fbdev.c |  30 +-
>   drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c   | 328 +----------
>   drivers/gpu/drm/mgag200/Kconfig               |   2 +
>   drivers/gpu/drm/mgag200/mgag200_cursor.c      |  61 ++-
>   drivers/gpu/drm/mgag200/mgag200_drv.c         |   4 +-
>   drivers/gpu/drm/mgag200/mgag200_drv.h         |  68 +--
>   drivers/gpu/drm/mgag200/mgag200_fb.c          |  18 +-
>   drivers/gpu/drm/mgag200/mgag200_main.c        |  84 +--
>   drivers/gpu/drm/mgag200/mgag200_mode.c        |  45 +-
>   drivers/gpu/drm/mgag200/mgag200_ttm.c         | 290 +---------
>   drivers/gpu/drm/vboxvideo/Kconfig             |   2 +
>   drivers/gpu/drm/vboxvideo/vbox_drv.c          |   5 +-
>   drivers/gpu/drm/vboxvideo/vbox_drv.h          |  64 +--
>   drivers/gpu/drm/vboxvideo/vbox_fb.c           |  22 +-
>   drivers/gpu/drm/vboxvideo/vbox_main.c         |  70 +--
>   drivers/gpu/drm/vboxvideo/vbox_mode.c         |  36 +-
>   drivers/gpu/drm/vboxvideo/vbox_ttm.c          | 344 +-----------
>   include/drm/drm_gem_ttm_helper.h              | 119 ++++
>   include/drm/drm_simple_ttm_helper.h           |  74 +++
>   41 files changed, 1292 insertions(+), 2162 deletions(-)
>   create mode 100644 drivers/gpu/drm/drm_gem_ttm_helper.c
>   create mode 100644 drivers/gpu/drm/drm_simple_ttm_helper.c
>   create mode 100644 drivers/gpu/drm/drm_ttm_helper_common.c
>   create mode 100644 include/drm/drm_gem_ttm_helper.h
>   create mode 100644 include/drm/drm_simple_ttm_helper.h
>
> --
> 2.21.0
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [RFC PATCH 01/12] virtio/s390: use vring_create_virtqueue
From: Cornelia Huck @ 2019-04-08 11:01 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S . Tsirkin, Farhan Ali, virtualization,
	Martin Schwidefsky, Viktor Mihajlovski, Janosch Frank
In-Reply-To: <20190404231622.52531-2-pasic@linux.ibm.com>

On Fri,  5 Apr 2019 01:16:11 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
> establishes a new way of allocating virtqueues (as a part of the effort
> that taught DMA to virtio rings).
> 
> In the future we will want virtio-ccw to use the DMA API as well.
> 
> Let us switch from the legacy method of allocating virtqueues to
> vring_create_virtqueue() as the first step into that direction.
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  drivers/s390/virtio/virtio_ccw.c | 27 ++++++++-------------------
>  1 file changed, 8 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> index 74c328321889..edf4afe2d688 100644
> --- a/drivers/s390/virtio/virtio_ccw.c
> +++ b/drivers/s390/virtio/virtio_ccw.c

> @@ -516,17 +512,10 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
>  		err = info->num;
>  		goto out_err;
>  	}
> -	size = PAGE_ALIGN(vring_size(info->num, KVM_VIRTIO_CCW_RING_ALIGN));
> -	info->queue = alloc_pages_exact(size, GFP_KERNEL | __GFP_ZERO);
> -	if (info->queue == NULL) {
> -		dev_warn(&vcdev->cdev->dev, "no queue\n");
> -		err = -ENOMEM;
> -		goto out_err;
> -	}
> +	vq = vring_create_virtqueue(i, info->num, KVM_VIRTIO_CCW_RING_ALIGN,
> +				    vdev, true, true, ctx,

This second true means 'may_reduce_num'. Looking at the vring code, it
seems that this parameter is never checked; the code will try to
allocate a smaller queue if it can't get the requested size in any
case... this will probably be a problem for legacy virtio-pci, which
explicitly sets may_reduce_num to false. (I can try to come up with a
patch to fix that.)

> +				    virtio_ccw_kvm_notify, callback, name);
>  
> -	vq = vring_new_virtqueue(i, info->num, KVM_VIRTIO_CCW_RING_ALIGN, vdev,
> -				 true, ctx, info->queue, virtio_ccw_kvm_notify,
> -				 callback, name);
>  	if (!vq) {
>  		/* For now, we fail if we can't get the requested size. */
>  		dev_warn(&vcdev->cdev->dev, "no vq\n");
> @@ -534,15 +523,17 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
>  		goto out_err;
>  	}
>  
> +

Extra blank line :)

>  	/* Register it with the host. */
> +	queue = virtqueue_get_desc_addr(vq);
>  	if (vcdev->revision == 0) {
> -		info->info_block->l.queue = (__u64)info->queue;
> +		info->info_block->l.queue = queue;
>  		info->info_block->l.align = KVM_VIRTIO_CCW_RING_ALIGN;
>  		info->info_block->l.index = i;
>  		info->info_block->l.num = info->num;

You always fill in the size requested by the host, but the actual size
may be smaller (see above). I don't think that is allowed for revision
0 (which implies !virtio-1). You probably need to call
vring_create_virtqueue with may_reduce_num=false for revision 0 (and
wait for the generic vring code to be fixed...)

>  		ccw->count = sizeof(info->info_block->l);
>  	} else {
> -		info->info_block->s.desc = (__u64)info->queue;
> +		info->info_block->s.desc = queue;
>  		info->info_block->s.index = i;
>  		info->info_block->s.num = info->num;

Here, you need to obtain the actual number via
virtqueue_get_vring_size().

>  		info->info_block->s.avail = (__u64)virtqueue_get_avail(vq);

^ permalink raw reply

* Re: [PATCH RFC 0/4] vsock/virtio: optimizations to increase the throughput
From: Stefan Hajnoczi @ 2019-04-08  9:44 UTC (permalink / raw)
  To: Jason Wang
  Cc: kvm, Michael S. Tsirkin, netdev, linux-kernel, virtualization,
	David S. Miller, Stefano Garzarella
In-Reply-To: <a98d194d-e257-41ff-82e7-de2d16d2f4d4@redhat.com>


[-- Attachment #1.1: Type: text/plain, Size: 383 bytes --]

On Mon, Apr 08, 2019 at 02:43:28PM +0800, Jason Wang wrote:
> Another thing that may help is to implement sendpage(), which will greatly
> improve the performance.

I can't find documentation for ->sendpage().  Is the idea that you get a
struct page for the payload and can do zero-copy tx?  (And can userspace
still write to the page, invalidating checksums in the header?)

Stefan

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH RFC 3/4] vsock/virtio: change the maximum packet size allowed
From: Stefan Hajnoczi @ 2019-04-08  9:37 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: kvm, Michael S. Tsirkin, linux-kernel, virtualization, netdev,
	David S. Miller
In-Reply-To: <20190405100747.dbwi3sjaudp3d2wa@steredhat>


[-- Attachment #1.1: Type: text/plain, Size: 3545 bytes --]

On Fri, Apr 05, 2019 at 12:07:47PM +0200, Stefano Garzarella wrote:
> On Fri, Apr 05, 2019 at 09:24:47AM +0100, Stefan Hajnoczi wrote:
> > On Thu, Apr 04, 2019 at 12:58:37PM +0200, Stefano Garzarella wrote:
> > > Since now we are able to split packets, we can avoid limiting
> > > their sizes to VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE.
> > > Instead, we can use VIRTIO_VSOCK_MAX_PKT_BUF_SIZE as the max
> > > packet size.
> > > 
> > > Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> > > ---
> > >  net/vmw_vsock/virtio_transport_common.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
> > > index f32301d823f5..822e5d07a4ec 100644
> > > --- a/net/vmw_vsock/virtio_transport_common.c
> > > +++ b/net/vmw_vsock/virtio_transport_common.c
> > > @@ -167,8 +167,8 @@ static int virtio_transport_send_pkt_info(struct vsock_sock *vsk,
> > >  	vvs = vsk->trans;
> > >  
> > >  	/* we can send less than pkt_len bytes */
> > > -	if (pkt_len > VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE)
> > > -		pkt_len = VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE;
> > > +	if (pkt_len > VIRTIO_VSOCK_MAX_PKT_BUF_SIZE)
> > > +		pkt_len = VIRTIO_VSOCK_MAX_PKT_BUF_SIZE;
> > 
> > The next line limits pkt_len based on available credits:
> > 
> >   /* virtio_transport_get_credit might return less than pkt_len credit */
> >   pkt_len = virtio_transport_get_credit(vvs, pkt_len);
> > 
> > I think drivers/vhost/vsock.c:vhost_transport_do_send_pkt() now works
> > correctly even with pkt_len > VIRTIO_VSOCK_MAX_PKT_BUF_SIZE.
> 
> Correct.
> 
> > 
> > The other ->send_pkt() callback is
> > net/vmw_vsock/virtio_transport.c:virtio_transport_send_pkt_work() and it
> > can already send any size packet.
> > 
> > Do you remember why VIRTIO_VSOCK_MAX_PKT_BUF_SIZE still needs to be the
> > limit?  I'm wondering if we can get rid of it now and just limit packets
> > to the available credits.
> 
> There are 2 reasons why I left this limit:
> 1. When the host receives a packets, it must be <=
>    VIRTIO_VSOCK_MAX_PKT_BUF_SIZE [drivers/vhost/vsock.c:vhost_vsock_alloc_pkt()]
>    So in this way we can limit the packets sent from the guest.

The general intent is to prevent the guest from sending huge buffers.
This is good.

However, the guest must already obey the credit limit advertized by the
host.  Therefore I think we should be checking against that instead of
an arbitrary constant limit.

So I think the limit should be the receive buffer size, not
VIRTIO_VSOCK_MAX_PKT_BUF_SIZE.  But at this point the code doesn't know
which connection the packet is associated with and cannot check the
receive buffer size. :(

Anyway, any change to this behavior requires compatibility so new guest
drivers work with old vhost_vsock.ko.  Therefore we should probably just
leave the limit for now.

> 2. When the host send packets, it help us to increase the parallelism
>    (especially if the guest has 64 KB RX buffers) because the user thread
>    will split packets, calling multiple times transport->stream_enqueue()
>    in net/vmw_vsock/af_vsock.c:vsock_stream_sendmsg() while the
>    vhost_transport_send_pkt_work() send them to the guest.

Sorry, I don't understand the reasoning.  Overall this creates more
work.  Are you saying the benefit is that
vhost_transport_send_pkt_work() can run "early" and notify the guest of
partial rx data before all of it has been enqueued?

Stefan

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH RFC 2/4] vhost/vsock: split packets to send using multiple buffers
From: Stefan Hajnoczi @ 2019-04-08  9:28 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: kvm, Michael S. Tsirkin, linux-kernel, virtualization, netdev,
	David S. Miller
In-Reply-To: <20190405093608.f2zsyxnjxcba5v6r@steredhat>


[-- Attachment #1.1: Type: text/plain, Size: 1367 bytes --]

On Fri, Apr 05, 2019 at 11:36:08AM +0200, Stefano Garzarella wrote:
> On Fri, Apr 05, 2019 at 09:13:56AM +0100, Stefan Hajnoczi wrote:
> > On Thu, Apr 04, 2019 at 12:58:36PM +0200, Stefano Garzarella wrote:
> > > -		vhost_add_used(vq, head, sizeof(pkt->hdr) + pkt->len);
> > > +		vhost_add_used(vq, head, sizeof(pkt->hdr) + payload_len);
> > >  		added = true;
> > >  
> > > +		pkt->off += payload_len;
> > > +
> > > +		/* If we didn't send all the payload we can requeue the packet
> > > +		 * to send it with the next available buffer.
> > > +		 */
> > > +		if (pkt->off < pkt->len) {
> > > +			spin_lock_bh(&vsock->send_pkt_list_lock);
> > > +			list_add(&pkt->list, &vsock->send_pkt_list);
> > > +			spin_unlock_bh(&vsock->send_pkt_list_lock);
> > > +			continue;
> > 
> > The virtio_transport_deliver_tap_pkt() call is skipped.  Packet capture
> > should see the exact packets that are delivered.  I think this patch
> > will present one large packet instead of several smaller packets that
> > were actually delivered.
> 
> I'll modify virtio_transport_build_skb() to take care of pkt->off
> and reading the payload size from the virtio_vsock_hdr.
> Otherwise, should I introduce another field in virtio_vsock_pkt to store
> the payload size?

I don't remember the details but I trust you'll pick a good way of doing
it.

Stefan

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH RFC 1/4] vsock/virtio: reduce credit update messages
From: Stefan Hajnoczi @ 2019-04-08  9:25 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: kvm, Michael S. Tsirkin, linux-kernel, virtualization, netdev,
	David S. Miller
In-Reply-To: <20190405081648.2zflr7gxknk4q3a2@steredhat>


[-- Attachment #1.1: Type: text/plain, Size: 1321 bytes --]

On Fri, Apr 05, 2019 at 10:16:48AM +0200, Stefano Garzarella wrote:
> On Thu, Apr 04, 2019 at 08:15:39PM +0100, Stefan Hajnoczi wrote:
> > On Thu, Apr 04, 2019 at 12:58:35PM +0200, Stefano Garzarella wrote:
> > >  	int err = -EFAULT;
> > >  
> > >  	spin_lock_bh(&vvs->rx_lock);
> > > @@ -288,9 +290,15 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
> > >  	}
> > >  	spin_unlock_bh(&vvs->rx_lock);
> > >  
> > > -	/* Send a credit pkt to peer */
> > > -	virtio_transport_send_credit_update(vsk, VIRTIO_VSOCK_TYPE_STREAM,
> > > -					    NULL);
> > > +	/* We send a credit update only when the space available seen
> > > +	 * by the transmitter is less than VIRTIO_VSOCK_MAX_PKT_BUF_SIZE
> > > +	 */
> > > +	free_space = vvs->buf_alloc - (vvs->fwd_cnt - vvs->last_fwd_cnt);
> > 
> > Locking?  These fields should be accessed under tx_lock.
> > 
> 
> Yes, we need a lock, but looking in the code, vvs->fwd_cnd is written
> taking rx_lock (virtio_transport_dec_rx_pkt) and it is read with the
> tx_lock (virtio_transport_inc_tx_pkt).
> 
> Maybe we should use another spin_lock shared between RX and TX for those
> fields or use atomic variables.
> 
> What do you suggest?

Or make vvs->fwd_cnt atomic if it's the only field that needs to be
accessed in this manner.

Stefan

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox