Linux virtualization list

Linux virtualization list
 help / color / mirror / Atom feed

* [PATCH 2/3] drm/qxl: use QXL_GEM_DOMAIN_SURFACE for shadow bo.
From: Gerd Hoffmann @ 2018-12-06 10:46 UTC (permalink / raw)
  To: dri-devel, David Airlie
  Cc: David Airlie, open list:DRM DRIVER FOR QXL VIRTUAL GPU, open list,
	open list:DRM DRIVER FOR QXL VIRTUAL GPU
In-Reply-To: <20181206104638.23330-1-kraxel@redhat.com>

The shadow bo is used as qxl surface, so allocate it as
QXL_GEM_DOMAIN_SURFACE.  Should usually be allocated in
PRIV ttm domain then, so this reduces VRAM memory pressure.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
 drivers/gpu/drm/qxl/qxl_display.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/qxl/qxl_display.c b/drivers/gpu/drm/qxl/qxl_display.c
index ce0b9c40fc..1dde3b2ecb 100644
--- a/drivers/gpu/drm/qxl/qxl_display.c
+++ b/drivers/gpu/drm/qxl/qxl_display.c
@@ -758,7 +758,7 @@ static int qxl_plane_prepare_fb(struct drm_plane *plane,
 			user_bo->shadow = old_bo->shadow;
 		} else {
 			qxl_bo_create(qdev, user_bo->gem_base.size,
-				      true, true, QXL_GEM_DOMAIN_VRAM, NULL,
+				      true, true, QXL_GEM_DOMAIN_SURFACE, NULL,
 				      &user_bo->shadow);
 		}
 	}
-- 
2.9.3

^ permalink raw reply related

* [PATCH 3/3] drm/qxl: use QXL_GEM_DOMAIN_SURFACE for dumb gem objects
From: Gerd Hoffmann @ 2018-12-06 10:46 UTC (permalink / raw)
  To: dri-devel, David Airlie
  Cc: David Airlie, open list:DRM DRIVER FOR QXL VIRTUAL GPU, open list,
	open list:DRM DRIVER FOR QXL VIRTUAL GPU
In-Reply-To: <20181206104638.23330-1-kraxel@redhat.com>

dumb buffers are used as qxl surfaces, so allocate them as
QXL_GEM_DOMAIN_SURFACE.  Should usually be allocated in
PRIV ttm domain then, so this reduces VRAM memory pressure.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
 drivers/gpu/drm/qxl/qxl_dumb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/qxl/qxl_dumb.c b/drivers/gpu/drm/qxl/qxl_dumb.c
index e3765739c3..272d19b677 100644
--- a/drivers/gpu/drm/qxl/qxl_dumb.c
+++ b/drivers/gpu/drm/qxl/qxl_dumb.c
@@ -59,7 +59,7 @@ int qxl_mode_dumb_create(struct drm_file *file_priv,
 	surf.stride = pitch;
 	surf.format = format;
 	r = qxl_gem_object_create_with_handle(qdev, file_priv,
-					      QXL_GEM_DOMAIN_VRAM,
+					      QXL_GEM_DOMAIN_SURFACE,
 					      args->size, &surf, &qobj,
 					      &handle);
 	if (r)
-- 
2.9.3

^ permalink raw reply related

* Re: [Spice-devel] [PATCH 1/3] drm/qxl: allow both PRIV and VRAM placement for QXL_GEM_DOMAIN_SURFACE
From: Gerd Hoffmann @ 2018-12-06 11:34 UTC (permalink / raw)
  To: Frediano Ziglio
  Cc: David Airlie, open list, dri-devel,
	open list:DRM DRIVER FOR QXL VIRTUAL GPU,
	open list:DRM DRIVER FOR QXL VIRTUAL GPU, David Airlie
In-Reply-To: <1897496108.48579647.1544093758850.JavaMail.zimbra@redhat.com>

On Thu, Dec 06, 2018 at 05:55:58AM -0500, Frediano Ziglio wrote:
> 
> > qxl surfaces (used for framebuffers and gem objects) can live in both
> > VRAM and PRIV ttm domains.  Update placement setup to include both.  Put
> > PRIV first in the list so it is preferred, so VRAM will have more room
> > for objects which must be allocated there.
> > 
> > Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
> 
> I remember these kind of changes in the past made migration
> fails. I proposed similar patches years ago and they were rejected
> for these reasons.

Pointer?

> Why now they are safe?

Well, you have to be careful what object you are allocating.  Surfaces
can be in both PRIV and VRAM.  Everything else (qxl commands, monitors
config, ...) must be in VRAM.

> Looks like we are improving QXL, so that means we are actively working
> on it.

Well, I'm just trying make qxl behave better with wayland.

> Should we not then thinking about moving feature in the proper
> places (like spice-server for atomic mode setting instead of implementin
> work around) ??

Main advantage is that it doesn't need qxl device changes, so it works
on old hosts too.  But, yes, we can consider to also modernize spice
protocol and qxl device.

cheers,
  Gerd

^ permalink raw reply

* Re: [Spice-devel] [PATCH] drm/qxl: use qxl_num_crtc directly
From: Gerd Hoffmann @ 2018-12-06 11:42 UTC (permalink / raw)
  To: Frediano Ziglio
  Cc: David Airlie, open list, dri-devel,
	open list:DRM DRIVER FOR QXL VIRTUAL GPU,
	open list:DRM DRIVER FOR QXL VIRTUAL GPU, David Airlie
In-Reply-To: <207905511.48580418.1544093965441.JavaMail.zimbra@redhat.com>

On Thu, Dec 06, 2018 at 05:59:25AM -0500, Frediano Ziglio wrote:
> > 
> > Just use qxl_num_crtc directly everywhere instead of using
> > qdev->monitors_config->max_allowed.  Drops pointless indirection
> > and also is less confusing.
> > 
> 
> To me is MORE confusing, why comparing number of something with
> another number? Previously code was comparing number of monitors
> with number of monitors, not number of CRTs with number of
> monitors.

Yes, spice/qxl and drm/kms use slightly different terminology.

drm crtc == qxl monitor.
drm framebuffer == qxl surface.

You need to know that anyway when looking at the qxl ksm code.  We
have function names like qxl_crtc_update_monitors_config().  I fail
to see why that is a problem ...

cheers,
  Gerd

^ permalink raw reply

* Re: [PULL 0/2] virtio-ccw fixes
From: Cornelia Huck @ 2018-12-06 12:18 UTC (permalink / raw)
  To: Michael S . Tsirkin; +Cc: Halil Pasic, linux-s390, kvm, virtualization
In-Reply-To: <20181129155648.440743bb.cohuck@redhat.com>

On Thu, 29 Nov 2018 15:56:48 +0100
Cornelia Huck <cohuck@redhat.com> wrote:

> On Wed, 26 Sep 2018 18:48:28 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > The following changes since commit 8c0f9f5b309d627182d5da72a69246f58bde1026:
> > 
> >   Revert "uapi/linux/keyctl.h: don't use C++ reserved keyword as a struct member name" (2018-09-25 13:28:58 +0200)
> > 
> > are available in the Git repository at:
> > 
> >   git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux.git tags/virtio-ccw-20180926
> > 
> > for you to fetch changes up to 1fd7ccf82385b3b14e4b3f009a82ada17ddc1e6f:
> > 
> >   virtio/s390: fix race in ccw_io_helper() (2018-09-25 15:29:10 +0200)
> > 
> > ----------------------------------------------------------------
> > Two patches fixing races in the virtio-ccw driver.
> > 
> > ----------------------------------------------------------------
> > 
> > Halil Pasic (2):
> >   virtio/s390: avoid race on vcdev->config
> >   virtio/s390: fix race in ccw_io_helper()
> > 
> >  drivers/s390/virtio/virtio_ccw.c | 17 ++++++++++++++---
> >  1 file changed, 14 insertions(+), 3 deletions(-)
> >   
> 
> Ping. I don't see these patches in master :(

Ping^2. Any chance to get this into the next version?

^ permalink raw reply

* Re: [Spice-devel] [PATCH] drm/qxl: use qxl_num_crtc directly
From: Gerd Hoffmann @ 2018-12-06 13:49 UTC (permalink / raw)
  To: Frediano Ziglio
  Cc: David Airlie, open list, dri-devel,
	open list:DRM DRIVER FOR QXL VIRTUAL GPU,
	open list:DRM DRIVER FOR QXL VIRTUAL GPU, David Airlie
In-Reply-To: <1902655248.48590444.1544100790071.JavaMail.zimbra@redhat.com>

On Thu, Dec 06, 2018 at 07:53:10AM -0500, Frediano Ziglio wrote:
> > 
> > On Thu, Dec 06, 2018 at 05:59:25AM -0500, Frediano Ziglio wrote:
> > > > 
> > > > Just use qxl_num_crtc directly everywhere instead of using
> > > > qdev->monitors_config->max_allowed.  Drops pointless indirection
> > > > and also is less confusing.
> > > > 
> > > 
> > > To me is MORE confusing, why comparing number of something with
> > > another number? Previously code was comparing number of monitors
> > > with number of monitors, not number of CRTs with number of
> > > monitors.
> > 
> > Yes, spice/qxl and drm/kms use slightly different terminology.
> > 
> > drm crtc == qxl monitor.
> > drm framebuffer == qxl surface.
> > 
> > You need to know that anyway when looking at the qxl ksm code.  We
> > have function names like qxl_crtc_update_monitors_config().  I fail
> > to see why that is a problem ...
> > 
> > cheers,
> >   Gerd
> 
> I don't see any problem too but you are explaining to me
> why your rationale "and also is less confusing" does not
> stand.

Well, it's less confusing because it takes away an indirection (not
because of the naming).

qdev->monitors_config->max_allowed is effectively set by a module
parameter.  So using the module parameter variable qxl_num_crtc
directly is better IMO.  The kernel doesn't need to dereference pointers
each time it needs the value, and when reading the code you don't have
to trace where and why qdev->monitors_config->max_allowed is set.

cheers,
  Gerd

^ permalink raw reply

* Re: [PULL 0/2] virtio-ccw fixes
From: Michael S. Tsirkin @ 2018-12-06 13:53 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: Halil Pasic, linux-s390, kvm, virtualization
In-Reply-To: <20181206131828.186fcf96.cohuck@redhat.com>

On Thu, Dec 06, 2018 at 01:18:28PM +0100, Cornelia Huck wrote:
> On Thu, 29 Nov 2018 15:56:48 +0100
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Wed, 26 Sep 2018 18:48:28 +0200
> > Cornelia Huck <cohuck@redhat.com> wrote:
> > 
> > > The following changes since commit 8c0f9f5b309d627182d5da72a69246f58bde1026:
> > > 
> > >   Revert "uapi/linux/keyctl.h: don't use C++ reserved keyword as a struct member name" (2018-09-25 13:28:58 +0200)
> > > 
> > > are available in the Git repository at:
> > > 
> > >   git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux.git tags/virtio-ccw-20180926
> > > 
> > > for you to fetch changes up to 1fd7ccf82385b3b14e4b3f009a82ada17ddc1e6f:
> > > 
> > >   virtio/s390: fix race in ccw_io_helper() (2018-09-25 15:29:10 +0200)
> > > 
> > > ----------------------------------------------------------------
> > > Two patches fixing races in the virtio-ccw driver.
> > > 
> > > ----------------------------------------------------------------
> > > 
> > > Halil Pasic (2):
> > >   virtio/s390: avoid race on vcdev->config
> > >   virtio/s390: fix race in ccw_io_helper()
> > > 
> > >  drivers/s390/virtio/virtio_ccw.c | 17 ++++++++++++++---
> > >  1 file changed, 14 insertions(+), 3 deletions(-)
> > >   
> > 
> > Ping. I don't see these patches in master :(
> 
> Ping^2. Any chance to get this into the next version?

Pulled, thanks!

^ permalink raw reply

* Re: [Spice-devel] [PATCH] drm/qxl: use qxl_num_crtc directly
From: Gerd Hoffmann @ 2018-12-06 14:21 UTC (permalink / raw)
  To: Frediano Ziglio
  Cc: David Airlie, open list, dri-devel,
	open list:DRM DRIVER FOR QXL VIRTUAL GPU,
	open list:DRM DRIVER FOR QXL VIRTUAL GPU, David Airlie
In-Reply-To: <621894308.48603466.1544105415949.JavaMail.zimbra@redhat.com>

> > qdev->monitors_config->max_allowed is effectively set by a module
> > parameter.  So using the module parameter variable qxl_num_crtc
> > directly is better IMO.  The kernel doesn't need to dereference pointers
> > each time it needs the value, and when reading the code you don't have
> > to trace where and why qdev->monitors_config->max_allowed is set.
> 
> That should go to the commit message!
> With that the patch is fine for me.

ok, I'll improve the message.

> Maybe there's no much point on reusing the same structure used
> inside QXLRom/QXLRam but this is OT for this patch.

It's part of the qxl device ABI, the same structure is used for both
host->guest (config hint from spice-client) and guest->host (actual
configuration).

cheers,
  Gerd

^ permalink raw reply

* [PATCH v2] drm/qxl: use qxl_num_crtc directly
From: Gerd Hoffmann @ 2018-12-06 14:27 UTC (permalink / raw)
  To: dri-devel, David Airlie
  Cc: David Airlie, open list:DRM DRIVER FOR QXL VIRTUAL GPU, open list,
	open list:DRM DRIVER FOR QXL VIRTUAL GPU

qdev->monitors_config->max_allowed is effectively set by the
qxl.num_heads module parameter, stored in the qxl_num_crtc variable.
Lets get rid of the indirection and use the variable qxl_num_crtc
directly.  The kernel doesn't need to dereference pointers each time it
needs the value, and when reading the code you don't have to trace where
and why qdev->monitors_config->max_allowed is set.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
 drivers/gpu/drm/qxl/qxl_display.c | 21 +++++++++------------
 1 file changed, 9 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/qxl/qxl_display.c b/drivers/gpu/drm/qxl/qxl_display.c
index ce0b9c40fc..6437369a31 100644
--- a/drivers/gpu/drm/qxl/qxl_display.c
+++ b/drivers/gpu/drm/qxl/qxl_display.c
@@ -80,10 +80,10 @@ static int qxl_display_copy_rom_client_monitors_config(struct qxl_device *qdev)
 		DRM_DEBUG_KMS("no client monitors configured\n");
 		return status;
 	}
-	if (num_monitors > qdev->monitors_config->max_allowed) {
+	if (num_monitors > qxl_num_crtc) {
 		DRM_DEBUG_KMS("client monitors list will be truncated: %d < %d\n",
-			      qdev->monitors_config->max_allowed, num_monitors);
-		num_monitors = qdev->monitors_config->max_allowed;
+			      qxl_num_crtc, num_monitors);
+		num_monitors = qxl_num_crtc;
 	} else {
 		num_monitors = qdev->rom->client_monitors_config.count;
 	}
@@ -96,8 +96,7 @@ static int qxl_display_copy_rom_client_monitors_config(struct qxl_device *qdev)
 		return status;
 	}
 	/* we copy max from the client but it isn't used */
-	qdev->client_monitors_config->max_allowed =
-				qdev->monitors_config->max_allowed;
+	qdev->client_monitors_config->max_allowed = qxl_num_crtc;
 	for (i = 0 ; i < qdev->client_monitors_config->count ; ++i) {
 		struct qxl_urect *c_rect =
 			&qdev->rom->client_monitors_config.heads[i];
@@ -204,7 +203,7 @@ static int qxl_add_monitors_config_modes(struct drm_connector *connector,
 
 	if (!qdev->monitors_config)
 		return 0;
-	if (h >= qdev->monitors_config->max_allowed)
+	if (h >= qxl_num_crtc)
 		return 0;
 	if (!qdev->client_monitors_config)
 		return 0;
@@ -307,8 +306,7 @@ static void qxl_crtc_update_monitors_config(struct drm_crtc *crtc,
 		return;
 	}
 
-	if (!qdev->monitors_config ||
-	    qdev->monitors_config->max_allowed <= i)
+	if (!qdev->monitors_config || qxl_num_crtc <= i)
 		return;
 
 	head.id = i;
@@ -348,9 +346,10 @@ static void qxl_crtc_update_monitors_config(struct drm_crtc *crtc,
 	if (oldcount != qdev->monitors_config->count)
 		DRM_DEBUG_KMS("active heads %d -> %d (%d total)\n",
 			      oldcount, qdev->monitors_config->count,
-			      qdev->monitors_config->max_allowed);
+			      qxl_num_crtc);
 
 	qdev->monitors_config->heads[i] = head;
+	qdev->monitors_config->max_allowed = qxl_num_crtc;
 	qxl_send_monitors_config(qdev);
 }
 
@@ -1097,9 +1096,8 @@ int qxl_create_monitors_object(struct qxl_device *qdev)
 {
 	int ret;
 	struct drm_gem_object *gobj;
-	int max_allowed = qxl_num_crtc;
 	int monitors_config_size = sizeof(struct qxl_monitors_config) +
-		max_allowed * sizeof(struct qxl_head);
+		qxl_num_crtc * sizeof(struct qxl_head);
 
 	ret = qxl_gem_object_create(qdev, monitors_config_size, 0,
 				    QXL_GEM_DOMAIN_VRAM,
@@ -1121,7 +1119,6 @@ int qxl_create_monitors_object(struct qxl_device *qdev)
 		qxl_bo_physical_address(qdev, qdev->monitors_config_bo, 0);
 
 	memset(qdev->monitors_config, 0, monitors_config_size);
-	qdev->monitors_config->max_allowed = max_allowed;
 	return 0;
 }
 
-- 
2.9.3

^ permalink raw reply related

* [PATCH 1/1] virtio: remove deprecated VIRTIO_PCI_CONFIG()
From: Dongli Zhang @ 2018-12-07  7:34 UTC (permalink / raw)
  To: virtualization, linux-kernel; +Cc: mst

VIRTIO_PCI_CONFIG() is deprecated. Use VIRTIO_PCI_CONFIG_OFF() instead.

Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
 drivers/virtio/virtio_pci_legacy.c | 6 ++++--
 include/uapi/linux/virtio_pci.h    | 2 --
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/virtio/virtio_pci_legacy.c b/drivers/virtio/virtio_pci_legacy.c
index de062fb..eff9ddc 100644
--- a/drivers/virtio/virtio_pci_legacy.c
+++ b/drivers/virtio/virtio_pci_legacy.c
@@ -52,7 +52,8 @@ static void vp_get(struct virtio_device *vdev, unsigned offset,
 {
 	struct virtio_pci_device *vp_dev = to_vp_device(vdev);
 	void __iomem *ioaddr = vp_dev->ioaddr +
-				VIRTIO_PCI_CONFIG(vp_dev) + offset;
+			VIRTIO_PCI_CONFIG_OFF(vp_dev->msix_enabled) +
+			offset;
 	u8 *ptr = buf;
 	int i;
 
@@ -67,7 +68,8 @@ static void vp_set(struct virtio_device *vdev, unsigned offset,
 {
 	struct virtio_pci_device *vp_dev = to_vp_device(vdev);
 	void __iomem *ioaddr = vp_dev->ioaddr +
-				VIRTIO_PCI_CONFIG(vp_dev) + offset;
+			VIRTIO_PCI_CONFIG_OFF(vp_dev->msix_enabled) +
+			offset;
 	const u8 *ptr = buf;
 	int i;
 
diff --git a/include/uapi/linux/virtio_pci.h b/include/uapi/linux/virtio_pci.h
index 90007a1..2070232 100644
--- a/include/uapi/linux/virtio_pci.h
+++ b/include/uapi/linux/virtio_pci.h
@@ -78,8 +78,6 @@
 /* The remaining space is defined by each driver as the per-driver
  * configuration space */
 #define VIRTIO_PCI_CONFIG_OFF(msix_enabled)	((msix_enabled) ? 24 : 20)
-/* Deprecated: please use VIRTIO_PCI_CONFIG_OFF instead */
-#define VIRTIO_PCI_CONFIG(dev)	VIRTIO_PCI_CONFIG_OFF((dev)->msix_enabled)
 
 /* Virtio ABI version, this must match exactly */
 #define VIRTIO_PCI_ABI_VERSION		0
-- 
2.7.4

^ permalink raw reply related

* Re: [Spice-devel] [PATCH 1/3] drm/qxl: allow both PRIV and VRAM placement for QXL_GEM_DOMAIN_SURFACE
From: Gerd Hoffmann @ 2018-12-07  8:26 UTC (permalink / raw)
  To: Frediano Ziglio
  Cc: David Airlie, open list, dri-devel,
	open list:DRM DRIVER FOR QXL VIRTUAL GPU,
	open list:DRM DRIVER FOR QXL VIRTUAL GPU, David Airlie
In-Reply-To: <885772449.48610143.1544107194097.JavaMail.zimbra@redhat.com>

On Thu, Dec 06, 2018 at 09:39:54AM -0500, Frediano Ziglio wrote:
> > 
> > On Thu, Dec 06, 2018 at 05:55:58AM -0500, Frediano Ziglio wrote:
> > > 
> > > > qxl surfaces (used for framebuffers and gem objects) can live in both
> > > > VRAM and PRIV ttm domains.  Update placement setup to include both.  Put
> > > > PRIV first in the list so it is preferred, so VRAM will have more room
> > > > for objects which must be allocated there.
> > > > 
> > > > Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
> > > 
> > > I remember these kind of changes in the past made migration
> > > fails. I proposed similar patches years ago and they were rejected
> > > for these reasons.
> > 
> > Pointer?
> 
> I think is this:
> https://patchwork.kernel.org/patch/7374931/

Ok, problem is you are doing it for both QXL_GEM_DOMAIN_VRAM and
QXL_GEM_DOMAIN_SURFACE.  Surfaces can be in both VRAM and PRIV ttm
domains, so for the later this is fine.  Most other allocations must be
in VRAM ttm domain though, so allowing PRIV for QXL_GEM_DOMAIN_VRAM is
*not* ok.

Noticed I need patch 1/2 of that series, otherwise things will break
when we run out of space in PRIV domain and surfaces are allocated in
VRAM.

> > Well, you have to be careful what object you are allocating.  Surfaces
> > can be in both PRIV and VRAM.  Everything else (qxl commands, monitors
> > config, ...) must be in VRAM.
> > 
> > > Looks like we are improving QXL, so that means we are actively working
> > > on it.
> > 
> > Well, I'm just trying make qxl behave better with wayland.
> > 
> 
> As far as I remember the Linux kernel driver simulates the frame buffer
> swap with drawings which cause more memory copies.

Yes.  wayland renders into a dumb gem bo (backed by a qxl surface),
Actual primary surface is a shadow bo though.  Display updates are done
by sending a draw command for the primary surface, with the pixel data
coming from the dumb gem bo.

We could maybe optimize that, by having the image chunk point directly
to the dumb gem bo instead of allocating memory and memcpy'ing the pixel
data.

I don't feel like putting too much effort into qxl performance
optimization though.  The time is better spent on virtio-gpu I think.

BTW: should I send virtio-gpu kernel patches to spice-devel too?

> Not also sure if this workaround make the server consume more network
> bandwidth.

Doesn't make much of a difference I think.  In case we improve qxl/spice
to support (a) one surface per monitor and (b) pageflips so we don't
need the shadow bo, then we would still need to send the pixel data.

> Are we supporting multiple monitors for Wayland?

I have patches for that.  Which basically extend the shadow logic, so
the shadow bo used as primary surface will be big enough that all
heads/monitors/crtcs fit in.  qxl/spice has a single surface then, while
userspace (i.e. wayland) can use one dumb gem bo per head/monitor/crtc.

Guess I should re-send them with spice-devel cc'ed,

> > Main advantage is that it doesn't need qxl device changes, so it works
> > on old hosts too.  But, yes, we can consider to also modernize spice
> > protocol and qxl device.
> 
> On the other hand we faced some bugs due these workarounds so we end up
> with a solution that is less optimal than before and potentially
> with more bugs to fix.

Ahem, well, not really.

Adding support for atomic modesetting to qxl caused some regressions
indeed.  Those should be fixed meanwhile.

Anything which was done in the qxl driver for wayland (most notably the
shadow logic) improved the situation for wayland.  wayland is pretty
much unusable without shadowing.  I'm not aware of any xorg regressions
caused by dumb bo shadowing.  xorg doesn't use dumb bo's in the first
place, and it composes its own qxl drawing commands instead of expecting
the qxl kms driver handle the update on pageflip.

cheers,
  Gerd

^ permalink raw reply

* [RFC 0/3] virtio_ring: define flags as shifts consistently
From: Tiwei Bie @ 2018-12-07  8:48 UTC (permalink / raw)
  To: mst, jasowang, virtualization, linux-kernel, netdev, virtio-dev
  Cc: maxime.coquelin, wexu

This is a follow up of the discussion in this thread:
https://patchwork.ozlabs.org/patch/1001015/#2042353

Tiwei Bie (3):
  virtio_ring: define flags as shifts consistently
  virtio_ring: add VIRTIO_RING_NO_LEGACY
  virtio_ring: use new vring flags

 drivers/virtio/virtio_ring.c     | 100 ++++++++++++++++++-------------
 include/uapi/linux/virtio_ring.h |  61 +++++++++++++------
 2 files changed, 103 insertions(+), 58 deletions(-)

-- 
2.17.1

^ permalink raw reply

* [RFC 1/3] virtio_ring: define flags as shifts consistently
From: Tiwei Bie @ 2018-12-07  8:48 UTC (permalink / raw)
  To: mst, jasowang, virtualization, linux-kernel, netdev, virtio-dev
  Cc: maxime.coquelin, wexu
In-Reply-To: <20181207084842.13133-1-tiwei.bie@intel.com>

Introduce _SPLIT_ and/or _PACKED_ variants for VRING_DESC_F_*,
VRING_AVAIL_F_NO_INTERRUPT and VRING_USED_F_NO_NOTIFY. These
variants are defined as shifts instead of shifted values for
consistency with other _F_ flags.

Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
---
 include/uapi/linux/virtio_ring.h | 57 ++++++++++++++++++++++----------
 1 file changed, 40 insertions(+), 17 deletions(-)

diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h
index 2414f8af26b3..9b0c0d92ab62 100644
--- a/include/uapi/linux/virtio_ring.h
+++ b/include/uapi/linux/virtio_ring.h
@@ -37,29 +37,52 @@
 #include <linux/types.h>
 #include <linux/virtio_types.h>
 
-/* This marks a buffer as continuing via the next field. */
+/*
+ * Notice: unlike other _F_ flags, below flags are defined as shifted
+ * values instead of shifts for compatibility.
+ */
+/* Same as VRING_SPLIT_DESC_F_NEXT. */
 #define VRING_DESC_F_NEXT	1
-/* This marks a buffer as write-only (otherwise read-only). */
+/* Same as VRING_SPLIT_DESC_F_WRITE. */
 #define VRING_DESC_F_WRITE	2
-/* This means the buffer contains a list of buffer descriptors. */
+/* Same as VRING_SPLIT_DESC_F_INDIRECT. */
 #define VRING_DESC_F_INDIRECT	4
-
-/*
- * Mark a descriptor as available or used in packed ring.
- * Notice: they are defined as shifts instead of shifted values.
- */
-#define VRING_PACKED_DESC_F_AVAIL	7
-#define VRING_PACKED_DESC_F_USED	15
-
-/* The Host uses this in used->flags to advise the Guest: don't kick me when
- * you add a buffer.  It's unreliable, so it's simply an optimization.  Guest
- * will still kick if it's out of buffers. */
+/* Same as VRING_SPLIT_USED_F_NO_NOTIFY. */
 #define VRING_USED_F_NO_NOTIFY	1
-/* The Guest uses this in avail->flags to advise the Host: don't interrupt me
- * when you consume a buffer.  It's unreliable, so it's simply an
- * optimization.  */
+/* Same as VRING_SPLIT_AVAIL_F_NO_INTERRUPT. */
 #define VRING_AVAIL_F_NO_INTERRUPT	1
 
+/* Mark a buffer as continuing via the next field in split ring. */
+#define VRING_SPLIT_DESC_F_NEXT		0
+/* Mark a buffer as write-only (otherwise read-only) in split ring. */
+#define VRING_SPLIT_DESC_F_WRITE	1
+/* Mean the buffer contains a list of buffer descriptors in split ring. */
+#define VRING_SPLIT_DESC_F_INDIRECT	2
+
+/*
+ * The Host uses this in used->flags in split ring to advise the Guest:
+ * don't kick me when you add a buffer.  It's unreliable, so it's simply
+ * an optimization.  Guest will still kick if it's out of buffers.
+ */
+#define VRING_SPLIT_USED_F_NO_NOTIFY		0
+/*
+ * The Guest uses this in avail->flags in split ring to advise the Host:
+ * don't interrupt me when you consume a buffer.  It's unreliable, so it's
+ * simply an optimization.
+ */
+#define VRING_SPLIT_AVAIL_F_NO_INTERRUPT	0
+
+/* Mark a buffer as continuing via the next field in packed ring. */
+#define VRING_PACKED_DESC_F_NEXT	0
+/* Mark a buffer as write-only (otherwise read-only) in packed ring. */
+#define VRING_PACKED_DESC_F_WRITE	1
+/* Mean the buffer contains a list of buffer descriptors in packed ring. */
+#define VRING_PACKED_DESC_F_INDIRECT	2
+
+/* Mark a descriptor as available or used in packed ring. */
+#define VRING_PACKED_DESC_F_AVAIL	7
+#define VRING_PACKED_DESC_F_USED	15
+
 /* Enable events in packed ring. */
 #define VRING_PACKED_EVENT_FLAG_ENABLE	0x0
 /* Disable events in packed ring. */
-- 
2.17.1

^ permalink raw reply related

* [RFC 2/3] virtio_ring: add VIRTIO_RING_NO_LEGACY
From: Tiwei Bie @ 2018-12-07  8:48 UTC (permalink / raw)
  To: mst, jasowang, virtualization, linux-kernel, netdev, virtio-dev
  Cc: maxime.coquelin, wexu
In-Reply-To: <20181207084842.13133-1-tiwei.bie@intel.com>

Introduce VIRTIO_RING_NO_LEGACY to support disabling legacy
macros and layout definitions.

Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
---
VRING_AVAIL_ALIGN_SIZE, VRING_USED_ALIGN_SIZE and VRING_DESC_ALIGN_SIZE
are not pre-virtio 1.0, but can also be disabled by VIRTIO_RING_NO_LEGACY
in this patch, because their names are not consistent with other names.
Not sure whether this is a good idea. If we want this, we may also want
to define _SPLIT_ version for them.

 include/uapi/linux/virtio_ring.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h
index 9b0c0d92ab62..192573827850 100644
--- a/include/uapi/linux/virtio_ring.h
+++ b/include/uapi/linux/virtio_ring.h
@@ -37,6 +37,7 @@
 #include <linux/types.h>
 #include <linux/virtio_types.h>
 
+#ifndef VIRTIO_RING_NO_LEGACY
 /*
  * Notice: unlike other _F_ flags, below flags are defined as shifted
  * values instead of shifts for compatibility.
@@ -51,6 +52,7 @@
 #define VRING_USED_F_NO_NOTIFY	1
 /* Same as VRING_SPLIT_AVAIL_F_NO_INTERRUPT. */
 #define VRING_AVAIL_F_NO_INTERRUPT	1
+#endif /* VIRTIO_RING_NO_LEGACY */
 
 /* Mark a buffer as continuing via the next field in split ring. */
 #define VRING_SPLIT_DESC_F_NEXT		0
@@ -151,6 +153,7 @@ struct vring {
 	struct vring_used *used;
 };
 
+#ifndef VIRTIO_RING_NO_LEGACY
 /* Alignment requirements for vring elements.
  * When using pre-virtio 1.0 layout, these fall out naturally.
  */
@@ -203,6 +206,7 @@ static inline unsigned vring_size(unsigned int num, unsigned long align)
 		 + align - 1) & ~(align - 1))
 		+ sizeof(__virtio16) * 3 + sizeof(struct vring_used_elem) * num;
 }
+#endif /* VIRTIO_RING_NO_LEGACY */
 
 /* The following is used with USED_EVENT_IDX and AVAIL_EVENT_IDX */
 /* Assuming a given event_idx value from the other side, if
-- 
2.17.1

^ permalink raw reply related

* [RFC 3/3] virtio_ring: use new vring flags
From: Tiwei Bie @ 2018-12-07  8:48 UTC (permalink / raw)
  To: mst, jasowang, virtualization, linux-kernel, netdev, virtio-dev
  Cc: maxime.coquelin, wexu
In-Reply-To: <20181207084842.13133-1-tiwei.bie@intel.com>

Switch to using the _SPLIT_ and _PACKED_ variants of vring flags
in split ring and packed ring respectively.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
---
 drivers/virtio/virtio_ring.c | 100 +++++++++++++++++++++--------------
 1 file changed, 59 insertions(+), 41 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index cd7e755484e3..2806f69c6c9f 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -371,17 +371,17 @@ static void vring_unmap_one_split(const struct vring_virtqueue *vq,
 
 	flags = virtio16_to_cpu(vq->vq.vdev, desc->flags);
 
-	if (flags & VRING_DESC_F_INDIRECT) {
+	if (flags & BIT(VRING_SPLIT_DESC_F_INDIRECT)) {
 		dma_unmap_single(vring_dma_dev(vq),
 				 virtio64_to_cpu(vq->vq.vdev, desc->addr),
 				 virtio32_to_cpu(vq->vq.vdev, desc->len),
-				 (flags & VRING_DESC_F_WRITE) ?
+				 (flags & BIT(VRING_SPLIT_DESC_F_WRITE)) ?
 				 DMA_FROM_DEVICE : DMA_TO_DEVICE);
 	} else {
 		dma_unmap_page(vring_dma_dev(vq),
 			       virtio64_to_cpu(vq->vq.vdev, desc->addr),
 			       virtio32_to_cpu(vq->vq.vdev, desc->len),
-			       (flags & VRING_DESC_F_WRITE) ?
+			       (flags & BIT(VRING_SPLIT_DESC_F_WRITE)) ?
 			       DMA_FROM_DEVICE : DMA_TO_DEVICE);
 	}
 }
@@ -481,7 +481,8 @@ static inline int virtqueue_add_split(struct virtqueue *_vq,
 			if (vring_mapping_error(vq, addr))
 				goto unmap_release;
 
-			desc[i].flags = cpu_to_virtio16(_vq->vdev, VRING_DESC_F_NEXT);
+			desc[i].flags = cpu_to_virtio16(_vq->vdev,
+						BIT(VRING_SPLIT_DESC_F_NEXT));
 			desc[i].addr = cpu_to_virtio64(_vq->vdev, addr);
 			desc[i].len = cpu_to_virtio32(_vq->vdev, sg->length);
 			prev = i;
@@ -494,7 +495,9 @@ static inline int virtqueue_add_split(struct virtqueue *_vq,
 			if (vring_mapping_error(vq, addr))
 				goto unmap_release;
 
-			desc[i].flags = cpu_to_virtio16(_vq->vdev, VRING_DESC_F_NEXT | VRING_DESC_F_WRITE);
+			desc[i].flags = cpu_to_virtio16(_vq->vdev,
+						BIT(VRING_SPLIT_DESC_F_NEXT) |
+						BIT(VRING_SPLIT_DESC_F_WRITE));
 			desc[i].addr = cpu_to_virtio64(_vq->vdev, addr);
 			desc[i].len = cpu_to_virtio32(_vq->vdev, sg->length);
 			prev = i;
@@ -502,7 +505,8 @@ static inline int virtqueue_add_split(struct virtqueue *_vq,
 		}
 	}
 	/* Last one doesn't continue. */
-	desc[prev].flags &= cpu_to_virtio16(_vq->vdev, ~VRING_DESC_F_NEXT);
+	desc[prev].flags &= cpu_to_virtio16(_vq->vdev,
+				(u16)~BIT(VRING_SPLIT_DESC_F_NEXT));
 
 	if (indirect) {
 		/* Now that the indirect table is filled in, map it. */
@@ -513,7 +517,7 @@ static inline int virtqueue_add_split(struct virtqueue *_vq,
 			goto unmap_release;
 
 		vq->split.vring.desc[head].flags = cpu_to_virtio16(_vq->vdev,
-				VRING_DESC_F_INDIRECT);
+				BIT(VRING_SPLIT_DESC_F_INDIRECT));
 		vq->split.vring.desc[head].addr = cpu_to_virtio64(_vq->vdev,
 				addr);
 
@@ -603,8 +607,8 @@ static bool virtqueue_kick_prepare_split(struct virtqueue *_vq)
 					      new, old);
 	} else {
 		needs_kick = !(vq->split.vring.used->flags &
-					cpu_to_virtio16(_vq->vdev,
-						VRING_USED_F_NO_NOTIFY));
+				cpu_to_virtio16(_vq->vdev,
+					BIT(VRING_SPLIT_USED_F_NO_NOTIFY)));
 	}
 	END_USE(vq);
 	return needs_kick;
@@ -614,7 +618,8 @@ static void detach_buf_split(struct vring_virtqueue *vq, unsigned int head,
 			     void **ctx)
 {
 	unsigned int i, j;
-	__virtio16 nextflag = cpu_to_virtio16(vq->vq.vdev, VRING_DESC_F_NEXT);
+	__virtio16 nextflag = cpu_to_virtio16(vq->vq.vdev,
+				BIT(VRING_SPLIT_DESC_F_NEXT));
 
 	/* Clear data ptr. */
 	vq->split.desc_state[head].data = NULL;
@@ -649,7 +654,8 @@ static void detach_buf_split(struct vring_virtqueue *vq, unsigned int head,
 				vq->split.vring.desc[head].len);
 
 		BUG_ON(!(vq->split.vring.desc[head].flags &
-			 cpu_to_virtio16(vq->vq.vdev, VRING_DESC_F_INDIRECT)));
+			 cpu_to_virtio16(vq->vq.vdev,
+				 BIT(VRING_SPLIT_DESC_F_INDIRECT))));
 		BUG_ON(len == 0 || len % sizeof(struct vring_desc));
 
 		for (j = 0; j < len / sizeof(struct vring_desc); j++)
@@ -715,7 +721,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
 	/* If we expect an interrupt for the next entry, tell host
 	 * by writing event index and flush out the write before
 	 * the read in the next get_buf call. */
-	if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
+	if (!(vq->split.avail_flags_shadow &
+			BIT(VRING_SPLIT_AVAIL_F_NO_INTERRUPT)))
 		virtio_store_mb(vq->weak_barriers,
 				&vring_used_event(&vq->split.vring),
 				cpu_to_virtio16(_vq->vdev, vq->last_used_idx));
@@ -730,8 +737,10 @@ static void virtqueue_disable_cb_split(struct virtqueue *_vq)
 {
 	struct vring_virtqueue *vq = to_vvq(_vq);
 
-	if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) {
-		vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
+	if (!(vq->split.avail_flags_shadow &
+			BIT(VRING_SPLIT_AVAIL_F_NO_INTERRUPT))) {
+		vq->split.avail_flags_shadow |=
+			BIT(VRING_SPLIT_AVAIL_F_NO_INTERRUPT);
 		if (!vq->event)
 			vq->split.vring.avail->flags =
 				cpu_to_virtio16(_vq->vdev,
@@ -751,8 +760,10 @@ static unsigned virtqueue_enable_cb_prepare_split(struct virtqueue *_vq)
 	/* Depending on the VIRTIO_RING_F_EVENT_IDX feature, we need to
 	 * either clear the flags bit or point the event index at the next
 	 * entry. Always do both to keep code simple. */
-	if (vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT) {
-		vq->split.avail_flags_shadow &= ~VRING_AVAIL_F_NO_INTERRUPT;
+	if (vq->split.avail_flags_shadow &
+			BIT(VRING_SPLIT_AVAIL_F_NO_INTERRUPT)) {
+		vq->split.avail_flags_shadow &=
+			~BIT(VRING_SPLIT_AVAIL_F_NO_INTERRUPT);
 		if (!vq->event)
 			vq->split.vring.avail->flags =
 				cpu_to_virtio16(_vq->vdev,
@@ -784,8 +795,10 @@ static bool virtqueue_enable_cb_delayed_split(struct virtqueue *_vq)
 	/* Depending on the VIRTIO_RING_F_USED_EVENT_IDX feature, we need to
 	 * either clear the flags bit or point the event index at the next
 	 * entry. Always update the event index to keep code simple. */
-	if (vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT) {
-		vq->split.avail_flags_shadow &= ~VRING_AVAIL_F_NO_INTERRUPT;
+	if (vq->split.avail_flags_shadow &
+			BIT(VRING_SPLIT_AVAIL_F_NO_INTERRUPT)) {
+		vq->split.avail_flags_shadow &=
+			~BIT(VRING_SPLIT_AVAIL_F_NO_INTERRUPT);
 		if (!vq->event)
 			vq->split.vring.avail->flags =
 				cpu_to_virtio16(_vq->vdev,
@@ -912,15 +925,15 @@ static void vring_unmap_state_packed(const struct vring_virtqueue *vq,
 
 	flags = state->flags;
 
-	if (flags & VRING_DESC_F_INDIRECT) {
+	if (flags & BIT(VRING_PACKED_DESC_F_INDIRECT)) {
 		dma_unmap_single(vring_dma_dev(vq),
 				 state->addr, state->len,
-				 (flags & VRING_DESC_F_WRITE) ?
+				 (flags & BIT(VRING_PACKED_DESC_F_WRITE)) ?
 				 DMA_FROM_DEVICE : DMA_TO_DEVICE);
 	} else {
 		dma_unmap_page(vring_dma_dev(vq),
 			       state->addr, state->len,
-			       (flags & VRING_DESC_F_WRITE) ?
+			       (flags & BIT(VRING_PACKED_DESC_F_WRITE)) ?
 			       DMA_FROM_DEVICE : DMA_TO_DEVICE);
 	}
 }
@@ -935,17 +948,17 @@ static void vring_unmap_desc_packed(const struct vring_virtqueue *vq,
 
 	flags = le16_to_cpu(desc->flags);
 
-	if (flags & VRING_DESC_F_INDIRECT) {
+	if (flags & BIT(VRING_PACKED_DESC_F_INDIRECT)) {
 		dma_unmap_single(vring_dma_dev(vq),
 				 le64_to_cpu(desc->addr),
 				 le32_to_cpu(desc->len),
-				 (flags & VRING_DESC_F_WRITE) ?
+				 (flags & BIT(VRING_PACKED_DESC_F_WRITE)) ?
 				 DMA_FROM_DEVICE : DMA_TO_DEVICE);
 	} else {
 		dma_unmap_page(vring_dma_dev(vq),
 			       le64_to_cpu(desc->addr),
 			       le32_to_cpu(desc->len),
-			       (flags & VRING_DESC_F_WRITE) ?
+			       (flags & BIT(VRING_PACKED_DESC_F_WRITE)) ?
 			       DMA_FROM_DEVICE : DMA_TO_DEVICE);
 	}
 }
@@ -1002,7 +1015,7 @@ static int virtqueue_add_indirect_packed(struct vring_virtqueue *vq,
 				goto unmap_release;
 
 			desc[i].flags = cpu_to_le16(n < out_sgs ?
-						0 : VRING_DESC_F_WRITE);
+					0 : BIT(VRING_PACKED_DESC_F_WRITE));
 			desc[i].addr = cpu_to_le64(addr);
 			desc[i].len = cpu_to_le32(sg->length);
 			i++;
@@ -1025,8 +1038,9 @@ static int virtqueue_add_indirect_packed(struct vring_virtqueue *vq,
 		vq->packed.desc_extra[id].addr = addr;
 		vq->packed.desc_extra[id].len = total_sg *
 				sizeof(struct vring_packed_desc);
-		vq->packed.desc_extra[id].flags = VRING_DESC_F_INDIRECT |
-						  vq->packed.avail_used_flags;
+		vq->packed.desc_extra[id].flags =
+				BIT(VRING_PACKED_DESC_F_INDIRECT) |
+				vq->packed.avail_used_flags;
 	}
 
 	/*
@@ -1035,8 +1049,9 @@ static int virtqueue_add_indirect_packed(struct vring_virtqueue *vq,
 	 * the list are made available.
 	 */
 	virtio_wmb(vq->weak_barriers);
-	vq->packed.vring.desc[head].flags = cpu_to_le16(VRING_DESC_F_INDIRECT |
-						vq->packed.avail_used_flags);
+	vq->packed.vring.desc[head].flags =
+		cpu_to_le16(BIT(VRING_PACKED_DESC_F_INDIRECT) |
+			    vq->packed.avail_used_flags);
 
 	/* We're using some buffers from the free list. */
 	vq->vq.num_free -= 1;
@@ -1047,8 +1062,8 @@ static int virtqueue_add_indirect_packed(struct vring_virtqueue *vq,
 		n = 0;
 		vq->packed.avail_wrap_counter ^= 1;
 		vq->packed.avail_used_flags ^=
-				1 << VRING_PACKED_DESC_F_AVAIL |
-				1 << VRING_PACKED_DESC_F_USED;
+				BIT(VRING_PACKED_DESC_F_AVAIL) |
+				BIT(VRING_PACKED_DESC_F_USED);
 	}
 	vq->packed.next_avail_idx = n;
 	vq->free_head = vq->packed.desc_state[id].next;
@@ -1141,8 +1156,10 @@ static inline int virtqueue_add_packed(struct virtqueue *_vq,
 				goto unmap_release;
 
 			flags = cpu_to_le16(vq->packed.avail_used_flags |
-				    (++c == total_sg ? 0 : VRING_DESC_F_NEXT) |
-				    (n < out_sgs ? 0 : VRING_DESC_F_WRITE));
+				    (++c == total_sg ? 0 :
+					BIT(VRING_PACKED_DESC_F_NEXT)) |
+				    (n < out_sgs ? 0 :
+					BIT(VRING_PACKED_DESC_F_WRITE)));
 			if (i == head)
 				head_flags = flags;
 			else
@@ -1164,8 +1181,8 @@ static inline int virtqueue_add_packed(struct virtqueue *_vq,
 			if ((unlikely(++i >= vq->packed.vring.num))) {
 				i = 0;
 				vq->packed.avail_used_flags ^=
-					1 << VRING_PACKED_DESC_F_AVAIL |
-					1 << VRING_PACKED_DESC_F_USED;
+					BIT(VRING_PACKED_DESC_F_AVAIL) |
+					BIT(VRING_PACKED_DESC_F_USED);
 			}
 		}
 	}
@@ -1258,7 +1275,7 @@ static bool virtqueue_kick_prepare_packed(struct virtqueue *_vq)
 	off_wrap = le16_to_cpu(snapshot.off_wrap);
 
 	wrap_counter = off_wrap >> VRING_PACKED_EVENT_F_WRAP_CTR;
-	event_idx = off_wrap & ~(1 << VRING_PACKED_EVENT_F_WRAP_CTR);
+	event_idx = off_wrap & ~BIT(VRING_PACKED_EVENT_F_WRAP_CTR);
 	if (wrap_counter != vq->packed.avail_wrap_counter)
 		event_idx -= vq->packed.vring.num;
 
@@ -1321,8 +1338,8 @@ static inline bool is_used_desc_packed(const struct vring_virtqueue *vq,
 	u16 flags;
 
 	flags = le16_to_cpu(vq->packed.vring.desc[idx].flags);
-	avail = !!(flags & (1 << VRING_PACKED_DESC_F_AVAIL));
-	used = !!(flags & (1 << VRING_PACKED_DESC_F_USED));
+	avail = !!(flags & BIT(VRING_PACKED_DESC_F_AVAIL));
+	used = !!(flags & BIT(VRING_PACKED_DESC_F_USED));
 
 	return avail == used && used == used_wrap_counter;
 }
@@ -1452,7 +1469,7 @@ static bool virtqueue_poll_packed(struct virtqueue *_vq, u16 off_wrap)
 	u16 used_idx;
 
 	wrap_counter = off_wrap >> VRING_PACKED_EVENT_F_WRAP_CTR;
-	used_idx = off_wrap & ~(1 << VRING_PACKED_EVENT_F_WRAP_CTR);
+	used_idx = off_wrap & ~BIT(VRING_PACKED_EVENT_F_WRAP_CTR);
 
 	return is_used_desc_packed(vq, used_idx, wrap_counter);
 }
@@ -1625,7 +1642,7 @@ static struct virtqueue *vring_create_virtqueue_packed(
 	vq->packed.avail_wrap_counter = 1;
 	vq->packed.used_wrap_counter = 1;
 	vq->packed.event_flags_shadow = 0;
-	vq->packed.avail_used_flags = 1 << VRING_PACKED_DESC_F_AVAIL;
+	vq->packed.avail_used_flags = BIT(VRING_PACKED_DESC_F_AVAIL);
 
 	vq->packed.desc_state = kmalloc_array(num,
 			sizeof(struct vring_desc_state_packed),
@@ -2088,7 +2105,8 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index,
 
 	/* No callback?  Tell other side not to bother us. */
 	if (!callback) {
-		vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
+		vq->split.avail_flags_shadow |=
+			BIT(VRING_SPLIT_AVAIL_F_NO_INTERRUPT);
 		if (!vq->event)
 			vq->split.vring.avail->flags = cpu_to_virtio16(vdev,
 					vq->split.avail_flags_shadow);
-- 
2.17.1

^ permalink raw reply related

* Re: [PATCH 1/1] virtio: remove deprecated VIRTIO_PCI_CONFIG()
From: Michael S. Tsirkin @ 2018-12-07 18:01 UTC (permalink / raw)
  To: Dongli Zhang; +Cc: linux-kernel, virtualization
In-Reply-To: <1544168081-13205-1-git-send-email-dongli.zhang@oracle.com>

On Fri, Dec 07, 2018 at 03:34:41PM +0800, Dongli Zhang wrote:
> VIRTIO_PCI_CONFIG() is deprecated. Use VIRTIO_PCI_CONFIG_OFF() instead.
> 
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> ---
>  drivers/virtio/virtio_pci_legacy.c | 6 ++++--
>  include/uapi/linux/virtio_pci.h    | 2 --
>  2 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_pci_legacy.c b/drivers/virtio/virtio_pci_legacy.c
> index de062fb..eff9ddc 100644
> --- a/drivers/virtio/virtio_pci_legacy.c
> +++ b/drivers/virtio/virtio_pci_legacy.c
> @@ -52,7 +52,8 @@ static void vp_get(struct virtio_device *vdev, unsigned offset,
>  {
>  	struct virtio_pci_device *vp_dev = to_vp_device(vdev);
>  	void __iomem *ioaddr = vp_dev->ioaddr +
> -				VIRTIO_PCI_CONFIG(vp_dev) + offset;
> +			VIRTIO_PCI_CONFIG_OFF(vp_dev->msix_enabled) +
> +			offset;
>  	u8 *ptr = buf;
>  	int i;
>  
> @@ -67,7 +68,8 @@ static void vp_set(struct virtio_device *vdev, unsigned offset,
>  {
>  	struct virtio_pci_device *vp_dev = to_vp_device(vdev);
>  	void __iomem *ioaddr = vp_dev->ioaddr +
> -				VIRTIO_PCI_CONFIG(vp_dev) + offset;
> +			VIRTIO_PCI_CONFIG_OFF(vp_dev->msix_enabled) +
> +			offset;
>  	const u8 *ptr = buf;
>  	int i;
>

I agree that VIRTIO_PCI_CONFIG_OFF is a better interface. So above looks
fine.
  
> diff --git a/include/uapi/linux/virtio_pci.h b/include/uapi/linux/virtio_pci.h
> index 90007a1..2070232 100644
> --- a/include/uapi/linux/virtio_pci.h
> +++ b/include/uapi/linux/virtio_pci.h
> @@ -78,8 +78,6 @@
>  /* The remaining space is defined by each driver as the per-driver
>   * configuration space */
>  #define VIRTIO_PCI_CONFIG_OFF(msix_enabled)	((msix_enabled) ? 24 : 20)
> -/* Deprecated: please use VIRTIO_PCI_CONFIG_OFF instead */
> -#define VIRTIO_PCI_CONFIG(dev)	VIRTIO_PCI_CONFIG_OFF((dev)->msix_enabled)
>  
>  /* Virtio ABI version, this must match exactly */
>  #define VIRTIO_PCI_ABI_VERSION		0

This might break some userspace builds, I don't see why we should bother
removing it. Any reason?


> -- 
> 2.7.4

^ permalink raw reply

* Re: [RFC 2/3] virtio_ring: add VIRTIO_RING_NO_LEGACY
From: Michael S. Tsirkin @ 2018-12-07 18:05 UTC (permalink / raw)
  To: Tiwei Bie
  Cc: virtio-dev, netdev, linux-kernel, virtualization, maxime.coquelin,
	wexu
In-Reply-To: <20181207084842.13133-3-tiwei.bie@intel.com>

On Fri, Dec 07, 2018 at 04:48:41PM +0800, Tiwei Bie wrote:
> Introduce VIRTIO_RING_NO_LEGACY to support disabling legacy
> macros and layout definitions.
> 
> Suggested-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> ---
> VRING_AVAIL_ALIGN_SIZE, VRING_USED_ALIGN_SIZE and VRING_DESC_ALIGN_SIZE
> are not pre-virtio 1.0, but can also be disabled by VIRTIO_RING_NO_LEGACY
> in this patch, because their names are not consistent with other names.
> Not sure whether this is a good idea. If we want this, we may also want
> to define _SPLIT_ version for them.

I don't think it's a good idea to have alignment in there - the point of
NO_LEGACY is to help catch bugs not to sanitize coding style IMHO.

And spec calls "legacy" the 0.X interfaces, let's not muddy the waters.

> 
>  include/uapi/linux/virtio_ring.h | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h
> index 9b0c0d92ab62..192573827850 100644
> --- a/include/uapi/linux/virtio_ring.h
> +++ b/include/uapi/linux/virtio_ring.h
> @@ -37,6 +37,7 @@
>  #include <linux/types.h>
>  #include <linux/virtio_types.h>
>  
> +#ifndef VIRTIO_RING_NO_LEGACY
>  /*
>   * Notice: unlike other _F_ flags, below flags are defined as shifted
>   * values instead of shifts for compatibility.
> @@ -51,6 +52,7 @@
>  #define VRING_USED_F_NO_NOTIFY	1
>  /* Same as VRING_SPLIT_AVAIL_F_NO_INTERRUPT. */
>  #define VRING_AVAIL_F_NO_INTERRUPT	1
> +#endif /* VIRTIO_RING_NO_LEGACY */
>  
>  /* Mark a buffer as continuing via the next field in split ring. */
>  #define VRING_SPLIT_DESC_F_NEXT		0
> @@ -151,6 +153,7 @@ struct vring {
>  	struct vring_used *used;
>  };
>  
> +#ifndef VIRTIO_RING_NO_LEGACY
>  /* Alignment requirements for vring elements.
>   * When using pre-virtio 1.0 layout, these fall out naturally.
>   */
> @@ -203,6 +206,7 @@ static inline unsigned vring_size(unsigned int num, unsigned long align)
>  		 + align - 1) & ~(align - 1))
>  		+ sizeof(__virtio16) * 3 + sizeof(struct vring_used_elem) * num;
>  }
> +#endif /* VIRTIO_RING_NO_LEGACY */
>  
>  /* The following is used with USED_EVENT_IDX and AVAIL_EVENT_IDX */
>  /* Assuming a given event_idx value from the other side, if
> -- 
> 2.17.1

^ permalink raw reply

* Re: [RFC 3/3] virtio_ring: use new vring flags
From: Michael S. Tsirkin @ 2018-12-07 18:10 UTC (permalink / raw)
  To: Tiwei Bie
  Cc: virtio-dev, netdev, linux-kernel, virtualization, maxime.coquelin,
	wexu
In-Reply-To: <20181207084842.13133-4-tiwei.bie@intel.com>

On Fri, Dec 07, 2018 at 04:48:42PM +0800, Tiwei Bie wrote:
> Switch to using the _SPLIT_ and _PACKED_ variants of vring flags
> in split ring and packed ring respectively.
> 
> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> ---
> @@ -502,7 +505,8 @@ static inline int virtqueue_add_split(struct virtqueue *_vq,
>  		}
>  	}
>  	/* Last one doesn't continue. */
> -	desc[prev].flags &= cpu_to_virtio16(_vq->vdev, ~VRING_DESC_F_NEXT);
> +	desc[prev].flags &= cpu_to_virtio16(_vq->vdev,
> +				(u16)~BIT(VRING_SPLIT_DESC_F_NEXT));
>  
>  	if (indirect) {
>  		/* Now that the indirect table is filled in, map it. */

I kind of dislike it that this forces use of a cast here.
Kind of makes it more fragile. Let's use a temporary instead?

> -- 
> 2.17.1

^ permalink raw reply

* Re: [RFC 0/3] virtio_ring: define flags as shifts consistently
From: Michael S. Tsirkin @ 2018-12-07 18:11 UTC (permalink / raw)
  To: Tiwei Bie
  Cc: virtio-dev, netdev, linux-kernel, virtualization, maxime.coquelin,
	wexu
In-Reply-To: <20181207084842.13133-1-tiwei.bie@intel.com>

On Fri, Dec 07, 2018 at 04:48:39PM +0800, Tiwei Bie wrote:
> This is a follow up of the discussion in this thread:
> https://patchwork.ozlabs.org/patch/1001015/#2042353

How was this tested? I'd suggest building virtio
before and after the changes, stripped binary
should be exactly the same.


> Tiwei Bie (3):
>   virtio_ring: define flags as shifts consistently
>   virtio_ring: add VIRTIO_RING_NO_LEGACY
>   virtio_ring: use new vring flags
> 
>  drivers/virtio/virtio_ring.c     | 100 ++++++++++++++++++-------------
>  include/uapi/linux/virtio_ring.h |  61 +++++++++++++------
>  2 files changed, 103 insertions(+), 58 deletions(-)
> 
> -- 
> 2.17.1

^ permalink raw reply

* Re: [virtio-dev] Re: [PATCH v5 5/7] iommu: Add virtio-iommu driver
From: Jean-Philippe Brucker @ 2018-12-07 18:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: mark.rutland, virtio-dev, lorenzo.pieralisi, tnowicki, devicetree,
	marc.zyngier, linux-pci, joro, will.deacon, virtualization,
	eric.auger, iommu, robh+dt, bhelgaas, robin.murphy, kvmarm
In-Reply-To: <20181127130527-mutt-send-email-mst@kernel.org>

Sorry for the delay, I wanted to do a little more performance analysis
before continuing.

On 27/11/2018 18:10, Michael S. Tsirkin wrote:
> On Tue, Nov 27, 2018 at 05:55:20PM +0000, Jean-Philippe Brucker wrote:
>>>> +	if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1) ||
>>>> +	    !virtio_has_feature(vdev, VIRTIO_IOMMU_F_MAP_UNMAP))
>>>
>>> Why bother with a feature bit for this then btw?
>>
>> We'll need a new feature bit for sharing page tables with the hardware,
>> because they require different requests (attach_table/invalidate instead
>> of map/unmap.) A future device supporting page table sharing won't
>> necessarily need to support map/unmap.
>>
> I don't see virtio iommu being extended to support ARM specific
> requests. This just won't scale, too many different
> descriptor formats out there.

They aren't really ARM specific requests. The two new requests are
ATTACH_TABLE and INVALIDATE, which would be used by x86 IOMMUs as well.

Sharing CPU address space with the HW IOMMU (SVM) has been in the scope
of virtio-iommu since the first RFC, and I've been working with that
extension in mind since the beginning. As an example you can have a look
at my current draft for this [1], which is inspired from the VFIO work
we've been doing with Intel.

The negotiation phase inevitably requires vendor-specific fields in the
descriptors - host tells which formats are supported, guest chooses a
format and attaches page tables. But invalidation and fault reporting
descriptors are fairly generic.

> If you want to go that way down the road, you should avoid
> virtio iommu, instead emulate and share code with the ARM SMMU (probably
> with a different vendor id so you can implement the
> report on map for devices without PRI).

vSMMU has to stay in userspace though. The main reason we're proposing
virtio-iommu is that emulating every possible vIOMMU model in the kernel
would be unmaintainable. With virtio-iommu we can process the fast path
in the host kernel, through vhost-iommu, and do the heavy lifting in
userspace. As said above, I'm trying to keep the fast path for
virtio-iommu generic.

More notes on what I consider to be the fast path, and comparison with
vSMMU:

(1) The primary use-case we have in mind for vIOMMU is something like
DPDK in the guest, assigning a hardware device to guest userspace. DPDK
maps a large amount of memory statically, to be used by a pass-through
device. For this case I don't think we care about vIOMMU performance.
Setup and teardown need to be reasonably fast, sure, but the MAP/UNMAP
requests don't have to be optimal.

(2) If the assigned device is owned by the guest kernel, then mappings
are dynamic and require dma_map/unmap() to be fast, but there generally
is no need for a vIOMMU, since device and drivers are trusted by the
guest kernel. Even when the user does enable a vIOMMU for this case
(allowing to over-commit guest memory, which needs to be pinned
otherwise), we generally play tricks like lazy TLBI (non-strict mode) to
make it faster. Here device and drivers are trusted, therefore the
vulnerability window of lazy mode isn't a concern.

If the reason to enable the vIOMMU is over-comitting guest memory
however, you can't use nested translation because it requires pinning
the second-level tables. For this case performance matters a bit,
because your invalidate-on-map needs to be fast, even if you enable lazy
mode and only receive inval-on-unmap every 10ms. It won't ever be as
fast as nested translation, though. For this case I think vSMMU+Caching
Mode and userspace virtio-iommu with MAP/UNMAP would perform similarly
(given page-sized payloads), because the pagetable walk doesn't add a
lot of overhead compared to the context switch. But given the results
below, vhost-iommu would be faster than vSMMU+CM.

(3) Then there is SVM. For SVM, any destructive change to the process
address space requires a synchronous invalidation command to the
hardware (at least when using PCI ATS). Given that SVM is based on page
faults, fault reporting from host to guest also needs to be fast, as
well as fault response from guest to host.

I think this is where performance matters the most. To get a feel of the
advantage we get with virtio-iommu, I compared the vSMMU page-table
sharing implementation [2] and vhost-iommu + VFIO with page table
sharing (based on Tomasz Nowicki's vhost-iommu prototype). That's on a
ThunderX2 with a 10Gb NIC assigned to the guest kernel, which
corresponds to case (2) above, with nesting page tables and without the
lazy mode. The host's only job is forwarding invalidation to the HW SMMU.

vhost-iommu performed on average 1.8x and 5.5x better than vSMMU on
netperf TCP_STREAM and TCP_MAERTS respectively (~200 samples). I think
this can be further optimized (that was still polling under the vq
lock), and unlike vSMMU, virtio-iommu offers the possibility of
multi-queue for improved scalability. In addition, the guest will need
to send both TLB and ATC invalidations with vSMMU, but virtio-iommu
allows to multiplex those, and to invalidate ranges. Similarly for fault
injection, having the ability to report page faults to the guest from
the host kernel should be significantly faster than having to go to
userspace and back to the kernel.

(4) Virtio and vhost endpoints weren't really a priority for the base
virtio-iommu device, we were looking mainly at device pass-through. I
have optimizations in mind for this, although a lot of them are based on
page tables, not MAP/UNMAP requests. But just getting the vIOMMU closer
to vhost devices, avoiding the trip to userspace through vhost-tlb,
should already improve things.

The important difference when DMA is done by software is that you don't
need to mirror all mappings into the HW IOMMU - you don't need
inval-on-map. The endpoint can ask the vIOMMU for mappings when it needs
them, like vhost-iotlb does for example. So the MAP/UNMAP interface of
virtio-iommu performs poorly for emulated/PV endpoints compared to an
emulated IOMMU, since it requires three context switches for DMA
(MAP/DMA/UNMAP) between host and guest, rather than two (DMA/INVAL).
There is a feature I call "posted MAP", that avoids the kick on MAP and
instead lets the device fetch the MAP request on TLB miss, but I haven't
spent enough time experimenting with this.

> Others on the TC might feel differently.
> 
> If someone's looking into adding virtio iommu support in hardware,
> that's a different matter. Which is it?

I'm not aware of anything like that, and suspect that no one would
consider it until virtio-iommu is more widely adopted.

Thanks,
Jean

[1] Diff between current spec and page table sharing draft
    (Very rough, missing page fault support and I'd like to rework the
     PASID model a bit, but table descriptors p.24-26 for both Arm
     SMMUv2 and SMMUv3.)

http://jpbrucker.net/virtio-iommu/spec-table/diffs/virtio-iommu-pdf-diff-v0.9-v0.10.dev03.pdf

[2] [RFC v2 00/28] vSMMUv3/pSMMUv3 2 stage VFIO integration
    https://www.mail-archive.com/qemu-devel@nongnu.org/msg562369.html

^ permalink raw reply

* [PULL] virtio/vhost: last minute fixes
From: Michael S. Tsirkin @ 2018-12-07 21:53 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: graham.whaley, kvm, mst, riemensberger, netdev, cohuck,
	linux-kernel, stable, virtualization, pasic,
	syzbot+d5a0a170c5069658b141, stefanha,
	syzbot+bd391451452fb0b93039, colin.king,
	syzbot+e3e074963495f92a89ed

The following changes since commit 2595646791c319cadfdbf271563aac97d0843dc7:

  Linux 4.20-rc5 (2018-12-02 15:07:55 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus

for you to fetch changes up to 834e772c8db0c6a275d75315d90aba4ebbb1e249:

  vhost/vsock: fix use-after-free in network stack callers (2018-12-06 14:28:38 -0500)

----------------------------------------------------------------
virtio fixes

A couple of last-minute fixes.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

----------------------------------------------------------------
Halil Pasic (2):
      virtio/s390: avoid race on vcdev->config
      virtio/s390: fix race in ccw_io_helper()

Stefan Hajnoczi (2):
      vhost/vsock: fix reset orphans race with close timeout
      vhost/vsock: fix use-after-free in network stack callers

 drivers/s390/virtio/virtio_ccw.c | 17 +++++++--
 drivers/vhost/vsock.c            | 79 ++++++++++++++++++++++++----------------
 2 files changed, 62 insertions(+), 34 deletions(-)

^ permalink raw reply

* Re: [PATCH 1/1] virtio: remove deprecated VIRTIO_PCI_CONFIG()
From: Dongli Zhang @ 2018-12-08  0:56 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: linux-kernel, virtualization
In-Reply-To: <20181207125951-mutt-send-email-mst@kernel.org>



On 12/08/2018 02:01 AM, Michael S. Tsirkin wrote:
> On Fri, Dec 07, 2018 at 03:34:41PM +0800, Dongli Zhang wrote:
>> VIRTIO_PCI_CONFIG() is deprecated. Use VIRTIO_PCI_CONFIG_OFF() instead.
>>
>> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
>> ---
>>  drivers/virtio/virtio_pci_legacy.c | 6 ++++--
>>  include/uapi/linux/virtio_pci.h    | 2 --
>>  2 files changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/virtio/virtio_pci_legacy.c b/drivers/virtio/virtio_pci_legacy.c
>> index de062fb..eff9ddc 100644
>> --- a/drivers/virtio/virtio_pci_legacy.c
>> +++ b/drivers/virtio/virtio_pci_legacy.c
>> @@ -52,7 +52,8 @@ static void vp_get(struct virtio_device *vdev, unsigned offset,
>>  {
>>  	struct virtio_pci_device *vp_dev = to_vp_device(vdev);
>>  	void __iomem *ioaddr = vp_dev->ioaddr +
>> -				VIRTIO_PCI_CONFIG(vp_dev) + offset;
>> +			VIRTIO_PCI_CONFIG_OFF(vp_dev->msix_enabled) +
>> +			offset;
>>  	u8 *ptr = buf;
>>  	int i;
>>  
>> @@ -67,7 +68,8 @@ static void vp_set(struct virtio_device *vdev, unsigned offset,
>>  {
>>  	struct virtio_pci_device *vp_dev = to_vp_device(vdev);
>>  	void __iomem *ioaddr = vp_dev->ioaddr +
>> -				VIRTIO_PCI_CONFIG(vp_dev) + offset;
>> +			VIRTIO_PCI_CONFIG_OFF(vp_dev->msix_enabled) +
>> +			offset;
>>  	const u8 *ptr = buf;
>>  	int i;
>>
> 
> I agree that VIRTIO_PCI_CONFIG_OFF is a better interface. So above looks
> fine.
>   
>> diff --git a/include/uapi/linux/virtio_pci.h b/include/uapi/linux/virtio_pci.h
>> index 90007a1..2070232 100644
>> --- a/include/uapi/linux/virtio_pci.h
>> +++ b/include/uapi/linux/virtio_pci.h
>> @@ -78,8 +78,6 @@
>>  /* The remaining space is defined by each driver as the per-driver
>>   * configuration space */
>>  #define VIRTIO_PCI_CONFIG_OFF(msix_enabled)	((msix_enabled) ? 24 : 20)
>> -/* Deprecated: please use VIRTIO_PCI_CONFIG_OFF instead */
>> -#define VIRTIO_PCI_CONFIG(dev)	VIRTIO_PCI_CONFIG_OFF((dev)->msix_enabled)
>>  
>>  /* Virtio ABI version, this must match exactly */
>>  #define VIRTIO_PCI_ABI_VERSION		0
> 
> This might break some userspace builds, I don't see why we should bother
> removing it. Any reason?

Apologies.

I thought about some compatibility issue for building third-party kernel module
at userspace, but did not realize it will break other userspace software builds.

I will keep the definition of VIRTIO_PCI_CONFIG() and resend again.

Thank you very much!

Dongli Zhang

> 
> 
>> -- 
>> 2.7.4

^ permalink raw reply

* Re: [RFC 0/3] virtio_ring: define flags as shifts consistently
From: Tiwei Bie @ 2018-12-08 13:26 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, netdev, linux-kernel, virtualization, maxime.coquelin,
	wexu
In-Reply-To: <20181207131058-mutt-send-email-mst@kernel.org>

On Fri, Dec 07, 2018 at 01:11:42PM -0500, Michael S. Tsirkin wrote:
> On Fri, Dec 07, 2018 at 04:48:39PM +0800, Tiwei Bie wrote:
> > This is a follow up of the discussion in this thread:
> > https://patchwork.ozlabs.org/patch/1001015/#2042353
> 
> How was this tested? I'd suggest building virtio
> before and after the changes, stripped binary
> should be exactly the same.

Sure, I will do the test with scripts/objdiff.

> 
> 
> > Tiwei Bie (3):
> >   virtio_ring: define flags as shifts consistently
> >   virtio_ring: add VIRTIO_RING_NO_LEGACY
> >   virtio_ring: use new vring flags
> > 
> >  drivers/virtio/virtio_ring.c     | 100 ++++++++++++++++++-------------
> >  include/uapi/linux/virtio_ring.h |  61 +++++++++++++------
> >  2 files changed, 103 insertions(+), 58 deletions(-)
> > 
> > -- 
> > 2.17.1

^ permalink raw reply

* Re: [RFC 2/3] virtio_ring: add VIRTIO_RING_NO_LEGACY
From: Tiwei Bie @ 2018-12-08 13:33 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, netdev, linux-kernel, virtualization, maxime.coquelin,
	wexu
In-Reply-To: <20181207130336-mutt-send-email-mst@kernel.org>

On Fri, Dec 07, 2018 at 01:05:35PM -0500, Michael S. Tsirkin wrote:
> On Fri, Dec 07, 2018 at 04:48:41PM +0800, Tiwei Bie wrote:
> > Introduce VIRTIO_RING_NO_LEGACY to support disabling legacy
> > macros and layout definitions.
> > 
> > Suggested-by: Michael S. Tsirkin <mst@redhat.com>
> > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > ---
> > VRING_AVAIL_ALIGN_SIZE, VRING_USED_ALIGN_SIZE and VRING_DESC_ALIGN_SIZE
> > are not pre-virtio 1.0, but can also be disabled by VIRTIO_RING_NO_LEGACY
> > in this patch, because their names are not consistent with other names.
> > Not sure whether this is a good idea. If we want this, we may also want
> > to define _SPLIT_ version for them.
> 
> I don't think it's a good idea to have alignment in there - the point of
> NO_LEGACY is to help catch bugs not to sanitize coding style IMHO.
> 
> And spec calls "legacy" the 0.X interfaces, let's not muddy the waters.

Make sense. Thanks!

> 
> > 
> >  include/uapi/linux/virtio_ring.h | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h
> > index 9b0c0d92ab62..192573827850 100644
> > --- a/include/uapi/linux/virtio_ring.h
> > +++ b/include/uapi/linux/virtio_ring.h
> > @@ -37,6 +37,7 @@
> >  #include <linux/types.h>
> >  #include <linux/virtio_types.h>
> >  
> > +#ifndef VIRTIO_RING_NO_LEGACY
> >  /*
> >   * Notice: unlike other _F_ flags, below flags are defined as shifted
> >   * values instead of shifts for compatibility.
> > @@ -51,6 +52,7 @@
> >  #define VRING_USED_F_NO_NOTIFY	1
> >  /* Same as VRING_SPLIT_AVAIL_F_NO_INTERRUPT. */
> >  #define VRING_AVAIL_F_NO_INTERRUPT	1
> > +#endif /* VIRTIO_RING_NO_LEGACY */
> >  
> >  /* Mark a buffer as continuing via the next field in split ring. */
> >  #define VRING_SPLIT_DESC_F_NEXT		0
> > @@ -151,6 +153,7 @@ struct vring {
> >  	struct vring_used *used;
> >  };
> >  
> > +#ifndef VIRTIO_RING_NO_LEGACY
> >  /* Alignment requirements for vring elements.
> >   * When using pre-virtio 1.0 layout, these fall out naturally.
> >   */
> > @@ -203,6 +206,7 @@ static inline unsigned vring_size(unsigned int num, unsigned long align)
> >  		 + align - 1) & ~(align - 1))
> >  		+ sizeof(__virtio16) * 3 + sizeof(struct vring_used_elem) * num;
> >  }
> > +#endif /* VIRTIO_RING_NO_LEGACY */
> >  
> >  /* The following is used with USED_EVENT_IDX and AVAIL_EVENT_IDX */
> >  /* Assuming a given event_idx value from the other side, if
> > -- 
> > 2.17.1

^ permalink raw reply

* Re: [RFC 3/3] virtio_ring: use new vring flags
From: Tiwei Bie @ 2018-12-08 13:47 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, netdev, linux-kernel, virtualization, maxime.coquelin,
	wexu
In-Reply-To: <20181207130822-mutt-send-email-mst@kernel.org>

On Fri, Dec 07, 2018 at 01:10:48PM -0500, Michael S. Tsirkin wrote:
> On Fri, Dec 07, 2018 at 04:48:42PM +0800, Tiwei Bie wrote:
> > Switch to using the _SPLIT_ and _PACKED_ variants of vring flags
> > in split ring and packed ring respectively.
> > 
> > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > ---
> > @@ -502,7 +505,8 @@ static inline int virtqueue_add_split(struct virtqueue *_vq,
> >  		}
> >  	}
> >  	/* Last one doesn't continue. */
> > -	desc[prev].flags &= cpu_to_virtio16(_vq->vdev, ~VRING_DESC_F_NEXT);
> > +	desc[prev].flags &= cpu_to_virtio16(_vq->vdev,
> > +				(u16)~BIT(VRING_SPLIT_DESC_F_NEXT));
> >  
> >  	if (indirect) {
> >  		/* Now that the indirect table is filled in, map it. */
> 
> I kind of dislike it that this forces use of a cast here.
> Kind of makes it more fragile. Let's use a temporary instead?

I tried something like this:

u16 mask = ~BIT(VRING_SPLIT_DESC_F_NEXT);

And it will still cause the warning:

warning: large integer implicitly truncated to unsigned type [-Woverflow]
  u16 mask = ~BIT(VRING_SPLIT_DESC_F_NEXT);

If the cast isn't wanted, maybe use ~(1 << VRING_SPLIT_DESC_F_NEXT)
directly?

> 
> > -- 
> > 2.17.1

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox