Re: [RFC PATCH 1/1] virtio: write back features before verify

* Re: [RFC PATCH 1/1] virtio: write back features before verify
       [not found]             ` <20211004042323.730c6a5e.pasic@linux.ibm.com>
@ 2021-10-04  9:07               ` Michael S. Tsirkin
  2021-10-05 10:06                 ` Cornelia Huck
  2021-10-05 10:43                 ` Halil Pasic
  0 siblings, 2 replies; 20+ messages in thread
From: Michael S. Tsirkin @ 2021-10-04  9:07 UTC (permalink / raw)
  To: Halil Pasic
  Cc: linux-s390, markver, Christian Borntraeger, qemu-devel,
	Jason Wang, Cornelia Huck, linux-kernel, virtualization,
	Xie Yongji, stefanha, Raphael Norwitz

On Mon, Oct 04, 2021 at 04:23:23AM +0200, Halil Pasic wrote:
> On Sat, 2 Oct 2021 14:13:37 -0400
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > > Anyone else have an idea? This is a nasty regression; we could revert the
> > > patch, which would remove the symptoms and give us some time, but that
> > > doesn't really feel right, I'd do that only as a last resort.  
> > 
> > Well we have Halil's hack (except I would limit it
> > to only apply to BE, only do devices with validate,
> > and only in modern mode), and we will fix QEMU to be spec compliant.
> > Between these why do we need any conditional compiles?
> 
> We don't. As I stated before, this hack is flawed because it
> effectively breaks fencing features by the driver with QEMU. Some
> features can not be unset after once set, because we tend to try to
> enable the corresponding functionality whenever we see a write
> features operation with the feature bit set, and we don't disable, if a
> subsequent features write operation stores the feature bit as not set.

Something to fix in QEMU too, I think.

> But it looks like VIRTIO_1 is fine to get cleared afterwards.

We'd never clear it though - why would we?

> So my hack
> should actually look like posted below, modulo conditions.

Looking at it some more, I see that vhost-user actually
does not send features to the backend until FEATURES_OK.
However, the code in contrib for vhost-user-blk at least seems
broken wrt endian-ness ATM. What about other backends though?
Hard to be sure right?
Cc Raphael and Stefan so they can take a look.
And I guess it's time we CC'd qemu-devel too.

For now I am beginning to think we should either revert or just limit
validation to LE and think about all this some more. And I am inclining
to do a revert. These are all hypervisors that shipped for a long time.
Do we need a flag for early config space access then?

> 
> Regarding the conditions I guess checking that driver_features has
> F_VERSION_1 already satisfies "only modern mode", or?

Right.

> For now
> I've deliberately omitted the has verify and the is big endian
> conditions so we have a better chance to see if something breaks
> (i.e. the approach does not work). I can add in those extra conditions
> later.

Or maybe if we will go down that road just the verify check (for
performance). I'm a bit unhappy we have the extra exit but consistency
seems more important.

> 
> --------------------------8<---------------------
> 
> From: Halil Pasic <pasic@linux.ibm.com>
> Date: Thu, 30 Sep 2021 02:38:47 +0200
> Subject: [PATCH] virtio: write back feature VERSION_1 before verify
> 
> This patch fixes a regression introduced by commit 82e89ea077b9
> ("virtio-blk: Add validation for block size in config space") and
> enables similar checks in verify() on big endian platforms.
> 
> The problem with checking multi-byte config fields in the verify
> callback, on big endian platforms, and with a possibly transitional
> device is the following. The verify() callback is called between
> config->get_features() and virtio_finalize_features(). That we have a
> device that offered F_VERSION_1 then we have the following options
> either the device is transitional, and then it has to present the legacy
> interface, i.e. a big endian config space until F_VERSION_1 is
> negotiated, or we have a non-transitional device, which makes
> F_VERSION_1 mandatory, and only implements the non-legacy interface and
> thus presents a little endian config space. Because at this point we
> can't know if the device is transitional or non-transitional, we can't
> know do we need to byte swap or not.

Well we established that we can know. Here's an alternative explanation:

	The virtio specification virtio-v1.1-cs01 states:

	Transitional devices MUST detect Legacy drivers by detecting that
	VIRTIO_F_VERSION_1 has not been acknowledged by the driver.
	This is exactly what QEMU as of 6.1 has done relying solely
	on VIRTIO_F_VERSION_1 for detecting that.

	However, the specification also says:
	driver MAY read (but MUST NOT write) the device-specific
	configuration fields to check that it can support the device before
	accepting it.

	In that case, any device relying solely on VIRTIO_F_VERSION_1
	for detecting legacy drivers will return data in legacy format.
	In particular, this implies that it is in big endian format
	for big endian guests. This naturally confuses the driver
	which expects little endian in the modern mode.

	It is probably a good idea to amend the spec to clarify that
	VIRTIO_F_VERSION_1 can only be relied on after the feature negotiation
	is complete. However, we already have regression so let's
	try to address it.

> 
> The virtio spec explicitly states that the driver MAY read config
> between reading and writing the features so saying that first accessing
> the config before feature negotiation is done is not an option. The
> specification ain't clear about setting the features multiple times
> before FEATURES_OK, so I guess that should be fine to set F_VERSION_1
> since at this point we already know that we are about to negotiate
> F_VERSION_1.
> 
> I don't consider this patch super clean, but frankly I don't think we
> have a ton of options. Another option that may or man not be cleaner,
> but is also IMHO much uglier is to figure out whether the device is
> transitional by rejecting _F_VERSION_1, then resetting it and proceeding
> according tho what we have figured out, hoping that the characteristics
> of the device didn't change.

An empty line before tags.

> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> Fixes: 82e89ea077b9 ("virtio-blk: Add validation for block size in config space")
> Reported-by: markver@us.ibm.com

Let's add more commits that are affected. E.g. virtio-net with MTU
feature bit set is affected too.

So let's add Fixes tag for:
commit 14de9d114a82a564b94388c95af79a701dc93134
Author: Aaron Conole <aconole@redhat.com>
Date:   Fri Jun 3 16:57:12 2016 -0400

    virtio-net: Add initial MTU advice feature

I think that's all, but pls double check me.

> ---
>  drivers/virtio/virtio.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> index 0a5b54034d4b..2b9358f2e22a 100644
> --- a/drivers/virtio/virtio.c
> +++ b/drivers/virtio/virtio.c
> @@ -239,6 +239,12 @@ static int virtio_dev_probe(struct device *_d)
>  		driver_features_legacy = driver_features;
>  	}
>  
> +	/* Write F_VERSION_1 feature to pin down endianness */
> +	if (device_features & (1ULL << VIRTIO_F_VERSION_1) & driver_features) {
> +		dev->features = (1ULL << VIRTIO_F_VERSION_1);
> +		dev->config->finalize_features(dev);
> +	}
> +
>  	if (device_features & (1ULL << VIRTIO_F_VERSION_1))
>  		dev->features = driver_features & device_features;
>  	else
> -- 
> 2.31.1
> 
> 
> 
> 
> 
>  

^ permalink raw reply	[flat|nested] 20+ messages in thread