virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Halil Pasic <pasic@linux.ibm.com>
Cc: linux-s390@vger.kernel.org, markver@us.ibm.com,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Cornelia Huck <cohuck@redhat.com>,
	linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org,
	Xie Yongji <xieyongji@bytedance.com>
Subject: Re: [RFC PATCH 1/1] virtio: write back features before verify
Date: Sat, 2 Oct 2021 06:21:32 -0400	[thread overview]
Message-ID: <20211002055605-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20211001092125.64fef348.pasic@linux.ibm.com>

On Fri, Oct 01, 2021 at 09:21:25AM +0200, Halil Pasic wrote:
> On Thu, 30 Sep 2021 07:12:21 -0400
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Thu, Sep 30, 2021 at 03:20:49AM +0200, Halil Pasic wrote:
> > > This patch fixes a regression introduced by commit 82e89ea077b9
> > > ("virtio-blk: Add validation for block size in config space") and
> > > enables similar checks in verify() on big endian platforms.
> > > 
> > > The problem with checking multi-byte config fields in the verify
> > > callback, on big endian platforms, and with a possibly transitional
> > > device is the following. The verify() callback is called between
> > > config->get_features() and virtio_finalize_features(). That we have a
> > > device that offered F_VERSION_1 then we have the following options
> > > either the device is transitional, and then it has to present the legacy
> > > interface, i.e. a big endian config space until F_VERSION_1 is
> > > negotiated, or we have a non-transitional device, which makes
> > > F_VERSION_1 mandatory, and only implements the non-legacy interface and
> > > thus presents a little endian config space. Because at this point we
> > > can't know if the device is transitional or non-transitional, we can't
> > > know do we need to byte swap or not.  
> > 
> > Hmm which transport does this refer to?
> 
> It is the same with virtio-ccw and virtio-pci. I see the same problem
> with both on s390x. I didn't try with virtio-blk-pci-non-transitional
> yet (have to figure out how to do that with libvirt) for pci I used
> virtio-blk-pci.
> 
> > Distinguishing between legacy and modern drivers is transport
> > specific.  PCI presents
> > legacy and modern at separate addresses so distinguishing
> > between these two should be no trouble.
> 
> You mean the device id? Yes that is bolted down in the spec, but
> currently we don't exploit that information. Furthermore there
> is a fat chance that with QEMU even the allegedly non-transitional
> devices only present a little endian config space after VERSION_1
> was negotiated. Namely get_config for virtio-blk is implemented in
> virtio_blk_update_config() which does virtio_stl_p(vdev,
> &blkcfg.blk_size, blk_size) and in there we don't care
> about transitional or not:
> 
> static inline bool virtio_access_is_big_endian(VirtIODevice *vdev)
> {
> #if defined(LEGACY_VIRTIO_IS_BIENDIAN)
>     return virtio_is_big_endian(vdev);
> #elif defined(TARGET_WORDS_BIGENDIAN)
>     if (virtio_vdev_has_feature(vdev, VIRTIO_F_VERSION_1)) {
>         /* Devices conforming to VIRTIO 1.0 or later are always LE. */
>         return false;
>     }
>     return true;
> #else
>     return false;
> #endif
> }
> 

ok so that's a QEMU bug. Any virtio 1.0 and up
compatible device must use LE.
It can also present a legacy config space where the
endian depends on the guest.

> > Channel i/o has versioning so same thing?
> >
> 
> Don't think so. Both a transitional and a non-transitional device
> would have to accept revisions higher than 0 if the driver tried to
> negotiate those (and we do in our case).

Yes, the modern driver does. And that one is known to be LE.
legacy driver doesn't.

> > > The virtio spec explicitly states that the driver MAY read config
> > > between reading and writing the features so saying that first accessing
> > > the config before feature negotiation is done is not an option. The
> > > specification ain't clear about setting the features multiple times
> > > before FEATURES_OK, so I guess that should be fine.
> > > 
> > > I don't consider this patch super clean, but frankly I don't think we
> > > have a ton of options. Another option that may or man not be cleaner,
> > > but is also IMHO much uglier is to figure out whether the device is
> > > transitional by rejecting _F_VERSION_1, then resetting it and proceeding
> > > according tho what we have figured out, hoping that the characteristics
> > > of the device didn't change.  
> > 
> > I am confused here. So is the problem at the device or at the driver level?
> 
> We have a driver regression. Since the 82e89ea077b9 ("virtio-blk: Add
> validation for block size in config space") virtio-blk is broken on
> s390.

Because of a qemu bug. I agree. It's worth working around in the driver
since the qemu bug has been around for a very long time.


> The deeper problem is in the spec. We stated that the driver may read
> config space before the feature negotiation is finalized, but we didn't
> think enough about what happens when native endiannes is not little
> endian in the different cases.

Because the spec is very clear that endian-ness is LE.
I don't see a spec issue yet here, just an implementation issue.

> I believe, for non-transitional devices we have a problem in the host as
> well (i.e. in QEMU).

Because QEMU ignores the spec and instead relies on the feature
negotiation.

> 
> > I suspect it's actually the host that has the issue, not
> > the guest?
> 
> I tend to say we have a problem both in the host and in the guest. I'm
> more concerned about the problem in the guest, because that is a really
> nasty regression.

The problem is in the guest. The bug is in the host ;)

> For the host. I think for legacy we don't have a
> problem, because both sides would operate on the assumption no
> _F_VERSION_1, IMHO the implementation for the transitional devices is
> correct.

Well no, the point of transitional is really to be 1.0 compliant
*and* also expose a legacy interface.

> For non-transitional flavor, it depends on the device. For
> example virtio-net and virtio-blk is broken, because we use primitives
> like virtio_stl_p() and those don't do the right thing before feature
> negotiation is completed. On the other hand virtio-crypto.c as a truly
> non-transitional device uses stl_le_p() and IMHO does the right thing.
> 
> Thanks for your comments! I hope I managed to answer your questions. I
> need some guidance on how do we want to move forward on this.
> 
> Regards,
> Halil

OK so. I don't have a problem with the patch itself,
assuming it's enough to work around all buggy hosts.
I am especially worried about things like vhost/vhost-user,
I suspect they might have a bug like this too, and
I am not sure whether your work around is enough for these.
Can you check please?

If not we'll have to move all validate code to after FEATURES_OK
is set.

We do however want to document that this API can be called
multiple times since that was not the case
previously.

Also, I would limit this to when
- the validate callback exists
- the guest endian-ness is not LE

We also want to document the QEMU bug in a comment here,
e.g. 

/*
 * QEMU before version 6.2 incorrectly uses driver features with guest
 * endian-ness to set endian-ness for config space instead of just using
 * LE for the modern interface as per spec.
 * This breaks reading config in the validate callback.
 * To work around that, when device is 1.0 (so supposed to be LE)
 * but guest is not LE, then send the features to device one extra
 * time before validation.
 */

Finally I'd like to see the QEMU bug fix before I merge this one,
since it will be harder to test with a fix.




> > 
> > 
> > > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > > Fixes: 82e89ea077b9 ("virtio-blk: Add validation for block size in config space")
> > > Reported-by: markver@us.ibm.com
> > > ---
> > >  drivers/virtio/virtio.c | 4 ++++
> > >  1 file changed, 4 insertions(+)
> > > 
> > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > index 0a5b54034d4b..9dc3cfa17b1c 100644
> > > --- a/drivers/virtio/virtio.c
> > > +++ b/drivers/virtio/virtio.c
> > > @@ -249,6 +249,10 @@ static int virtio_dev_probe(struct device *_d)
> > >  		if (device_features & (1ULL << i))
> > >  			__virtio_set_bit(dev, i);
> > >  
> > > +	/* Write back features before validate to know endianness */
> > > +	if (device_features & (1ULL << VIRTIO_F_VERSION_1))
> > > +		dev->config->finalize_features(dev);
> > > +
> > >  	if (drv->validate) {
> > >  		err = drv->validate(dev);
> > >  		if (err)
> > > 
> > > base-commit: 02d5e016800d082058b3d3b7c3ede136cdc6ddcb
> > > -- 
> > > 2.25.1  
> > 

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

  reply	other threads:[~2021-10-02 10:21 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-30  1:20 [RFC PATCH 1/1] virtio: write back features before verify Halil Pasic
2021-09-30  8:04 ` Christian Borntraeger
2021-09-30  9:28 ` Cornelia Huck
2021-09-30 11:03   ` Halil Pasic
2021-09-30 11:31     ` Cornelia Huck
2021-10-01 14:22       ` Halil Pasic
2021-10-01 15:18         ` Cornelia Huck
2021-10-02 18:13           ` Michael S. Tsirkin
2021-10-04  2:23             ` Halil Pasic
2021-10-04  9:07               ` Michael S. Tsirkin
2021-10-05 10:06                 ` Cornelia Huck
2021-10-05 10:43                 ` Halil Pasic
2021-10-05 11:11                   ` Michael S. Tsirkin
2021-10-05 11:13                   ` Cornelia Huck
2021-10-05 11:20                     ` Michael S. Tsirkin
2021-10-05 11:59                     ` Halil Pasic
2021-10-05 15:25                       ` Cornelia Huck
2021-10-04  7:01             ` Cornelia Huck
2021-10-04  9:25               ` Halil Pasic
2021-10-04  9:51                 ` Cornelia Huck
2021-10-02 12:09       ` Michael S. Tsirkin
2021-09-30 11:12 ` Michael S. Tsirkin
2021-09-30 11:36   ` Cornelia Huck
2021-10-02 18:20     ` Michael S. Tsirkin
2021-10-03  5:00       ` Halil Pasic
2021-10-03  6:42         ` Michael S. Tsirkin
2021-10-03  7:26           ` Michael S. Tsirkin
2021-10-04 12:01             ` Cornelia Huck
2021-10-04 12:54               ` Michael S. Tsirkin
2021-10-04 14:27                 ` Cornelia Huck
2021-10-04 15:05                   ` Michael S. Tsirkin
2021-10-04 15:45                     ` [virtio-dev] " Cornelia Huck
2021-10-04 20:01                       ` Michael S. Tsirkin
2021-10-05  7:38                         ` Cornelia Huck
2021-10-05 11:17                         ` Halil Pasic
2021-10-05 11:22                           ` Michael S. Tsirkin
2021-10-05 15:20                             ` Cornelia Huck
2021-10-01  7:21   ` Halil Pasic
2021-10-02 10:21     ` Michael S. Tsirkin [this message]
2021-10-04 12:19       ` Cornelia Huck
2021-10-04 13:11         ` Michael S. Tsirkin
2021-10-04 14:33           ` Cornelia Huck
2021-10-04 15:07             ` Michael S. Tsirkin
2021-10-04 15:50               ` Cornelia Huck
2021-10-04 19:17                 ` Michael S. Tsirkin
2021-10-06 10:13                   ` Cornelia Huck
2021-10-06 12:15                     ` Michael S. Tsirkin
2021-10-05  7:25           ` Halil Pasic
2021-10-05  7:53             ` Michael S. Tsirkin
2021-10-05 10:46               ` Halil Pasic
2021-10-05 11:11                 ` Michael S. Tsirkin
2021-10-01 14:34 ` Christian Borntraeger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211002055605-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=borntraeger@de.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=markver@us.ibm.com \
    --cc=pasic@linux.ibm.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=xieyongji@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).