From: "Michael S. Tsirkin" <mst@redhat.com>
To: Jason Wang <jasowang@redhat.com>
Cc: linux-s390@vger.kernel.org, kvm <kvm@vger.kernel.org>,
Vasily Gorbik <gor@linux.ibm.com>,
Heiko Carstens <hca@linux.ibm.com>,
Cornelia Huck <cohuck@redhat.com>,
linux-kernel <linux-kernel@vger.kernel.org>,
virtualization <virtualization@lists.linux-foundation.org>,
Halil Pasic <pasic@linux.ibm.com>,
Christian Borntraeger <borntraeger@de.ibm.com>,
Alexander Gordeev <agordeev@linux.ibm.com>,
Ben Hutchings <ben@decadent.org.uk>
Subject: Re: [PATCH V3] virtio: disable notification hardening by default
Date: Tue, 28 Jun 2022 00:59:49 -0400 [thread overview]
Message-ID: <20220628004614-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <CACGkMEvcs+9_SHmO1s3nyzgU7oq7jhU2gircVVR3KDsGDikh5Q@mail.gmail.com>
On Tue, Jun 28, 2022 at 11:49:12AM +0800, Jason Wang wrote:
> > Heh. Yea sure. But things work fine for people. What is the chance
> > your review found and fixed all driver bugs?
>
> I don't/can't audit all bugs but the race between open/close against
> ready/reset. It looks to me a good chance to fix them all but if you
> think differently, let me know
>
> > After two attempts
> > I don't feel like hoping audit will fix all bugs.
>
> I've started the auditing and have 15+ patches in the queue. (only
> covers bluetooth, console, pmem, virtio-net and caif). Spotting the
> issue is not hard but the testing, It would take at least the time of
> one release to finalize I guess.
Absolutely. So I am looking for a way to implement hardening that does
not break existing drivers.
> >
> >
> > > >
> > > > The reason config was kind of easy is that config interrupt is rarely
> > > > vital for device function so arbitrarily deferring that does not lead to
> > > > deadlocks - what you are trying to do with VQ interrupts is
> > > > fundamentally different. Things are especially bad if we just drop
> > > > an interrupt but deferring can lead to problems too.
> > >
> > > I'm not sure I see the difference, disable_irq() stuffs also delay the
> > > interrupt processing until enable_irq().
> >
> >
> > Absolutely. I am not at all sure disable_irq fixes all problems.
> >
> > > >
> > > > Consider as an example
> > > > virtio-net: fix race between ndo_open() and virtio_device_ready()
> > > > if you just defer vq interrupts you get deadlocks.
> > > >
> > > >
> > >
> > > I don't see a deadlock here, maybe you can show more detail on this?
> >
> > What I mean is this: if we revert the above commit, things still
> > work (out of spec, but still). If we revert and defer interrupts until
> > device ready then ndo_open that triggers before device ready deadlocks.
>
> Ok, I guess you meant on a hypervisor that is strictly written with spec.
I mean on hypervisor that starts processing queues after getting a kick
even without DRIVER_OK.
> >
> >
> > > >
> > > > So, thinking about all this, how about a simple per vq flag meaning
> > > > "this vq was kicked since reset"?
> > >
> > > And ignore the notification if vq is not kicked? It sounds like the
> > > callback needs to be synchronized with the kick.
> >
> > Note we only need to synchronize it when it changes, which is
> > only during initialization and reset.
>
> Yes.
>
> >
> >
> > > >
> > > > If driver does not kick then it's not ready to get callbacks, right?
> > > >
> > > > Sounds quite clean, but we need to think through memory ordering
> > > > concerns - I guess it's only when we change the value so
> > > > if (!vq->kicked) {
> > > > vq->kicked = true;
> > > > mb();
> > > > }
> > > >
> > > > will do the trick, right?
> > >
> > > There's no much difference with the existing approach:
> > >
> > > 1) your proposal implicitly makes callbacks ready in virtqueue_kick()
> > > 2) my proposal explicitly makes callbacks ready via virtio_device_ready()
> > >
> > > Both require careful auditing of all the existing drivers to make sure
> > > no kick before DRIVER_OK.
> >
> > Jason, kick before DRIVER_OK is out of spec, sure. But it is unrelated
> > to hardening
>
> Yes but with your proposal, it seems to couple kick with DRIVER_OK somehow.
I don't see how - my proposal ignores DRIVER_OK issues.
> > and in absence of config interrupts is generally easily
> > fixed just by sticking virtio_device_ready early in initialization.
>
> So if the kick is done before the subsystem registration, there's
> still a window in the middle (assuming we stick virtio_device_ready()
> early):
>
> virtio_device_ready()
> virtqueue_kick()
> /* the window */
> subsystem_registration()
Absolutely, however, I do not think we really have many such drivers
since this has been known as a wrong thing to do since the beginning.
Want to try to find any? I couldn't ... except maybe bluetooth
but that's just maintainer nacking fixes saying he'll fix it
his way ...
> And during remove(), we get another window:
>
> subsysrem_unregistration()
> /* the window */
> virtio_device_reset()
Same here.
> if we do reset before, we may end up with other issues like deadlock.
>
> So I think it's probably very hard to fix issues only at the virtio
> core since we need subsystem specific knowledge to make sure
> everything is synchronized.
How many drivers do you see with the issue above?
As compared to yours which has 16 patches just in your queue.
> > With the current approach one has to *also* not do virtio_device_ready
> > too early - and it's really tricky.
>
> Not sure how much we differ here, during the probe driver can just
> place the virtio_device_ready() after the kick.
Not so easy. For example, consider virtio net without your
locking change. kick is part of a command vq operation
which does not directly have anything to do with probe.
Same for many other devices - once you register,
linux can send commands immediately.
Yes we can invent per device flags and locking rules to
try and suppress commands until device_ready. I am not
all that happy with this idea to be frank ...
> >
> > With the proposal I think that we don't need to fix all drivers and
> > in my eyes that is a huge advantage because frankly I'm fine with
> > more work on strict spec compliance taking more than expected
> > but I would like the hardening work to finally be done.
>
> Ok, but what I meant is, with your proposal if a buggy drive kicks
> before DRIVER_OK, it suppresses the effect of hardening?
I don't see how. DRIVER_OK is for device's benefit, guest does not need
it.
> > I am not sure the amount of effort expended on the hardening here is
> > proportionate to the benefit it provides.
>
> Probably, but we received those bug reports from the confidential
> computing guys.
>
> Or can we choose to go another way, let the kconfig option enabled for
> TDX/SEV, and then fix the bugs only if it is reported?
Nah that's really fragmenting out userbase.
> >
> >
> >
> > > >
> > > > need to think about the reset path - it already synchronizes callbacks
> > > > and already can lose interrupts so we just need to clear vq->kicked
> > > > before that, right?
> > >
> > > Probably.
> > >
> > > >
> > > >
> > > > > 2) seems require more work in the IRQ core and it can not work for all
> > > > > transports (e.g vDPA would be kind of difficult)
> > > > >
> > > > > Thanks
> > > >
> > > > Hmm I don't really get why would it be difficult.
> > > > VDPA is mostly PCI isn't it? With PCI both level INT#x and edge MSI
> > > > have interrupt masking support.
> > >
> > > Yes, but consider the case of mlx5_vdpa, PCI stuff was hidden under
> > > the auxiliary bus. And that is the way another vendor will go.
> > >
> > > Thanks
> >
> > A bunch of callbacks will do it I guess.
>
> Possible but looks like a layer violation, I think auxiliary stuff
> wants to hide the underlayer architecture. That is why I tend to do it
> in the virtio core. And actually, transport is freed to implement
> another layer of those synchronization if it wants.
>
> Thanks
As usual a level of indirection will help with layering.
> >
> > > >
> > > >
> > > > > >
> > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > ---
> > > > > > > > > Changes since V2:
> > > > > > > > > - Tweak the Kconfig help
> > > > > > > > > - Add comment for the read_lock() pairing in virtio_ccw
> > > > > > > > > ---
> > > > > > > > > drivers/s390/virtio/virtio_ccw.c | 9 ++++++++-
> > > > > > > > > drivers/virtio/Kconfig | 13 +++++++++++++
> > > > > > > > > drivers/virtio/virtio.c | 2 ++
> > > > > > > > > drivers/virtio/virtio_ring.c | 12 ++++++++++++
> > > > > > > > > include/linux/virtio_config.h | 2 ++
> > > > > > > > > 5 files changed, 37 insertions(+), 1 deletion(-)
> > > > > > > > >
> > > > > > > > > diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> > > > > > > > > index 97e51c34e6cf..1f6a358f65f0 100644
> > > > > > > > > --- a/drivers/s390/virtio/virtio_ccw.c
> > > > > > > > > +++ b/drivers/s390/virtio/virtio_ccw.c
> > > > > > > > > @@ -1136,8 +1136,13 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
> > > > > > > > > vcdev->err = -EIO;
> > > > > > > > > }
> > > > > > > > > virtio_ccw_check_activity(vcdev, activity);
> > > > > > > > > - /* Interrupts are disabled here */
> > > > > > > > > +#ifdef CONFIG_VIRTIO_HARDEN_NOTIFICATION
> > > > > > > > > + /*
> > > > > > > > > + * Paried with virtio_ccw_synchronize_cbs() and interrupts are
> > > > > > > > > + * disabled here.
> > > > > > > > > + */
> > > > > > > > > read_lock(&vcdev->irq_lock);
> > > > > > > > > +#endif
> > > > > > > > > for_each_set_bit(i, indicators(vcdev),
> > > > > > > > > sizeof(*indicators(vcdev)) * BITS_PER_BYTE) {
> > > > > > > > > /* The bit clear must happen before the vring kick. */
> > > > > > > > > @@ -1146,7 +1151,9 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
> > > > > > > > > vq = virtio_ccw_vq_by_ind(vcdev, i);
> > > > > > > > > vring_interrupt(0, vq);
> > > > > > > > > }
> > > > > > > > > +#ifdef CONFIG_VIRTIO_HARDEN_NOTIFICATION
> > > > > > > > > read_unlock(&vcdev->irq_lock);
> > > > > > > > > +#endif
> > > > > > > > > if (test_bit(0, indicators2(vcdev))) {
> > > > > > > > > virtio_config_changed(&vcdev->vdev);
> > > > > > > > > clear_bit(0, indicators2(vcdev));
> > > > > > > > > diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
> > > > > > > > > index b5adf6abd241..c04f370a1e5c 100644
> > > > > > > > > --- a/drivers/virtio/Kconfig
> > > > > > > > > +++ b/drivers/virtio/Kconfig
> > > > > > > > > @@ -35,6 +35,19 @@ menuconfig VIRTIO_MENU
> > > > > > > > >
> > > > > > > > > if VIRTIO_MENU
> > > > > > > > >
> > > > > > > > > +config VIRTIO_HARDEN_NOTIFICATION
> > > > > > > > > + bool "Harden virtio notification"
> > > > > > > > > + help
> > > > > > > > > + Enable this to harden the device notifications and suppress
> > > > > > > > > + those that happen at a time where notifications are illegal.
> > > > > > > > > +
> > > > > > > > > + Experimental: Note that several drivers still have bugs that
> > > > > > > > > + may cause crashes or hangs when correct handling of
> > > > > > > > > + notifications is enforced; depending on the subset of
> > > > > > > > > + drivers and devices you use, this may or may not work.
> > > > > > > > > +
> > > > > > > > > + If unsure, say N.
> > > > > > > > > +
> > > > > > > > > config VIRTIO_PCI
> > > > > > > > > tristate "PCI driver for virtio devices"
> > > > > > > > > depends on PCI
> > > > > > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > > > > > > index ef04a96942bf..21dc08d2f32d 100644
> > > > > > > > > --- a/drivers/virtio/virtio.c
> > > > > > > > > +++ b/drivers/virtio/virtio.c
> > > > > > > > > @@ -220,6 +220,7 @@ static int virtio_features_ok(struct virtio_device *dev)
> > > > > > > > > * */
> > > > > > > > > void virtio_reset_device(struct virtio_device *dev)
> > > > > > > > > {
> > > > > > > > > +#ifdef CONFIG_VIRTIO_HARDEN_NOTIFICATION
> > > > > > > > > /*
> > > > > > > > > * The below virtio_synchronize_cbs() guarantees that any
> > > > > > > > > * interrupt for this line arriving after
> > > > > > > > > @@ -228,6 +229,7 @@ void virtio_reset_device(struct virtio_device *dev)
> > > > > > > > > */
> > > > > > > > > virtio_break_device(dev);
> > > > > > > > > virtio_synchronize_cbs(dev);
> > > > > > > > > +#endif
> > > > > > > > >
> > > > > > > > > dev->config->reset(dev);
> > > > > > > > > }
> > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > > > > index 13a7348cedff..d9d3b6e201fb 100644
> > > > > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > > > > @@ -1688,7 +1688,11 @@ static struct virtqueue *vring_create_virtqueue_packed(
> > > > > > > > > vq->we_own_ring = true;
> > > > > > > > > vq->notify = notify;
> > > > > > > > > vq->weak_barriers = weak_barriers;
> > > > > > > > > +#ifdef CONFIG_VIRTIO_HARDEN_NOTIFICATION
> > > > > > > > > vq->broken = true;
> > > > > > > > > +#else
> > > > > > > > > + vq->broken = false;
> > > > > > > > > +#endif
> > > > > > > > > vq->last_used_idx = 0;
> > > > > > > > > vq->event_triggered = false;
> > > > > > > > > vq->num_added = 0;
> > > > > > > > > @@ -2135,9 +2139,13 @@ irqreturn_t vring_interrupt(int irq, void *_vq)
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > if (unlikely(vq->broken)) {
> > > > > > > > > +#ifdef CONFIG_VIRTIO_HARDEN_NOTIFICATION
> > > > > > > > > dev_warn_once(&vq->vq.vdev->dev,
> > > > > > > > > "virtio vring IRQ raised before DRIVER_OK");
> > > > > > > > > return IRQ_NONE;
> > > > > > > > > +#else
> > > > > > > > > + return IRQ_HANDLED;
> > > > > > > > > +#endif
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > /* Just a hint for performance: so it's ok that this can be racy! */
> > > > > > > > > @@ -2180,7 +2188,11 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index,
> > > > > > > > > vq->we_own_ring = false;
> > > > > > > > > vq->notify = notify;
> > > > > > > > > vq->weak_barriers = weak_barriers;
> > > > > > > > > +#ifdef CONFIG_VIRTIO_HARDEN_NOTIFICATION
> > > > > > > > > vq->broken = true;
> > > > > > > > > +#else
> > > > > > > > > + vq->broken = false;
> > > > > > > > > +#endif
> > > > > > > > > vq->last_used_idx = 0;
> > > > > > > > > vq->event_triggered = false;
> > > > > > > > > vq->num_added = 0;
> > > > > > > > > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > > > > > > > index 9a36051ceb76..d15c3cdda2d2 100644
> > > > > > > > > --- a/include/linux/virtio_config.h
> > > > > > > > > +++ b/include/linux/virtio_config.h
> > > > > > > > > @@ -257,6 +257,7 @@ void virtio_device_ready(struct virtio_device *dev)
> > > > > > > > >
> > > > > > > > > WARN_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > >
> > > > > > > > > +#ifdef CONFIG_VIRTIO_HARDEN_NOTIFICATION
> > > > > > > > > /*
> > > > > > > > > * The virtio_synchronize_cbs() makes sure vring_interrupt()
> > > > > > > > > * will see the driver specific setup if it sees vq->broken
> > > > > > > > > @@ -264,6 +265,7 @@ void virtio_device_ready(struct virtio_device *dev)
> > > > > > > > > */
> > > > > > > > > virtio_synchronize_cbs(dev);
> > > > > > > > > __virtio_unbreak_device(dev);
> > > > > > > > > +#endif
> > > > > > > > > /*
> > > > > > > > > * The transport should ensure the visibility of vq->broken
> > > > > > > > > * before setting DRIVER_OK. See the comments for the transport
> > > > > > > > > --
> > > > > > > > > 2.25.1
> > > > > > > >
> > > > > >
> > > >
> >
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
next prev parent reply other threads:[~2022-06-28 5:00 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-22 1:29 [PATCH V3] virtio: disable notification hardening by default Jason Wang
2022-06-22 7:02 ` Michael S. Tsirkin
2022-06-22 7:09 ` Jason Wang
2022-06-24 6:31 ` Michael S. Tsirkin
2022-06-27 2:50 ` Jason Wang
2022-06-27 7:33 ` Michael S. Tsirkin
2022-06-27 8:11 ` Jason Wang
2022-06-27 9:52 ` Michael S. Tsirkin
2022-06-28 3:49 ` Jason Wang
2022-06-28 4:59 ` Michael S. Tsirkin [this message]
2022-06-28 6:17 ` Jason Wang
2022-06-28 6:24 ` Michael S. Tsirkin
2022-06-28 6:32 ` Jason Wang
2022-06-28 6:44 ` Michael S. Tsirkin
2022-06-29 4:07 ` Jason Wang
2022-06-29 6:31 ` Michael S. Tsirkin
2022-06-29 7:02 ` Jason Wang
2022-06-29 7:15 ` Michael S. Tsirkin
2022-06-29 8:34 ` Jason Wang
2022-06-29 8:52 ` Michael S. Tsirkin
2022-06-30 2:01 ` Jason Wang
2022-06-30 8:38 ` Michael S. Tsirkin
2022-07-04 4:23 ` Jason Wang
2022-07-04 6:22 ` Michael S. Tsirkin
2022-07-04 6:40 ` Jason Wang
2022-07-04 7:00 ` Michael S. Tsirkin
2022-06-22 8:32 ` Cornelia Huck
2022-06-24 8:41 ` Xuan Zhuo
2022-06-24 9:05 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220628004614-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=agordeev@linux.ibm.com \
--cc=ben@decadent.org.uk \
--cc=borntraeger@de.ibm.com \
--cc=cohuck@redhat.com \
--cc=gor@linux.ibm.com \
--cc=hca@linux.ibm.com \
--cc=jasowang@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=pasic@linux.ibm.com \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).