From: Greg Kurz <groug@kaod.org>
To: Markus Armbruster <armbru@redhat.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
qemu-devel@nongnu.org, "Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 1/2] virtio-9p: print error message and exit instead of BUG_ON()
Date: Fri, 9 Sep 2016 11:54:30 +0200 [thread overview]
Message-ID: <20160909115430.73567e70@bahia> (raw)
In-Reply-To: <87vay5samv.fsf@dusky.pond.sub.org>
On Fri, 09 Sep 2016 11:08:56 +0200
Markus Armbruster <armbru@redhat.com> wrote:
> Greg Kurz <groug@kaod.org> writes:
>
> > On Fri, 09 Sep 2016 08:38:13 +0200
> > Markus Armbruster <armbru@redhat.com> wrote:
> >
> >> Greg Kurz <groug@kaod.org> writes:
> >>
> >> > On Thu, 8 Sep 2016 18:19:27 +0300
> >> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >> >
> >> >> On Thu, Sep 08, 2016 at 05:04:47PM +0200, Cornelia Huck wrote:
> >> >> > On Thu, 8 Sep 2016 18:00:28 +0300
> >> >> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >> >> >
> >> >> > > On Thu, Sep 08, 2016 at 11:12:16AM +0200, Greg Kurz wrote:
> >> >> > > > On Thu, 8 Sep 2016 10:59:26 +0200
> >> >> > > > Cornelia Huck <cornelia.huck@de.ibm.com> wrote:
> >> >> > > >
> >> >> > > > > On Wed, 07 Sep 2016 19:19:24 +0200
> >> >> > > > > Greg Kurz <groug@kaod.org> wrote:
> >> >> > > > >
> >> >> > > > > > Calling assert() really makes sense when hitting a genuine bug, which calls
> >> >> > > > > > for a fix in QEMU. However, when something goes wrong because the guest
> >> >> > > > > > sends a malformed message, it is better to write down a more meaningul
> >> >> > > > > > error message and exit.
> >> >> > > > > >
> >> >> > > > > > Signed-off-by: Greg Kurz <groug@kaod.org>
> >> >> > > > > > ---
> >> >> > > > > > hw/9pfs/virtio-9p-device.c | 20 ++++++++++++++++++--
> >> >> > > > > > 1 file changed, 18 insertions(+), 2 deletions(-)
> >> >> > > > >
> >> >> > > > > While this is an improvement over the current state, I don't think the
> >> >> > > > > guest should be able to kill qemu just by doing something stupid.
> >> >> > > > >
> >> >> > > >
> >> >> > > > Hi Connie,
> >> >> > > >
> >> >> > > > I'm glad you're pointing this out... this was also my impression, but
> >> >> > > > since there are a bunch of sanity checks in the virtio code that cause
> >> >> > > > QEMU to exit (even recently added like 1e7aed70144b), I did not dare
> >> >> > > > stand up :)
> >> >> > >
> >> >> > > It's true that it's broken in many places but we should just
> >> >> > > fix them all.
> >> >> > >
> >> >> > >
> >> >> > > A separate question is how to log such hardware/guest bugs generally.
> >> >> > > People already complained about disk filling up because of us printing
> >> >> > > errors on each such bug. Maybe print each message only N times, and
> >> >> > > then set a flag to skip the log until management tells us to restart
> >> >> > > logging again.
> >> >> >
> >> >> > I'd expect to get the message just once per device if we set the device
> >> >> > to broken (unless the guess continuously resets it again...)
> >> >>
> >> >> Which it can do, so we should limit that anyway.
> >> >>
> >> >> > Do we have
> >> >> > a generic print/log ratelimit infrastructure in qemu?
> >> >>
> >> >> There are actually two kinds of errors
> >> >> host side ones and ones triggered by guests.
> >> >>
> >> >> We should distinguish between them API-wise, then
> >> >> we will be able to limit the logging of those
> >> >> that guest can trigger.
> >> >>
> >> >
> >> > FWIW it makes sense to use error_report() if QEMU exits.
> >>
> >> exit(STATUS) with STATUS != 0 without printing a message is always
> >> wrong.
> >>
> >
> > I fully agree.
> >
> >> > If it continues
> >> > execution, this means we're expecting the guest or the host to do something
> >> > to fix the error condition. This requires QEMU to emit an event of some
> >> > sort, but not necessarily to log an error message in a file. I guess this
> >> > depends if QEMU is run by some tooling, or by a human.
> >>
> >> error_report() normally goes to stderr. Tooling or humans can of course
> >> make it go to a file instead.
> >>
> >> error_report() is indeed a sub-par way to send an "attention" signal to
> >> the host, because recognizing such a signal reliably is unnecessary hard
> >> for management applications. QMP events are much easier.
> >>
> >
> > My wording was poor but yes, that was my point. :)
> >
> >> Both are useless when the signal needs to go to the guest. Signalling
> >> the guest is a device model job.
> >>
> >
> > I also agree with that. In the case of virtio, this is explained in section
> > 2.1.2 of the spec.
> >
> >> error_report() without exit() has its uses. Error conditions in need of
> >> fixing aren't the only reason to call error_report(). But when you add
> >> a call, ask yourself whether management application or guest would like
> >> to respond to it.
> >
> > In the case of the present patch, we currently have BUG_ON() which generates
> > a cryptic and unusable message.
> >
> > It turns out that the first one (elem->out_num == 0 || elem->in_num == 0) is
> > correct since it is now [1] impossible to hit this according to the code (see
> > virtqueue_pop() and virtqueue_map_desc()).
> >
> > The second one (len != sizeof out) though matches a potential guest originated
> > error. If I do as suggested by Connie, then the error_report() isn't needed
> > anymore.
>
> I dive into the details of your analysis right now, only make high-level
> recommendations:
>
> * Issues common to all virtio devices should be addressed in the virtio
> core. If that's not feasible, they should be addressed in all devices
> consistently.
>
Agreed.
> * Guest misbehavior should put the device in a guest-observable error
> state. It should not crash QEMU, it should not spam stderr. Code
> handling it in other ways should be marked FIXME.
>
Agreed. FWIW a bunch of FIXMEs are missing in the virtio code then :)
> * Nobody expects you to get things perfectly right in one step. Just
> try to move towards the goal.
>
Sure ! I'm now reading through Stefan's series to address the issue:
https://lists.nongnu.org/archive/html/qemu-devel/2016-04/msg01978.html
Cheers.
--
Greg
> >
> > Cheers.
> >
> > --
> > Greg
> >
> > [1] sending an empty buffer was sufficient before commit 1e7aed70144b4 as said
> > in my previous answer
next prev parent reply other threads:[~2016-09-09 9:54 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-09-07 17:19 [Qemu-devel] [PATCH 0/2] virtio: error report fixes in 9P and PCI Greg Kurz
2016-09-07 17:19 ` [Qemu-devel] [PATCH 1/2] virtio-9p: print error message and exit instead of BUG_ON() Greg Kurz
2016-09-08 7:14 ` Markus Armbruster
2016-09-08 9:05 ` Greg Kurz
2016-09-08 8:59 ` Cornelia Huck
2016-09-08 9:12 ` Greg Kurz
2016-09-08 15:00 ` Michael S. Tsirkin
2016-09-08 15:04 ` Cornelia Huck
2016-09-08 15:19 ` Michael S. Tsirkin
2016-09-08 16:26 ` Greg Kurz
2016-09-08 16:55 ` Michael S. Tsirkin
2016-09-09 8:30 ` Cornelia Huck
2016-09-09 8:46 ` Greg Kurz
2016-09-09 8:53 ` Cornelia Huck
2016-09-09 9:26 ` Greg Kurz
2016-09-09 9:37 ` Greg Kurz
2016-09-09 6:38 ` Markus Armbruster
2016-09-09 7:30 ` Greg Kurz
2016-09-09 9:08 ` Markus Armbruster
2016-09-09 9:54 ` Greg Kurz [this message]
2016-09-07 17:19 ` [Qemu-devel] [PATCH 2/2] virtio-pci: error out when both legacy and modern modes are disabled Greg Kurz
2016-09-08 7:15 ` Markus Armbruster
2016-09-08 9:52 ` Greg Kurz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160909115430.73567e70@bahia \
--to=groug@kaod.org \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=armbru@redhat.com \
--cc=cornelia.huck@de.ibm.com \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.