All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kurz <groug@kaod.org>
To: Markus Armbruster <armbru@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
	Cornelia Huck <cornelia.huck@de.ibm.com>,
	qemu-devel@nongnu.org,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Subject: Re: [Qemu-devel] [PATCH 1/2] virtio-9p: print error message and exit instead of BUG_ON()
Date: Fri, 9 Sep 2016 09:30:21 +0200	[thread overview]
Message-ID: <20160909093021.421a23fc@bahia> (raw)
In-Reply-To: <87fup9wpbe.fsf@dusky.pond.sub.org>

On Fri, 09 Sep 2016 08:38:13 +0200
Markus Armbruster <armbru@redhat.com> wrote:

> Greg Kurz <groug@kaod.org> writes:
> 
> > On Thu, 8 Sep 2016 18:19:27 +0300
> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >  
> >> On Thu, Sep 08, 2016 at 05:04:47PM +0200, Cornelia Huck wrote:  
> >> > On Thu, 8 Sep 2016 18:00:28 +0300
> >> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >> >     
> >> > > On Thu, Sep 08, 2016 at 11:12:16AM +0200, Greg Kurz wrote:    
> >> > > > On Thu, 8 Sep 2016 10:59:26 +0200
> >> > > > Cornelia Huck <cornelia.huck@de.ibm.com> wrote:
> >> > > >     
> >> > > > > On Wed, 07 Sep 2016 19:19:24 +0200
> >> > > > > Greg Kurz <groug@kaod.org> wrote:
> >> > > > >     
> >> > > > > > Calling assert() really makes sense when hitting a genuine bug, which calls
> >> > > > > > for a fix in QEMU. However, when something goes wrong because the guest
> >> > > > > > sends a malformed message, it is better to write down a more meaningul
> >> > > > > > error message and exit.
> >> > > > > > 
> >> > > > > > Signed-off-by: Greg Kurz <groug@kaod.org>
> >> > > > > > ---
> >> > > > > >  hw/9pfs/virtio-9p-device.c |   20 ++++++++++++++++++--
> >> > > > > >  1 file changed, 18 insertions(+), 2 deletions(-)      
> >> > > > > 
> >> > > > > While this is an improvement over the current state, I don't think the
> >> > > > > guest should be able to kill qemu just by doing something stupid.
> >> > > > >     
> >> > > > 
> >> > > > Hi Connie,
> >> > > > 
> >> > > > I'm glad you're pointing this out... this was also my impression, but
> >> > > > since there are a bunch of sanity checks in the virtio code that cause
> >> > > > QEMU to exit (even recently added like 1e7aed70144b), I did not dare
> >> > > > stand up :)    
> >> > > 
> >> > > It's true that it's broken in many places but we should just
> >> > > fix them all.
> >> > > 
> >> > > 
> >> > > A separate question is how to log such hardware/guest bugs generally.
> >> > > People already complained about disk filling up because of us printing
> >> > > errors on each such bug.  Maybe print each message only N times, and
> >> > > then set a flag to skip the log until management tells us to restart
> >> > > logging again.    
> >> > 
> >> > I'd expect to get the message just once per device if we set the device
> >> > to broken (unless the guess continuously resets it again...)    
> >> 
> >> Which it can do, so we should limit that anyway.
> >>   
> >> > Do we have
> >> > a generic print/log ratelimit infrastructure in qemu?    
> >> 
> >> There are actually two kinds of errors
> >> host side ones and ones triggered by guests.
> >> 
> >> We should distinguish between them API-wise, then
> >> we will be able to limit the logging of those
> >> that guest can trigger.
> >>   
> >
> > FWIW it makes sense to use error_report() if QEMU exits.  
> 
> exit(STATUS) with STATUS != 0 without printing a message is always
> wrong.
> 

I fully agree.

> >                                                          If it continues
> > execution, this means we're expecting the guest or the host to do something
> > to fix the error condition. This requires QEMU to emit an event of some
> > sort, but not necessarily to log an error message in a file. I guess this
> > depends if QEMU is run by some tooling, or by a human.  
> 
> error_report() normally goes to stderr.  Tooling or humans can of course
> make it go to a file instead.
> 
> error_report() is indeed a sub-par way to send an "attention" signal to
> the host, because recognizing such a signal reliably is unnecessary hard
> for management applications.  QMP events are much easier.
> 

My wording was poor but yes, that was my point. :)

> Both are useless when the signal needs to go to the guest.  Signalling
> the guest is a device model job.
> 

I also agree with that. In the case of virtio, this is explained in section
2.1.2 of the spec.

> error_report() without exit() has its uses.  Error conditions in need of
> fixing aren't the only reason to call error_report().  But when you add
> a call, ask yourself whether management application or guest would like
> to respond to it.

In the case of the present patch, we currently have BUG_ON() which generates
a cryptic and unusable message.

It turns out that the first one (elem->out_num == 0 || elem->in_num == 0) is
correct since it is now [1] impossible to hit this according to the code (see
virtqueue_pop() and virtqueue_map_desc()).

The second one (len != sizeof out) though matches a potential guest originated
error. If I do as suggested by Connie, then the error_report() isn't needed
anymore.

Cheers.

--
Greg

[1] sending an empty buffer was sufficient before commit 1e7aed70144b4 as said
    in my previous answer

  reply	other threads:[~2016-09-09  7:30 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-07 17:19 [Qemu-devel] [PATCH 0/2] virtio: error report fixes in 9P and PCI Greg Kurz
2016-09-07 17:19 ` [Qemu-devel] [PATCH 1/2] virtio-9p: print error message and exit instead of BUG_ON() Greg Kurz
2016-09-08  7:14   ` Markus Armbruster
2016-09-08  9:05     ` Greg Kurz
2016-09-08  8:59   ` Cornelia Huck
2016-09-08  9:12     ` Greg Kurz
2016-09-08 15:00       ` Michael S. Tsirkin
2016-09-08 15:04         ` Cornelia Huck
2016-09-08 15:19           ` Michael S. Tsirkin
2016-09-08 16:26             ` Greg Kurz
2016-09-08 16:55               ` Michael S. Tsirkin
2016-09-09  8:30                 ` Cornelia Huck
2016-09-09  8:46                   ` Greg Kurz
2016-09-09  8:53                     ` Cornelia Huck
2016-09-09  9:26                       ` Greg Kurz
2016-09-09  9:37                         ` Greg Kurz
2016-09-09  6:38               ` Markus Armbruster
2016-09-09  7:30                 ` Greg Kurz [this message]
2016-09-09  9:08                   ` Markus Armbruster
2016-09-09  9:54                     ` Greg Kurz
2016-09-07 17:19 ` [Qemu-devel] [PATCH 2/2] virtio-pci: error out when both legacy and modern modes are disabled Greg Kurz
2016-09-08  7:15   ` Markus Armbruster
2016-09-08  9:52     ` Greg Kurz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160909093021.421a23fc@bahia \
    --to=groug@kaod.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=armbru@redhat.com \
    --cc=cornelia.huck@de.ibm.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.