From: Halil Pasic <pasic@linux.vnet.ibm.com>
To: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: qemu-devel@nongnu.org, "Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 1/1] virtio: fail device if set_event_notifier fails
Date: Fri, 3 Mar 2017 14:08:37 +0100 [thread overview]
Message-ID: <d00802cc-150a-2263-604d-9c2eaa5c19d1@linux.vnet.ibm.com> (raw)
In-Reply-To: <20170303135012.17edb640.cornelia.huck@de.ibm.com>
On 03/03/2017 01:50 PM, Cornelia Huck wrote:
> On Fri, 3 Mar 2017 13:43:32 +0100
> Halil Pasic <pasic@linux.vnet.ibm.com> wrote:
>
>> On 03/03/2017 01:21 PM, Cornelia Huck wrote:
>>> On Thu, 2 Mar 2017 19:59:42 +0100
>>> Halil Pasic <pasic@linux.vnet.ibm.com> wrote:
>>>
>>>> The function virtio_notify_irqfd used to ignore the return code of
>>>> event_notifier_set. Let's fail the device should this occur.
>>>
>>> I'm wondering if there are reasons for event_notifier_set() to fail
>>> beyond "we've hit an internal race and should make an effort to fix
>>> that one, or else we have completely messed up in qemu". Marking the
>>> device broken tells the guest that there's something wrong with the
>>> device, but I think we want qemu bug reports when there's something
>>> broken with the irqfd.
>>>
>>
>> That's why the error is logged. I understand virtio_error like something
>> suitable for indicating bugs.
>>
>> What do you suggest? Forcing a dump? I would rather leave it to the
>> user to figure out how important is the state sitting in the machine
>> and the device, and how much effort does (s)he want to put into recovering
>> from the failure.
>
> How likely are those logged messages being brought to attention of the
> admin? Does any management software flag machines with such error
> messages? (that's more of a general question)
>
I admit, I did not investigate this thoroughly, also because the patch
is flawed regarding multi-thread anyway. After a quick investigation
it seems the linux guest won't auto-reset the device so the guest should
end up with a not working device. I think it's pretty likely that the
admin will check the logs if the device was important.
I agree fully that it's a very general question, and I do not feel
competent for answering it.
> I'd like to have some kind of trigger that rings an alarm bell so that
> the admin might consider reporting this, but I don't have a good idea
> on how to do that either...
>
There are tools for aggregating and processing logs, and triggering
alarm bells too (for example ELK= logstash + Kibana + Elasticsearch).
AFAIK logs are the most common way to deal with such stuff. But I'm far
form being an expert. Of course logs are only as good as the messages
landing in them...
Halil
next prev parent reply other threads:[~2017-03-03 13:09 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-02 18:59 [Qemu-devel] [PATCH 1/1] virtio: fail device if set_event_notifier fails Halil Pasic
2017-03-03 12:21 ` Cornelia Huck
2017-03-03 12:43 ` Halil Pasic
2017-03-03 12:50 ` Cornelia Huck
2017-03-03 13:08 ` Halil Pasic [this message]
2017-03-06 14:56 ` Cornelia Huck
2017-03-06 15:21 ` Halil Pasic
2017-03-06 16:04 ` Cornelia Huck
2017-03-23 17:09 ` Michael S. Tsirkin
2017-03-29 11:12 ` Halil Pasic
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d00802cc-150a-2263-604d-9c2eaa5c19d1@linux.vnet.ibm.com \
--to=pasic@linux.vnet.ibm.com \
--cc=cornelia.huck@de.ibm.com \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).