From: Etienne Martineau <etmartin101@gmail.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: Jan Kiszka <jan.kiszka@web.de>,
Marcelo Tosatti <mtosatti@redhat.com>, kvm <kvm@vger.kernel.org>
Subject: Re: pci-assign terminates the guest upon pread() / pwrite() error?
Date: Fri, 21 Sep 2012 13:07:44 -0400 [thread overview]
Message-ID: <505C9EE0.8010107@gmail.com> (raw)
In-Reply-To: <1348242578.2320.42.camel@ul30vt.home>
On 09/21/2012 11:49 AM, Alex Williamson wrote:
> On Fri, 2012-09-21 at 11:17 -0400, Etienne Martineau wrote:
>> On 09/20/2012 05:13 PM, Alex Williamson wrote:
>>> On Thu, 2012-09-20 at 16:36 -0400, Etienne Martineau wrote:
>>>> On 09/20/2012 03:37 PM, Alex Williamson wrote:
>>>>> On Thu, 2012-09-20 at 15:08 -0400, Etienne Martineau wrote:
>>>>>> On 09/20/2012 02:16 PM, Alex Williamson wrote:
>>>>>>> On Thu, 2012-09-20 at 13:27 -0400, Etienne Martineau wrote:
>>>>>>>> In hw/kvm/pci-assign.c a pread() error part of assigned_dev_pci_read()
>>>>>>>> result in a hw_error(). Similarly a pwrite() error part of
>>>>>>>> assigned_dev_pci_write() also result in a hw_error().
>>>>>>>>
>>>>>>>> Would there be a way to avoid terminating the guest for those cases? How
>>>>>>>> about we deassign the device upon error?
>>>>>>>
>>>>>>> By terminating the guest we contain the error vs allowing the guest to
>>>>>>> continue running with invalid data. De-assigning the device is
>>>>>>> asynchronous and relies on guest involvement, so damage is potentially
>>>>>>> already done. Is this a theoretical problem or do you actually have
>>>>>>> hardware that hits this? Thanks,
>>>>>>>
>>>>>>> Alex
>>>>>>>
>>>>>>
>>>>>> This problem is in the context of a Hot-pluggable device assigned to the
>>>>>> guest. If the guest rd/wr the config space at the same time than the
>>>>>> device is physically taken out then the guest will terminate with
>>>>>> hw_error().
>>>>>>
>>>>>> Because this limits the availability of the guest I think we should try
>>>>>> to recover instead. I don't see what other damage can happen since
>>>>>> guest's MMIO access to the stale device will go nowhere?
>>>>>
>>>>> So you're looking at implementing surprise device removal? There's not
>>>>> just config space, there's slow bar access and mmap'd spaces to worry
>>>>> about too. What does going nowhere mean? If it means reads return -1
>>>>> and the guest is trying to read the data portion of a packet from the
>>>>> network or an hba, we've now passed bad data to the guest. Thanks,
>>>>>
>>>>> Alex
>>>>>
>>>>>
>>>>>
>>>>
>>>> Thanks for your answer;
>>>>
>>>> Yes we are doing 'surprise device removal' for assigned device. Note
>>>> that the problem also exist with standard 'attention button' device removal.
>>>>
>>>> The problem is all about fault isolation. Ideally, only the
>>>> corresponding driver should be affected by this 'surprise device
>>>> removal'. I think that taking down the guest is too coarse. Think about
>>>> a 'surprise device removal' on the host. In that case the host is not
>>>> taken down so why not do the same with the guest?
>>>
>>> It depends on the host hardware. Some x86 hardware will try to isolate
>>> the fault with an NMI other architectures such as ia64 would pull a
>>> machine check on a driver access to unresponsive devices.
>>>
>>>> Yes some badness will be latched into the guest but really this not any
>>>> different that having a mis-behaving device.
>>>
>>> ... which is a bad thing, but often undetectable. This is detectable.
>>> Thanks,
>>>
>>> Alex
>>>
>>
>> Our hardware is throwing a surprise link down PCIe AER and we are acting
>> on it. I agree that for the generalized case NMI can be an issue.
>>
>> Let me ask you that question. What would be the best way to support
>> device removal (surprise or not) for guest assigned device then? How
>> about signaling the guest from vfio_pci_remove()?
>
> Thanks for using vfio! :)
>
> The 440fx chipset is really not designed to deal with these kinds of
> problems. Generally the best answer to "how should we expose foo to the
> guest" is to do it exactly like it is on the host. That means sending a
> surprise link down aer to the guest. That should be possible with q35.
We are using q35 at this time for those reasons but the original qemu
problem still exist. By the time the SPLD aer reached the guest, the
device is physically gone on the host. Any transient guest MMIO/PCIcfg
access to the stale assigned device can be fatal ( hw_error() ).
> We could potentially signal that in vfio_pci_remove, but we probably
> want to figure out how to relay the aer event to the guest and inject it
> into the emulated chipset.
We tried that but there was some problems such as mangling the tlp to
match the guest pci topology or the propagation latency caused by the
chipset emulation layer during AER delivery. Right now we are using a
straight lookup in the guest and fire the AER directly into the driver
callback pci_error. We are doing that to minimize the exposition to the
stale assigned device.
thanks,
Etienne
prev parent reply other threads:[~2012-09-21 17:07 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-09-20 17:27 pci-assign terminates the guest upon pread() / pwrite() error? Etienne Martineau
2012-09-20 17:36 ` Jan Kiszka
2012-09-20 18:16 ` Alex Williamson
2012-09-20 19:08 ` Etienne Martineau
2012-09-20 19:37 ` Alex Williamson
2012-09-20 20:36 ` Etienne Martineau
2012-09-20 21:13 ` Alex Williamson
2012-09-21 15:17 ` Etienne Martineau
2012-09-21 15:49 ` Alex Williamson
2012-09-21 17:07 ` Etienne Martineau [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=505C9EE0.8010107@gmail.com \
--to=etmartin101@gmail.com \
--cc=alex.williamson@redhat.com \
--cc=jan.kiszka@web.de \
--cc=kvm@vger.kernel.org \
--cc=mtosatti@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox