From: Chen Fan <fan.chen@easystack.cn>
To: Alex Williamson <alex.williamson@redhat.com>,
Zhou Jie <zhoujie2011@cn.fujitsu.com>
Cc: mst@redhat.com, qemu-devel@nongnu.org, caoj.fnst@cn.fujitsu.com,
Chen Fan <chen.fan.fnst@cn.fujitsu.com>,
izumi.taku@jp.fujitsu.com
Subject: Re: [Qemu-devel] [PATCH v8 11/12] vfio: register aer resume notification handler for aer resume
Date: Tue, 21 Jun 2016 20:41:32 +0800 [thread overview]
Message-ID: <576935FC.1080503@easystack.cn> (raw)
In-Reply-To: <20160620211306.66a6b249@t450s.home>
On 2016年06月21日 11:13, Alex Williamson wrote:
> On Tue, 21 Jun 2016 10:16:25 +0800
> Zhou Jie <zhoujie2011@cn.fujitsu.com> wrote:
>
>> Hi, Alex
>>
>>> I was really hoping to hear your opinion, or at least some further
>>> discussion of pros and cons rather than simply parroting back my idea.
>> I understand.
>>
>>> My current thinking is that a resume notifier to userspace is poorly
>>> defined, it's not clear what the user can and cannot do between an
>>> error notification and the resume notification.
>> Yes, do nothing between that time is better.
>>
>>> One approach to solve
>>> that might be that the kernel internally handles the resume
>>> notifications. Maybe that means blocking the ioctl (interruptible
>>> timeout) until the internal resume occurs, or maybe that means
>>> returning -EAGAIN.
>> I don't think it is a good idea.
>> The kernel give the error and resume notifications, it's enough.
>> It's up to user to how to use them.
> Well that's exactly why it's poorly defined. What does a resume
> notification signal a user that they're allowed to do? What can they
> not do between error and resume notification. Clearly you had issues
> attempting to perform a reset during this time period since it was
> racing with the kernel reset, so is a user allowed to do a hot reset
> between error and resume? Where do we define it? Do we prevent it if
> they try? Why? What about the reset ioctl? How and why is that
> different from a hot reset? (hint, they can be the same) Do we define
> that resets are not allowed between error and resume, but other
> operations like read/write or interrupt setup ioctls are allowed? Why?
> Clearly we can't do anything that manipulates the device between error
> and resume since it might be lost or ineffective, but where do we
> define it and do we need to actively enforce those rules? I'm arguing
> that it's poorly defined, so "it's up to the user how to use them"
> doesn't not give me any additional confidence in that approach. We
> can't trust the user to be polite, we can't even trust the user not to
> be malicious.
Hi Alex,
on kernel side, I think if we don't trust the user behaviors, we
should
disable the access of vfio-pci interface once vfio-pci driver got the
error_detected,
we should disable all access to vfio fd regardless whether the vfio-pci
was assigned to a VM, we also can return a EAGAIN error if user try
to access it during the reset period until the host reset finished.
on qemu side, when we got a error_detect, we pass through the
aer error to guest directly, ignore all access to vfio-pci during this
time,
when qemu need to do a hot reset, we can retry to get the info from
the get info ioctl until we got the info that vfio-pci has been reset
finished,
then do the hot_reset ioctl if need, the kernel should ensure the ioctl
become
//// accessible after host reset completed.
Thanks,
Chen
>
>>> Probably implementations of each need to be worked
>>> through to determine which is better. We don't want to add complexity
>>> to the kernel simply to make things easier for userspace, but we also
>>> don't want a poorly specified interface that is difficult for
>>> userspace to use correctly. Thanks,
>> In qemu, the aer recovery process:
>> 1. Detect support for resume notification
>> If host vfio driver does not support for resume notification,
>> directly fail to boot up VM as with aer enabled.
>> 2. Immediately notify the VM on error detected.
>> 3. Disable the device.
>> Unmap the config space and bar region.
>> 4. Delay the guest directed bus reset.
>> 5. Wait for resume notification.
>> If we don't get the resume notification from the host after
>> some timeout, we would abort the guest directed bus reset
>> altogether and unplug of the device to prevent it from further
>> interacting with the VM.
>> 6. After get the resume notification reset bus and enable the device.
>>
>> I think we only make sure the disabled device
>> will not interact with the VM.
> Should interrupt irqfds then also be disabled so they trap into QEMU
> and we can prevent that interaction? Also, QEMU can be polite, but as
> above, QEMU is just one user, the API is open to anyone and QEMU might
> be exploited to not be so polite. So if there are points where the
> user can interfere with the kernel or exploit the knowledge that the
> device is going through a reset, the kernel can't rely on a friendly
> user. Thanks,
>
> Alex
>
--
Sincerely,
Chen Fan
next prev parent reply other threads:[~2016-06-21 13:17 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-27 2:12 [Qemu-devel] [PATCH v8 11/12] vfio: register aer resume notification handler for aer resume Zhou Jie
2016-05-27 16:06 ` Alex Williamson
2016-06-12 2:38 ` Zhou Jie
2016-06-20 7:41 ` Zhou Jie
2016-06-20 16:32 ` Alex Williamson
2016-06-21 2:16 ` Zhou Jie
2016-06-21 3:13 ` Alex Williamson
2016-06-21 12:41 ` Chen Fan [this message]
2016-06-21 14:44 ` Alex Williamson
2016-06-22 3:28 ` Zhou Jie
2016-06-22 3:56 ` Alex Williamson
2016-06-22 5:45 ` Zhou Jie
2016-06-22 7:49 ` Zhou Jie
2016-06-22 15:42 ` Alex Williamson
2016-06-25 1:24 ` Zhou Jie
2016-06-27 15:54 ` Alex Williamson
2016-06-28 3:26 ` Zhou Jie
2016-06-28 3:58 ` Alex Williamson
2016-06-28 5:27 ` Zhou Jie
2016-06-28 14:40 ` Alex Williamson
2016-06-29 8:54 ` Zhou Jie
2016-06-29 18:22 ` Alex Williamson
2016-06-30 1:45 ` Zhou Jie
2016-07-03 4:00 ` Zhou Jie
2016-07-05 1:36 ` Zhou Jie
2016-07-05 17:03 ` Alex Williamson
2016-07-06 2:01 ` Zhou Jie
2016-07-07 19:04 ` Alex Williamson
2016-07-08 1:38 ` Zhou Jie
2016-07-08 17:33 ` Alex Williamson
2016-07-10 1:28 ` Zhou Jie
2016-07-11 16:24 ` Alex Williamson
2016-07-12 1:42 ` Zhou Jie
2016-07-12 15:45 ` Alex Williamson
2016-07-13 1:04 ` Zhou Jie
2016-07-13 2:54 ` Alex Williamson
2016-07-13 3:33 ` Zhou Jie
2016-06-22 15:25 ` Alex Williamson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=576935FC.1080503@easystack.cn \
--to=fan.chen@easystack.cn \
--cc=alex.williamson@redhat.com \
--cc=caoj.fnst@cn.fujitsu.com \
--cc=chen.fan.fnst@cn.fujitsu.com \
--cc=izumi.taku@jp.fujitsu.com \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=zhoujie2011@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).