Re: [Qemu-devel] VFIO based vGPU(was Re: [Announcement] 2015-Q3 release of XenGT - a Mediated ...)

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Jike Song <jike.song@intel.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: Yang Zhang <yang.zhang.wz@gmail.com>,
	"Ruan, Shuai" <shuai.ruan@intel.com>,
	"Tian, Kevin" <kevin.tian@intel.com>, Neo Jia <cjia@nvidia.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"igvt-g@lists.01.org" <igvt-g@ml01.01.org>,
	qemu-devel <qemu-devel@nongnu.org>,
	Gerd Hoffmann <kraxel@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	"Lv, Zhiyuan" <zhiyuan.lv@intel.com>
Subject: Re: [Qemu-devel] VFIO based vGPU(was Re: [Announcement] 2015-Q3 release of XenGT - a Mediated ...)
Date: Wed, 27 Jan 2016 09:47:25 +0800	[thread overview]
Message-ID: <56A821AD.5090606@intel.com> (raw)
In-Reply-To: <1453848975.18049.7.camel@redhat.com>

On 01/27/2016 06:56 AM, Alex Williamson wrote:
> On Tue, 2016-01-26 at 22:39 +0000, Tian, Kevin wrote:
>>> From: Alex Williamson [mailto:alex.williamson@redhat.com]
>>> Sent: Wednesday, January 27, 2016 6:27 AM
>>>  
>>> On Tue, 2016-01-26 at 22:15 +0000, Tian, Kevin wrote:
>>>>> From: Alex Williamson [mailto:alex.williamson@redhat.com]
>>>>> Sent: Wednesday, January 27, 2016 6:08 AM
>>>>>  
>>>>>>>>>  
>>>>>>>>  
>>>>>>>> Today KVMGT (not using VFIO yet) registers I/O emulation callbacks to
>>>>>>>> KVM, so VM MMIO access will be forwarded to KVMGT directly for
>>>>>>>> emulation in kernel. If we reuse above R/W flags, the whole emulation
>>>>>>>> path would be unnecessarily long with obvious performance impact. We
>>>>>>>> either need a new flag here to indicate in-kernel emulation (bias from
>>>>>>>> passthrough support), or just hide the region alternatively (let KVMGT
>>>>>>>> to handle I/O emulation itself like today).
>>>>>>>  
>>>>>>> That sounds like a future optimization TBH.  There's very strict
>>>>>>> layering between vfio and kvm.  Physical device assignment could make
>>>>>>> use of it as well, avoiding a round trip through userspace when an
>>>>>>> ioread/write would do.  Userspace also needs to orchestrate those kinds
>>>>>>> of accelerators, there might be cases where userspace wants to see those
>>>>>>> transactions for debugging or manipulating the device.  We can't simply
>>>>>>> take shortcuts to provide such direct access.  Thanks,
>>>>>>>  
>>>>>>  
>>>>>> But we have to balance such debugging flexibility and acceptable performance.
>>>>>> To me the latter one is more important otherwise there'd be no real usage
>>>>>> around this technique, while for debugging there are other alternative (e.g.
>>>>>> ftrace) Consider some extreme case with 100k traps/second and then see
>>>>>> how much impact a 2-3x longer emulation path can bring...
>>>>>  
>>>>> Are you jumping to the conclusion that it cannot be done with proper
>>>>> layering in place?  Performance is important, but it's not an excuse to
>>>>> abandon designing interfaces between independent components.  Thanks,
>>>>>  
>>>>  
>>>> Two are not controversial. My point is to remove unnecessary long trip
>>>> as possible. After another thought, yes we can reuse existing read/write
>>>> flags:
>>>>  	- KVMGT will expose a private control variable whether in-kernel
>>>> delivery is required;
>>>  
>>> But in-kernel delivery is never *required*.  Wouldn't userspace want to
>>> deliver in-kernel any time it possibly could?
>>>  
>>>>  	- when the variable is true, KVMGT will register in-kernel MMIO
>>>> emulation callbacks then VM MMIO request will be delivered to KVMGT
>>>> directly;
>>>>  	- when the variable is false, KVMGT will not register anything.
>>>> VM MMIO request will then be delivered to Qemu and then ioread/write
>>>> will be used to finally reach KVMGT emulation logic;
>>>  
>>> No, that means the interface is entirely dependent on a backdoor through
>>> KVM.  Why can't userspace (QEMU) do something like register an MMIO
>>> region with KVM handled via a provided file descriptor and offset,
>>> couldn't KVM then call the file ops without a kernel exit?  Thanks,
>>>  
>>  
>> Could you elaborate this thought? If it can achieve the purpose w/o
>> a kernel exit definitely we can adapt to it. :-)
> 
> I only thought of it when replying to the last email and have been doing
> some research, but we already do quite a bit of synchronization through
> file descriptors.  The kvm-vfio pseudo device uses a group file
> descriptor to ensure a user has access to a group, allowing some degree
> of interaction between modules.  Eventfds and irqfds already make use of
> f_ops on file descriptors to poke data.  So, if KVM had information that
> an MMIO region was backed by a file descriptor for which it already has
> a reference via fdget() (and verified access rights and whatnot), then
> it ought to be a simple matter to get to f_ops->read/write knowing the
> base offset of that MMIO region.  Perhaps it could even simply use
> __vfs_read/write().  Then we've got a proper reference to the file
> descriptor for ownership purposes and we've transparently jumped across
> modules without any implicit knowledge of the other end.  Could it work?

This is OK for KVMGT, from fops to vgpu device-model would always be simple.
The only question is, how is KVM hypervisor supposed to get the fd on VM-exitings?

copy-and-paste the current implementation of vcpu_mmio_write(), seems
nothing but GPA and len are provided:

	static int vcpu_mmio_write(struct kvm_vcpu *vcpu, gpa_t addr, int len,
				   const void *v)
	{
		int handled = 0;
		int n;

		do {
			n = min(len, 8);
			if (!(vcpu->arch.apic &&
			      !kvm_iodevice_write(vcpu, &vcpu->arch.apic->dev, addr, n, v))
			    && kvm_io_bus_write(vcpu, KVM_MMIO_BUS, addr, n, v))
				break;
			handled += n;
			addr += n;
			len -= n;
			v += n;
		} while (len);

		return handled;
	}

If we back a GPA range with a fd, this will also be a 'backdoor'?


> Thanks,
> 
> Alex
> 

--
Thanks,
Jike

next prev parent reply	other threads:[~2016-01-27  1:47 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-18  2:39 [Qemu-devel] VFIO based vGPU(was Re: [Announcement] 2015-Q3 release of XenGT - a Mediated ...) Jike Song
2016-01-18  4:47 ` Alex Williamson
2016-01-18  8:56   ` Jike Song
2016-01-18 19:05     ` Alex Williamson
2016-01-20  8:59       ` Jike Song
2016-01-20  9:05         ` Tian, Kevin
2016-01-25 11:34           ` Jike Song
2016-01-25 21:30             ` Alex Williamson
2016-01-25 21:45               ` Tian, Kevin
2016-01-25 21:48                 ` Tian, Kevin
2016-01-26  9:48                 ` Neo Jia
2016-01-26 10:20                 ` Neo Jia
2016-01-26 19:24                   ` Tian, Kevin
2016-01-26 19:29                     ` Neo Jia
2016-01-26 20:06                   ` Alex Williamson
2016-01-26 21:38                     ` Tian, Kevin
2016-01-26 22:28                     ` Neo Jia
2016-01-26 23:30                       ` Alex Williamson
2016-01-27  9:14                         ` Neo Jia
2016-01-27 16:10                           ` Alex Williamson
2016-01-27 21:48                             ` Neo Jia
2016-01-27  8:06                     ` Kirti Wankhede
2016-01-27 16:00                       ` Alex Williamson
2016-01-27 20:55                         ` Kirti Wankhede
2016-01-27 21:58                           ` Alex Williamson
2016-01-28  3:01                             ` Kirti Wankhede
2016-01-26  7:41               ` Jike Song
2016-01-26 14:05                 ` Yang Zhang
2016-01-26 16:37                   ` Alex Williamson
2016-01-26 21:21                     ` Tian, Kevin
2016-01-26 21:30                       ` Neo Jia
2016-01-26 21:43                         ` Tian, Kevin
2016-01-26 21:43                       ` Alex Williamson
2016-01-26 21:50                         ` Tian, Kevin
2016-01-26 22:07                           ` Alex Williamson
2016-01-26 22:15                             ` Tian, Kevin
2016-01-26 22:27                               ` Alex Williamson
2016-01-26 22:39                                 ` Tian, Kevin
2016-01-26 22:56                                   ` Alex Williamson
2016-01-27  1:47                                     ` Jike Song [this message]
2016-01-27  3:07                                       ` Alex Williamson
2016-01-27  5:43                                         ` Jike Song
2016-01-27 16:19                                           ` Alex Williamson
2016-01-28  6:00                                             ` Jike Song
2016-01-28 15:23                                               ` Alex Williamson
2016-01-29  7:20                                                 ` Jike Song
2016-01-29  8:49                                                   ` [Qemu-devel] [iGVT-g] " Jike Song
2016-01-29 18:50                                                     ` Alex Williamson
2016-02-01 13:10                                                       ` Gerd Hoffmann
2016-02-01 21:44                                                         ` Alex Williamson
2016-02-02  7:28                                                           ` Gerd Hoffmann
2016-02-02  7:35                                                           ` Zhiyuan Lv
2016-01-27  1:52                                     ` [Qemu-devel] " Yang Zhang
2016-01-27  3:37                                       ` Alex Williamson
2016-01-27  0:06                   ` Jike Song
2016-01-27  1:34                     ` Yang Zhang
2016-01-27  1:51                       ` Jike Song
2016-01-26 16:12                 ` Alex Williamson
2016-01-26 21:57                   ` Tian, Kevin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56A821AD.5090606@intel.com \
    --to=jike.song@intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=cjia@nvidia.com \
    --cc=igvt-g@ml01.01.org \
    --cc=kevin.tian@intel.com \
    --cc=kraxel@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=shuai.ruan@intel.com \
    --cc=yang.zhang.wz@gmail.com \
    --cc=zhiyuan.lv@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).