From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dor Laor Subject: Re: [RFC] virtio-blk PCI backend Date: Fri, 09 Nov 2007 02:13:41 +0200 Message-ID: <4733A635.1080004@qumranet.com> References: <11944902733951-git-send-email-aliguori@us.ibm.com> <4732ABA0.5090603@qumranet.com> <473315DB.9030803@us.ibm.com> <4733170B.70206@qumranet.com> <473326B4.2080307@us.ibm.com> <473328EC.4090905@qumranet.com> <473337B9.8040503@us.ibm.com> Reply-To: dor.laor-atKUWr5tajBWk0Htik3J/w@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0570610181==" Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org, Avi Kivity To: Anthony Liguori Return-path: In-Reply-To: <473337B9.8040503-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org Errors-To: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org List-Id: kvm.vger.kernel.org This is a multi-part message in MIME format. --===============0570610181== Content-Type: multipart/alternative; boundary="------------090000070909000308080802" This is a multi-part message in MIME format. --------------090000070909000308080802 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Anthony Liguori wrote: > Avi Kivity wrote: > >>> There's no reason that the PIO operations couldn't be handled in the >>> kernel. You'll already need some level of cooperation in userspace >>> unless you plan on implementing the PCI bus in kernel space too. >>> It's easy enough in the pci_map function in QEMU to just notify the >>> kernel that it should listen on a particular PIO range. >>> >>> >>> >> This is a config space write, right? If so, the range is the regular >> 0xcf8-0xcff and it has to be very specially handled. >> > > This is a per-device IO slot and as best as I can tell, the PCI device > advertises the size of the region and the OS then identifies a range of > PIO space to use and tells the PCI device about it. So we would just > need to implement a generic userspace virtio PCI device in QEMU that did > an ioctl to the kernel when this happened to tell the kernel what region > to listen on for a particular device. > > >>> vmcalls will certainly get faster but I doubt that the cost >>> difference between vmcall and pio will ever be greater than a few >>> hundred cycles. The only performance sensitive operation here would >>> be the kick and I don't think a few hundred cycles in the kick path >>> is ever going to be that significant for overall performance. >>> >>> >>> >> Why do you think the different will be a few hundred cycles? >> > > The only difference in hardware between a PIO exit and a vmcall is that > you don't have write out an exit reason in the VMC[SB]. So the > performance difference between PIO/vmcall shouldn't be that great (and > if it were, the difference would probably be obvious today). That's > different from, say, a PF exit because with a PF, you also have to > attempt to resolve it by walking the guest page table before determining > that you do in fact need to exit. > > >> And if you have a large number of devices, searching the list >> becomes expensive too. >> > > The PIO address space is relatively small. You could do a radix tree or > even a direct array lookup if you are concerned about performance. > > >>> So why introduce the extra complexity? >>> >>> >> Overall I think it reduces comlexity if we have in-kernel devices. >> Anyway we can add additional signalling methods later. >> > > In-kernel virtio backends add quite a lot of complexity. Just the > mechanism to setup the device is complicated enough. I suspect that > it'll be necessary down the road for performance but I certainly don't > think it's a simplification. > > I believe that the network interface will quickly go to the kernel since copy takes most of the cpu time and qemu does not support scatter gather dma at the moment. Nevertheless using pio seems good enough, Anthony's suggestion of notifying the kernel using ioctls is logical. If we'll run into troubles further on we can add a hypercall capability and if exist use hypercalls instead of pios. > Regards, > > Anthony Liguori > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. > Still grepping through log files to find problems? Stop. > Now Search log events and configuration files using AJAX and a browser. > Download your FREE copy of Splunk now >> http://get.splunk.com/ > _______________________________________________ > kvm-devel mailing list > kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org > https://lists.sourceforge.net/lists/listinfo/kvm-devel > > --------------090000070909000308080802 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Anthony Liguori wrote:
Avi Kivity wrote:
  
There's no reason that the PIO operations couldn't be handled in the 
kernel.  You'll already need some level of cooperation in userspace 
unless you plan on implementing the PCI bus in kernel space too.  
It's easy enough in the pci_map function in QEMU to just notify the 
kernel that it should listen on a particular PIO range.

  
      
This is a config space write, right?  If so, the range is the regular 
0xcf8-0xcff and it has to be very specially handled.
    

This is a per-device IO slot and as best as I can tell, the PCI device 
advertises the size of the region and the OS then identifies a range of 
PIO space to use and tells the PCI device about it.  So we would just 
need to implement a generic userspace virtio PCI device in QEMU that did 
an ioctl to the kernel when this happened to tell the kernel what region 
to listen on for a particular device.

  
vmcalls will certainly get faster but I doubt that the cost 
difference between vmcall and pio will ever be greater than a few 
hundred cycles.  The only performance sensitive operation here would 
be the kick and I don't think a few hundred cycles in the kick path 
is ever going to be that significant for overall performance.

  
      
Why do you think the different will be a few hundred cycles?
    

The only difference in hardware between a PIO exit and a vmcall is that 
you don't have write out an exit reason in the VMC[SB].  So the 
performance difference between PIO/vmcall shouldn't be that great (and 
if it were, the difference would probably be obvious today).  That's 
different from, say, a PF exit because with a PF, you also have to 
attempt to resolve it by walking the guest page table before determining 
that you do in fact need to exit.

  
  And if you have a large number of devices, searching the list 
becomes expensive too.
    

The PIO address space is relatively small.  You could do a radix tree or 
even a direct array lookup if you are concerned about performance.

  
So why introduce the extra complexity?
  
      
Overall I think it reduces comlexity if we have in-kernel devices.  
Anyway we can add additional signalling methods later.
    

In-kernel virtio backends add quite a lot of complexity.  Just the 
mechanism to setup the device is complicated enough.  I suspect that 
it'll be necessary down the road for performance but I certainly don't 
think it's a simplification.

  
I believe that the network interface will quickly go to the kernel since copy takes most of the
cpu time and qemu does not support scatter gather dma at the moment.
Nevertheless using pio seems good enough, Anthony's suggestion of notifying the kernel using ioctls
is logical. If we'll run into troubles further on we can add a hypercall capability and if exist use hypercalls
instead of pios.
Regards,

Anthony Liguori


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
kvm-devel mailing list
kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
https://lists.sourceforge.net/lists/listinfo/kvm-devel

  

--------------090000070909000308080802-- --===============0570610181== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ --===============0570610181== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ kvm-devel mailing list kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org https://lists.sourceforge.net/lists/listinfo/kvm-devel --===============0570610181==--