From mboxrd@z Thu Jan  1 00:00:00 1970
From: Dor Laor <dor.laor-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Subject: Re: [RFC] virtio-blk PCI backend
Date: Fri, 09 Nov 2007 02:13:41 +0200
Message-ID: <4733A635.1080004@qumranet.com>
References: <11944902733951-git-send-email-aliguori@us.ibm.com>	<4732ABA0.5090603@qumranet.com>	<473315DB.9030803@us.ibm.com>	<4733170B.70206@qumranet.com>	<473326B4.2080307@us.ibm.com>
	<473328EC.4090905@qumranet.com> <473337B9.8040503@us.ibm.com>
Reply-To: dor.laor-atKUWr5tajBWk0Htik3J/w@public.gmane.org
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="===============0570610181=="
Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org, Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
To: Anthony Liguori <aliguori-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
Return-path: <kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org>
In-Reply-To: <473337B9.8040503-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/kvm-devel>,
	<mailto:kvm-devel-request-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org?subject=unsubscribe>
List-Archive: <http://sourceforge.net/mailarchive/forum.php?forum_name=kvm-devel>
List-Post: <mailto:kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org>
List-Help: <mailto:kvm-devel-request-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org?subject=help>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/kvm-devel>,
	<mailto:kvm-devel-request-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org?subject=subscribe>
Sender: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
Errors-To: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
List-Id: kvm.vger.kernel.org

This is a multi-part message in MIME format.
--===============0570610181==
Content-Type: multipart/alternative;
	boundary="------------090000070909000308080802"

This is a multi-part message in MIME format.
--------------090000070909000308080802
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Anthony Liguori wrote:
> Avi Kivity wrote:
>   
>>> There's no reason that the PIO operations couldn't be handled in the 
>>> kernel.  You'll already need some level of cooperation in userspace 
>>> unless you plan on implementing the PCI bus in kernel space too.  
>>> It's easy enough in the pci_map function in QEMU to just notify the 
>>> kernel that it should listen on a particular PIO range.
>>>
>>>   
>>>       
>> This is a config space write, right?  If so, the range is the regular 
>> 0xcf8-0xcff and it has to be very specially handled.
>>     
>
> This is a per-device IO slot and as best as I can tell, the PCI device 
> advertises the size of the region and the OS then identifies a range of 
> PIO space to use and tells the PCI device about it.  So we would just 
> need to implement a generic userspace virtio PCI device in QEMU that did 
> an ioctl to the kernel when this happened to tell the kernel what region 
> to listen on for a particular device.
>
>   
>>> vmcalls will certainly get faster but I doubt that the cost 
>>> difference between vmcall and pio will ever be greater than a few 
>>> hundred cycles.  The only performance sensitive operation here would 
>>> be the kick and I don't think a few hundred cycles in the kick path 
>>> is ever going to be that significant for overall performance.
>>>
>>>   
>>>       
>> Why do you think the different will be a few hundred cycles?
>>     
>
> The only difference in hardware between a PIO exit and a vmcall is that 
> you don't have write out an exit reason in the VMC[SB].  So the 
> performance difference between PIO/vmcall shouldn't be that great (and 
> if it were, the difference would probably be obvious today).  That's 
> different from, say, a PF exit because with a PF, you also have to 
> attempt to resolve it by walking the guest page table before determining 
> that you do in fact need to exit.
>
>   
>>   And if you have a large number of devices, searching the list 
>> becomes expensive too.
>>     
>
> The PIO address space is relatively small.  You could do a radix tree or 
> even a direct array lookup if you are concerned about performance.
>
>   
>>> So why introduce the extra complexity?
>>>   
>>>       
>> Overall I think it reduces comlexity if we have in-kernel devices.  
>> Anyway we can add additional signalling methods later.
>>     
>
> In-kernel virtio backends add quite a lot of complexity.  Just the 
> mechanism to setup the device is complicated enough.  I suspect that 
> it'll be necessary down the road for performance but I certainly don't 
> think it's a simplification.
>
>   
I believe that the network interface will quickly go to the kernel since 
copy takes most of the
cpu time and qemu does not support scatter gather dma at the moment.
Nevertheless using pio seems good enough, Anthony's suggestion of 
notifying the kernel using ioctls
is logical. If we'll run into troubles further on we can add a hypercall 
capability and if exist use hypercalls
instead of pios.
> Regards,
>
> Anthony Liguori
>
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems?  Stop.
> Now Search log events and configuration files using AJAX and a browser.
> Download your FREE copy of Splunk now >> http://get.splunk.com/
> _______________________________________________
> kvm-devel mailing list
> kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
> https://lists.sourceforge.net/lists/listinfo/kvm-devel
>
>   


--------------090000070909000308080802
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
  <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Anthony Liguori wrote:
<blockquote cite="mid:473337B9.8040503-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org" type="cite">
  <pre wrap="">Avi Kivity wrote:
  </pre>
  <blockquote type="cite">
    <blockquote type="cite">
      <pre wrap="">There's no reason that the PIO operations couldn't be handled in the 
kernel.  You'll already need some level of cooperation in userspace 
unless you plan on implementing the PCI bus in kernel space too.  
It's easy enough in the pci_map function in QEMU to just notify the 
kernel that it should listen on a particular PIO range.

  
      </pre>
    </blockquote>
    <pre wrap="">This is a config space write, right?  If so, the range is the regular 
0xcf8-0xcff and it has to be very specially handled.
    </pre>
  </blockquote>
  <pre wrap=""><!---->
This is a per-device IO slot and as best as I can tell, the PCI device 
advertises the size of the region and the OS then identifies a range of 
PIO space to use and tells the PCI device about it.  So we would just 
need to implement a generic userspace virtio PCI device in QEMU that did 
an ioctl to the kernel when this happened to tell the kernel what region 
to listen on for a particular device.

  </pre>
  <blockquote type="cite">
    <blockquote type="cite">
      <pre wrap="">vmcalls will certainly get faster but I doubt that the cost 
difference between vmcall and pio will ever be greater than a few 
hundred cycles.  The only performance sensitive operation here would 
be the kick and I don't think a few hundred cycles in the kick path 
is ever going to be that significant for overall performance.

  
      </pre>
    </blockquote>
    <pre wrap="">Why do you think the different will be a few hundred cycles?
    </pre>
  </blockquote>
  <pre wrap=""><!---->
The only difference in hardware between a PIO exit and a vmcall is that 
you don't have write out an exit reason in the VMC[SB].  So the 
performance difference between PIO/vmcall shouldn't be that great (and 
if it were, the difference would probably be obvious today).  That's 
different from, say, a PF exit because with a PF, you also have to 
attempt to resolve it by walking the guest page table before determining 
that you do in fact need to exit.

  </pre>
  <blockquote type="cite">
    <pre wrap="">  And if you have a large number of devices, searching the list 
becomes expensive too.
    </pre>
  </blockquote>
  <pre wrap=""><!---->
The PIO address space is relatively small.  You could do a radix tree or 
even a direct array lookup if you are concerned about performance.

  </pre>
  <blockquote type="cite">
    <blockquote type="cite">
      <pre wrap="">So why introduce the extra complexity?
  
      </pre>
    </blockquote>
    <pre wrap="">Overall I think it reduces comlexity if we have in-kernel devices.  
Anyway we can add additional signalling methods later.
    </pre>
  </blockquote>
  <pre wrap=""><!---->
In-kernel virtio backends add quite a lot of complexity.  Just the 
mechanism to setup the device is complicated enough.  I suspect that 
it'll be necessary down the road for performance but I certainly don't 
think it's a simplification.

  </pre>
</blockquote>
I believe that the network interface will quickly go to the kernel
since copy takes most of the<br>
cpu time and qemu does not support scatter gather dma at the moment.<br>
Nevertheless using pio seems good enough, Anthony's suggestion of
notifying the kernel using ioctls<br>
is logical. If we'll run into troubles further on we can add a
hypercall capability and if exist use hypercalls<br>
instead of pios.<br>
<blockquote cite="mid:473337B9.8040503-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org" type="cite">
  <pre wrap="">Regards,

Anthony Liguori


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now &gt;&gt; <a class="moz-txt-link-freetext" href="http://get.splunk.com/">http://get.splunk.com/</a>
_______________________________________________
kvm-devel mailing list
<a class="moz-txt-link-abbreviated" href="mailto:kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org">kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org</a>
<a class="moz-txt-link-freetext" href="https://lists.sourceforge.net/lists/listinfo/kvm-devel">https://lists.sourceforge.net/lists/listinfo/kvm-devel</a>

  </pre>
</blockquote>
<br>
</body>
</html>

--------------090000070909000308080802--


--===============0570610181==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
--===============0570610181==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
kvm-devel mailing list
kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
https://lists.sourceforge.net/lists/listinfo/kvm-devel

--===============0570610181==--