From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1NmWbg-0001jH-1r
	for qemu-devel@nongnu.org; Tue, 02 Mar 2010 13:14:08 -0500
Received: from [199.232.76.173] (port=53243 helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1NmWbf-0001j7-JJ
	for qemu-devel@nongnu.org; Tue, 02 Mar 2010 13:14:07 -0500
Received: from Debian-exim by monty-python.gnu.org with spam-scanned (Exim
	4.60) (envelope-from <anthony@codemonkey.ws>) id 1NmWbd-0002Cl-SG
	for qemu-devel@nongnu.org; Tue, 02 Mar 2010 13:14:07 -0500
Received: from mail-ew0-f219.google.com ([209.85.219.219]:64881)
	by monty-python.gnu.org with esmtp (Exim 4.60)
	(envelope-from <anthony@codemonkey.ws>) id 1NmWbd-0002Cb-9O
	for qemu-devel@nongnu.org; Tue, 02 Mar 2010 13:14:05 -0500
Received: by ewy19 with SMTP id 19so490372ewy.2
	for <qemu-devel@nongnu.org>; Tue, 02 Mar 2010 10:14:04 -0800 (PST)
Message-ID: <4B8D5566.6030308@codemonkey.ws>
Date: Tue, 02 Mar 2010 12:13:58 -0600
From: Anthony Liguori <anthony@codemonkey.ws>
MIME-Version: 1.0
Subject: Re: [Qemu-devel] Re: [PATCHv2 10/12] tap: add vhost/vhostfd options
References: <886ef6ffeb6748f6dc4fe5431f71cb12bb74edc9.1267122331.git.mst@redhat.com>
	<4B86D3CF.4020601@codemonkey.ws>
	<20100226145155.GC23359@redhat.com>
	<4B87E755.9000707@codemonkey.ws>
	<20100227194418.GB26389@redhat.com>
	<4B8A94FA.5020000@codemonkey.ws>
	<20100228171920.GE28921@redhat.com>
	<4B8AD8D4.7070002@codemonkey.ws> <20100302161247.GA25371@amt.cnet>
	<4B8D4350.6040506@codemonkey.ws> <20100302180025.GA27792@amt.cnet>
In-Reply-To: <20100302180025.GA27792@amt.cnet>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>, quintela@redhat.com, qemu-devel@nongnu.org, Paul Brook <paul@codesourcery.com>, amit.shah@redhat.com, kraxel@redhat.com

On 03/02/2010 12:00 PM, Marcelo Tosatti wrote:
>> - it always returns a transient mapping
>> - it may (transparently) bounce
>> - it may fail to bounce, caller must deal
>>
>> The new function I'm proposing has the following semantics:
>>      
> What exactly are the purposes of the new function?
>    

We need an API that can be used to obtain a persistent mapping of a 
given guest physical address.  We don't have that today.  We can 
potentially change cpu_physical_memory_map() to also accommodate this 
use case but that's a second step.

>> - it always returns a persistent mapping
>>      
> For hotplug? What exactly you mean persistent?
>    

Hotplug cannot happen as long as a persistent mapping exists for an 
address within that region.  This is okay.  You cannot have an active 
device driver DMA'ing to a DIMM and then hot unplug it.  The guest is 
responsible for making sure this doesn't happen.

>> - it never bounces
>> - it will only fail if the mapping isn't ram
>>
>> A caller can use the new function to implement an optimization to
>> force the device to only work with real ram.
>>      
> To bypass the address translation in exec.c?
>    

No, the ultimate goal is to convert virtio ring accesses from:

static inline uint32_t vring_desc_len(target_phys_addr_t desc_pa, int i)
{
     target_phys_addr_t pa;
     pa = desc_pa + sizeof(VRingDesc) * i + offsetof(VRingDesc, len);
     return ldl_phys(pa);
}

len = vring_desc_len(vring.desc_pa, i)

To:

len = ldl_w(vring->desc[i].len);

When host == arch, ldl_w is a nop.  Otherwise, it's just a byte swap.  
ldl_phys() today turns into cpu_physical_memory_read() which is very slow.

To support this, we must enforce that when a guest passes us a physical 
address, we can safely obtain a persistent mapping to it.  This is true 
for any ram address.  It's not true for MMIO memory.  We have no way to 
do this with cpu_physical_memory_map().

>>    IOW, this is something we can use in virtio, but very little else.
>> cpu_physical_memory_map can be used in more circumstances.
>>      
> Does not make much sense to me. The qdev<->  memory map mapping seems
> more important. Following your security enhancement drive, you can for
> example check whether the region can actually be mapped by the device
> and deny otherwise, or do whatever host-side memory protection tricks
> you'd like.
>    

It's two independent things.  Part of what makes virtio so complicated 
to convert to proper bus accessors is it's use of 
ldl_phys/stl_phys/etc.  No other device use those functions.  If we 
reduce virtio to just use a map() function, it simplifies the bus 
accessor conversion.

> And "cpu_ram_map" seems like the memory is accessed through cpu context,
> while it is really always device context.
>    

Yes, but that's a separate effort.  In fact, see 
http://wiki.qemu.org/Features/RamAPI vs. 
http://wiki.qemu.org/Features/PCIMemoryAPI

Regards,

Anthony Liguori