From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NmWbg-0001jH-1r for qemu-devel@nongnu.org; Tue, 02 Mar 2010 13:14:08 -0500 Received: from [199.232.76.173] (port=53243 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NmWbf-0001j7-JJ for qemu-devel@nongnu.org; Tue, 02 Mar 2010 13:14:07 -0500 Received: from Debian-exim by monty-python.gnu.org with spam-scanned (Exim 4.60) (envelope-from ) id 1NmWbd-0002Cl-SG for qemu-devel@nongnu.org; Tue, 02 Mar 2010 13:14:07 -0500 Received: from mail-ew0-f219.google.com ([209.85.219.219]:64881) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1NmWbd-0002Cb-9O for qemu-devel@nongnu.org; Tue, 02 Mar 2010 13:14:05 -0500 Received: by ewy19 with SMTP id 19so490372ewy.2 for ; Tue, 02 Mar 2010 10:14:04 -0800 (PST) Message-ID: <4B8D5566.6030308@codemonkey.ws> Date: Tue, 02 Mar 2010 12:13:58 -0600 From: Anthony Liguori MIME-Version: 1.0 Subject: Re: [Qemu-devel] Re: [PATCHv2 10/12] tap: add vhost/vhostfd options References: <886ef6ffeb6748f6dc4fe5431f71cb12bb74edc9.1267122331.git.mst@redhat.com> <4B86D3CF.4020601@codemonkey.ws> <20100226145155.GC23359@redhat.com> <4B87E755.9000707@codemonkey.ws> <20100227194418.GB26389@redhat.com> <4B8A94FA.5020000@codemonkey.ws> <20100228171920.GE28921@redhat.com> <4B8AD8D4.7070002@codemonkey.ws> <20100302161247.GA25371@amt.cnet> <4B8D4350.6040506@codemonkey.ws> <20100302180025.GA27792@amt.cnet> In-Reply-To: <20100302180025.GA27792@amt.cnet> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Marcelo Tosatti Cc: "Michael S. Tsirkin" , quintela@redhat.com, qemu-devel@nongnu.org, Paul Brook , amit.shah@redhat.com, kraxel@redhat.com On 03/02/2010 12:00 PM, Marcelo Tosatti wrote: >> - it always returns a transient mapping >> - it may (transparently) bounce >> - it may fail to bounce, caller must deal >> >> The new function I'm proposing has the following semantics: >> > What exactly are the purposes of the new function? > We need an API that can be used to obtain a persistent mapping of a given guest physical address. We don't have that today. We can potentially change cpu_physical_memory_map() to also accommodate this use case but that's a second step. >> - it always returns a persistent mapping >> > For hotplug? What exactly you mean persistent? > Hotplug cannot happen as long as a persistent mapping exists for an address within that region. This is okay. You cannot have an active device driver DMA'ing to a DIMM and then hot unplug it. The guest is responsible for making sure this doesn't happen. >> - it never bounces >> - it will only fail if the mapping isn't ram >> >> A caller can use the new function to implement an optimization to >> force the device to only work with real ram. >> > To bypass the address translation in exec.c? > No, the ultimate goal is to convert virtio ring accesses from: static inline uint32_t vring_desc_len(target_phys_addr_t desc_pa, int i) { target_phys_addr_t pa; pa = desc_pa + sizeof(VRingDesc) * i + offsetof(VRingDesc, len); return ldl_phys(pa); } len = vring_desc_len(vring.desc_pa, i) To: len = ldl_w(vring->desc[i].len); When host == arch, ldl_w is a nop. Otherwise, it's just a byte swap. ldl_phys() today turns into cpu_physical_memory_read() which is very slow. To support this, we must enforce that when a guest passes us a physical address, we can safely obtain a persistent mapping to it. This is true for any ram address. It's not true for MMIO memory. We have no way to do this with cpu_physical_memory_map(). >> IOW, this is something we can use in virtio, but very little else. >> cpu_physical_memory_map can be used in more circumstances. >> > Does not make much sense to me. The qdev<-> memory map mapping seems > more important. Following your security enhancement drive, you can for > example check whether the region can actually be mapped by the device > and deny otherwise, or do whatever host-side memory protection tricks > you'd like. > It's two independent things. Part of what makes virtio so complicated to convert to proper bus accessors is it's use of ldl_phys/stl_phys/etc. No other device use those functions. If we reduce virtio to just use a map() function, it simplifies the bus accessor conversion. > And "cpu_ram_map" seems like the memory is accessed through cpu context, > while it is really always device context. > Yes, but that's a separate effort. In fact, see http://wiki.qemu.org/Features/RamAPI vs. http://wiki.qemu.org/Features/PCIMemoryAPI Regards, Anthony Liguori