From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43346) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WskzK-0007h8-Oq for qemu-devel@nongnu.org; Thu, 05 Jun 2014 23:38:48 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WskzE-0004Fz-C2 for qemu-devel@nongnu.org; Thu, 05 Jun 2014 23:38:42 -0400 Message-ID: <1402025888.3247.178.camel@pasglop> From: Benjamin Herrenschmidt Date: Fri, 06 Jun 2014 13:38:08 +1000 In-Reply-To: <5390FEF3.4080108@suse.de> References: <1401947401-21329-1-git-send-email-aik@ozlabs.ru> <1401947401-21329-2-git-send-email-aik@ozlabs.ru> <5390119D.8040201@ozlabs.ru> <53906B56.3080007@suse.de> <53906C50.50308@ozlabs.ru> <53906D54.4030105@suse.de> <5390718C.4020005@ozlabs.ru> <53907267.1090000@suse.de> <53907FBA.8060604@ozlabs.ru> <5390A01D.7020004@suse.de> <5390FA95.2090509@ozlabs.ru> <5390FEF3.4080108@suse.de> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v7 1/4] spapr_iommu: Make in-kernel TCE table optional List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexander Graf Cc: Alexey Kardashevskiy , Alex Williamson , qemu-ppc@nongnu.org, qemu-devel@nongnu.org, Gavin Shan On Fri, 2014-06-06 at 01:36 +0200, Alexander Graf wrote: > > It would be nicer if the guest had full control over the virtual > address range of a PCI device. It does ... within a HW window which can be different between P7 and P8. On P7 all PEs on a PHB share a single DMA address space that gets sliced up, I won't get into details on what kind of slices are available, suffice to say we provide a single smallish window in 32-bit space for each PE and the guest controls the 4k TCEs in there. On P8, each PE has its own DMA address space which has 2 wnidows, one at 0 and one at 0x0800_0000_0000_0000. By default we configure the 0 window for 32-bit/4K TCE remapping (we set it to 2G window) for compatibility with existing PAPR expectations. The high window is used in the host as a bypass. We disable TCEs and use a direct mapping to physical memory through it instead to allow the host drivers that are 64-bit DMA capable to have the fastest possible access to memory. When we pass-through a device today we disable that second window for obvious reasons. With Alexey patches, we'll be able to control it which will in turn allow us to implement the PAPR "DDW" extension which allows the guest to populate that second window. Typically the guest will use it to create a full mapping of its entire address space in 64-bit space using the largest possible TCE size (whose size is constrained by the page size used to back the guest memory). Here too, within those windows, the guest has control of the mappings. Cheers, Ben.