From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Yu, Zhang" Subject: Re: [RFC] Xen PV IOMMU interface draft B Date: Wed, 17 Jun 2015 20:48:44 +0800 Message-ID: <55816CAC.7090104@linux.intel.com> References: <557B0C35.4080907@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1Z5CqZ-0001yM-Nd for xen-devel@lists.xenproject.org; Wed, 17 Jun 2015 12:53:40 +0000 In-Reply-To: <557B0C35.4080907@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Malcolm Crossley , xen-devel , Jan Beulich , Konrad Rzeszutek Wilk , Andrew Cooper , Paul Durrant , Kevin Tian , "Lv, Zhiyuan" , David Vrabel List-Id: xen-devel@lists.xenproject.org Hi Malcolm, Thank you very much for accommodate our XenGT requirement in your design. Following are some XenGT related questions. :) On 6/13/2015 12:43 AM, Malcolm Crossley wrote: > Hi All, > > Here is a design for allowing guests to control the IOMMU. This > allows for the guest GFN mapping to be programmed into the IOMMU and > avoid using the SWIOTLB bounce buffer technique in the Linux kernel > (except for legacy 32 bit DMA IO devices). > > Draft B has been expanded to include Bus Address mapping/lookup for Mediated > pass-through emulators. > > The pandoc markdown format of the document is provided below to allow > for easier inline comments: > > % Xen PV IOMMU interface > % Malcolm Crossley <> > Paul Durrant <> > % Draft B > > Introduction > ============ > > Revision History > ---------------- > > -------------------------------------------------------------------- > Version Date Changes > ------- ----------- ---------------------------------------------- > Draft A 10 Apr 2014 Initial draft. > > Draft B 12 Jun 2015 Second draft. > -------------------------------------------------------------------- > > Background > ========== > > Linux kernel SWIOTLB > -------------------- > > Xen PV guests use a Pseudophysical Frame Number(PFN) address space which is > decoupled from the host Machine Frame Number(MFN) address space. > > PV guest hardware drivers are only aware of the PFN address space only and > assume that if PFN addresses are contiguous then the hardware addresses would > be contiguous as well. The decoupling between PFN and MFN address spaces means > PFN and MFN addresses may not be contiguous across page boundaries and thus a > buffer allocated in GFN address space which spans a page boundary may not be > contiguous in MFN address space. > > PV hardware drivers cannot tolerate this behaviour and so a special > "bounce buffer" region is used to hide this issue from the drivers. > > A bounce buffer region is a special part of the PFN address space which has > been made to be contiguous in both PFN and MFN address spaces. When a driver > requests a buffer which spans a page boundary be made available for hardware > to read the core operating system code copies the buffer into a temporarily > reserved part of the bounce buffer region and then returns the MFN address of > the reserved part of the bounce buffer region back to the driver itself. The > driver then instructs the hardware to read the copy of the buffer in the > bounce buffer. Similarly if the driver requests a buffer is made available > for hardware to write to the first a region of the bounce buffer is reserved > and then after the hardware completes writing then the reserved region of > bounce buffer is copied to the originally allocated buffer. > > The overheard of memory copies to/from the bounce buffer region is high > and damages performance. Furthermore, there is a risk the fixed size > bounce buffer region will become exhausted and it will not be possible to > return an hardware address back to the driver. The Linux kernel drivers do not > tolerate this failure and so the kernel is forced to crash, as an > uncorrectable error has occurred. > > Input/Output Memory Management Units (IOMMU) allow for an inbound address > mapping to be created from the I/O Bus address space (typically PCI) to > the machine frame number address space. IOMMU's typically use a page table > mechanism to manage the mappings and therefore can create mappings of page size > granularity or larger. > > The I/O Bus address space will be referred to as the Bus Frame Number (BFN) > address space for the rest of this document. > > > Mediated Pass-through Emulators > ------------------------------- > > Mediated Pass-through emulators allow guest domains to interact with > hardware devices via emulator mediation. The emulator runs in a domain separate > to the guest domain and it is used to enforce security of guest access to the > hardware devices and isolation of different guests accessing the same hardware > device. > > The emulator requires a mechanism to map guest address's to a bus address that > the hardware devices can access. > > > Clarification of GFN and BFN fields for different guest types > ------------------------------------------------------------- > Guest Frame Numbers (GFN) definition varies depending on the guest type. > > Diagram below details the memory accesses originating from CPU, per guest type: > > HVM guest PV guest > > (VA) (VA) > | | > MMU MMU > | | > (GFN) | > | | (GFN) > HAP a.k.a EPT/NPT | > | | > (MFN) (MFN) > | | > RAM RAM > > For PV guests GFN is equal to MFN for a single page but not for a contiguous > range of pages. > > Bus Frame Numbers (BFN) refer to the address presented on the physical bus > before being translated by the IOMMU. > > Diagram below details memory accesses originating from physical device. > > Physical Device > | > (BFN) > | > IOMMU-PT > | > (MFN) > | > RAM > > > > Purpose > ======= > > 1. Allow Xen guests to create/modify/destroy IOMMU mappings for > hardware devices that the PV guests has access to. This enables the PV guest to > program a bus address space mapping which matches it's GFN mapping. Once a 1-1 > mapping of PFN to bus address space is created then a bounce buffer > region is not required for the IO devices connected to the IOMMU. > > 2. Allow for Xen guests to lookup/create/modify/destroy IOMMU mappings for > guest memory of domains the calling Xen guest has sufficient privilege over. > This enables domains to provide mediated hardware acceleration to other > guest domains. > > > Xen Architecture > ================ > > The Xen architecture consists of a new hypercall interface and changes to the > grant map interface. > > The existing IOMMU mappings setup at domain creation time will be preserved so > that PV domains unaware of this feature will continue to function with no > changes required. > > Memory ballooning will be supported by taking an additional reference on the > MFN backing the GFN for each successful IOMMU mapping created. > > An M2B tracking structure will be used to ensure all reference's to a MFN can > be located easily. > > Xen PV IOMMU hypercall interface > -------------------------------- > A two argument hypercall interface (do_iommu_op). > > ret_t do_iommu_op(XEN_GUEST_HANDLE_PARAM(void) arg, unsigned int count) > > First argument, guest handle pointer to array of `struct pv_iommu_op` > Second argument, unsigned integer count of `struct pv_iommu_op` elements in array. > > Definition of struct pv_iommu_op: > > struct pv_iommu_op { > > uint16_t subop_id; > uint16_t flags; > int32_t status; > > union { > struct { > uint64_t bfn; > uint64_t gfn; > } map_page; > > struct { > uint64_t bfn; > } unmap_page; > > struct { > uint64_t bfn; > uint64_t gfn; > uint16_t domid; > ioservid_t ioserver; > } map_foreign_page; > > struct { > uint64_t bfn; > uint64_t gfn; > uint16_t domid; > ioservid_t ioserver; > } lookup_foreign_page; > > struct { > uint64_t bfn; > ioservid_t ioserver; > } unmap_foreign_page; > } u; > }; > > Definition of PV IOMMU subops: > > #define IOMMUOP_query_caps 1 > #define IOMMUOP_map_page 2 > #define IOMMUOP_unmap_page 3 > #define IOMMUOP_map_foreign_page 4 > #define IOMMUOP_lookup_foreign_page 5 > #define IOMMUOP_unmap_foreign_page 6 > > > Design considerations for hypercall op > ------------------------------------------- > IOMMU map/unmap operations can be slow and can involve flushing the IOMMU TLB > to ensure the IO device uses the updated mappings. > > The op has been designed to take an array of operations and a count as > parameters. This allows for easily implemented hypercall continuations to be > used and allows for batches of IOMMU operations to be submitted before flushing > the IOMMU TLB. > > The subop_id to be used for a particular element is encoded into the element > itself. This allows for map and unmap operations to be performed in one hypercall > and for the IOMMU TLB flushing optimisations to be still applied. > > The hypercall will ensure that the required IOMMU TLB flushes are applied before > returning to guest via either hypercall completion or a hypercall continuation. > > IOMMUOP_query_caps > ------------------ > > This subop queries the runtime capabilities of the PV-IOMMU interface for the > specific called domain. This subop uses `struct pv_iommu_op` directly. > > ------------------------------------------------------------------------------ > Field Purpose > ----- --------------------------------------------------------------- > `flags` [out] This field details the IOMMUOP capabilities. > > `status` [out] Status of this op, op specific values listed below > ------------------------------------------------------------------------------ > > Defined bits for flags field: > > ------------------------------------------------------------------------------ > Name Bit Definition > ---- ------ ---------------------------------- > IOMMU_QUERY_map_cap 0 IOMMUOP_map_page or IOMMUOP_map_foreign > can be used for this domain > > IOMMU_QUERY_map_all_gfns 1 IOMMUOP_map_page subop can map any MFN > not used by Xen > > Reserved for future use 2-9 n/a > > IOMMU_page_order 10-15 Returns maximum possible page order for > all other IOMMUOP subops > ------------------------------------------------------------------------------ > > Defined values for query_caps subop status field: > > Value Reason > ------ ---------------------------------------------------------- > 0 subop successfully returned > > IOMMUOP_map_page > ---------------------- > This subop uses `struct map_page` part of the `struct pv_iommu_op`. > > If IOMMU dom0-strict mode is NOT enabled then the hardware domain will be > allowed to map all GFN's except for Xen owned MFN's else the hardware > domain will only be allowed to map GFN's which it owns. > > If IOMMU dom0-strict mode is NOT enabled then the hardware domain will be > allowed to map all GFN's without taking a reference to the MFN backing the GFN > by setting the IOMMU_MAP_OP_no_ref_cnt flag. > > Every successful pv_iommu_op will result in an additional page reference being > taken on the MFN backing the GFN except for the condition detailed above. > > If the map_op flags indicate a writeable mapping is required then a writeable > page type reference will be taken otherwise a standard page reference will be > taken. > > All the following conditions are required to be true for PV IOMMU map > subop to succeed: > > 1. IOMMU detected and supported by Xen > 2. The domain has IOMMU controlled hardware allocated to it > 3. If hardware_domain and the following Xen IOMMU options are > NOT enabled: dom0-passthrough > > This subop usage of the "struct pv_iommu_op" and ``struct map_page` fields > are detailed below: > > ------------------------------------------------------------------------------ > Field Purpose > ----- --------------------------------------------------------------- > `bfn` [in] Bus address frame number(BFN) to be mapped to specified gfn > below > > `gfn` [in] Guest address frame number for DOMID_SELF > > `flags` [in] Flags for signalling type of IOMMU mapping to be created, > Flags can be combined. > > `status` [out] Mapping status of this op, op specific values listed below > ------------------------------------------------------------------------------ > > Defined bits for flags field: > > Name Bit Definition > ---- ----- ---------------------------------- > IOMMU_OP_readable 0 Create readable IOMMU mapping > IOMMU_OP_writeable 1 Create writeable IOMMU mapping > IOMMU_MAP_OP_no_ref_cnt 2 IOMMU mapping does not take a reference to > MFN backing BFN mapping > Reserved for future use 3-9 n/a > IOMMU_page_order 10-15 Page order to be used for both gfn and bfn > > Defined values for map_page subop status field: > > Value Reason > ------ ---------------------------------------------------------------------- > 0 subop successfully returned > -EIO IOMMU unit returned error when attempting to map BFN to GFN. > -EPERM GFN could not be mapped because the GFN belongs to Xen. > -EPERM Domain is not a domain and GFN does not belong to domain > -EPERM Domain is a hardware domain, IOMMU dom-strict mode is enabled and > GFN does not belong to domain > -EACCES BFN address conflicts with RMRR regions for device's attached to > DOMID_SELF > -ENOSPC Page order is too large for either BFN, GFN or IOMMU unit > > IOMMUOP_unmap_page > ------------------ > This subop uses `struct unmap_page` part of the `struct pv_iommu_op`. > > The subop usage of the "struct pv_iommu_op" and ``struct unmap_page` fields > are detailed below: > > -------------------------------------------------------------------- > Field Purpose > ----- ----------------------------------------------------- > `bfn` [in] Bus address frame number to be unmapped in DOMID_SELF > > `flags` [in] Flags for signalling page order of unmap operation > > `status` [out] Mapping status of this unmap operation, 0 indicates success > -------------------------------------------------------------------- > > Defined bits for flags field: > > Name Bit Definition > ---- ----- ---------------------------------- > Reserved for future use 0-9 n/a > IOMMU_page_order 10-15 Page order to be used for bfn > > > Defined values for unmap_page subop status field: > > Error code Reason > ---------- ------------------------------------------------------------ > 0 subop successfully returned > -EIO IOMMU unit returned error when attempting to unmap BFN. > -ENOSPC Page order is too large for either BFN address or IOMMU unit > ------------------------------------------------------------------------ > > > IOMMUOP_map_foreign_page > ---------------- > This subop uses `struct map_foreign_page` part of the `struct pv_iommu_op`. > > It is not valid to use domid representing the calling domain. > > The hypercall will only succeed if calling domain has sufficient privilege over > the specified domid > > If there is no IOMMU support then the MFN is returned in the BFN field (that is > the only valid bus address for the GFN + domid combination). > > If there IOMMU support then the specified BFN is returned for the GFN + domid > combination > > The M2B mechanism is a MFN to (BFN,domid,ioserver) tuple. > > Each successful subop will add to the M2B if there was not an existing identical > M2B entry. > > Every new M2B entry will take a reference to the MFN backing the GFN. > > All the following conditions are required to be true for PV IOMMU map_foreign > subop to succeed: > > 1. IOMMU detected and supported by Xen > 2. The domain has IOMMU controlled hardware allocated to it > 3. The domain is a hardware_domain and the following Xen IOMMU options are > NOT enabled: dom0-passthrough What if the IOMMU is enabled, and runs in the default mode, which 1:1 maps all memories except owned by Xen? > > > This subop usage of the "struct pv_iommu_op" and ``struct map_foreign_page` > fields are detailed below: > > -------------------------------------------------------------------- > Field Purpose > ----- ----------------------------------------------------- > `domid` [in] The domain ID for which the gfn field applies > > `ioserver` [in] IOREQ server id associated with mapping > > `bfn` [in] Bus address frame number for gfn address > > `gfn` [in] Guest address frame number > > `flags` [in] Details the status of the BFN mapping > > `status` [out] status of this subop, 0 indicates success > -------------------------------------------------------------------- > > Defined bits for flags field: > > Name Bit Definition > ---- ----- ---------------------------------- > IOMMUOP_readable 0 BFN IOMMU mapping is readable > IOMMUOP_writeable 1 BFN IOMMU mapping is writeable > IOMMUOP_swap_mfn 2 BFN IOMMU mapping can be safely > swapped to scratch page > Reserved for future use 3-9 Reserved flag bits should be 0 > IOMMU_page_order 10-15 Returns maximum possible page order for > all other IOMMUOP subops > > Defined values for map_foreign_page subop status field: > > Error code Reason > ---------- ------------------------------------------------------------ > 0 subop successfully returned > -EIO IOMMU unit returned error when attempting to map BFN to GFN. > -EPERM Calling domain does not have sufficient privilege over domid > -EPERM GFN could not be mapped because the GFN belongs to Xen. > -EPERM domid maps to DOMID_SELF > -EACCES BFN address conflicts with RMRR regions for device's attached to > DOMID_SELF > -ENODEV Provided ioserver id is not valid > -ENXIO Provided domid id is not valid > -ENXIO Provided GFN address is not valid > -ENOSPC Page order is too large for either BFN, GFN or IOMMU unit > > IOMMU_lookup_foreign_page > ---------------- > This subop uses `struct lookup_foreign_page` part of the `struct pv_iommu_op`. > > If the BFN is specified as an input and parameter and there is no IOMMU support > for the calling domain then an error will be returned. > > It is the calling domain responsibility to ensure there are no conflicts > > The hypercall will only succeed if calling domain has sufficient privilege over > the specified domid > > If there is no IOMMU support then the MFN is returned in the BFN field (that is > the only valid bus address for the GFN + domid combination). Similarly, what if the IOMMU is enabled, and runs in the default mode, which 1:1 maps all memories except owned by Xen? Will a MFN be returned? Or should we take the query/map ops instead of the lookup op for this situation? > > Each successful subop will add to the M2B if there was not an existing identical > M2B entry. > > Every new M2B entry will take a reference to the MFN backing the GFN. > > This subop usage of the "struct pv_iommu_op" and ``struct lookup_foreign_page` > fields are detailed below: > > -------------------------------------------------------------------- > Field Purpose > ----- ----------------------------------------------------- > `domid` [in] The domain ID for which the gfn field applies > > `ioserver` [in] IOREQ server id associated with mapping > > `bfn` [out] Bus address frame number for gfn address > > `gfn` [in] Guest address frame number > > `flags` [out] Details the status of the BFN mapping > > `status` [out] status of this subop, 0 indicates success > -------------------------------------------------------------------- > > Defined bits for flags field: > > Name Bit Definition > ---- ----- ---------------------------------- > IOMMUOP_readable 0 Returned BFN IOMMU mapping is readable > IOMMUOP_writeable 1 Returned BFN IOMMU mapping is writeable > Reserved for future use 2-9 Reserved flag bits should be 0 > IOMMU_page_order 10-15 Returns maximum possible page order for > all other IOMMUOP subops > > Defined values for lookup_foreign_page subop status field: > > Error code Reason > ---------- ------------------------------------------------------------ > 0 subop successfully returned > -EPERM Calling domain does not have sufficient privilege over domid > -ENOENT There is no available BFN for provided GFN + domid combination > -ENODEV Provided ioserver id is not valid > -ENXIO Provided domid id is not valid > -ENXIO Provided GFN address is not valid > > > IOMMUOP_unmap_foreign_page > ---------------- > This subop uses `struct unmap_foreign_page` part of the `struct pv_iommu_op`. > > If there is no IOMMU support then the MFN is returned in the BFN field (that is > the only valid bus address for the GFN + domid combination). > > If there is IOMMU support then the specified BFN is returned for the GFN + domid > combination > > Each successful subop will add to the M2B if there was not an existing identical > M2B entry. The > > Every new M2B entry will take a reference to the MFN backing the GFN. > > This subop usage of the "struct pv_iommu_op" and ``struct unmap_foreign_page` fields > are detailed below: > > ----------------------------------------------------------------------- > Field Purpose > ----- -------------------------------------------------------- > `ioserver` [in] IOREQ server id associated with mapping > > `bfn` [in] Bus address frame number for gfn address > > `flags` [out] Flags for signalling page order of unmap operation > > `status` [out] status of this subop, 0 indicates success > ----------------------------------------------------------------------- > > Defined bits for flags field: > > Name Bit Definition > ---- ----- ---------------------------------- > Reserved for future use 0-9 n/a > IOMMU_page_order 10-15 Page order to be used for bfn > > Defined values for unmap_foreign_page subop status field: > > Error code Reason > ---------- ------------------------------------------------------------ > 0 subop successfully returned > -ENOENT There is no mapped BFN + ioserver id combination to unmap > > > IOMMUOP_*_foreign_page interactions with guest domain ballooning > ================================================================ > > Guest domains can balloon out a set of GFN mappings at any time and render the > BFN to GFN mapping invalid. > > When a BFN to GFN mapping becomes invalid, Xen will issue a buffered IO request > of type IOREQ_TYPE_INVALIDATE to the affected IOREQ servers with the now invalid > BFN address in the data field. If the buffered IO request ring is full then a > standard (synchronous) IO request of type IOREQ_TYPE_INVALIDATE will be issued > to the affected IOREQ server the with just invalidated BFN address in the data > field. > > The BFN mappings cannot be simply unmapped at the point of the balloon hypercall > otherwise a malicious guest could specifically balloon out an in use GFN address > in use by an emulator and trigger IOMMU faults for the domains with BFN > mappings. > > For hosts with no IOMMU support: The affected emulator(s) must specifically > issue a IOMMUOP_unmap_foreign_page subop for the now invalid BFN address so that > the references to the underlying MFN are removed and the MFN can be freed back > to the Xen memory allocator. I do not quite understand this. With no IOMMU support, these BFNs are supplied by hypervisor. So why not let hypervisor do this unmap and notify the calling domain? > > For hosts with IOMMU support: > If the BFN was mapped without the IOMMUOP_swap_mfn flag set in the > IOMMUOP_map_foreign_page then the affected affected emulator(s) must > specifically issue a IOMMUOP_unmap_foreign_page subop for the now invalid BFN > address so that the references to the underlying MFN are removed. > > If the BFN was mapped with the IOMMUOP_swap_mfn flag set in the > IOMMUOP_map_foreign_page subop for all emulators with mappings of that GFN then > the BFN mapping will be swapped to point at a scratch MFN page and all BFN > references to the invalid MFN will be removed by Xen after the BFN mapping has > been updated to point at the scratch MFN page. > > The rationale for swapping the BFN mapping to point at scratch pages is to > enable guest domains to balloon quickly without requiring hypercall(s) from > emulators. > > Not all BFN mappings can be swapped without potentially causing problems for the > hardware itself (command rings etc.) so the IOMMUOP_swap_mfn flag is used to > allow per BFN control of Xen ballooning behaviour. > > > PV IOMMU interactions with self ballooning > ========================================== > > The guest should clear any IOMMU mappings it has of it's own pages before > releasing a page back to Xen. It will need to add IOMMU mappings after > repopulating a page with the populate_physmap hypercall. > > This requires that IOMMU mappings get a writeable page type reference count and > that guests clear any IOMMU mappings before pinning page table pages. > > > Security Implications of allowing domain IOMMU control > =============================================================== > > Xen currently allows IO devices attached to hardware domain to have direct > access to the all of the MFN address space (except Xen hypervisor memory regions), > provided the Xen IOMMU option dom0-strict is not enabled. > > The PV IOMMU feature provides the same level of access to MFN address space > and the feature is not enabled when the Xen IOMMU option dom0-strict is > enabled. Therefore security is not degraded by the PV IOMMU feature. > > Domains with physical device(s) assigned which are not hardware domains are only > allowed to map their own GFNs or GFNs for domain(s) they have privilege over. > > > PV IOMMU interactions with grant map/unmap operations > ===================================================== > > Grant map operations return a Physical device accessible address (BFN) if the > GNTMAP_device_map flag is set. This operation currently returns the MFN for PV > guests which may conflict with the BFN address space the guest uses if PV IOMMU > map support is available to the guest. > > This design proposes to allow the calling domain to control the BFN address that > a grant map operation uses. > > This can be achieved by specifying that the dev_bus_addr in the > gnttab_map_grant_ref structure is used an input parameter instead of the > output parameter it is currently. > > Only PAGE_SIZE aligned addresses are allowed for dev_bus_addr input parameter. > > The revised structure is shown below for convenience. > > struct gnttab_map_grant_ref { > /* IN parameters. */ > uint64_t host_addr; > uint32_t flags; /* GNTMAP_* */ > grant_ref_t ref; > domid_t dom; > /* OUT parameters. */ > int16_t status; /* => enum grant_status */ > grant_handle_t handle; > /* IN/OUT parameters */ > uint64_t dev_bus_addr; > }; > > > The grant map operation would then behave similarly to the IOMMUOP_map_page > subop for the creation of the IOMMU mapping. > > The grant unmap operation would then behave similarly to the IOMMUOP_unmap_page > subop for the removal of the IOMMU mapping. > > A new grantmap flag would be used to indicate the domain is requesting the > dev_bus_addr field is used an input parameter. > > > #define _GNTMAP_request_bfn_map (6) > #define GNTMAP_request_bfn_map (1<<_GNTMAP_request_bfn_map) > > > > Linux kernel architecture > ========================= > > The Linux kernel will use the PV-IOMMU hypercalls to map it's PFN address > space into the IOMMU. It will map the PFN's to the IOMMU address space using > a 1:1 mapping, it does this by programming a BFN to GFN mapping which matches > the PFN to GFN mapping. > > The native SWIOTLB will be used to handle device's which cannot DMA to all of > the kernel's PFN address space. > > An interface shall be provided for emulator usage of IOMMUOP_*_foreign_page > subops which will allow the Linux kernel to centrally manage that domains BFN > resource and ensure there are no unexpected conflicts. > > > Emulator usage of PV IOMMU interface > ==================================== > > Emulators which require bus address mapping of guest RAM must first determine if > it's possible for the domain to control the bus addresses themselves. > > A IOMMUOP_query_caps subop will return the IOMMU_QUERY_map_cap flag. If this > flag is set then the emulator may specify the BFN address it wishes guest RAM to > be mapped to via the IOMMUOP_map_foreign_page subop. If the flag is not set > then the emulator must use BFN addresses supplied by the Xen via the > IOMMUOP_lookup_foreign_page. > > Operating systems which use the IOMMUOP_map_page subop are expected to provide a > common interface for emulators According to our previous internal discussions, my understanding about the usage is this: 1> PV IOMMU has an interface in dom0's kernel to do the query/map/lookup all at once, which also includes the BFN allocation algorithm. 2> When XenGT emulator tries to construct a shadow PTE, we can just call your interface, which returns a BFN whatever. However, the above description seems the XenGT device model need to do the query/lookup/map by itself? Besides, could you please give a more detailed information about this 'common interface'? :) Thanks Yu > > Emulators should unmap unused GFN mappings as often as possible using > IOMMUOP_unmap_foreign_page subops so that guest domains can balloon pages > quickly and efficiently. > > Emulators should conform to the ballooning behaviour described section > "IOMMUOP_*_foreign_page interactions with guest domain ballooning" so that guest > domains are able to effectively balloon out and in memory. > > Emulators must unmap any active BFN mappings when they shutdown. > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel >