* [PATCH 0/6] IOMMU, vtd and iotlb flush rework (v4)
@ 2011-11-08 20:04 Jean Guyader
2011-11-08 20:04 ` [PATCH 1/6] vtd: Refactor iotlb flush code Jean Guyader
0 siblings, 1 reply; 8+ messages in thread
From: Jean Guyader @ 2011-11-08 20:04 UTC (permalink / raw)
To: xen-devel; +Cc: allen.m.kay, tim
In one of my previous email I detailed a bug I was seeing when passing
through a Intel GPU on a guest that has more that 4G or RAM.
Allen suggested that I go for the Plan B but after a discussion with Tim
we agreed that Plan B was way to disruptive in term of code change.
This patch series implements Plan A.
http://xen.1045712.n5.nabble.com/VTD-Intel-iommu-IOTLB-flush-really-slow-td4952866.html
Changes between v3 and v4:
- Move the loop for gmfn_range from arch_memory_op to xenmem_add_to_physmap.
- Add a comment to comment to explain the purpose of iommu_dont_flush_iotlb.
Changes between v2 and v3:
- Check for the presence iotlb_flush_all callback before calling it.
Changes between v1 and v2:
- Move size in struct xen_add_to_physmap in padding between .domid and .space.
- Store iommu_dont_flush per cpu
- Change the code in hvmloader to relocate by batch of 64K, .size is now 16 bits.
tools/firmware/hvmloader/pci.c | 20 +++++--
xen/arch/x86/mm.c | 82 ++++++++++++++++++++++------
xen/arch/x86/x86_64/compat/mm.c | 4 ++
xen/drivers/passthrough/iommu.c | 25 +++++++++
xen/drivers/passthrough/vtd/iommu.c | 100 ++++++++++++++++++++---------------
xen/include/public/memory.h | 4 ++
xen/include/xen/iommu.h | 17 ++++++
7 files changed, 186 insertions(+), 66 deletions(-)
Jean Guyader (6):
vtd: Refactor iotlb flush code
iommu: Introduce iommu_flush and iommu_flush_all.
add_to_physmap: Move the code for XENMEM_add_to_physmap.
mm: New XENMEM space, XENMAPSPACE_gmfn_range
hvmloader: Change memory relocation loop when overlap with PCI hole.
Introduce per cpu flag (iommu_dont_flush_iotlb) to avoid unnecessary iotlb flush
^ permalink raw reply [flat|nested] 8+ messages in thread* [PATCH 1/6] vtd: Refactor iotlb flush code 2011-11-08 20:04 [PATCH 0/6] IOMMU, vtd and iotlb flush rework (v4) Jean Guyader @ 2011-11-08 20:04 ` Jean Guyader 2011-11-08 20:04 ` [PATCH 2/6] iommu: Introduce iommu_flush and iommu_flush_all Jean Guyader 0 siblings, 1 reply; 8+ messages in thread From: Jean Guyader @ 2011-11-08 20:04 UTC (permalink / raw) To: xen-devel; +Cc: allen.m.kay, tim, Jean Guyader [-- Attachment #1: Type: text/plain, Size: 279 bytes --] Factorize the iotlb flush code from map_page and unmap_page into it's own function. Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com> --- xen/drivers/passthrough/vtd/iommu.c | 86 +++++++++++++++++----------------- 1 files changed, 43 insertions(+), 43 deletions(-) [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-vtd-Refactor-iotlb-flush-code.patch --] [-- Type: text/x-patch; name="0001-vtd-Refactor-iotlb-flush-code.patch", Size: 4122 bytes --] diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c index 7717ab4..5a5b6be 100644 --- a/xen/drivers/passthrough/vtd/iommu.c +++ b/xen/drivers/passthrough/vtd/iommu.c @@ -578,16 +578,53 @@ static void iommu_flush_all(void) } } +static void __intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn, + int dma_old_pte_present, unsigned int page_count) +{ + struct hvm_iommu *hd = domain_hvm_iommu(d); + struct acpi_drhd_unit *drhd; + struct iommu *iommu; + int flush_dev_iotlb; + int iommu_domid; + + /* + * No need pcideves_lock here because we have flush + * when assign/deassign device + */ + for_each_drhd_unit ( drhd ) + { + iommu = drhd->iommu; + + if ( !test_bit(iommu->index, &hd->iommu_bitmap) ) + continue; + + flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0; + iommu_domid= domain_iommu_domid(d, iommu); + if ( iommu_domid == -1 ) + continue; + + if ( page_count > 1 || gfn == -1 ) + { + if ( iommu_flush_iotlb_dsi(iommu, iommu_domid, + 0, flush_dev_iotlb) ) + iommu_flush_write_buffer(iommu); + } + else + { + if ( iommu_flush_iotlb_psi(iommu, iommu_domid, + (paddr_t)gfn << PAGE_SHIFT_4K, 0, + !dma_old_pte_present, flush_dev_iotlb) ) + iommu_flush_write_buffer(iommu); + } + } +} + /* clear one page's page table */ static void dma_pte_clear_one(struct domain *domain, u64 addr) { struct hvm_iommu *hd = domain_hvm_iommu(domain); - struct acpi_drhd_unit *drhd; - struct iommu *iommu; struct dma_pte *page = NULL, *pte = NULL; u64 pg_maddr; - int flush_dev_iotlb; - int iommu_domid; struct mapped_rmrr *mrmrr; spin_lock(&hd->mapping_lock); @@ -613,21 +650,7 @@ static void dma_pte_clear_one(struct domain *domain, u64 addr) spin_unlock(&hd->mapping_lock); iommu_flush_cache_entry(pte, sizeof(struct dma_pte)); - /* No need pcidevs_lock here since do that on assign/deassign device*/ - for_each_drhd_unit ( drhd ) - { - iommu = drhd->iommu; - if ( test_bit(iommu->index, &hd->iommu_bitmap) ) - { - flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0; - iommu_domid= domain_iommu_domid(domain, iommu); - if ( iommu_domid == -1 ) - continue; - if ( iommu_flush_iotlb_psi(iommu, iommu_domid, addr, - 0, 0, flush_dev_iotlb) ) - iommu_flush_write_buffer(iommu); - } - } + __intel_iommu_iotlb_flush(domain, addr >> PAGE_SHIFT_4K , 0, 1); unmap_vtd_domain_page(page); @@ -1677,12 +1700,8 @@ static int intel_iommu_map_page( unsigned int flags) { struct hvm_iommu *hd = domain_hvm_iommu(d); - struct acpi_drhd_unit *drhd; - struct iommu *iommu; struct dma_pte *page = NULL, *pte = NULL, old, new = { 0 }; u64 pg_maddr; - int flush_dev_iotlb; - int iommu_domid; /* Do nothing if VT-d shares EPT page table */ if ( iommu_use_hap_pt(d) ) @@ -1724,26 +1743,7 @@ static int intel_iommu_map_page( spin_unlock(&hd->mapping_lock); unmap_vtd_domain_page(page); - /* - * No need pcideves_lock here because we have flush - * when assign/deassign device - */ - for_each_drhd_unit ( drhd ) - { - iommu = drhd->iommu; - - if ( !test_bit(iommu->index, &hd->iommu_bitmap) ) - continue; - - flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0; - iommu_domid= domain_iommu_domid(d, iommu); - if ( iommu_domid == -1 ) - continue; - if ( iommu_flush_iotlb_psi(iommu, iommu_domid, - (paddr_t)gfn << PAGE_SHIFT_4K, 0, - !dma_pte_present(old), flush_dev_iotlb) ) - iommu_flush_write_buffer(iommu); - } + __intel_iommu_iotlb_flush(d, gfn, dma_pte_present(old), 1); return 0; } [-- Attachment #3: Type: text/plain, Size: 138 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/6] iommu: Introduce iommu_flush and iommu_flush_all. 2011-11-08 20:04 ` [PATCH 1/6] vtd: Refactor iotlb flush code Jean Guyader @ 2011-11-08 20:04 ` Jean Guyader 2011-11-08 20:04 ` [PATCH 3/6] add_to_physmap: Move the code for XENMEM_add_to_physmap Jean Guyader 0 siblings, 1 reply; 8+ messages in thread From: Jean Guyader @ 2011-11-08 20:04 UTC (permalink / raw) To: xen-devel; +Cc: allen.m.kay, tim, Jean Guyader [-- Attachment #1: Type: text/plain, Size: 286 bytes --] Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com> --- xen/drivers/passthrough/iommu.c | 20 ++++++++++++++++++++ xen/drivers/passthrough/vtd/iommu.c | 12 ++++++++++++ xen/include/xen/iommu.h | 5 +++++ 3 files changed, 37 insertions(+), 0 deletions(-) [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0002-iommu-Introduce-iommu_flush-and-iommu_flush_all.patch --] [-- Type: text/x-patch; name="0002-iommu-Introduce-iommu_flush-and-iommu_flush_all.patch", Size: 2780 bytes --] diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c index cd6174d..ca7b37b 100644 --- a/xen/drivers/passthrough/iommu.c +++ b/xen/drivers/passthrough/iommu.c @@ -301,6 +301,26 @@ int iommu_unmap_page(struct domain *d, unsigned long gfn) return hd->platform_ops->unmap_page(d, gfn); } +void iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int page_count) +{ + struct hvm_iommu *hd = domain_hvm_iommu(d); + + if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush ) + return; + + hd->platform_ops->iotlb_flush(d, gfn, page_count); +} + +void iommu_iotlb_flush_all(struct domain *d) +{ + struct hvm_iommu *hd = domain_hvm_iommu(d); + + if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush_all ) + return; + + hd->platform_ops->iotlb_flush_all(d); +} + /* caller should hold the pcidevs_lock */ int deassign_device(struct domain *d, u16 seg, u8 bus, u8 devfn) { diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c index 5a5b6be..7ec9541 100644 --- a/xen/drivers/passthrough/vtd/iommu.c +++ b/xen/drivers/passthrough/vtd/iommu.c @@ -619,6 +619,16 @@ static void __intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn, } } +static void intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int page_count) +{ + __intel_iommu_iotlb_flush(d, gfn, 1, page_count); +} + +static void intel_iommu_iotlb_flush_all(struct domain *d) +{ + __intel_iommu_iotlb_flush(d, 0, 0, 0); +} + /* clear one page's page table */ static void dma_pte_clear_one(struct domain *domain, u64 addr) { @@ -2329,6 +2339,8 @@ const struct iommu_ops intel_iommu_ops = { .resume = vtd_resume, .share_p2m = iommu_set_pgd, .crash_shutdown = vtd_crash_shutdown, + .iotlb_flush = intel_iommu_iotlb_flush, + .iotlb_flush_all = intel_iommu_iotlb_flush_all, }; /* diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h index 837e60d..a1034df 100644 --- a/xen/include/xen/iommu.h +++ b/xen/include/xen/iommu.h @@ -139,6 +139,8 @@ struct iommu_ops { void (*resume)(void); void (*share_p2m)(struct domain *d); void (*crash_shutdown)(void); + void (*iotlb_flush)(struct domain *d, unsigned long gfn, unsigned int page_count); + void (*iotlb_flush_all)(struct domain *d); }; void iommu_update_ire_from_apic(unsigned int apic, unsigned int reg, unsigned int value); @@ -155,4 +157,7 @@ void iommu_share_p2m_table(struct domain *d); int iommu_do_domctl(struct xen_domctl *, XEN_GUEST_HANDLE(xen_domctl_t)); +void iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int page_count); +void iommu_iotlb_flush_all(struct domain *d); + #endif /* _IOMMU_H_ */ [-- Attachment #3: Type: text/plain, Size: 138 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 3/6] add_to_physmap: Move the code for XENMEM_add_to_physmap. 2011-11-08 20:04 ` [PATCH 2/6] iommu: Introduce iommu_flush and iommu_flush_all Jean Guyader @ 2011-11-08 20:04 ` Jean Guyader 2011-11-08 20:04 ` [PATCH 4/6] mm: New XENMEM space, XENMAPSPACE_gmfn_range Jean Guyader 0 siblings, 1 reply; 8+ messages in thread From: Jean Guyader @ 2011-11-08 20:04 UTC (permalink / raw) To: xen-devel; +Cc: allen.m.kay, tim, Jean Guyader [-- Attachment #1: Type: text/plain, Size: 293 bytes --] Move the code for the XENMEM_add_to_physmap case into it's own function (xenmem_add_to_physmap). Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com> --- xen/arch/x86/mm.c | 188 ++++++++++++++++++++++++++++------------------------- 1 files changed, 99 insertions(+), 89 deletions(-) [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0003-add_to_physmap-Move-the-code-for-XENMEM_add_to_physm.patch --] [-- Type: text/x-patch; name="0003-add_to_physmap-Move-the-code-for-XENMEM_add_to_physm.patch", Size: 7065 bytes --] diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index acc1f34..f75011e 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -4592,119 +4592,129 @@ static int handle_iomem_range(unsigned long s, unsigned long e, void *p) return 0; } -long arch_memory_op(int op, XEN_GUEST_HANDLE(void) arg) +static int xenmem_add_to_physmap(struct domain *d, struct xen_add_to_physmap xatp) { struct page_info *page = NULL; + unsigned long prev_mfn, mfn = 0, gpfn; int rc; - switch ( op ) - { - case XENMEM_add_to_physmap: + switch ( xatp.space ) { - struct xen_add_to_physmap xatp; - unsigned long prev_mfn, mfn = 0, gpfn; - struct domain *d; - - if ( copy_from_guest(&xatp, arg, 1) ) - return -EFAULT; + case XENMAPSPACE_shared_info: + if ( xatp.idx == 0 ) + mfn = virt_to_mfn(d->shared_info); + break; + case XENMAPSPACE_grant_table: + spin_lock(&d->grant_table->lock); - rc = rcu_lock_target_domain_by_id(xatp.domid, &d); - if ( rc != 0 ) - return rc; + if ( d->grant_table->gt_version == 0 ) + d->grant_table->gt_version = 1; - if ( xsm_add_to_physmap(current->domain, d) ) + if ( d->grant_table->gt_version == 2 && + (xatp.idx & XENMAPIDX_grant_table_status) ) { - rcu_unlock_domain(d); - return -EPERM; + xatp.idx &= ~XENMAPIDX_grant_table_status; + if ( xatp.idx < nr_status_frames(d->grant_table) ) + mfn = virt_to_mfn(d->grant_table->status[xatp.idx]); + } + else + { + if ( (xatp.idx >= nr_grant_frames(d->grant_table)) && + (xatp.idx < max_nr_grant_frames) ) + gnttab_grow_table(d, xatp.idx + 1); + + if ( xatp.idx < nr_grant_frames(d->grant_table) ) + mfn = virt_to_mfn(d->grant_table->shared_raw[xatp.idx]); } - switch ( xatp.space ) + spin_unlock(&d->grant_table->lock); + break; + case XENMAPSPACE_gmfn: + { + p2m_type_t p2mt; + + xatp.idx = mfn_x(gfn_to_mfn_unshare(d, xatp.idx, &p2mt)); + /* If the page is still shared, exit early */ + if ( p2m_is_shared(p2mt) ) { - case XENMAPSPACE_shared_info: - if ( xatp.idx == 0 ) - mfn = virt_to_mfn(d->shared_info); + rcu_unlock_domain(d); + return -ENOMEM; + } + if ( !get_page_from_pagenr(xatp.idx, d) ) break; - case XENMAPSPACE_grant_table: - spin_lock(&d->grant_table->lock); + mfn = xatp.idx; + page = mfn_to_page(mfn); + break; + } + default: + break; + } - if ( d->grant_table->gt_version == 0 ) - d->grant_table->gt_version = 1; + if ( !paging_mode_translate(d) || (mfn == 0) ) + { + if ( page ) + put_page(page); + rcu_unlock_domain(d); + return -EINVAL; + } - if ( d->grant_table->gt_version == 2 && - (xatp.idx & XENMAPIDX_grant_table_status) ) - { - xatp.idx &= ~XENMAPIDX_grant_table_status; - if ( xatp.idx < nr_status_frames(d->grant_table) ) - mfn = virt_to_mfn(d->grant_table->status[xatp.idx]); - } - else - { - if ( (xatp.idx >= nr_grant_frames(d->grant_table)) && - (xatp.idx < max_nr_grant_frames) ) - gnttab_grow_table(d, xatp.idx + 1); + domain_lock(d); - if ( xatp.idx < nr_grant_frames(d->grant_table) ) - mfn = virt_to_mfn(d->grant_table->shared_raw[xatp.idx]); - } + if ( page ) + put_page(page); - spin_unlock(&d->grant_table->lock); - break; - case XENMAPSPACE_gmfn: - { - p2m_type_t p2mt; + /* Remove previously mapped page if it was present. */ + prev_mfn = gmfn_to_mfn(d, xatp.gpfn); + if ( mfn_valid(prev_mfn) ) + { + if ( is_xen_heap_mfn(prev_mfn) ) + /* Xen heap frames are simply unhooked from this phys slot. */ + guest_physmap_remove_page(d, xatp.gpfn, prev_mfn, 0); + else + /* Normal domain memory is freed, to avoid leaking memory. */ + guest_remove_page(d, xatp.gpfn); + } - xatp.idx = mfn_x(gfn_to_mfn_unshare(d, xatp.idx, &p2mt)); - /* If the page is still shared, exit early */ - if ( p2m_is_shared(p2mt) ) - { - rcu_unlock_domain(d); - return -ENOMEM; - } - if ( !get_page_from_pagenr(xatp.idx, d) ) - break; - mfn = xatp.idx; - page = mfn_to_page(mfn); - break; - } - default: - break; - } + /* Unmap from old location, if any. */ + gpfn = get_gpfn_from_mfn(mfn); + ASSERT( gpfn != SHARED_M2P_ENTRY ); + if ( gpfn != INVALID_M2P_ENTRY ) + guest_physmap_remove_page(d, gpfn, mfn, 0); - if ( !paging_mode_translate(d) || (mfn == 0) ) - { - if ( page ) - put_page(page); - rcu_unlock_domain(d); - return -EINVAL; - } + /* Map at new location. */ + rc = guest_physmap_add_page(d, xatp.gpfn, mfn, 0); - domain_lock(d); + domain_unlock(d); - if ( page ) - put_page(page); + return rc; +} - /* Remove previously mapped page if it was present. */ - prev_mfn = gmfn_to_mfn(d, xatp.gpfn); - if ( mfn_valid(prev_mfn) ) - { - if ( is_xen_heap_mfn(prev_mfn) ) - /* Xen heap frames are simply unhooked from this phys slot. */ - guest_physmap_remove_page(d, xatp.gpfn, prev_mfn, 0); - else - /* Normal domain memory is freed, to avoid leaking memory. */ - guest_remove_page(d, xatp.gpfn); - } - /* Unmap from old location, if any. */ - gpfn = get_gpfn_from_mfn(mfn); - ASSERT( gpfn != SHARED_M2P_ENTRY ); - if ( gpfn != INVALID_M2P_ENTRY ) - guest_physmap_remove_page(d, gpfn, mfn, 0); +long arch_memory_op(int op, XEN_GUEST_HANDLE(void) arg) +{ + int rc; + + switch ( op ) + { + case XENMEM_add_to_physmap: + { + struct xen_add_to_physmap xatp; + struct domain *d; - /* Map at new location. */ - rc = guest_physmap_add_page(d, xatp.gpfn, mfn, 0); + if ( copy_from_guest(&xatp, arg, 1) ) + return -EFAULT; + + rc = rcu_lock_target_domain_by_id(xatp.domid, &d); + if ( rc != 0 ) + return rc; + + if ( xsm_add_to_physmap(current->domain, d) ) + { + rcu_unlock_domain(d); + return -EPERM; + } - domain_unlock(d); + xenmem_add_to_physmap(d, xatp); rcu_unlock_domain(d); [-- Attachment #3: Type: text/plain, Size: 138 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 4/6] mm: New XENMEM space, XENMAPSPACE_gmfn_range 2011-11-08 20:04 ` [PATCH 3/6] add_to_physmap: Move the code for XENMEM_add_to_physmap Jean Guyader @ 2011-11-08 20:04 ` Jean Guyader 2011-11-08 20:04 ` [PATCH 5/6] hvmloader: Change memory relocation loop when overlap with PCI hole Jean Guyader 2011-11-09 10:09 ` [PATCH 4/6] mm: New XENMEM space, XENMAPSPACE_gmfn_range Tim Deegan 0 siblings, 2 replies; 8+ messages in thread From: Jean Guyader @ 2011-11-08 20:04 UTC (permalink / raw) To: xen-devel; +Cc: allen.m.kay, tim, Jean Guyader [-- Attachment #1: Type: text/plain, Size: 562 bytes --] XENMAPSPACE_gmfn_range is like XENMAPSPACE_gmfn but it runs on a range of pages. The size of the range is defined in a new field. This new field .size is located in the 16 bits padding between .domid and .space in struct xen_add_to_physmap to stay compatible with older versions. Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com> --- xen/arch/x86/mm.c | 164 ++++++++++++++++++++++----------------- xen/arch/x86/x86_64/compat/mm.c | 4 + xen/include/public/memory.h | 4 + 3 files changed, 102 insertions(+), 70 deletions(-) [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0004-mm-New-XENMEM-space-XENMAPSPACE_gmfn_range.patch --] [-- Type: text/x-patch; name="0004-mm-New-XENMEM-space-XENMAPSPACE_gmfn_range.patch", Size: 7674 bytes --] diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index f75011e..2417fe9 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -4598,93 +4598,112 @@ static int xenmem_add_to_physmap(struct domain *d, struct xen_add_to_physmap xat unsigned long prev_mfn, mfn = 0, gpfn; int rc; - switch ( xatp.space ) - { - case XENMAPSPACE_shared_info: - if ( xatp.idx == 0 ) - mfn = virt_to_mfn(d->shared_info); - break; - case XENMAPSPACE_grant_table: - spin_lock(&d->grant_table->lock); + if ( xatp.space != XENMAPSPACE_gmfn_range ) + xatp.size = 1; - if ( d->grant_table->gt_version == 0 ) - d->grant_table->gt_version = 1; - - if ( d->grant_table->gt_version == 2 && - (xatp.idx & XENMAPIDX_grant_table_status) ) + while ( xatp.size > 0 ) + { + if ( hypercall_preempt_check() ) { - xatp.idx &= ~XENMAPIDX_grant_table_status; - if ( xatp.idx < nr_status_frames(d->grant_table) ) - mfn = virt_to_mfn(d->grant_table->status[xatp.idx]); + rc = -EAGAIN; + break; } - else + + switch ( xatp.space ) { - if ( (xatp.idx >= nr_grant_frames(d->grant_table)) && - (xatp.idx < max_nr_grant_frames) ) - gnttab_grow_table(d, xatp.idx + 1); + case XENMAPSPACE_shared_info: + if ( xatp.idx == 0 ) + mfn = virt_to_mfn(d->shared_info); + break; + case XENMAPSPACE_grant_table: + spin_lock(&d->grant_table->lock); - if ( xatp.idx < nr_grant_frames(d->grant_table) ) - mfn = virt_to_mfn(d->grant_table->shared_raw[xatp.idx]); - } + if ( d->grant_table->gt_version == 0 ) + d->grant_table->gt_version = 1; - spin_unlock(&d->grant_table->lock); - break; - case XENMAPSPACE_gmfn: - { - p2m_type_t p2mt; + if ( d->grant_table->gt_version == 2 && + (xatp.idx & XENMAPIDX_grant_table_status) ) + { + xatp.idx &= ~XENMAPIDX_grant_table_status; + if ( xatp.idx < nr_status_frames(d->grant_table) ) + mfn = virt_to_mfn(d->grant_table->status[xatp.idx]); + } + else + { + if ( (xatp.idx >= nr_grant_frames(d->grant_table)) && + (xatp.idx < max_nr_grant_frames) ) + gnttab_grow_table(d, xatp.idx + 1); + + if ( xatp.idx < nr_grant_frames(d->grant_table) ) + mfn = virt_to_mfn(d->grant_table->shared_raw[xatp.idx]); + } - xatp.idx = mfn_x(gfn_to_mfn_unshare(d, xatp.idx, &p2mt)); - /* If the page is still shared, exit early */ - if ( p2m_is_shared(p2mt) ) + spin_unlock(&d->grant_table->lock); + break; + case XENMAPSPACE_gmfn: { - rcu_unlock_domain(d); - return -ENOMEM; + p2m_type_t p2mt; + + xatp.idx = mfn_x(gfn_to_mfn_unshare(d, xatp.idx, &p2mt)); + /* If the page is still shared, exit early */ + if ( p2m_is_shared(p2mt) ) + { + rcu_unlock_domain(d); + return -ENOMEM; + } + if ( !get_page_from_pagenr(xatp.idx, d) ) + break; + mfn = xatp.idx; + page = mfn_to_page(mfn); + break; } - if ( !get_page_from_pagenr(xatp.idx, d) ) + default: break; - mfn = xatp.idx; - page = mfn_to_page(mfn); - break; - } - default: - break; - } + } + + if ( !paging_mode_translate(d) || (mfn == 0) ) + { + if ( page ) + put_page(page); + rcu_unlock_domain(d); + return -EINVAL; + } + + domain_lock(d); - if ( !paging_mode_translate(d) || (mfn == 0) ) - { if ( page ) put_page(page); - rcu_unlock_domain(d); - return -EINVAL; - } - domain_lock(d); + /* Remove previously mapped page if it was present. */ + prev_mfn = gmfn_to_mfn(d, xatp.gpfn); + if ( mfn_valid(prev_mfn) ) + { + if ( is_xen_heap_mfn(prev_mfn) ) + /* Xen heap frames are simply unhooked from this phys slot. */ + guest_physmap_remove_page(d, xatp.gpfn, prev_mfn, 0); + else + /* Normal domain memory is freed, to avoid leaking memory. */ + guest_remove_page(d, xatp.gpfn); + } - if ( page ) - put_page(page); + /* Unmap from old location, if any. */ + gpfn = get_gpfn_from_mfn(mfn); + ASSERT( gpfn != SHARED_M2P_ENTRY ); + if ( gpfn != INVALID_M2P_ENTRY ) + guest_physmap_remove_page(d, gpfn, mfn, 0); - /* Remove previously mapped page if it was present. */ - prev_mfn = gmfn_to_mfn(d, xatp.gpfn); - if ( mfn_valid(prev_mfn) ) - { - if ( is_xen_heap_mfn(prev_mfn) ) - /* Xen heap frames are simply unhooked from this phys slot. */ - guest_physmap_remove_page(d, xatp.gpfn, prev_mfn, 0); - else - /* Normal domain memory is freed, to avoid leaking memory. */ - guest_remove_page(d, xatp.gpfn); - } + /* Map at new location. */ + rc = guest_physmap_add_page(d, xatp.gpfn, mfn, 0); - /* Unmap from old location, if any. */ - gpfn = get_gpfn_from_mfn(mfn); - ASSERT( gpfn != SHARED_M2P_ENTRY ); - if ( gpfn != INVALID_M2P_ENTRY ) - guest_physmap_remove_page(d, gpfn, mfn, 0); + domain_unlock(d); - /* Map at new location. */ - rc = guest_physmap_add_page(d, xatp.gpfn, mfn, 0); + xatp.idx++; + xatp.gpfn++; + xatp.size--; - domain_unlock(d); + if ( rc != 0 ) + break; + } return rc; } @@ -4714,10 +4733,15 @@ long arch_memory_op(int op, XEN_GUEST_HANDLE(void) arg) return -EPERM; } - xenmem_add_to_physmap(d, xatp); - + rc = xenmem_add_to_physmap(d, xatp); rcu_unlock_domain(d); + if ( rc == -EAGAIN ) + { + rc = hypercall_create_continuation( + __HYPERVISOR_memory_op, "ih", op, arg); + } + return rc; } diff --git a/xen/arch/x86/x86_64/compat/mm.c b/xen/arch/x86/x86_64/compat/mm.c index 2c05099..05c3098 100644 --- a/xen/arch/x86/x86_64/compat/mm.c +++ b/xen/arch/x86/x86_64/compat/mm.c @@ -63,6 +63,10 @@ int compat_arch_memory_op(int op, XEN_GUEST_HANDLE(void) arg) XLAT_add_to_physmap(nat, &cmp); rc = arch_memory_op(op, guest_handle_from_ptr(nat, void)); + XLAT_add_to_physmap(&cmp, nat); + + if ( copy_to_guest(arg, &cmp, 1) ) + return -EFAULT; break; } diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h index 08355e3..c5b78a8 100644 --- a/xen/include/public/memory.h +++ b/xen/include/public/memory.h @@ -208,10 +208,14 @@ struct xen_add_to_physmap { /* Which domain to change the mapping for. */ domid_t domid; + /* Number of pages to go through for gmfn_range */ + uint16_t size; + /* Source mapping space. */ #define XENMAPSPACE_shared_info 0 /* shared info page */ #define XENMAPSPACE_grant_table 1 /* grant table page */ #define XENMAPSPACE_gmfn 2 /* GMFN */ +#define XENMAPSPACE_gmfn_range 3 /* GMFN range */ unsigned int space; #define XENMAPIDX_grant_table_status 0x80000000 [-- Attachment #3: Type: text/plain, Size: 138 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 5/6] hvmloader: Change memory relocation loop when overlap with PCI hole. 2011-11-08 20:04 ` [PATCH 4/6] mm: New XENMEM space, XENMAPSPACE_gmfn_range Jean Guyader @ 2011-11-08 20:04 ` Jean Guyader 2011-11-08 20:04 ` [PATCH 6/6] Introduce per cpu flag (iommu_dont_flush_iotlb) to avoid unnecessary iotlb flush Jean Guyader 2011-11-09 10:09 ` [PATCH 4/6] mm: New XENMEM space, XENMAPSPACE_gmfn_range Tim Deegan 1 sibling, 1 reply; 8+ messages in thread From: Jean Guyader @ 2011-11-08 20:04 UTC (permalink / raw) To: xen-devel; +Cc: allen.m.kay, tim, Jean Guyader [-- Attachment #1: Type: text/plain, Size: 715 bytes --] Change the way we relocate the memory page if they overlap with pci hole. Use new map space (XENMAPSPACE_gmfn_range) to move the loop into xen. This code usually get triggered when a device is pass through to a guest and the PCI hole has to be extended to have enough room to map the device BARs. The PCI hole will starts lower and it might overlap with some RAM that has been alocated for the guest. That usually happen if the guest has more than 4G of RAM. We have to relocate those pages in high mem otherwise they won't be accessible. Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com> --- tools/firmware/hvmloader/pci.c | 20 ++++++++++++++------ 1 files changed, 14 insertions(+), 6 deletions(-) [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0005-hvmloader-Change-memory-relocation-loop-when-overlap.patch --] [-- Type: text/x-patch; name="0005-hvmloader-Change-memory-relocation-loop-when-overlap.patch", Size: 1719 bytes --] diff --git a/tools/firmware/hvmloader/pci.c b/tools/firmware/hvmloader/pci.c index 29ec011..3bd6ac5 100644 --- a/tools/firmware/hvmloader/pci.c +++ b/tools/firmware/hvmloader/pci.c @@ -50,6 +50,7 @@ void pci_setup(void) uint32_t devfn, bar_reg, bar_sz; } *bars = (struct bars *)scratch_start; unsigned int i, nr_bars = 0; + unsigned long pci_mem_start_pg; /* Program PCI-ISA bridge with appropriate link routes. */ isa_irq = 0; @@ -185,17 +186,24 @@ void pci_setup(void) ((pci_mem_start << 1) != 0) ) pci_mem_start <<= 1; - while ( (pci_mem_start >> PAGE_SHIFT) < hvm_info->low_mem_pgend ) + /* Relocate RAM that overlaps (in 64K chunks) */ + pci_mem_start_pg = (pci_mem_start >> PAGE_SHIFT); + while (pci_mem_start_pg < hvm_info->low_mem_pgend) { struct xen_add_to_physmap xatp; - if ( hvm_info->high_mem_pgend == 0 ) - hvm_info->high_mem_pgend = 1ull << (32 - PAGE_SHIFT); + unsigned int size = hvm_info->low_mem_pgend - pci_mem_start_pg; xatp.domid = DOMID_SELF; - xatp.space = XENMAPSPACE_gmfn; - xatp.idx = --hvm_info->low_mem_pgend; - xatp.gpfn = hvm_info->high_mem_pgend++; + xatp.space = XENMAPSPACE_gmfn_range; + xatp.idx = pci_mem_start_pg; + xatp.gpfn = hvm_info->high_mem_pgend; + size = size > ((1 << 16) - 1) ? ((1 << 16) - 1) : size; + xatp.size = size; + if ( hypercall_memory_op(XENMEM_add_to_physmap, &xatp) != 0 ) BUG(); + pci_mem_start_pg += size; + hvm_info->high_mem_pgend += size; + hvm_info->low_mem_pgend = pci_mem_start_pg; } mem_resource.base = pci_mem_start; [-- Attachment #3: Type: text/plain, Size: 138 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 6/6] Introduce per cpu flag (iommu_dont_flush_iotlb) to avoid unnecessary iotlb flush 2011-11-08 20:04 ` [PATCH 5/6] hvmloader: Change memory relocation loop when overlap with PCI hole Jean Guyader @ 2011-11-08 20:04 ` Jean Guyader 0 siblings, 0 replies; 8+ messages in thread From: Jean Guyader @ 2011-11-08 20:04 UTC (permalink / raw) To: xen-devel; +Cc: allen.m.kay, tim, Jean Guyader [-- Attachment #1: Type: text/plain, Size: 463 bytes --] Add cpu flag that will be checked by the iommu low level code to skip iotlb flushes. iommu_iotlb_flush shall be called explicitly. Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com> --- xen/arch/x86/mm.c | 14 ++++++++++++++ xen/drivers/passthrough/iommu.c | 5 +++++ xen/drivers/passthrough/vtd/iommu.c | 6 ++++-- xen/include/xen/iommu.h | 12 ++++++++++++ 4 files changed, 35 insertions(+), 2 deletions(-) [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0006-Introduce-per-cpu-flag-iommu_dont_flush_iotlb-to-avo.patch --] [-- Type: text/x-patch; name="0006-Introduce-per-cpu-flag-iommu_dont_flush_iotlb-to-avo.patch", Size: 3959 bytes --] diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index 2417fe9..17a8504 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -4594,13 +4594,20 @@ static int handle_iomem_range(unsigned long s, unsigned long e, void *p) static int xenmem_add_to_physmap(struct domain *d, struct xen_add_to_physmap xatp) { + struct xen_add_to_physmap start_xatp; struct page_info *page = NULL; unsigned long prev_mfn, mfn = 0, gpfn; int rc; if ( xatp.space != XENMAPSPACE_gmfn_range ) xatp.size = 1; + else + { + if ( need_iommu(d) ) + this_cpu(iommu_dont_flush_iotlb) = 1; + } + start_xatp = xatp; while ( xatp.size > 0 ) { if ( hypercall_preempt_check() ) @@ -4705,6 +4712,13 @@ static int xenmem_add_to_physmap(struct domain *d, struct xen_add_to_physmap xat break; } + if ( xatp.space == XENMAPSPACE_gmfn_range && need_iommu(d) ) + { + this_cpu(iommu_dont_flush_iotlb) = 0; + iommu_iotlb_flush(d, start_xatp.idx, start_xatp.size - xatp.size); + iommu_iotlb_flush(d, start_xatp.gpfn, start_xatp.size - xatp.size); + } + return rc; } diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c index ca7b37b..bacca11 100644 --- a/xen/drivers/passthrough/iommu.c +++ b/xen/drivers/passthrough/iommu.c @@ -52,6 +52,8 @@ bool_t __read_mostly iommu_hap_pt_share = 1; bool_t __read_mostly iommu_debug; bool_t __read_mostly amd_iommu_perdev_intremap; +DEFINE_PER_CPU(bool_t, iommu_dont_flush_iotlb); + static void __init parse_iommu_param(char *s) { char *ss; @@ -227,6 +229,7 @@ static int iommu_populate_page_table(struct domain *d) spin_lock(&d->page_alloc_lock); + this_cpu(iommu_dont_flush_iotlb) = 1; page_list_for_each ( page, &d->page_list ) { if ( is_hvm_domain(d) || @@ -244,6 +247,8 @@ static int iommu_populate_page_table(struct domain *d) } } } + this_cpu(iommu_dont_flush_iotlb) = 0; + iommu_iotlb_flush_all(d); spin_unlock(&d->page_alloc_lock); return 0; } diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c index 7ec9541..a3dd018 100644 --- a/xen/drivers/passthrough/vtd/iommu.c +++ b/xen/drivers/passthrough/vtd/iommu.c @@ -660,7 +660,8 @@ static void dma_pte_clear_one(struct domain *domain, u64 addr) spin_unlock(&hd->mapping_lock); iommu_flush_cache_entry(pte, sizeof(struct dma_pte)); - __intel_iommu_iotlb_flush(domain, addr >> PAGE_SHIFT_4K , 0, 1); + if ( !this_cpu(iommu_dont_flush_iotlb) ) + __intel_iommu_iotlb_flush(domain, addr >> PAGE_SHIFT_4K , 0, 1); unmap_vtd_domain_page(page); @@ -1753,7 +1754,8 @@ static int intel_iommu_map_page( spin_unlock(&hd->mapping_lock); unmap_vtd_domain_page(page); - __intel_iommu_iotlb_flush(d, gfn, dma_pte_present(old), 1); + if ( !this_cpu(iommu_dont_flush_iotlb) ) + __intel_iommu_iotlb_flush(d, gfn, dma_pte_present(old), 1); return 0; } diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h index a1034df..6f7fbf7 100644 --- a/xen/include/xen/iommu.h +++ b/xen/include/xen/iommu.h @@ -160,4 +160,16 @@ int iommu_do_domctl(struct xen_domctl *, XEN_GUEST_HANDLE(xen_domctl_t)); void iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int page_count); void iommu_iotlb_flush_all(struct domain *d); +/* + * The purpose of the iommu_dont_flush_iotlb optional cpu flag is to + * avoid unecessary iotlb_flush in the low level IOMMU code. + * + * iommu_map_page/iommu_unmap_page must flush the iotlb but somethimes + * this operation can be really expensive. This flag will be set by the + * caller to notify the low level IOMMU code to avoid the iotlb flushes. + * iommu_iotlb_flush/iommu_iotlb_flush_all will be explicitly called by + * the caller. + */ +DECLARE_PER_CPU(bool_t, iommu_dont_flush_iotlb); + #endif /* _IOMMU_H_ */ [-- Attachment #3: Type: text/plain, Size: 138 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 4/6] mm: New XENMEM space, XENMAPSPACE_gmfn_range 2011-11-08 20:04 ` [PATCH 4/6] mm: New XENMEM space, XENMAPSPACE_gmfn_range Jean Guyader 2011-11-08 20:04 ` [PATCH 5/6] hvmloader: Change memory relocation loop when overlap with PCI hole Jean Guyader @ 2011-11-09 10:09 ` Tim Deegan 1 sibling, 0 replies; 8+ messages in thread From: Tim Deegan @ 2011-11-09 10:09 UTC (permalink / raw) To: Jean Guyader; +Cc: xen-devel, allen.m.kay Hi, diff --git a/xen/arch/x86/x86_64/compat/mm.c b/xen/arch/x86/x86_64/compat/mm.c index 2c05099..05c3098 100644 --- a/xen/arch/x86/x86_64/compat/mm.c +++ b/xen/arch/x86/x86_64/compat/mm.c @@ -63,6 +63,10 @@ int compat_arch_memory_op(int op, XEN_GUEST_HANDLE(void) arg) XLAT_add_to_physmap(nat, &cmp); rc = arch_memory_op(op, guest_handle_from_ptr(nat, void)); + XLAT_add_to_physmap(&cmp, nat); + + if ( copy_to_guest(arg, &cmp, 1) ) + return -EFAULT; break; } I think you might need to do a bit more than this - have a look at how XENMEM_get_pod_target is handled (in the same file), including re-translating the hypercall continuation. Please explicitly test the case where the hypercall gets interrupted and the caller is a 32-bit compat guest, and make sure it does what you want. (Apart from that the x86/mm parts of the series are now fine by me) Cheers, Tim. ^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2011-11-09 10:09 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-11-08 20:04 [PATCH 0/6] IOMMU, vtd and iotlb flush rework (v4) Jean Guyader 2011-11-08 20:04 ` [PATCH 1/6] vtd: Refactor iotlb flush code Jean Guyader 2011-11-08 20:04 ` [PATCH 2/6] iommu: Introduce iommu_flush and iommu_flush_all Jean Guyader 2011-11-08 20:04 ` [PATCH 3/6] add_to_physmap: Move the code for XENMEM_add_to_physmap Jean Guyader 2011-11-08 20:04 ` [PATCH 4/6] mm: New XENMEM space, XENMAPSPACE_gmfn_range Jean Guyader 2011-11-08 20:04 ` [PATCH 5/6] hvmloader: Change memory relocation loop when overlap with PCI hole Jean Guyader 2011-11-08 20:04 ` [PATCH 6/6] Introduce per cpu flag (iommu_dont_flush_iotlb) to avoid unnecessary iotlb flush Jean Guyader 2011-11-09 10:09 ` [PATCH 4/6] mm: New XENMEM space, XENMAPSPACE_gmfn_range Tim Deegan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).