From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wei Wang2 Subject: Re: [PATCH] AMD IOMMU: Use global interrupt remapping table by default Date: Wed, 20 Jul 2011 17:56:44 +0200 Message-ID: <201107201756.45241.wei.wang2@amd.com> References: <200910281732.07420.wei.wang2@amd.com> <201107201434.58081.wei.wang2@amd.com> <1311166895.20648.200.camel@zakaz.uk.xensource.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="Boundary-00=_9qvJOyYFy1V9vRo" Return-path: In-Reply-To: <1311166895.20648.200.camel@zakaz.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Ian Campbell Cc: George Dunlap , "xen-devel@lists.xensource.com" List-Id: xen-devel@lists.xenproject.org --Boundary-00=_9qvJOyYFy1V9vRo Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline George & Ian, Patch attached. This patch removes global interrupt remapping table and use= s=20 per-device table instead. This should work with per-cpu IDTs. We are safe = to=20 remove global table since SATA device id issue dose not appear in recent=20 production BIOS. Thanks, Wei Signed-off-by: Wei Wang =2D- Advanced Micro Devices GmbH Sitz: Dornach, Gemeinde Aschheim,=20 Landkreis M=C3=BCnchen Registergericht M=C3=BCnchen,=20 HRB Nr. 43632 WEEE Registrierungsnummer 129 19551 Gesch=C3=A4ftsf=C3=BChrer: Alberto BozzoOn=20 Wednesday 20 July 2011 15:01:35 Ian Campbell wrote: > On Wed, 2011-07-20 at 13:34 +0100, Wei Wang2 wrote: > > Wednesday 20 July 2011 12:50:25 Ian Campbell wrote: > > > On Wed, 2011-07-20 at 10:52 +0100, Wei Wang2 wrote: > > > > On Tuesday 19 July 2011 16:14:31 George Dunlap wrote: > > > > > Wei, > > > > > > > > > > Can you be more specific about which BIOSes behave poorly with > > > > > per-device intremap tables, and why? > > > > > > > > We found that, in some case, SATA device uses different device ids > > > > for dma remapping and interrupt remapping. Some early BIOSes cannot > > > > handle this situation correctly, so if SATA uses device id for DMA = to > > > > lookup device table entry for intremap table and if intremap table = is > > > > per-device configured, SATA device won't get the right table. > > > > > > Was this issue present in production BIOSes or do you mean early as in > > > pre-production? IOW can we drop the support non-share remapping table > > > altogether or do we need to fix things in this mode to force the IDT = to > > > be identical across CPUs (either by resharing the IDT in that case, > > > ick, or by enforcing that the contents are the same for devices with > > > this property)? > > > > > > OOI was the issue a confusion between the SATA PCI device and the > > > legacy PCI IDE facet of the same device? > > > > Yes, using shared intremap table is the workaround for this issue. > > Ideally, BIOS should create 2 IVRS entries for SATA devices in IDE > > combined mode, one for DMA the other for interrupt. But this setup is n= ot > > strict compatible with iommu specification. So recent BIOS should have > > IDE combined mode disabled in this case. So I believe that remove the > > global table is safe from now on. I could send patches. > > Please do. > > Ian. > > > Thanks, > > Wei > > > > > > > The problem with a global intremap table is that, AFAICT, it's not > > > > > fundamentally compatible with per-cpu IDTs. With per-cpu IDTs, > > > > > different devices may end up with interrupts mapped to different > > > > > cpus but the same vector (i.e., device A mapped to cpu 9 vector 6= 7, > > > > > cpu B mapped to cpu 12 vector 67). This is by design; the whole > > > > > point of the per-cpu IDTs is to avoid restricting the number of > > > > > IRQs to the number of vectors. But it seems that the intremap > > > > > table only maps vectors, not destination IDs; so in the example > > > > > above, both devices' interrupts would end up being remapped to the > > > > > same place, causing one device driver to get both sets of > > > > > interrupts, and the other to get none. > > > > > > > > Yes, obviously a problem...Using shared intremap table, devices uses > > > > the same vector and delivery mode will end up to the same remapping > > > > entry. Is per-cpu IDTs enable by default in Xen? > > > > > > I didn't think it was even optional these days, but I didn't check. > > > > > > > > Do I understand correctly? If so, it seems like we should switch > > > > > to per-device intremap tables by default; and if we're using a > > > > > global intremap table, we need to somehow make sure that vectors > > > > > are not shared across cpus. > > > > > > > > I agree to use per-device table by default, since BIOS issue has be= en > > > > fixed and per-device table also has some security advantages. > > > > > > > > Thanks, > > > > Wei > > > > > > > > > -George > > > > > > > > > > On Wed, Oct 28, 2009 at 4:32 PM, Wei Wang2 =20 wrote: > > > > > > Using a global interrupt remapping table shared by all devices > > > > > > has better compatibility with certain old BIOSes. Per-device > > > > > > interrupt remapping table can still be enabled by using a new > > > > > > parameter "amd-iommu-perdev-intremap". Thanks, > > > > > > Wei > > > > > > > > > > > > Signed-off-by: Wei Wang > > > > > > -- > > > > > > AMD GmbH, Germany > > > > > > Operating System Research Center > > > > > > > > > > > > Legal Information: > > > > > > Advanced Micro Devices GmbH > > > > > > Karl-Hammerschmidt-Str. 34 > > > > > > 85609 Dornach b. M=C3=BCnchen > > > > > > > > > > > > Gesch=C3=A4ftsf=C3=BChrer: Andrew Bowd, Thomas M. McCoy, Giulia= no Meroni > > > > > > Sitz: Dornach, Gemeinde Aschheim, Landkreis M=C3=BCnchen > > > > > > Registergericht M=C3=BCnchen, HRB Nr. 43632 > > > > > > > > > > > > _______________________________________________ > > > > > > Xen-devel mailing list > > > > > > Xen-devel@lists.xensource.com > > > > > > http://lists.xensource.com/xen-devel > > > > > > > > _______________________________________________ > > > > Xen-devel mailing list > > > > Xen-devel@lists.xensource.com > > > > http://lists.xensource.com/xen-devel --Boundary-00=_9qvJOyYFy1V9vRo Content-Type: text/x-diff; charset="utf-8"; name="rm_glb_intremap.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="rm_glb_intremap.patch" Content-Description: rm_glb_intremap.patch diff -r 548b2826293e xen/drivers/passthrough/amd/iommu_acpi.c --- a/xen/drivers/passthrough/amd/iommu_acpi.c Tue Jul 19 16:02:36 2011 +0100 +++ b/xen/drivers/passthrough/amd/iommu_acpi.c Wed Jul 20 17:43:22 2011 +0200 @@ -66,15 +66,8 @@ static void __init add_ivrs_mapping_entr if (ivrs_mappings[alias_id].intremap_table == NULL ) { /* allocate per-device interrupt remapping table */ - if ( amd_iommu_perdev_intremap ) - ivrs_mappings[alias_id].intremap_table = + ivrs_mappings[alias_id].intremap_table = amd_iommu_alloc_intremap_table(); - else - { - if ( shared_intremap_table == NULL ) - shared_intremap_table = amd_iommu_alloc_intremap_table(); - ivrs_mappings[alias_id].intremap_table = shared_intremap_table; - } } /* assgin iommu hardware */ ivrs_mappings[bdf].iommu = iommu; diff -r 548b2826293e xen/drivers/passthrough/amd/iommu_init.c --- a/xen/drivers/passthrough/amd/iommu_init.c Tue Jul 19 16:02:36 2011 +0100 +++ b/xen/drivers/passthrough/amd/iommu_init.c Wed Jul 20 17:43:22 2011 +0200 @@ -760,8 +760,7 @@ static int __init init_ivrs_mapping(void ivrs_mappings[bdf].dte_ext_int_pass = IOMMU_CONTROL_DISABLED; ivrs_mappings[bdf].dte_init_pass = IOMMU_CONTROL_DISABLED; - if ( amd_iommu_perdev_intremap ) - spin_lock_init(&ivrs_mappings[bdf].intremap_lock); + spin_lock_init(&ivrs_mappings[bdf].intremap_lock); } return 0; } diff -r 548b2826293e xen/drivers/passthrough/amd/iommu_intr.c --- a/xen/drivers/passthrough/amd/iommu_intr.c Tue Jul 19 16:02:36 2011 +0100 +++ b/xen/drivers/passthrough/amd/iommu_intr.c Wed Jul 20 17:43:22 2011 +0200 @@ -28,20 +28,10 @@ #define INTREMAP_ENTRIES (1 << INTREMAP_LENGTH) int ioapic_bdf[MAX_IO_APICS]; -void *shared_intremap_table; -static DEFINE_SPINLOCK(shared_intremap_lock); static spinlock_t* get_intremap_lock(int req_id) { - return (amd_iommu_perdev_intremap ? - &ivrs_mappings[req_id].intremap_lock: - &shared_intremap_lock); -} - -static int get_intremap_requestor_id(int bdf) -{ - ASSERT( bdf < ivrs_bdf_entries ); - return ivrs_mappings[bdf].dte_requestor_id; + return &ivrs_mappings[req_id].intremap_lock; } static int get_intremap_offset(u8 vector, u8 dm) @@ -125,7 +115,7 @@ static void update_intremap_entry_from_i spinlock_t *lock; int offset; - req_id = get_intremap_requestor_id(bdf); + req_id = get_requestor_id(bdf); lock = get_intremap_lock(req_id); delivery_mode = rte->delivery_mode; @@ -183,7 +173,7 @@ int __init amd_iommu_setup_ioapic_remapp continue; } - req_id = get_intremap_requestor_id(bdf); + req_id = get_requestor_id(bdf); lock = get_intremap_lock(req_id); delivery_mode = rte.delivery_mode; @@ -283,14 +273,13 @@ static void update_intremap_entry_from_m { unsigned long flags; u32* entry; - u16 bdf, req_id, alias_id; + u16 bdf, req_id; u8 delivery_mode, dest, vector, dest_mode; spinlock_t *lock; int offset; bdf = (pdev->bus << 8) | pdev->devfn; - req_id = get_dma_requestor_id(bdf); - alias_id = get_intremap_requestor_id(bdf); + req_id = get_requestor_id(bdf); if ( msg == NULL ) { @@ -299,14 +288,6 @@ static void update_intremap_entry_from_m free_intremap_entry(req_id, msi_desc->remap_index); spin_unlock_irqrestore(lock, flags); - if ( ( req_id != alias_id ) && - ivrs_mappings[alias_id].intremap_table != NULL ) - { - lock = get_intremap_lock(alias_id); - spin_lock_irqsave(lock, flags); - free_intremap_entry(alias_id, msi_desc->remap_index); - spin_unlock_irqrestore(lock, flags); - } goto done; } @@ -324,30 +305,11 @@ static void update_intremap_entry_from_m update_intremap_entry(entry, vector, delivery_mode, dest_mode, dest); spin_unlock_irqrestore(lock, flags); - /* - * In some special cases, a pci-e device(e.g SATA controller in IDE mode) - * will use alias id to index interrupt remapping table. - * We have to setup a secondary interrupt remapping entry to satisfy those - * devices. - */ - - lock = get_intremap_lock(alias_id); - if ( ( req_id != alias_id ) && - ivrs_mappings[alias_id].intremap_table != NULL ) - { - spin_lock_irqsave(lock, flags); - entry = (u32*)get_intremap_entry(alias_id, offset); - update_intremap_entry(entry, vector, delivery_mode, dest_mode, dest); - spin_unlock_irqrestore(lock, flags); - } - done: if ( iommu->enabled ) { spin_lock_irqsave(&iommu->lock, flags); invalidate_interrupt_table(iommu, req_id); - if ( alias_id != req_id ) - invalidate_interrupt_table(iommu, alias_id); flush_command_buffer(iommu); spin_unlock_irqrestore(&iommu->lock, flags); } diff -r 548b2826293e xen/drivers/passthrough/amd/iommu_map.c --- a/xen/drivers/passthrough/amd/iommu_map.c Tue Jul 19 16:02:36 2011 +0100 +++ b/xen/drivers/passthrough/amd/iommu_map.c Wed Jul 20 17:43:22 2011 +0200 @@ -543,7 +543,7 @@ static int update_paging_mode(struct dom for_each_pdev( d, pdev ) { bdf = (pdev->bus << 8) | pdev->devfn; - req_id = get_dma_requestor_id(bdf); + req_id = get_requestor_id(bdf); iommu = find_iommu_for_device(bdf); if ( !iommu ) { diff -r 548b2826293e xen/drivers/passthrough/amd/pci_amd_iommu.c --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c Tue Jul 19 16:02:36 2011 +0100 +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c Wed Jul 20 17:43:22 2011 +0200 @@ -31,25 +31,10 @@ struct amd_iommu *find_iommu_for_device( return ivrs_mappings[bdf].iommu; } -/* - * Some devices will use alias id and original device id to index interrupt - * table and I/O page table respectively. Such devices will have - * both alias entry and select entry in IVRS structure. - * - * Return original device id, if device has valid interrupt remapping - * table setup for both select entry and alias entry. - */ -int get_dma_requestor_id(u16 bdf) -{ - int req_id; - +int get_requestor_id(u16 bdf) +{ BUG_ON ( bdf >= ivrs_bdf_entries ); - req_id = ivrs_mappings[bdf].dte_requestor_id; - if ( (ivrs_mappings[bdf].intremap_table != NULL) && - (ivrs_mappings[req_id].intremap_table != NULL) ) - req_id = bdf; - - return req_id; + return ivrs_mappings[bdf].dte_requestor_id; } static int is_translation_valid(u32 *entry) @@ -91,7 +76,7 @@ static void amd_iommu_setup_domain_devic valid = 0; /* get device-table entry */ - req_id = get_dma_requestor_id(bdf); + req_id = get_requestor_id(bdf); dte = iommu->dev_table.buffer + (req_id * IOMMU_DEV_TABLE_ENTRY_SIZE); spin_lock_irqsave(&iommu->lock, flags); @@ -252,7 +237,7 @@ static void amd_iommu_disable_domain_dev int req_id; BUG_ON ( iommu->dev_table.buffer == NULL ); - req_id = get_dma_requestor_id(bdf); + req_id = get_requestor_id(bdf); dte = iommu->dev_table.buffer + (req_id * IOMMU_DEV_TABLE_ENTRY_SIZE); spin_lock_irqsave(&iommu->lock, flags); @@ -314,7 +299,7 @@ static int amd_iommu_assign_device(struc static int amd_iommu_assign_device(struct domain *d, u8 bus, u8 devfn) { int bdf = (bus << 8) | devfn; - int req_id = get_dma_requestor_id(bdf); + int req_id = get_requestor_id(bdf); if ( ivrs_mappings[req_id].unity_map_enable ) { @@ -433,7 +418,7 @@ static int amd_iommu_group_id(u8 bus, u8 int rt; int bdf = (bus << 8) | devfn; rt = ( bdf < ivrs_bdf_entries ) ? - get_dma_requestor_id(bdf) : + get_requestor_id(bdf) : bdf; return rt; } diff -r 548b2826293e xen/drivers/passthrough/iommu.c --- a/xen/drivers/passthrough/iommu.c Tue Jul 19 16:02:36 2011 +0100 +++ b/xen/drivers/passthrough/iommu.c Wed Jul 20 17:43:22 2011 +0200 @@ -49,7 +49,6 @@ bool_t __read_mostly iommu_intremap = 1; bool_t __read_mostly iommu_intremap = 1; bool_t __read_mostly iommu_hap_pt_share; bool_t __read_mostly iommu_debug; -bool_t __read_mostly amd_iommu_perdev_intremap; static void __init parse_iommu_param(char *s) { @@ -76,8 +75,6 @@ static void __init parse_iommu_param(cha iommu_intremap = 0; else if ( !strcmp(s, "debug") ) iommu_debug = 1; - else if ( !strcmp(s, "amd-iommu-perdev-intremap") ) - amd_iommu_perdev_intremap = 1; else if ( !strcmp(s, "dom0-passthrough") ) iommu_passthrough = 1; else if ( !strcmp(s, "dom0-strict") ) diff -r 548b2826293e xen/include/asm-x86/hvm/svm/amd-iommu-proto.h --- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h Tue Jul 19 16:02:36 2011 +0100 +++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h Wed Jul 20 17:43:22 2011 +0200 @@ -65,7 +65,7 @@ void amd_iommu_share_p2m(struct domain * void amd_iommu_share_p2m(struct domain *d); /* device table functions */ -int get_dma_requestor_id(u16 bdf); +int get_requestor_id(u16 bdf); void amd_iommu_add_dev_table_entry( u32 *dte, u8 sys_mgt, u8 dev_ex, u8 lint1_pass, u8 lint0_pass, u8 nmi_pass, u8 ext_int_pass, u8 init_pass); @@ -97,7 +97,6 @@ unsigned int amd_iommu_read_ioapic_from_ unsigned int apic, unsigned int reg); extern int ioapic_bdf[MAX_IO_APICS]; -extern void *shared_intremap_table; /* power management support */ void amd_iommu_resume(void); diff -r 548b2826293e xen/include/xen/iommu.h --- a/xen/include/xen/iommu.h Tue Jul 19 16:02:36 2011 +0100 +++ b/xen/include/xen/iommu.h Wed Jul 20 17:43:22 2011 +0200 @@ -32,7 +32,6 @@ extern bool_t iommu_snoop, iommu_qinval, extern bool_t iommu_snoop, iommu_qinval, iommu_intremap; extern bool_t iommu_hap_pt_share; extern bool_t iommu_debug; -extern bool_t amd_iommu_perdev_intremap; extern struct rangeset *mmio_ro_ranges; --Boundary-00=_9qvJOyYFy1V9vRo Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --Boundary-00=_9qvJOyYFy1V9vRo--