From: Wei Wang2 <wei.wang2@amd.com>
To: xen-devel@lists.xensource.com
Cc: "Ostrovsky, Boris" <Boris.Ostrovsky@amd.com>,
"Huang2, Wei" <Wei.Huang2@amd.com>,
Jan Beulich <JBeulich@novell.com>
Subject: [PATCH] AMD IOMMU: Fix an interrupt remapping issue (v2)
Date: Fri, 8 Apr 2011 18:52:20 +0200 [thread overview]
Message-ID: <201104081852.20738.wei.wang2@amd.com> (raw)
In-Reply-To: <201104081706.16445.wei.wang2@amd.com>
[-- Attachment #1: Type: text/plain, Size: 4009 bytes --]
Jan, How dose this one look like to you?
Thanks,
Wei
Signed-off-by Wei Wang <wei.wang2@amd.com>
--
Advanced Micro Devices GmbH
Sitz: Dornach, Gemeinde Aschheim,
Landkreis München Registergericht München,
HRB Nr. 43632
WEEE-Reg-Nr: DE 12919551
Geschäftsführer:
Alberto Bozzo, Andrew Bowd
On Friday 08 April 2011 17:06:16 Wei Wang2 wrote:
> On Friday 08 April 2011 16:39:13 Jan Beulich wrote:
> > >>> On 08.04.11 at 16:26, Wei Wang2 <wei.wang2@amd.com> wrote:
> > >
> > > On Friday 08 April 2011 15:43:57 Jan Beulich wrote:
> > >> >>> On 08.04.11 at 13:35, Wei Wang2 <wei.wang2@amd.com> wrote:
> > >> >
> > >> > Some device could generate bogus interrupts if an IO-APIC RTE and an
> > >> > iommu interrupt remapping entry are not consistent during 2 adjacent
> > >> > 64bits IO-APIC RTE updates. For example, if the 2nd operation
> > >> > updates destination bits in RTE for SATA device and unmask it, in
> > >> > some case, SATA device will assert ioapic pin to generate interrupt
> > >> > immediately using new destination but iommu could still translate it
> > >> > into the old destination, then dom0 would be confused. To fix that,
> > >> > we sync up interrupt remapping entry with IO-APIC IRE on every 32
> > >> > bits operation and foward IOAPIC RTE updates after interrupt
> > >> > remapping table has been changed.
> > >>
> > >> I don't think this is correct: Without the patch, the filling of
> > >> ioapic_rte takes into account the value already written. Now that you
> > >> only write the value at the end of the function, you should overwrite
> > >> the
> > >> affected half with "value" immediately before calling
> > >> update_intremap_entry_from_ioapic().
> > >
> > > Sorry, not quite understand your point. My thought is, no matter dom0
> > > tried to
> > > updates lower half or upper half of RTE, we always updates interrupt
> > > table from the lower half. This will keep iommu table strictly
> > > identically to RTE. The old code has an assumption that both lower half
> > > and upper of RTE should be updated together. But this might not be
> > > always true. If by incident, dom0 only updates the upper half and we
> > > don't sync iommu with it, then the destination in RTE and iommu table
> > > will be different.
> >
> > No, that's not my point. The problem I'm seeing is that you pass the
> > old value (as read from the IO-APIC) to
> > update_intremap_entry_from_ioapic(), but the function certainly
> > should use the to-be-written one. Previously this was implicit because
> > the IO-APIC register write happened first.
>
> OK, got it. That is definitely problematic. will fix it.
>
> > >> Eliminating the double write if reg == rte_lo would also seem
> > >> desirable (and in no case should you write back the old value after
> > >> having called update_intremap_entry_from_ioapic()).
> > >
> > > It not a write back, It just finishes IO-APIC RTE writes. After
> > > updating interrupt remapping table we still have to update RTE. It is
> > > just a copy of __io_apic_write (maybe I should just call it). Old code
> > > updates ioapic earlier than interrupt remapping table and sata device
> > > might generate interrupt right after this, which is not expected.
> >
> > No. If reg == ret_lo, you write that IO-APIC register twice, which is
> > pointless. With the other problem unaddressed, you actually first write
> > back the old value (with the mask bit restored), which gets IO-APIC
> > and remapping tables out of sync for a brief period of time (which is
> > a problem by itself), then write the new value. With the other problem
> > addressed, you would simply write the new value twice, which is
> > wasteful given that these writes are uncached.
>
> True. I will rework the patch try to eliminate this.
> Thanks
> Wei
>
> > Jan
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
[-- Attachment #2: fix_intremap_v2.patch --]
[-- Type: text/x-diff, Size: 5541 bytes --]
diff -r e5a750d1bf9b xen/drivers/passthrough/amd/iommu_intr.c
--- a/xen/drivers/passthrough/amd/iommu_intr.c Thu Apr 07 11:12:55 2011 +0100
+++ b/xen/drivers/passthrough/amd/iommu_intr.c Fri Apr 08 18:49:18 2011 +0200
@@ -117,8 +117,7 @@ static void update_intremap_entry_from_i
static void update_intremap_entry_from_ioapic(
int bdf,
struct amd_iommu *iommu,
- struct IO_APIC_route_entry *ioapic_rte,
- unsigned int rte_upper, unsigned int value)
+ struct IO_APIC_route_entry *ioapic_rte)
{
unsigned long flags;
u32* entry;
@@ -130,28 +129,26 @@ static void update_intremap_entry_from_i
req_id = get_intremap_requestor_id(bdf);
lock = get_intremap_lock(req_id);
- /* only remap interrupt vector when lower 32 bits in ioapic ire changed */
- if ( likely(!rte_upper) )
- {
- delivery_mode = rte->delivery_mode;
- vector = rte->vector;
- dest_mode = rte->dest_mode;
- dest = rte->dest.logical.logical_dest;
-
- spin_lock_irqsave(lock, flags);
- offset = get_intremap_offset(vector, delivery_mode);
- entry = (u32*)get_intremap_entry(req_id, offset);
-
- update_intremap_entry(entry, vector, delivery_mode, dest_mode, dest);
- spin_unlock_irqrestore(lock, flags);
-
- if ( iommu->enabled )
- {
- spin_lock_irqsave(&iommu->lock, flags);
- invalidate_interrupt_table(iommu, req_id);
- flush_command_buffer(iommu);
- spin_unlock_irqrestore(&iommu->lock, flags);
- }
+
+ delivery_mode = rte->delivery_mode;
+ vector = rte->vector;
+ dest_mode = rte->dest_mode;
+ dest = rte->dest.logical.logical_dest;
+
+ spin_lock_irqsave(lock, flags);
+
+ offset = get_intremap_offset(vector, delivery_mode);
+ entry = (u32*)get_intremap_entry(req_id, offset);
+ update_intremap_entry(entry, vector, delivery_mode, dest_mode, dest);
+
+ spin_unlock_irqrestore(lock, flags);
+
+ if ( iommu->enabled )
+ {
+ spin_lock_irqsave(&iommu->lock, flags);
+ invalidate_interrupt_table(iommu, req_id);
+ flush_command_buffer(iommu);
+ spin_unlock_irqrestore(&iommu->lock, flags);
}
}
@@ -199,7 +196,8 @@ int __init amd_iommu_setup_ioapic_remapp
spin_lock_irqsave(lock, flags);
offset = get_intremap_offset(vector, delivery_mode);
entry = (u32*)get_intremap_entry(req_id, offset);
- update_intremap_entry(entry, vector, delivery_mode, dest_mode, dest);
+ update_intremap_entry(entry, vector,
+ delivery_mode, dest_mode, dest);
spin_unlock_irqrestore(lock, flags);
if ( iommu->enabled )
@@ -217,16 +215,14 @@ void amd_iommu_ioapic_update_ire(
void amd_iommu_ioapic_update_ire(
unsigned int apic, unsigned int reg, unsigned int value)
{
- struct IO_APIC_route_entry ioapic_rte = { 0 };
- unsigned int rte_upper = (reg & 1) ? 1 : 0;
+ struct IO_APIC_route_entry old_rte = { 0 };
+ struct IO_APIC_route_entry new_rte = { 0 };
+ unsigned int rte_lo = (reg & 1) ? reg - 1 : reg;
int saved_mask, bdf;
struct amd_iommu *iommu;
- *IO_APIC_BASE(apic) = reg;
- *(IO_APIC_BASE(apic)+4) = value;
-
if ( !iommu_intremap )
- return;
+ goto done;
/* get device id of ioapic devices */
bdf = ioapic_bdf[IO_APIC_ID(apic)];
@@ -235,30 +231,47 @@ void amd_iommu_ioapic_update_ire(
{
AMD_IOMMU_DEBUG("Fail to find iommu for ioapic device id = 0x%x\n",
bdf);
- return;
- }
- if ( rte_upper )
- return;
-
- /* read both lower and upper 32-bits of rte entry */
- *IO_APIC_BASE(apic) = reg;
- *(((u32 *)&ioapic_rte) + 0) = *(IO_APIC_BASE(apic)+4);
- *IO_APIC_BASE(apic) = reg + 1;
- *(((u32 *)&ioapic_rte) + 1) = *(IO_APIC_BASE(apic)+4);
+ goto done;
+ }
+
+ /* Save io-apic rte lower 32 bits */
+ *IO_APIC_BASE(apic) = rte_lo;
+ *((u32 *)&old_rte) = *(IO_APIC_BASE(apic) + 4);
+ saved_mask = old_rte.mask;
+
+ if ( reg == rte_lo )
+ {
+ *((u32 *)&new_rte) = value;
+ /* read upper 32 bits from io-apic rte */
+ *IO_APIC_BASE(apic) = reg + 1;
+ *(((u32 *)&new_rte) + 1) = *(IO_APIC_BASE(apic) + 4);
+ }
+ else
+ {
+ *((u32 *)&new_rte) = *((u32 *)&old_rte);
+ *(((u32 *)&new_rte) + 1) = value;
+ }
/* mask the interrupt while we change the intremap table */
- saved_mask = ioapic_rte.mask;
- ioapic_rte.mask = 1;
- *IO_APIC_BASE(apic) = reg;
- *(IO_APIC_BASE(apic)+4) = *(((int *)&ioapic_rte)+0);
- ioapic_rte.mask = saved_mask;
-
- update_intremap_entry_from_ioapic(
- bdf, iommu, &ioapic_rte, rte_upper, value);
+ old_rte.mask = 1;
+ *IO_APIC_BASE(apic) = rte_lo;
+ *(IO_APIC_BASE(apic) + 4) = *((u32 *)&old_rte);
+
+ /* Update interrupt remapping entry */
+ update_intremap_entry_from_ioapic(bdf, iommu, &new_rte);
+
+ /* Update IO-APIC directly to avoid double writes */
+ if ( reg == rte_lo )
+ goto done;
/* unmask the interrupt after we have updated the intremap table */
- *IO_APIC_BASE(apic) = reg;
- *(IO_APIC_BASE(apic)+4) = *(((u32 *)&ioapic_rte)+0);
+ old_rte.mask = saved_mask;
+ *IO_APIC_BASE(apic) = rte_lo;
+ *(IO_APIC_BASE(apic) + 4) = *((u32 *)&old_rte);
+
+done:
+ /* Forward write access to IO-APIC */
+ __io_apic_write(apic, reg, value);
}
static void update_intremap_entry_from_msi_msg(
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
next prev parent reply other threads:[~2011-04-08 16:52 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-08 11:35 [PATCH] AMD IOMMU: Fix an interrupt remapping issue Wei Wang2
2011-04-08 13:43 ` Jan Beulich
2011-04-08 14:26 ` Wei Wang2
2011-04-08 14:39 ` Jan Beulich
2011-04-08 15:06 ` Wei Wang2
2011-04-08 16:52 ` Wei Wang2 [this message]
2011-04-11 7:23 ` [PATCH] AMD IOMMU: Fix an interrupt remapping issue (v2) Jan Beulich
2011-04-11 7:39 ` Wei Wang2
2011-04-11 10:31 ` [PATCH V3] AMD IOMMU: Fix an interrupt remapping issue Wei Wang2
2011-04-11 11:35 ` Jan Beulich
2011-07-19 9:37 ` George Dunlap
2011-07-19 9:59 ` George Dunlap
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201104081852.20738.wei.wang2@amd.com \
--to=wei.wang2@amd.com \
--cc=Boris.Ostrovsky@amd.com \
--cc=JBeulich@novell.com \
--cc=Wei.Huang2@amd.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).