From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Jan Beulich" Subject: Re: [PATCH] AMD IOMMU: Fix an interrupt remapping issue Date: Fri, 08 Apr 2011 15:39:13 +0100 Message-ID: <4D9F3A31020000780003A9FA@vpn.id2.novell.com> References: <201104081335.36718.wei.wang2@amd.com> <4D9F2D3D020000780003A9AD@vpn.id2.novell.com> <201104081626.44096.wei.wang2@amd.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: <201104081626.44096.wei.wang2@amd.com> Content-Disposition: inline List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Wei Wang2 Cc: Boris Ostrovsky , Wei Huang2 , "xen-devel@lists.xensource.com" List-Id: xen-devel@lists.xenproject.org >>> On 08.04.11 at 16:26, Wei Wang2 wrote: > On Friday 08 April 2011 15:43:57 Jan Beulich wrote: >> >>> On 08.04.11 at 13:35, Wei Wang2 wrote: >> > >> > Some device could generate bogus interrupts if an IO-APIC RTE and an >> > iommu interrupt remapping entry are not consistent during 2 adjacent >> > 64bits IO-APIC RTE updates. For example, if the 2nd operation updates >> > destination bits in RTE for SATA device and unmask it, in some case, = SATA >> > device will assert ioapic pin to generate interrupt immediately using = new >> > destination but iommu could still translate it into the old destinatio= n, >> > then dom0 would be confused. To fix that, we sync up interrupt = remapping >> > entry with IO-APIC IRE on every 32 bits operation and foward IOAPIC = RTE >> > updates after interrupt remapping table has been changed. >> >> I don't think this is correct: Without the patch, the filling of = ioapic_rte >> takes into account the value already written. Now that you only write >> the value at the end of the function, you should overwrite the >> affected half with "value" immediately before calling >> update_intremap_entry_from_ioapic().=20 > Sorry, not quite understand your point. My thought is, no matter dom0 = tried=20 > to=20 > updates lower half or upper half of RTE, we always updates interrupt = table=20 > from the lower half. This will keep iommu table strictly identically to = RTE.=20 > The old code has an assumption that both lower half and upper of RTE = should=20 > be updated together. But this might not be always true. If by incident, = dom0=20 > only updates the upper half and we don't sync iommu with it, then the=20 > destination in RTE and iommu table will be different. No, that's not my point. The problem I'm seeing is that you pass the old value (as read from the IO-APIC) to update_intremap_entry_from_ioapic(), but the function certainly should use the to-be-written one. Previously this was implicit because the IO-APIC register write happened first. >> Eliminating the double write if reg =3D=3D rte_lo would also seem = desirable >> (and in no case should you write back the old value after having called >> update_intremap_entry_from_ioapic()). >=20 > It not a write back, It just finishes IO-APIC RTE writes. After = updating=20 > interrupt remapping table we still have to update RTE. It is just a copy = of=20 > __io_apic_write (maybe I should just call it). Old code updates = ioapic=20 > earlier than interrupt remapping table and sata device might generate=20 > interrupt right after this, which is not expected. No. If reg =3D=3D ret_lo, you write that IO-APIC register twice, which is pointless. With the other problem unaddressed, you actually first write back the old value (with the mask bit restored), which gets IO-APIC and remapping tables out of sync for a brief period of time (which is a problem by itself), then write the new value. With the other problem addressed, you would simply write the new value twice, which is wasteful given that these writes are uncached. Jan