From: Suravee Suthikulanit <suravee.suthikulpanit@amd.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Tim Deegan <tim@xen.org>, xen-devel@lists.xen.org
Subject: Re: [PATCH 2/2 V2] iommu/amd: Workaround for erratum 787
Date: Mon, 10 Jun 2013 10:11:57 -0500 [thread overview]
Message-ID: <51B5ECBD.4070508@amd.com> (raw)
In-Reply-To: <51B5F7CA02000078000DCB26@nat28.tlf.novell.com>
On 6/10/2013 8:59 AM, Jan Beulich wrote:
>>>> On 10.06.13 at 15:53, Suravee Suthikulanit <suravee.suthikulpanit@amd.com>
> wrote:
>> On 6/10/2013 5:53 AM, Jan Beulich wrote:
>>>>>> On 10.06.13 at 12:40, Tim Deegan <tim@xen.org> wrote:
>>>> At 10:47 +0100 on 10 Jun (1370861263), Jan Beulich wrote:
>>>>>>>> On 10.06.13 at 11:35, Tim Deegan <tim@xen.org> wrote:
>>>>>> At 00:05 -0500 on 10 Jun (1370822751), suravee.suthikulpanit@amd.com wrote:
>>>>>>> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>>>>>>>
>>>>>>> The IOMMU interrupt handling in bottom half must clear the PPR log
>>>> interrupt
>>>>>>> and event log interrupt bits to re-enable the interrupt. This is done by
>>>>>>> writing 1 to the memory mapped register to clear the bit. Due to hardware
>>>>>> bug,
>>>>>>> if the driver tries to clear this bit while the IOMMU hardware also setting
>>>>>>> this bit, the conflict will result with the bit being set. If the interrupt
>>>>>>> handling code does not make sure to clear this bit, subsequent changes in
>>>>>> the
>>>>>>> event/PPR logs will no longer generating interrupts, and would result if
>>>>>>> buffer overflow. After clearing the bits, the driver must read back
>>>>>>> the register to verify.
>>>>>> Is there a risk of livelock here? That is, if some device is causing a
>>>>>> lot of IOMMU faults, a CPU could get stuck in this loop re-enabling
>>>>>> interrupts as fast as they are raised.
>>>>>>
>>>>>> The solution suggested in the erratum seems better: if the bit is set
>>>>>> after clearing, process the interrupts again (i.e. run/schedule the
>>>>>> top-half handler). That way the bottom-half handler will definitely
>>>>>> terminate and the system can make some progress.
>>>>> That's what's being done really: The actual interrupt handler disables
>>>>> the interrupt sources, and the tasklet re-enables them (or at least is
>>>>> supposed to do so - patch 1 isn't really correct in the respect).
>>>> Oh I see, the idea is that we use the control register to mask
>>>> interrupts instead of relying on the status register? That seems
>>>> better. But doesn't this IOMMU already mask further interrupts when the
>>>> pending bit in the status register is set? I can't see any wording
>>>> about that in the IOMMU doc but the erratum implies it. Suravee, do you
>>>> know if this is the case?
>>> Actually, the documentation has a subtle but perhaps important
>>> difference int the wording here: For EventLogInt and ComWaitInt
>>> is read "An interrupt is generated when <status bit> = 1b and MMIO
>>> Offset 0018h[<control bit>] = 1b", where as for PPRInt and GAInt
>>> it says "An interrupt is generated when <status bit> changes from 0b
>>> to 1b and MMIO Offset 0018h[<control bit>] = 1b".
>>>
>>> So I'd like to be one the safe side and assume further interrupts can
>>> be generated in all cases - see also the emulation behavior in
>>> iommu_guest.c which - afaict - always raises an interrupt, not just on
>>> a 0 -> 1 transition.
>> The behavior should be that the interrupt will be generated if the "xxxInt"
>> bit is 0. Once generated, it will be set to "1" by the hardware. If this
>> bit is 1, events will be added to the log but interrupt will not be
>> generated.
> "Should be" isn't enough here, even more so given the mentioned
> wording differences in your documentation. We need to know how
> actual (past, current, and future) hardware behaves.
>
> Jan
>
I am sure this is what the behavior of the hardware. Besides, only the
hardware can set this bit.
I have also tested by not clearing the bit, and basically I did not see
additional interrupts.
Suravee
next prev parent reply other threads:[~2013-06-10 15:11 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-10 5:05 [PATCH 1/2 V2] iommu/amd: Fix logic for clearing the IOMMU interrupt bits suravee.suthikulpanit
2013-06-10 5:05 ` [PATCH 2/2 V2] iommu/amd: Workaround for erratum 787 suravee.suthikulpanit
2013-06-10 9:35 ` Tim Deegan
2013-06-10 9:47 ` Jan Beulich
2013-06-10 10:40 ` Tim Deegan
2013-06-10 10:53 ` Jan Beulich
2013-06-10 12:43 ` Tim Deegan
2013-06-10 13:23 ` Jan Beulich
2013-06-10 13:41 ` Jan Beulich
2013-06-10 13:56 ` Tim Deegan
2013-06-10 13:55 ` Jan Beulich
2013-06-10 15:03 ` Jan Beulich
2013-06-10 16:31 ` Tim Deegan
2013-06-10 23:13 ` Suravee Suthikulanit
2013-06-11 6:45 ` Jan Beulich
2013-06-11 6:40 ` Jan Beulich
2013-06-11 8:53 ` Tim Deegan
2013-06-10 13:53 ` Suravee Suthikulanit
2013-06-10 13:59 ` Jan Beulich
2013-06-10 15:11 ` Suravee Suthikulanit [this message]
2013-06-10 15:21 ` Jan Beulich
2013-06-10 10:59 ` [PATCH 2/2 v3] " Jan Beulich
2013-06-11 6:47 ` [PATCH 2/2 v5] " Jan Beulich
2013-06-17 18:57 ` Suravee Suthikulanit
2013-06-10 10:05 ` [PATCH 1/2 V2] iommu/amd: Fix logic for clearing the IOMMU interrupt bits Jan Beulich
2013-06-10 10:56 ` [PATCH 1/2 v3] " Jan Beulich
2013-06-10 11:02 ` Jan Beulich
2013-06-10 12:18 ` Tim Deegan
2013-06-10 12:31 ` Jan Beulich
2013-06-10 13:58 ` Suravee Suthikulanit
2013-06-10 12:41 ` [PATCH 1/2 v4] " Jan Beulich
2013-06-10 12:46 ` Tim Deegan
2013-06-10 13:49 ` George Dunlap
2013-06-10 14:08 ` Jan Beulich
2013-06-11 6:47 ` [PATCH 1/2 v5] " Jan Beulich
2013-06-11 23:03 ` Suravee Suthikulanit
2013-06-12 6:24 ` Jan Beulich
2013-06-12 22:37 ` Suravee Suthikulpanit
2013-06-13 1:44 ` Suravee Suthikulpanit
2013-06-13 7:54 ` Jan Beulich
2013-06-13 13:48 ` Suravee Suthikulpanit
2013-06-13 14:20 ` George Dunlap
2013-06-13 14:30 ` Processed: " xen
2013-06-13 15:58 ` Jan Beulich
2013-06-13 16:34 ` Suravee Suthikulanit
2013-06-14 6:27 ` Jan Beulich
2013-06-14 6:40 ` Jan Beulich
2013-06-14 7:14 ` [PATCH] AMD IOMMU: make interrupt work again Jan Beulich
2013-06-14 16:10 ` Suravee Suthikulanit
2013-06-17 18:59 ` [PATCH 1/2 v5] iommu/amd: Fix logic for clearing the IOMMU interrupt bits Suravee Suthikulanit
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51B5ECBD.4070508@amd.com \
--to=suravee.suthikulpanit@amd.com \
--cc=JBeulich@suse.com \
--cc=tim@xen.org \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).