From: Ben Guthro <ben@guthro.net>
To: Jan Beulich <JBeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
xiantao.zhang@intel.com, xen-devel <xen-devel@lists.xen.org>
Subject: Re: S3 crash with VTD Queue Invalidation enabled
Date: Wed, 5 Jun 2013 09:54:19 -0400 [thread overview]
Message-ID: <CAOvdn6XWGAoedf+sCAuc8NyRS7Jw8W--nG5N6+71UtzJeHDphg@mail.gmail.com> (raw)
In-Reply-To: <51AF11F102000078000DB589@nat28.tlf.novell.com>
On Wed, Jun 5, 2013 at 4:24 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 04.06.13 at 23:09, Ben Guthro <ben@guthro.net> wrote:
>> On Tue, Jun 4, 2013 at 3:49 PM, Ben Guthro <ben@guthro.net> wrote:
>>> On Tue, Jun 4, 2013 at 3:20 PM, Ben Guthro <ben@guthro.net> wrote:
>>>> On Tue, Jun 4, 2013 at 10:01 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>>> On 04.06.13 at 14:25, Ben Guthro <ben@guthro.net> wrote:
>>>>>> On Tue, Jun 4, 2013 at 4:54 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>> Is this perhaps having some similarity with
>>>>>>> http://lists.xen.org/archives/html/xen-devel/2013-04/msg00343.html?
>>>>>>> We're clearly running single-CPU only here and there...
>>>>>>
>>>>>> We certainly should be, as we have gone through the
>>>>>> disable_nonboot_cpus() by this point - and I can verify that from the
>>>>>> logs.
>>>>>
>>>>> I'm much more tending towards the connection here, noting that
>>>>> Andrew's original thread didn't really lead anywhere (i.e. we still
>>>>> don't know what the panic he saw was actually caused by).
>>>>>
>>>>
>>>> I'm starting to think you're on to something here.
>>>
>>> hmm - maybe not.
>>> I get the same crash with "maxcpus=1"
>>>
>>>
>>>
>>>> I've put a bunch of trace throughout the functions in qinval.c
>>>>
>>>> It seems that everything is functioning properly, up until we go
>>>> through the disable_nonboot_cpus() path.
>>>> Prior to this, I see the qinval.c functions being executed on all
>>>> cpus, and both drhd units
>>>> Afterward, it gets stuck in queue_invalidate_wait on the first drhd
>>>> unit.. and eventually panics.
>>>>
>>>> I'm not exactly sure what to make of this yet.
>>
>> querying status of the hardware all seems to be working correctly...
>> it just doesn't work with querying the QINVAL_STAT_DONE state, as far
>> as I can tell.
>>
>> Other register state is:
>>
>> (XEN) VER = 10
>> (XEN) CAP = c0000020e60262
>> (XEN) n_fault_reg = 1
>> (XEN) fault_recording_offset = 200
>> (XEN) fault_recording_reg_l = 0
>> (XEN) fault_recording_reg_h = 0
>> (XEN) ECAP = f0101a
>> (XEN) GCMD = 0
>> (XEN) GSTS = c7000000
>> (XEN) RTADDR = 137a31000
>> (XEN) CCMD = 800000000000000
>> (XEN) FSTS = 0
>> (XEN) FECTL = 0
>> (XEN) FEDATA = 4128
>> (XEN) FEADDR = fee0000c
>> (XEN) FEUADDR = 0
>>
>> (with code lifted from print_iommu_regs() )
>>
>>
>> None of this looks suspicious to my untrained eye - but I'm including
>> it here in case someone else sees something I don't.
>
> Xiantao, you certainly will want to give some advice here. I won't
> be able to look into this more deeply right away.
Thanks Jan. Xiantao - I'd appreciate any insight you may have.
One curious thing I have found, that seems buggy to me, is that
{dis,en}able_qinval() is being called prior to the platform quirks
being executed.
It appears they are being called through iommu_{en,dis}able_x2apic_IR()
However, when I try to put a BUG(), or dump_execution_state in that
code, it would not dump a stack.
I was going to put a platform quirk in, to detect, and disable qinval
on this platform, but it seems that may be too late in the process.
next prev parent reply other threads:[~2013-06-05 13:54 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-03 18:29 S3 crash with VTD Queue Invalidation enabled Ben Guthro
2013-06-03 19:22 ` Andrew Cooper
2013-06-04 8:54 ` Jan Beulich
2013-06-04 12:25 ` Ben Guthro
2013-06-04 14:01 ` Jan Beulich
2013-06-04 19:20 ` Ben Guthro
2013-06-04 19:49 ` Ben Guthro
2013-06-04 21:09 ` Ben Guthro
2013-06-05 8:24 ` Jan Beulich
2013-06-05 13:54 ` Ben Guthro [this message]
2013-06-05 15:14 ` Jan Beulich
2013-06-05 15:25 ` Ben Guthro
2013-06-05 15:38 ` Jan Beulich
2013-06-05 20:27 ` Ben Guthro
2013-06-05 23:53 ` Ben Guthro
2013-06-06 6:58 ` Jan Beulich
2013-06-06 15:06 ` Zhang, Xiantao
2013-06-06 15:07 ` Ben Guthro
2013-06-06 15:13 ` Zhang, Xiantao
2013-06-06 15:17 ` Ben Guthro
2013-06-07 1:33 ` Zhang, Xiantao
2013-06-07 15:52 ` Ben Guthro
2013-06-14 8:38 ` Jan Beulich
2013-06-14 17:01 ` Ben Guthro
2013-06-14 18:27 ` Ben Guthro
2013-06-17 7:23 ` Jan Beulich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAOvdn6XWGAoedf+sCAuc8NyRS7Jw8W--nG5N6+71UtzJeHDphg@mail.gmail.com \
--to=ben@guthro.net \
--cc=JBeulich@suse.com \
--cc=andrew.cooper3@citrix.com \
--cc=xen-devel@lists.xen.org \
--cc=xiantao.zhang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).