From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Marek Marczykowski <marmarek@invisiblethingslab.com>
Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>,
Jan Beulich <JBeulich@suse.com>,
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Subject: Re: High CPU temp, suspend problem - xen 4.1.5-pre, linux 3.7.x
Date: Thu, 28 Mar 2013 17:41:15 +0000 [thread overview]
Message-ID: <515480BB.6070309@citrix.com> (raw)
In-Reply-To: <51532927.8040908@invisiblethingslab.com>
On 27/03/2013 17:15, Marek Marczykowski wrote:
> On 27.03.2013 17:56, Andrew Cooper wrote:
>> On 27/03/2013 15:47, Konrad Rzeszutek Wilk wrote:
>>> On Wed, Mar 27, 2013 at 02:52:14PM +0000, Andrew Cooper wrote:
>>>> On 27/03/2013 14:46, Andrew Cooper wrote:
>>>>> On 27/03/2013 14:31, Marek Marczykowski wrote:
>>>>>> On 27.03.2013 09:52, Jan Beulich wrote:
>>>>>>>>>> On 26.03.13 at 19:50, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>>>>>>>> So vector e9 doesn't appear to be programmed in anywhere.
>>>>>>> Quite obviously, as it's the 8259A vector for IRQ 9. The question
>>>>>>> really is why an IRQ appears on that vector in the first place. The
>>>>>>> 8259A resume code _should_ leave all IRQs masked on a fully
>>>>>>> IO-APIC system (see my question raised yesterday).
>>>>>>>
>>>>>>> And that's also why I suggested, for an experiment, to fiddle with
>>>>>>> the loop exit condition to exclude legacy vectors (which wouldn't
>>>>>>> be a final solution, but would at least tell us whether the direction
>>>>>>> is the right one). In the end, besides understanding why an
>>>>>>> interrupt on vector E9 gets raised at all, we may also need to
>>>>>>> tweak the IRQ migration logic to not do anything on legacy IRQs,
>>>>>>> but that would need to happen earlier than in
>>>>>>> smp_irq_move_cleanup_interrupt(). Considering that 4.3
>>>>>>> apparently doesn't have this problem, we may need to go hunt for
>>>>>>> a change that isn't directly connected to this, yet deals with the
>>>>>>> problem as a side effect (at least I don't recall any particular fix
>>>>>>> since 4.2). One aspect here is the double mapping of legacy IRQs
>>>>>>> (once to their IO-APIC vector, and once to their legacy vector,
>>>>>>> i.e. vector_irq[] having two entries pointing to the same IRQ).
>>>>>> So tried change loop condition to LAST_DYNAMIC_VECTOR and it doesn't hit that
>>>>>> BUG/ASSERT. But still it doesn't work - only CPU0 used by scheduler, also some
>>>>>> errors from dom0 kernel, and errors about PCI devices used by domU(1).
>>>>>>
>>>>>> Messages from resume (different tries):
>>>>>> http://duch.mimuw.edu.pl/~marmarek/qubes/xen-4.1-last-dynamic-vector.log
>>>>>> http://duch.mimuw.edu.pl/~marmarek/qubes/xen-4.1-last-dynamic-vector2.log
>>>>>>
>>>>>> Also one time I've got fatal page fault error, earlier in resume (it isn't
>>>>>> deterministic):
>>>>>> http://duch.mimuw.edu.pl/~marmarek/qubes/xen-4.1-resume-page-fault.log
>>>>>>
>>>>> This pagefault is a Null structure pointer dereference, likely the
>>>>> scheduling data. At a first glance, it looks related to the assertion
>>>>> failures I have been seeing sporadically in testing, but unable to
>>>>> reproduce reliably. There seems to be something quite dodgy with
>>>>> interaction of vcpu_wake and scheduling loops.
>>>>>
>>>>> The other logs indicate that dom0 appears to have a domain id of 1,
>>>>> which is sure to cause problems.
>>>> Actually - ignore this
>>>>
>>>> >From the log,
>>>>
>>>> (XEN) physdev.c:153: dom0: can't create irq for msi!
>>>> [ 113.637037] xhci_hcd 0000:03:00.0: xen map irq failed -22 for 32752
>>>> domain
>>>> (XEN) physdev.c:153: dom0: can't create irq for msi!
>>>> [ 113.657911] xhci_hcd 0000:03:00.0: xen map irq failed -22 for 32752
>>>> domain
>>>>
>>>> and later
>>>>
>>>> (XEN) physdev.c:153: dom1: can't create irq for msi!
>>>> [ 121.909814] pciback 0000:00:19.0: xen map irq failed -22 for 1 domain
>>>> [ 121.954080] error enable msi for guest 1 status ffffffea
>>>> (XEN) physdev.c:153: dom1: can't create irq for msi!
>>>> [ 122.035355] pciback 0000:00:19.0: xen map irq failed -22 for 1 domain
>>>> [ 122.044421] error enable msi for guest 1 status ffffffea
>>>>
>>>> I think that there is a separate bug where mapped irqs are not unmapped
>>>> on the suspend path.
>>> You thinking this is a Linux (xen irq machinery) issue? Meaning it should
>>> end up calling PHYSDEV_unmap_pirq as part of the suspend process?
>> I am not sure. Without looking at the code, I am only speculating.
>>
>> Beyond that, the main question is about the expected behaviour. Do we
>> expect dom0/U to unmap its irqs and remap them after resume? What do we
>> expect from domains which are unaware of the host sleep action?
> BTW this is the case: domain 1 isn't fully aware of sleep. It have some PCI
> devices assigned. The only action taken there before suspend is shutdown
> network interfaces (without this system hanged during suspend).
>
What do you mean here by shutting down the network interfaces? Are the
devices being assigned back to dom0? Ifso, is dom0 assigning them back
to domU before the domU driver tries to set itself up?
~Andrew
next prev parent reply other threads:[~2013-03-28 17:41 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-03-13 20:50 High CPU temp, suspend problem - xen 4.1.5-pre, linux 3.7.x Marek Marczykowski
2013-03-15 3:00 ` Dario Faggioli
2013-03-15 3:22 ` Marek Marczykowski
2013-03-15 13:02 ` Konrad Rzeszutek Wilk
2013-03-22 15:34 ` Marek Marczykowski
2013-03-22 16:56 ` Konrad Rzeszutek Wilk
2013-03-25 11:36 ` Marek Marczykowski
2013-03-25 14:17 ` Konrad Rzeszutek Wilk
2013-03-25 14:56 ` Marek Marczykowski
2013-03-26 12:17 ` Marek Marczykowski
2013-03-26 13:11 ` Jan Beulich
2013-03-26 13:50 ` Marek Marczykowski
2013-03-26 15:47 ` Andrew Cooper
2013-03-26 16:12 ` Andrew Cooper
2013-03-26 16:47 ` Marek Marczykowski
2013-03-26 16:03 ` Jan Beulich
2013-03-26 16:45 ` Marek Marczykowski
2013-03-26 17:02 ` Andrew Cooper
2013-03-26 17:42 ` Marek Marczykowski
2013-03-26 17:54 ` Andrew Cooper
2013-03-26 18:21 ` Marek Marczykowski
2013-03-26 18:50 ` Andrew Cooper
2013-03-27 8:50 ` Marek Marczykowski
2013-03-27 8:58 ` Jan Beulich
2013-03-27 8:52 ` Jan Beulich
2013-03-27 9:03 ` Jan Beulich
2013-03-27 14:01 ` Marek Marczykowski
2013-03-27 14:31 ` Marek Marczykowski
2013-03-27 14:46 ` Andrew Cooper
2013-03-27 14:49 ` Marek Marczykowski
2013-03-27 15:51 ` Marek Marczykowski
2013-03-27 16:27 ` Andrew Cooper
2013-03-27 18:16 ` Marek Marczykowski
2013-03-27 18:56 ` Andrew Cooper
2013-03-28 14:43 ` Marek Marczykowski
2013-03-28 10:50 ` Jan Beulich
2013-03-28 11:53 ` Andrew Cooper
2013-03-28 12:54 ` Jan Beulich
2013-03-28 13:19 ` Jan Beulich
2013-03-27 14:52 ` Andrew Cooper
2013-03-27 15:47 ` Konrad Rzeszutek Wilk
2013-03-27 16:56 ` Andrew Cooper
2013-03-27 17:15 ` Marek Marczykowski
2013-03-28 17:41 ` Andrew Cooper [this message]
2013-03-28 17:44 ` Marek Marczykowski
2013-03-28 17:50 ` Andrew Cooper
2013-03-29 0:26 ` Marek Marczykowski
2013-03-28 16:13 ` Jan Beulich
2013-03-28 19:03 ` Marek Marczykowski
2013-04-01 13:53 ` Ben Guthro
2013-04-02 1:13 ` Marek Marczykowski
2013-04-02 14:05 ` Konrad Rzeszutek Wilk
2013-04-15 22:09 ` Marek Marczykowski
2013-04-15 23:36 ` Ben Guthro
2013-04-15 23:51 ` konrad wilk
2013-04-16 0:19 ` Ben Guthro
2013-04-16 0:46 ` Ben Guthro
2013-04-16 3:20 ` konrad wilk
2013-04-16 1:02 ` Marek Marczykowski
2013-04-16 8:47 ` Jan Beulich
2013-04-16 11:49 ` Ben Guthro
2013-04-16 11:57 ` Jan Beulich
2013-04-16 12:09 ` Ben Guthro
2013-04-16 12:51 ` Jan Beulich
2013-03-28 16:25 ` Jan Beulich
2013-03-28 16:31 ` Marek Marczykowski
2013-03-28 16:52 ` Jan Beulich
2013-03-28 17:09 ` Marek Marczykowski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=515480BB.6070309@citrix.com \
--to=andrew.cooper3@citrix.com \
--cc=JBeulich@suse.com \
--cc=konrad.wilk@oracle.com \
--cc=marmarek@invisiblethingslab.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).