xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Marek Marczykowski <marmarek@invisiblethingslab.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Jan Beulich <JBeulich@suse.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: High CPU temp, suspend problem - xen 4.1.5-pre, linux 3.7.x
Date: Wed, 27 Mar 2013 18:56:29 +0000	[thread overview]
Message-ID: <515340DD.7020102@citrix.com> (raw)
In-Reply-To: <51533771.3080808@invisiblethingslab.com>

On 27/03/2013 18:16, Marek Marczykowski wrote:
> On 27.03.2013 17:27, Andrew Cooper wrote:
>> On 27/03/2013 15:51, Marek Marczykowski wrote:
>>> On 27.03.2013 15:49, Marek Marczykowski wrote:
>>>> On 27.03.2013 15:46, Andrew Cooper wrote:
>>>>> As for locating the cause of the legacy vectors, it might be a good idea
>>>>> to stick a printk at the top of do_IRQ() which indicates an interrupt
>>>>> with vector between 0xe0 and 0xef.  This might at least indicate whether
>>>>> legacy vectors are genuinely being delivered, or whether we have some
>>>>> memory corruption causing these effects.
>>>> Ok, will try something like this.
>>> Nothing interesting here...
>>> Only vector 0xf1 for irq 4 and 0xf0 for irq 0 (which match irq dump information).
>>>
>> Even in the case where we hit the original assertion?
> Yes, even then.
>
>> If so, then all I can thing is that the move_pending flag for that
>> specific GSI has been corrupted in memory somehow.
> I guest this isn't the case, see below.
>
>> I wonder if hexdumping irq_desc[9] after setup, before sleep, on resume
>> and in the case of the assertion failure might give some hints.
> I've tried something like this. Detailed log here:
> http://duch.mimuw.edu.pl/~marmarek/qubes/xen-4.1-suspend-irq9-dump.log

This is concerning, unless I am getting utterly confused.  Jan: Do you
mind double checking my reasoning?

irq 0 through 15 should be the PIC irqs, set up in init_IRQ() in
arch/x86/i8259.c

irq9 should be the irq for the PIC vector which is set up as 0xe9, and
its vector should never change.

Could you put in extra checks for the sanity of per_cpu(vector_irq,
cpu)[0xe0 thru 0xef] ?

>
> Some interesing parts:
> after system startup:
> (XEN) irq_cfg of IRQ 9:
> (XEN)   vector: 138
> (XEN)   move_cleanup_count: 0x0
> (XEN)   move_in_progress: 0x0
> (XEN) irq_desc of IRQ 9:
> (XEN)   status: 80 (IRQ_GUEST | IRQ_PENDING)
>
> Isn't this wrong (status vs move_in_progress)?

This here looks fine.  What do you think is wrong about it?

>
> Then I've run pm-suspend, intentionally failed at the end to prevent actual
> suspend, but run all its hooks. After that:
> (XEN) irq_cfg of IRQ 9:
> (XEN)   vector: 181
> (XEN)   move_cleanup_count: 0x0
> (XEN)   move_in_progress: 0x1
> (XEN) irq_desc of IRQ 9:
> (XEN)   status: 80
>
> So now move_in_progress consistent with status.
> Wait few second, and still move_in_progress was 0x1. Isn't it supposed to be
> only temporary state?

move_in_progress gets set by __assign_irq_vector() when the scheduler
decides to move the IRQ.  It can stay set for a long time.

On the next interrupt from this source, the move_in_progress bit being
set causes the IRQ source to be reprogrammed to the new destination.

>
> Then suspended, at resume hit that bug. There was:
> (XEN) irq_cfg of IRQ 9:
> (XEN)   vector: 60
> (XEN)   move_cleanup_count: 0x0
> (XEN)   move_in_progress: 0x0
> (XEN) irq_desc of IRQ 9:
> (XEN)   status: 16
>
> move_in_progress==0, ok. But move_cleanup_count==0, while at least once was
> move_in_progress==1. Isn't that wrong?
>

move_cleanup_count is only set in send_cleanup_vector, for the specific
vector which is being cleaned up.

However, as the IPI handler cleans up all vectors which are outstanding,
the move_cleanup_count can be 0 for most vectors which are actually
cleaned up.

This is in an attempt to reduce the number of IPIs required to clean up
all moving irqs.  As the scheduler currently has a habit of moving vcpus
at every scheduling opportunity, this means that irqs are constantly moving.

~Andrew

  reply	other threads:[~2013-03-27 18:56 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-13 20:50 High CPU temp, suspend problem - xen 4.1.5-pre, linux 3.7.x Marek Marczykowski
2013-03-15  3:00 ` Dario Faggioli
2013-03-15  3:22   ` Marek Marczykowski
2013-03-15 13:02 ` Konrad Rzeszutek Wilk
2013-03-22 15:34   ` Marek Marczykowski
2013-03-22 16:56     ` Konrad Rzeszutek Wilk
2013-03-25 11:36       ` Marek Marczykowski
2013-03-25 14:17         ` Konrad Rzeszutek Wilk
2013-03-25 14:56           ` Marek Marczykowski
2013-03-26 12:17           ` Marek Marczykowski
2013-03-26 13:11             ` Jan Beulich
2013-03-26 13:50               ` Marek Marczykowski
2013-03-26 15:47                 ` Andrew Cooper
2013-03-26 16:12                   ` Andrew Cooper
2013-03-26 16:47                     ` Marek Marczykowski
2013-03-26 16:03                 ` Jan Beulich
2013-03-26 16:45                   ` Marek Marczykowski
2013-03-26 17:02                     ` Andrew Cooper
2013-03-26 17:42                       ` Marek Marczykowski
2013-03-26 17:54                         ` Andrew Cooper
2013-03-26 18:21                           ` Marek Marczykowski
2013-03-26 18:50                             ` Andrew Cooper
2013-03-27  8:50                               ` Marek Marczykowski
2013-03-27  8:58                                 ` Jan Beulich
2013-03-27  8:52                               ` Jan Beulich
2013-03-27  9:03                                 ` Jan Beulich
2013-03-27 14:01                                   ` Marek Marczykowski
2013-03-27 14:31                                 ` Marek Marczykowski
2013-03-27 14:46                                   ` Andrew Cooper
2013-03-27 14:49                                     ` Marek Marczykowski
2013-03-27 15:51                                       ` Marek Marczykowski
2013-03-27 16:27                                         ` Andrew Cooper
2013-03-27 18:16                                           ` Marek Marczykowski
2013-03-27 18:56                                             ` Andrew Cooper [this message]
2013-03-28 14:43                                               ` Marek Marczykowski
2013-03-28 10:50                                           ` Jan Beulich
2013-03-28 11:53                                             ` Andrew Cooper
2013-03-28 12:54                                               ` Jan Beulich
2013-03-28 13:19                                                 ` Jan Beulich
2013-03-27 14:52                                     ` Andrew Cooper
2013-03-27 15:47                                       ` Konrad Rzeszutek Wilk
2013-03-27 16:56                                         ` Andrew Cooper
2013-03-27 17:15                                           ` Marek Marczykowski
2013-03-28 17:41                                             ` Andrew Cooper
2013-03-28 17:44                                               ` Marek Marczykowski
2013-03-28 17:50                                                 ` Andrew Cooper
2013-03-29  0:26                                                   ` Marek Marczykowski
2013-03-28 16:13                                   ` Jan Beulich
2013-03-28 19:03                                     ` Marek Marczykowski
2013-04-01 13:53                                       ` Ben Guthro
2013-04-02  1:13                                         ` Marek Marczykowski
2013-04-02 14:05                                           ` Konrad Rzeszutek Wilk
2013-04-15 22:09                                           ` Marek Marczykowski
2013-04-15 23:36                                             ` Ben Guthro
2013-04-15 23:51                                               ` konrad wilk
2013-04-16  0:19                                                 ` Ben Guthro
2013-04-16  0:46                                                   ` Ben Guthro
2013-04-16  3:20                                                     ` konrad wilk
2013-04-16  1:02                                               ` Marek Marczykowski
2013-04-16  8:47                                             ` Jan Beulich
2013-04-16 11:49                                               ` Ben Guthro
2013-04-16 11:57                                                 ` Jan Beulich
2013-04-16 12:09                                                   ` Ben Guthro
2013-04-16 12:51                                                     ` Jan Beulich
2013-03-28 16:25                                   ` Jan Beulich
2013-03-28 16:31                                     ` Marek Marczykowski
2013-03-28 16:52                                       ` Jan Beulich
2013-03-28 17:09                                         ` Marek Marczykowski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=515340DD.7020102@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=konrad.wilk@oracle.com \
    --cc=marmarek@invisiblethingslab.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).