qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Cédric Le Goater" <clg@kaod.org>
To: Frederic Barrat <fbarrat@linux.ibm.com>, <danielhb413@gmail.com>,
	<qemu-ppc@nongnu.org>, <qemu-devel@nongnu.org>
Subject: Re: [PATCH] target/ppc: cpu_init: Clean up stop state on cpu reset
Date: Wed, 15 Jun 2022 11:40:55 +0200	[thread overview]
Message-ID: <20992f15-dd7f-f089-57e0-e934fb121d4a@kaod.org> (raw)
In-Reply-To: <3da1094b-b200-49ad-3a7c-dae31a7e7658@linux.ibm.com>

On 6/15/22 09:17, Frederic Barrat wrote:
> 
> 
> On 15/06/2022 07:23, Cédric Le Goater wrote:
>> On 6/14/22 10:29, Frederic Barrat wrote:
>>> The 'resume_as_sreset' attribute of a cpu can be set when a thread is
>>> entering a stop state on ppc books. It causes the thread to be
>>> re-routed to vector 0x100 when woken up by an exception. So it must be
>>> cleaned on reset or a thread might be re-routed unexpectedly after a
>>> reset, when it was not in a stop state and/or when the appropriate
>>> exception handler isn't set up yet.
>>
>> What is the test scenario ? and what are the symptoms ?
> 
> 
> I was hitting it because of another bug in skiboot: if you have many chips, we spend way too much time in add_opal_interrupts(), especially on powernv10 (I'm working on a separate patch in skiboot to fix that). Sufficiently so that the watchdog timer resets the system. When it happens, all the secondary threads are in stopped state, only the main thread is working. That's how I was reproducing.
> 
> What happens after the reset can vary a bit due to timing, but the most likely scenario is that we go through another primary thread election in skiboot. If the primary thread is the same as before, then there's no problem. If it's a different primary, then it will enter main_cpu_entry() while the other threads wait as secondaries. At some point, the primary thread (which still carries the wrong resume_as_sreset value from before reset) will enable the decrementer interrupt. The vector for the decrementer exception 0x900 is defined, so that shouldn't be a problem. However, because of the wrong resume_as_sreset value, it is re-routed to vector 0x100, which is still defined as the default boot-time handler, which is the entry point for BML. So the thread restarts as new, but this time it will be elected secondary. And we end up with all threads waiting as secondaries and a system stuck. All that happen before we've init the uart, so there's not a single trace on the console. 
> Fun :-)

Great analysis !

I think this deserve a v2 just to put in the commit log what you
just wrote :)

Thanks,

C.


      reply	other threads:[~2022-06-15  9:43 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-14  8:29 [PATCH] target/ppc: cpu_init: Clean up stop state on cpu reset Frederic Barrat
2022-06-14 12:44 ` Fabiano Rosas
2022-06-15  5:23 ` Cédric Le Goater
2022-06-15  7:17   ` Frederic Barrat
2022-06-15  9:40     ` Cédric Le Goater [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20992f15-dd7f-f089-57e0-e934fb121d4a@kaod.org \
    --to=clg@kaod.org \
    --cc=danielhb413@gmail.com \
    --cc=fbarrat@linux.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).