From: David Gibson <david@gibson.dropbear.id.au>
To: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Cc: qemu-ppc@nongnu.org, qemu-devel@nongnu.org, clg@kaod.org,
bharata@linux.vnet.ibm.com, benh@kernel.crashing.org
Subject: Re: [Qemu-devel] [PATCH v3] spapr: disable decrementer during reset
Date: Wed, 19 Jul 2017 14:32:50 +1000 [thread overview]
Message-ID: <20170719043250.GW3140@umbus.fritz.box> (raw)
In-Reply-To: <87o9shm6eb.fsf@abhimanyu.i-did-not-set--mail-host-address--so-tickle-me>
[-- Attachment #1: Type: text/plain, Size: 4412 bytes --]
On Wed, Jul 19, 2017 at 09:20:52AM +0530, Nikunj A Dadhania wrote:
> David Gibson <david@gibson.dropbear.id.au> writes:
>
> > On Tue, Jul 18, 2017 at 10:53:01AM +0530, Nikunj A Dadhania wrote:
> >> David Gibson <david@gibson.dropbear.id.au> writes:
> >>
> >> > On Mon, Jul 17, 2017 at 09:46:39AM +0530, Nikunj A Dadhania wrote:
> >> >> Rebooting a SMP TCG guest is broken for both single/multi threaded TCG.
> >> >>
> >> >> When reset happens, all the CPUs are in halted state. First CPU is brought out
> >> >> of reset and secondary CPUs would be initialized by the guest kernel using a
> >> >> rtas call start-cpu.
> >> >>
> >> >> However, in case of TCG, decrementer interrupts keep on coming and waking the
> >> >> secondary CPUs up.
> >> >>
> >> >> These secondary CPUs would see the decrementer interrupt pending, which makes
> >> >> cpu::has_work() to bring them out of wait loop and start executing
> >> >> tcg_exec_cpu().
> >> >>
> >> >> The problem with this is all the CPUs wake up and start booting SLOF image,
> >> >> causing the following exception(4 CPUs TCG VM):
> >> >
> >> > Ok, I'm still trying to understand why the behaviour on reboot is
> >> > different from the first boot.
> >>
> >> During first boot, the cpu is in the stopped state, so
> >> cpus.c:cpu_thread_is_idle returns true and CPU remains in halted state
> >> until rtas start-cpu. Therefore, we never check the cpu_has_work()
> >>
> >> In case of reboot, all CPUs are resumed after reboot. So we check the
> >> next condition cpu_has_work() in cpu_thread_is_idle(), where we see a
> >> DECR interrupt and remove the CPU from halted state as the CPU has
> >> work.
> >
> > Ok, so it sounds like we should set stopped on all the secondary CPUs
> > on reset as well. What's causing them to be resumed after the reset
> > at the moment?
>
> That is part of the main loop in vl.c, when reset is requested. All the
> vcpus are paused (stopped == true) then system reset is issued, and all
> cpus are resumed (stopped == false). Which is correct.
is it? Seems we have a different value of 'stopped' on the first boot
compared to reoboots, which doesn't seem right.
> static bool main_loop_should_exit(void)
> {
> [...]
> request = qemu_reset_requested();
> if (request) {
> pause_all_vcpus();
> qemu_system_reset(request);
> resume_all_vcpus();
> if (!runstate_check(RUN_STATE_RUNNING) &&
> !runstate_check(RUN_STATE_INMIGRATE)) {
> runstate_set(RUN_STATE_PRELAUNCH);
> }
> }
> [...]
> }
>
> The CPUs are in halted state, i.e. cpu::halted = 1. We have set
> cpu::halted = 0 for the primary CPU, aka first_cpu, in
> spapr.c::ppc_spapr_reset()
>
> In case of TCG, we have a decr callback (per CPU) scheduled once the
> machine is started which keeps on running and interrupting irrespective
> of the state of the machine.
Right. The thing is "halted" means waiting-for-interrupt; it's mostly
used for things like x86 hlt instruction, H_CEDE, short-term nap modes
and so forth. We're abusing it a little for keeping the secondary
CPUs offline until they're explicitly started.
Trouble is, there isn't a clearly better option - stopped is
automatically turned off after reset as above, so it can't be used for
"stopped under firmware / hypervisor authority".
> tb_env->decr_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, &cpu_ppc_decr_cb, cpu);
>
> Which keeps on queueing the interrupt to the CPUs. All the other CPUs
> are in a tight loop checking cpu_thread_is_idle(), which returns false
> when it sees the decrementer interrupt, the cpu starts executing:
>
> cpu_exec()
> -> cpu_handle_halt()
> -> sees cpu_has_work() and sets cpu->halted = 0
>
> And the execution resumes, when it shouldnt have. Ideally, for secondary
> CPUs, cpu::halted flag is cleared in rtas start-cpu call.
>
> Initially, I thought we should not have interrupt during reset state.
> That was the reason of ignoring decr interrupt when msr_ee was disabled
> in my previous patch. BenH suggested that it is wrong from HW
> perspective.
>
> Regards,
> Nikunj
>
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2017-07-19 4:33 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-17 4:16 [Qemu-devel] [PATCH v3] spapr: disable decrementer during reset Nikunj A Dadhania
2017-07-18 4:46 ` David Gibson
2017-07-18 5:17 ` Nikunj A Dadhania
2017-07-18 5:23 ` Nikunj A Dadhania
2017-07-18 6:50 ` David Gibson
2017-07-19 3:50 ` Nikunj A Dadhania
2017-07-19 4:32 ` David Gibson [this message]
2017-09-14 6:02 ` Nikunj A Dadhania
2017-07-18 5:26 ` Nikunj A Dadhania
2017-07-18 11:01 ` Benjamin Herrenschmidt
2017-09-27 20:36 ` Cédric Le Goater
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170719043250.GW3140@umbus.fritz.box \
--to=david@gibson.dropbear.id.au \
--cc=benh@kernel.crashing.org \
--cc=bharata@linux.vnet.ibm.com \
--cc=clg@kaod.org \
--cc=nikunj@linux.vnet.ibm.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).