qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: qemu-ppc@nongnu.org, qemu-devel@nongnu.org, clg@kaod.org,
	bharata@linux.vnet.ibm.com, benh@kernel.crashing.org
Subject: Re: [Qemu-devel] [PATCH v3] spapr: disable decrementer during reset
Date: Wed, 19 Jul 2017 09:20:52 +0530	[thread overview]
Message-ID: <87o9shm6eb.fsf@abhimanyu.i-did-not-set--mail-host-address--so-tickle-me> (raw)
In-Reply-To: <20170718065055.GG3140@umbus.fritz.box>

David Gibson <david@gibson.dropbear.id.au> writes:

> On Tue, Jul 18, 2017 at 10:53:01AM +0530, Nikunj A Dadhania wrote:
>> David Gibson <david@gibson.dropbear.id.au> writes:
>> 
>> > On Mon, Jul 17, 2017 at 09:46:39AM +0530, Nikunj A Dadhania wrote:
>> >> Rebooting a SMP TCG guest is broken for both single/multi threaded TCG.
>> >> 
>> >> When reset happens, all the CPUs are in halted state. First CPU is brought out
>> >> of reset and secondary CPUs would be initialized by the guest kernel using a
>> >> rtas call start-cpu.
>> >> 
>> >> However, in case of TCG, decrementer interrupts keep on coming and waking the
>> >> secondary CPUs up.
>> >> 
>> >> These secondary CPUs would see the decrementer interrupt pending, which makes
>> >> cpu::has_work() to bring them out of wait loop and start executing
>> >> tcg_exec_cpu().
>> >> 
>> >> The problem with this is all the CPUs wake up and start booting SLOF image,
>> >> causing the following exception(4 CPUs TCG VM):
>> >
>> > Ok, I'm still trying to understand why the behaviour on reboot is
>> > different from the first boot.
>> 
>> During first boot, the cpu is in the stopped state, so
>> cpus.c:cpu_thread_is_idle returns true and CPU remains in halted state
>> until rtas start-cpu. Therefore, we never check the cpu_has_work()
>> 
>> In case of reboot, all CPUs are resumed after reboot. So we check the
>> next condition cpu_has_work() in cpu_thread_is_idle(), where we see a
>> DECR interrupt and remove the CPU from halted state as the CPU has
>> work.
>
> Ok, so it sounds like we should set stopped on all the secondary CPUs
> on reset as well.  What's causing them to be resumed after the reset
> at the moment?

That is part of the main loop in vl.c, when reset is requested. All the
vcpus are paused (stopped == true) then system reset is issued, and all
cpus are resumed (stopped == false). Which is correct.

static bool main_loop_should_exit(void)
{
[...]
    request = qemu_reset_requested();
    if (request) {
        pause_all_vcpus();
        qemu_system_reset(request);
        resume_all_vcpus();
        if (!runstate_check(RUN_STATE_RUNNING) &&
                !runstate_check(RUN_STATE_INMIGRATE)) {
            runstate_set(RUN_STATE_PRELAUNCH);
        }
    }
[...]
}

The CPUs are in halted state, i.e. cpu::halted = 1. We have set
cpu::halted = 0 for the primary CPU, aka first_cpu, in
spapr.c::ppc_spapr_reset()

In case of TCG, we have a decr callback (per CPU) scheduled once the
machine is started which keeps on running and interrupting irrespective
of the state of the machine.

tb_env->decr_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, &cpu_ppc_decr_cb, cpu);

Which keeps on queueing the interrupt to the CPUs. All the other CPUs
are in a tight loop checking cpu_thread_is_idle(), which returns false
when it sees the decrementer interrupt, the cpu starts executing:

cpu_exec()
  -> cpu_handle_halt()
     -> sees cpu_has_work() and sets cpu->halted = 0

And the execution resumes, when it shouldnt have. Ideally, for secondary
CPUs, cpu::halted flag is cleared in rtas start-cpu call.

Initially, I thought we should not have interrupt during reset state.
That was the reason of ignoring decr interrupt when msr_ee was disabled
in my previous patch. BenH suggested that it is wrong from HW
perspective.

Regards,
Nikunj

  reply	other threads:[~2017-07-19  3:51 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-17  4:16 [Qemu-devel] [PATCH v3] spapr: disable decrementer during reset Nikunj A Dadhania
2017-07-18  4:46 ` David Gibson
2017-07-18  5:17   ` Nikunj A Dadhania
2017-07-18  5:23   ` Nikunj A Dadhania
2017-07-18  6:50     ` David Gibson
2017-07-19  3:50       ` Nikunj A Dadhania [this message]
2017-07-19  4:32         ` David Gibson
2017-09-14  6:02           ` Nikunj A Dadhania
2017-07-18  5:26   ` Nikunj A Dadhania
2017-07-18 11:01     ` Benjamin Herrenschmidt
2017-09-27 20:36 ` Cédric Le Goater

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87o9shm6eb.fsf@abhimanyu.i-did-not-set--mail-host-address--so-tickle-me \
    --to=nikunj@linux.vnet.ibm.com \
    --cc=benh@kernel.crashing.org \
    --cc=bharata@linux.vnet.ibm.com \
    --cc=clg@kaod.org \
    --cc=david@gibson.dropbear.id.au \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).