From: Thomas Gleixner <tglx@linutronix.de>
To: Pingfan Liu <kernelfans@gmail.com>, linux-kernel@vger.kernel.org
Cc: Pingfan Liu <kernelfans@gmail.com>,
Eric Biederman <ebiederm@xmission.com>,
Peter Zijlstra <peterz@infradead.org>,
Valentin Schneider <valentin.schneider@arm.com>,
Vincent Donnefort <vincent.donnefort@arm.com>,
Ingo Molnar <mingo@kernel.org>,
Mark Rutland <mark.rutland@arm.com>,
YueHaibing <yuehaibing@huawei.com>,
Baokun Li <libaokun1@huawei.com>,
Randy Dunlap <rdunlap@infradead.org>, Baoquan He <bhe@redhat.com>,
kexec@lists.infradead.org
Subject: Re: [PATCHv3 1/2] cpu/hotplug: Keep cpu hotplug disabled until the rebooting cpu is stable
Date: Mon, 09 May 2022 12:55:21 +0200 [thread overview]
Message-ID: <87ee13rn52.ffs@tglx> (raw)
In-Reply-To: <20220509041305.15056-2-kernelfans@gmail.com>
On Mon, May 09 2022 at 12:13, Pingfan Liu wrote:
> The following code chunk repeats in both
> migrate_to_reboot_cpu() and smp_shutdown_nonboot_cpus():
>
> if (!cpu_online(primary_cpu))
> primary_cpu = cpumask_first(cpu_online_mask);
>
> This is due to a breakage like the following:
I don't see what's broken here.
> kernel_kexec()
> migrate_to_reboot_cpu();
> cpu_hotplug_enable();
> -----------> comes a cpu_down(this_cpu) on other cpu
> machine_shutdown();
> smp_shutdown_nonboot_cpus(); // re-check "if (!cpu_online(primary_cpu))" to protect against the former breakin
>
> Although the kexec-reboot task can get through a cpu_down() on its cpu,
> this code looks a little confusing.
Confusing != broken.
> +/* primary_cpu keeps unchanged after migrate_to_reboot_cpu() */
This comment makes no sense.
> void smp_shutdown_nonboot_cpus(unsigned int primary_cpu)
> {
> unsigned int cpu;
> int error;
>
> + /*
> + * Block other cpu hotplug event, so primary_cpu is always online if
> + * it is not touched by us
> + */
> cpu_maps_update_begin();
> -
> /*
> - * Make certain the cpu I'm about to reboot on is online.
> - *
> - * This is inline to what migrate_to_reboot_cpu() already do.
> + * migrate_to_reboot_cpu() disables CPU hotplug assuming that
> + * no further code needs to use CPU hotplug (which is true in
> + * the reboot case). However, the kexec path depends on using
> + * CPU hotplug again; so re-enable it here.
You want to reduce confusion, but in reality this is even more confusing
than before.
> */
> - if (!cpu_online(primary_cpu))
> - primary_cpu = cpumask_first(cpu_online_mask);
> + __cpu_hotplug_enable();
How is this decrement solving anything? At the end of this function, the
counter is incremented again. So what's the point of this exercise?
> for_each_online_cpu(cpu) {
> if (cpu == primary_cpu)
> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> index 68480f731192..db4fa6b174e3 100644
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -1168,14 +1168,12 @@ int kernel_kexec(void)
> kexec_in_progress = true;
> kernel_restart_prepare("kexec reboot");
> migrate_to_reboot_cpu();
> -
> /*
> - * migrate_to_reboot_cpu() disables CPU hotplug assuming that
> - * no further code needs to use CPU hotplug (which is true in
> - * the reboot case). However, the kexec path depends on using
> - * CPU hotplug again; so re-enable it here.
> + * migrate_to_reboot_cpu() disables CPU hotplug. If an arch
> + * relies on the cpu teardown to achieve reboot, it needs to
> + * re-enable CPU hotplug there.
What does that for arch/powerpc/kernel/kexec_machine64.c now?
Nothing, as far as I can tell. Which means you basically reverted
011e4b02f1da ("powerpc, kexec: Fix "Processor X is stuck" issue during
kexec from ST mode") unless I'm completely confused.
> */
> - cpu_hotplug_enable();
This is tinkering at best. Can we please sit down and rethink this whole
machinery instead of applying random duct tape to it?
Thanks,
tglx
next prev parent reply other threads:[~2022-05-09 10:55 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-09 4:13 [PATCHv3 0/2] cpu/hotplug: Keep cpu hotplug disabled until the rebooting cpu is stable Pingfan Liu
2022-05-09 4:13 ` [PATCHv3 1/2] " Pingfan Liu
2022-05-09 10:55 ` Thomas Gleixner [this message]
2022-05-09 10:57 ` Thomas Gleixner
2022-05-10 3:38 ` Pingfan Liu
2022-05-10 8:28 ` Thomas Gleixner
2022-05-11 9:09 ` Pingfan Liu
2022-05-09 4:13 ` [PATCHv3 2/2] arm/arm64/ia64: kexec: fix the primary cpu passed to smp_shutdown_nonboot_cpus() Pingfan Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ee13rn52.ffs@tglx \
--to=tglx@linutronix.de \
--cc=bhe@redhat.com \
--cc=ebiederm@xmission.com \
--cc=kernelfans@gmail.com \
--cc=kexec@lists.infradead.org \
--cc=libaokun1@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=rdunlap@infradead.org \
--cc=valentin.schneider@arm.com \
--cc=vincent.donnefort@arm.com \
--cc=yuehaibing@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox