All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dengcheng Zhu <dzhu@wavecomp.com>
To: Paul Burton <paul.burton@mips.com>
Cc: pburton@wavecomp.com, ralf@linux-mips.org,
	linux-mips@linux-mips.org, rachel.mozes@intel.com
Subject: Re: [PATCH v3 1/6] MIPS: Make play_dead() work for kexec
Date: Mon, 30 Jul 2018 11:50:38 -0700	[thread overview]
Message-ID: <5B5F5DFE.9090702@wavecomp.com> (raw)
In-Reply-To: <20180724232355.z6j2wvs6srigr7kx@pburton-laptop>

Hi Paul,


Thanks for reviewing. Please see my comments below.


On 07/24/2018 04:23 PM, Paul Burton wrote:
> Hi Dengcheng,
>
> On Mon, Jul 23, 2018 at 07:48:14AM -0700, Dengcheng Zhu wrote:
>> Extract play_dead() from CONFIG_HOTPLUG_CPU and share with CONFIG_KEXEC.
>> Also, add one parameter to it to avoid doing unnecessary things in the case
>> of kexec.
> I'd prefer that we use a separate function to play_dead() for this, for
> example we could provide an implementation of crash_smp_send_stop() much
> like ARM's which invokes a machine_crash_nonpanic_core() function on all
> CPUs other than the crash CPU.

ARM's crash_smp_send_stop() that calls machine_crash_nonpanic_core() is called
in machine_crash_shutdown(), not in machine_kexec(). It's similar in the MIPS
case - default_machine_crash_shutdown().

>
> This would prevent the kexec/kdump functionality from depending on the
> board/platform specific play_dead(), and wouldn't need these changes to
> all of the implementations of play_dead().

The revised play_dead() is JUST to make sure we are turning off CPUs cleanly.
This function itself already hides board/platform details. So it seems a good
candidate for turning off CPUs for the target platform.

This function is called only in the newly created kexec_smp_reboot(), before
which cpu states have been saved.

>
> We should also be calling crash_save_cpu() on each CPU, which is a
> further difference from play_dead().

crash_save_cpu() is already called in machine_crash_shutdown(), which is
prior to machine_kexec().

>
>> This is needed to correctly support SMP new kernel in kexec. Before doing
>> this, all non-crashing CPUs are waiting for the reboot signal
>> (kexec_ready_to_reboot) to jump to kexec_start_address in kexec_smp_wait(),
>> which creates some problems like incorrect CPU topology upon booting from
>> new UP kernel,
> Do you know how that happens? I'd expect detecting topology not to
> depend upon what state CPUs are in. That should certainly be the case
> for smp-cps/CONFIG_MIPS_CPS which detects topology just by reading
> CM/CPC/GIC registers.

I didn't debug the topology issue. But it's related to the attempt of *all*
CPUs using the same code from kexec_start_address, thinking they are
dominating the system.

The cleanest way IMO is simple and what this patch is trying to do - turn off
CPUs and do a fresh SMP boot from c0v0.


>
>> sluggish performance in MT environment during and after reboot,
> The function running on non-crash CPUs would just need to execute a loop
> of wait instructions to avoid this.

Same as the topology problem, I didn't look into this performance issue. But
I did try using 'wait' on non-crash CPUs - it didn't work well. In the clean
way, this problem disappears as expected.

>
>> new SMP kernel not able to bring up secondary CPU etc.
> If the SMP implementation can reset CPUs then that ought not to happen,
> since no matter what the CPU was doing Linux should be able to cause it
> to reset & run some known piece of code. I'm not sure the current Octeon
> SMP code can do that, but there are patches in patchwork that look like
> they might (& patches to remove Octeon's current kexec/kdump code which
> suggests nobody cares much about it).
>
> I'd suggest we could perhaps add a boolean to struct plat_smp_ops to
> indicate whether kexec is supported, and start by setting it to true for
> cps_smp_ops. Then we can have machine_kexec_prepare() return an error if
> it finds !mp_ops->kexec_supported, and deal with enabling kexec per
> platform. I think this would be better than Kconfig because there are
> systems where we may use one of multiple SMP implementations - for
> example Malta might use smp-cps (which would be OK for kexec) or smp-cmp
> (which wouldn't). If we get to a point where all our SMP implementations
> can deal with kexec we could remove the field later.

This problem is also related to the original kexec boot mechanism. No
matter what SMP implementation it is, it should be good if we correctly turn
off CPUs and reboot the (SMP) system from the 1st CPU in its own
implementation.


Regards,

Dengcheng

---------------------------------------------------------------------------

*From:* Paul Burton <mailto:paul.burton@mips.com>
*Sent:* Tuesday, July 24, 2018 4:23PM
*To:* Dengcheng Zhu <mailto:dzhu@wavecomp.com>
*Cc:* Pburton <mailto:pburton@wavecomp.com>, Ralf 
<mailto:ralf@linux-mips.org>, Linux-mips 
<mailto:linux-mips@linux-mips.org>, Rachel.mozes 
<mailto:rachel.mozes@intel.com>
*Subject:* Re: [PATCH v3 1/6] MIPS: Make play_dead() work for kexec

Hi Dengcheng,

On Mon, Jul 23, 2018 at 07:48:14AM -0700, Dengcheng Zhu wrote:

> Extract play_dead() from CONFIG_HOTPLUG_CPU and share with CONFIG_KEXEC.
> Also, add one parameter to it to avoid doing unnecessary things in the case
> of kexec.

I'd prefer that we use a separate function to play_dead() for this, for
example we could provide an implementation of crash_smp_send_stop() much
like ARM's which invokes a machine_crash_nonpanic_core() function on all
CPUs other than the crash CPU.

This would prevent the kexec/kdump functionality from depending on the
board/platform specific play_dead(), and wouldn't need these changes to
all of the implementations of play_dead().

We should also be calling crash_save_cpu() on each CPU, which is a
further difference from play_dead().

> This is needed to correctly support SMP new kernel in kexec. Before doing
> this, all non-crashing CPUs are waiting for the reboot signal
> (kexec_ready_to_reboot) to jump to kexec_start_address in kexec_smp_wait(),
> which creates some problems like incorrect CPU topology upon booting from
> new UP kernel,

Do you know how that happens? I'd expect detecting topology not to
depend upon what state CPUs are in. That should certainly be the case
for smp-cps/CONFIG_MIPS_CPS which detects topology just by reading
CM/CPC/GIC registers.

> sluggish performance in MT environment during and after reboot,

The function running on non-crash CPUs would just need to execute a loop
of wait instructions to avoid this.

> new SMP kernel not able to bring up secondary CPU etc.

If the SMP implementation can reset CPUs then that ought not to happen,
since no matter what the CPU was doing Linux should be able to cause it
to reset & run some known piece of code. I'm not sure the current Octeon
SMP code can do that, but there are patches in patchwork that look like
they might (& patches to remove Octeon's current kexec/kdump code which
suggests nobody cares much about it).

I'd suggest we could perhaps add a boolean to struct plat_smp_ops to
indicate whether kexec is supported, and start by setting it to true for
cps_smp_ops. Then we can have machine_kexec_prepare() return an error if
it finds !mp_ops->kexec_supported, and deal with enabling kexec per
platform. I think this would be better than Kconfig because there are
systems where we may use one of multiple SMP implementations - for
example Malta might use smp-cps (which would be OK for kexec) or smp-cmp
(which wouldn't). If we get to a point where all our SMP implementations
can deal with kexec we could remove the field later.

Thanks,
     Paul

  reply	other threads:[~2018-07-30 18:50 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-23 14:48 [PATCH v3 0/6] MIPS: kexec/kdump: Fix smp reboot and other issues Dengcheng Zhu
2018-07-23 14:48 ` [PATCH v3 1/6] MIPS: Make play_dead() work for kexec Dengcheng Zhu
2018-07-24 23:23   ` Paul Burton
2018-07-30 18:50     ` Dengcheng Zhu [this message]
2018-08-30 23:32       ` Paul Burton
2018-08-31 17:04         ` Dengcheng Zhu
2018-07-23 14:48 ` [PATCH v3 2/6] MIPS: kexec: Let the new kernel handle all CPUs Dengcheng Zhu
2018-07-23 14:48 ` [PATCH v3 3/6] MIPS: kexec: Deprecate (relocated_)kexec_smp_wait Dengcheng Zhu
2018-07-23 14:48 ` [PATCH v3 4/6] MIPS: kexec: Do not flush system wide caches in machine_kexec() Dengcheng Zhu
2018-07-23 14:48 ` [PATCH v3 5/6] MIPS: kexec: Relax memory restriction Dengcheng Zhu
2018-07-23 14:48 ` [PATCH v3 6/6] MIPS: kexec: Use prepare method from generic platform as default option Dengcheng Zhu
2018-07-24 23:36   ` Paul Burton
2018-07-30 20:57     ` Dengcheng Zhu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5B5F5DFE.9090702@wavecomp.com \
    --to=dzhu@wavecomp.com \
    --cc=linux-mips@linux-mips.org \
    --cc=paul.burton@mips.com \
    --cc=pburton@wavecomp.com \
    --cc=rachel.mozes@intel.com \
    --cc=ralf@linux-mips.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.