public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
From: sudeep.holla@arm.com (Sudeep Holla)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v3 5/7] arm64: smp: remove cpu and numa topology information when hotplugging out CPU
Date: Tue, 17 Jul 2018 15:05:36 +0100	[thread overview]
Message-ID: <20180717140536.GA27323@e107155-lin> (raw)
In-Reply-To: <CAMuHMdXw9My+P1UCJ19Qdjq4O9gY_0QVFZaw6C5Ytmpyac59Vw@mail.gmail.com>

On Tue, Jul 17, 2018 at 02:58:14PM +0200, Geert Uytterhoeven wrote:
> Hi Sudeep,
>
> On Fri, Jul 6, 2018 at 1:04 PM Sudeep Holla <sudeep.holla@arm.com> wrote:
> > We already repopulate the information on CPU hotplug-in, so we can safely
> > remove the CPU topology and NUMA cpumap information during CPU hotplug
> > out operation. This will help to provide the correct cpumask for
> > scheduler domains.
> >
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will.deacon@arm.com>
> > Tested-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
> > Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
> > Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
>
> This is now commit 7f9545aa1a91a9a4 ("arm64: smp: remove cpu and numa
> topology information when hotplugging out CPU") in arm64/for-next/core, to
> which I bisected a PSCI checker regression on systems with two CPU clusters.
>
> Dmesg on R-Car H3 (4xCA57+4xCA53) before/after:
>
>      psci_checker: PSCI checker started using 8 CPUs
>
> 8 CPU cores detected.
>
>      psci_checker: Starting hotplug tests
>      psci_checker: Trying to turn off and on again all CPUs
>      CPU1: shutdown
>      psci: CPU1 killed.
>      CPU2: shutdown
>      psci: CPU2 killed.
>     -NOHZ: local_softirq_pending 55
>      CPU3: shutdown
>      psci: CPU3 killed.
>     -NOHZ: local_softirq_pending 51
>      CPU4: shutdown
>      psci: CPU4 killed.
>      NOHZ: local_softirq_pending 55
>      CPU5: shutdown
>      psci: CPU5 killed.
>      NOHZ: local_softirq_pending 55
>      CPU6: shutdown
>      psci: CPU6 killed.
>      NOHZ: local_softirq_pending 55
>      CPU7: shutdown
>      psci: CPU7 killed.
>      Detected PIPT I-cache on CPU1
>      CPU1: Booted secondary processor 0x0000000001 [0x411fd073]
>      Detected PIPT I-cache on CPU2
>      CPU2: Booted secondary processor 0x0000000002 [0x411fd073]
>      Detected PIPT I-cache on CPU3
>      CPU3: Booted secondary processor 0x0000000003 [0x411fd073]
>      Detected VIPT I-cache on CPU4
>      CPU4: Booted secondary processor 0x0000000100 [0x410fd034]
>      cpufreq: cpufreq_online: CPU4: Running at unlisted freq: 1198080 KHz
>      cpufreq: cpufreq_online: CPU4: Unlisted initial frequency changed to: 1200000 KHz
>      Detected VIPT I-cache on CPU5
>      CPU5: Booted secondary processor 0x0000000101 [0x410fd034]
>      Detected VIPT I-cache on CPU6
>      CPU6: Booted secondary processor 0x0000000102 [0x410fd034]
>      Detected VIPT I-cache on CPU7
>      CPU7: Booted secondary processor 0x0000000103 [0x410fd034]
>
> All but CPU0 tested, as expected.
>

OK, does the firmware on this system not allow CPU0 to be hotplugged out ?

>     psci_checker: Trying to turn off and on again group 0 (CPUs 0-3)
>
> 4 big CPU cores detected.
>
>      CPU1: shutdown
>      psci: CPU1 killed.
>     -NOHZ: local_softirq_pending 55
>     +NOHZ: local_softirq_pending 51
>      CPU2: shutdown
>      psci: CPU2 killed.
>      NOHZ: local_softirq_pending 51
>      CPU3: shutdown
>      psci: CPU3 killed.
>      Detected PIPT I-cache on CPU1
>      CPU1: Booted secondary processor 0x0000000001 [0x411fd073]
>      Detected PIPT I-cache on CPU2
>      CPU2: Booted secondary processor 0x0000000002 [0x411fd073]
>      Detected PIPT I-cache on CPU3
>      CPU3: Booted secondary processor 0x0000000003 [0x411fd073]
>
> All but CPU0 tested, as expected.
>
>     psci_checker: Trying to turn off and on again group 1 (CPUs 4-7)
>
> 4 LITTLE CPU cores detected.
>
>      CPU4: shutdown
>      psci: CPU4 killed.
>      NOHZ: local_softirq_pending 55
>     -CPU5: shutdown
>     -psci: CPU5 killed.
>     -NOHZ: local_softirq_pending 55
>     -CPU6: shutdown
>     -psci: CPU6 killed.
>     -NOHZ: local_softirq_pending 55
>     -CPU7: shutdown
>     -psci: CPU7 killed.
>      Detected VIPT I-cache on CPU4
>      CPU4: Booted secondary processor 0x0000000100 [0x410fd034]
>     -cpufreq: cpufreq_online: CPU4: Running at unlisted freq: 1198080 KHz
>     -cpufreq: cpufreq_online: CPU4: Unlisted initial frequency changed to: 1200000 KHz
>     -Detected VIPT I-cache on CPU5
>     -CPU5: Booted secondary processor 0x0000000101 [0x410fd034]
>     -Detected VIPT I-cache on CPU6
>     -CPU6: Booted secondary processor 0x0000000102 [0x410fd034]
>     -Detected VIPT I-cache on CPU7
>     -CPU7: Booted secondary processor 0x0000000103 [0x410fd034]
>
> Woops, CPU5-7 are not tested.
>

I don't understand what you mean by that. From the logs, it looks fine.
What do you mean by "CPU5-7 are not tested" ?

>     psci_checker: Hotplug tests passed OK
>
>
> > --- a/arch/arm64/kernel/smp.c
> > +++ b/arch/arm64/kernel/smp.c
> > @@ -279,6 +279,9 @@ int __cpu_disable(void)
> >         if (ret)
> >                 return ret;
> >
> > +       remove_cpu_topology(cpu);
> > +       numa_remove_cpu(cpu);
> > +
> >         /*
> >          * Take this CPU offline.  Once we clear this, we can't return,
> >          * and we must not schedule until we're ready to give up the cpu.
>
> A simple revert is not sufficient, as that causes
>
>     watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [cpuhp/2:21]
>
> Do you have an idea how to fix this?

Sorry, but I am finding it hard to understand the issue from the log.
Also, it would be good to know your config. Is it defconfig - NUMA as before ?

--
Regards,
Sudeep

  reply	other threads:[~2018-07-17 14:05 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-06 11:02 [PATCH v3 0/7] arm64: numa/topology/smp: update the cpumasks for CPU hotplug Sudeep Holla
2018-07-06 11:02 ` [PATCH v3 1/7] arm64: topology: refactor reset_cpu_topology to add support for removing topology Sudeep Holla
2018-07-06 11:02 ` [PATCH v3 2/7] arm64: numa: separate out updates to percpu nodeid and NUMA node cpumap Sudeep Holla
2018-07-06 11:02 ` [PATCH v3 3/7] arm64: topology: add support to remove cpu topology sibling masks Sudeep Holla
2018-07-06 11:02 ` [PATCH v3 4/7] arm64: topology: restrict updating siblings_masks to online cpus only Sudeep Holla
2018-07-06 11:02 ` [PATCH v3 5/7] arm64: smp: remove cpu and numa topology information when hotplugging out CPU Sudeep Holla
2018-07-17 12:58   ` Geert Uytterhoeven
2018-07-17 14:05     ` Sudeep Holla [this message]
2018-07-17 15:06       ` Geert Uytterhoeven
2018-07-17 15:34         ` Sudeep Holla
2018-07-17 15:55           ` Sudeep Holla
2018-07-17 17:01             ` Sudeep Holla
2018-07-18  7:15               ` Geert Uytterhoeven
2018-07-18 10:33                 ` Sudeep Holla
2018-07-06 11:02 ` [PATCH v3 6/7] arm64: topology: rename llc_siblings to align with other struct members Sudeep Holla
2018-07-06 11:02 ` [PATCH v3 7/7] arm64: topology: re-introduce numa mask check for scheduler MC selection Sudeep Holla
2018-07-10 21:51 ` [PATCH v3 0/7] arm64: numa/topology/smp: update the cpumasks for CPU hotplug Jeremy Linton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180717140536.GA27323@e107155-lin \
    --to=sudeep.holla@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox