From: Marc Zyngier <maz@kernel.org>
To: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
Cc: <linuxarm@huawei.com>, Thomas Gleixner <tglx@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>, <linux-pm@vger.kernel.org>,
<loongarch@lists.linux.dev>, <linux-acpi@vger.kernel.org>,
<linux-arch@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
<linux-arm-kernel@lists.infradead.org>, <kvmarm@lists.linux.dev>,
<x86@kernel.org>, Russell King <linux@armlinux.org.uk>,
"Rafael J . Wysocki" <rafael@kernel.org>,
Miguel Luis <miguel.luis@oracle.com>,
"James Morse" <james.morse@arm.com>,
Salil Mehta <salil.mehta@huawei.com>,
Jean-Philippe Brucker <jean-philippe@linaro.org>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>, Ingo Molnar <mingo@redhat.com>,
Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>, <justin.he@arm.com>,
<jianyong.wu@arm.com>
Subject: Re: [PATCH v7 11/16] irqchip/gic-v3: Add support for ACPI's disabled but 'online capable' CPUs
Date: Fri, 26 Apr 2024 13:41:27 +0100 [thread overview]
Message-ID: <86bk5wqoso.wl-maz@kernel.org> (raw)
In-Reply-To: <20240425175502.00007def@huawei.com>
On Thu, 25 Apr 2024 17:55:27 +0100,
Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
>
> On Thu, 25 Apr 2024 16:00:17 +0100
> Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
>
> > On Thu, 25 Apr 2024 13:31:50 +0100
> > Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
> >
> > > On Wed, 24 Apr 2024 16:33:22 +0100
> > > Marc Zyngier <maz@kernel.org> wrote:
[...]
> > >
> > > > I'd expect the handling side to look like this (will not compile, but
> > > > you'll get the idea):
> > > Hi Marc,
> > >
> > > In general this looks good - but...
> > >
> > > I haven't gotten to the bottom of why yet (and it might be a side
> > > effect of how I hacked the test by lying in minimal fashion and
> > > just frigging the MADT read functions) but the hotplug flow is only getting
> > > as far as calling __cpu_up() before it seems to enter an infinite loop.
> > > That is it never gets far enough to fail this test.
> > >
> > > Getting stuck in a psci cpu_on call. I'm guessing something that
> > > we didn't get to in the earlier gicv3 calls before bailing out is blocking that?
> > > Looks like it gets to
> > > SMCCC smc
> > > and is never seen again.
> > >
> > > Any ideas on where to look? The one advantage so far of the higher level
> > > approach is we never tried the hotplug callbacks at all so avoided hitting
> > > that call. One (little bit horrible) solution that might avoid this would
> > > be to add another cpuhp state very early on and fail at that stage.
> > > I'm not keen on doing that without a better explanation than I have so far!
> >
> > Whilst it still doesn't work I suspect I'm loosing ability to print to the console
> > between that point and somewhat later and real problem is
> > elsewhere.
Sorry, travelling at the moment, so only spotted this now.
>
> Hi again,
>
> Found it I think. cpuhp calls between cpu:bringup and ap:online
> arm made from notify_cpu_starting() are clearly marked as nofail with a comment.
> STARTING must not fail!
>
> https://elixir.bootlin.com/linux/latest/source/kernel/cpu.c#L1642
Ah, now that rings a bell! ;-)
>
> Whilst I have no immediate idea why that comment is there it is pretty strong
> argument against trying to have the CPUHP_AP_IRQ_GIC_STARTING callback fail
> and expecting it to carry on working :(
> There would have been a nice print message, but given I don't appear to have
> a working console after that stage I never see it.
>
> So the best I have yet come up with for this is the option of a new callback registered
> in gic_smp_init()
>
> cpuhp_setup_state_nocalls(CPUHP_BP_PREPARE_DYN,
> "irqchip/arm/gicv3:checkrdist",
> gic_broken_rdist, NULL);
>
> with callback being simply
>
> static int gic_broken_rdist(unsigned int cpu)
> {
> if (cpumask_test_cpu(cpu, &broken_rdists))
> return -EINVAL;
>
> return 0;
> }
>
> That gets called cpuhp_up_callbacks() and is allows to fail and roll back the steps.
>
> Not particularly satisfying but keeps the logic confined to the gicv3 driver.
>
> What do you think?
Good enough for me. Cc me on the resulting patch when you repost it so
that I can eyeball it, but this is IMO the right direction.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
WARNING: multiple messages have this Message-ID (diff)
From: Marc Zyngier <maz@kernel.org>
To: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
Cc: <linuxarm@huawei.com>, Thomas Gleixner <tglx@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>, <linux-pm@vger.kernel.org>,
<loongarch@lists.linux.dev>, <linux-acpi@vger.kernel.org>,
<linux-arch@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
<linux-arm-kernel@lists.infradead.org>, <kvmarm@lists.linux.dev>,
<x86@kernel.org>, Russell King <linux@armlinux.org.uk>,
"Rafael J . Wysocki" <rafael@kernel.org>,
Miguel Luis <miguel.luis@oracle.com>,
"James Morse" <james.morse@arm.com>,
Salil Mehta <salil.mehta@huawei.com>,
Jean-Philippe Brucker <jean-philippe@linaro.org>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>, Ingo Molnar <mingo@redhat.com>,
Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>, <justin.he@arm.com>,
<jianyong.wu@arm.com>
Subject: Re: [PATCH v7 11/16] irqchip/gic-v3: Add support for ACPI's disabled but 'online capable' CPUs
Date: Fri, 26 Apr 2024 13:41:27 +0100 [thread overview]
Message-ID: <86bk5wqoso.wl-maz@kernel.org> (raw)
In-Reply-To: <20240425175502.00007def@huawei.com>
On Thu, 25 Apr 2024 17:55:27 +0100,
Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
>
> On Thu, 25 Apr 2024 16:00:17 +0100
> Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
>
> > On Thu, 25 Apr 2024 13:31:50 +0100
> > Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
> >
> > > On Wed, 24 Apr 2024 16:33:22 +0100
> > > Marc Zyngier <maz@kernel.org> wrote:
[...]
> > >
> > > > I'd expect the handling side to look like this (will not compile, but
> > > > you'll get the idea):
> > > Hi Marc,
> > >
> > > In general this looks good - but...
> > >
> > > I haven't gotten to the bottom of why yet (and it might be a side
> > > effect of how I hacked the test by lying in minimal fashion and
> > > just frigging the MADT read functions) but the hotplug flow is only getting
> > > as far as calling __cpu_up() before it seems to enter an infinite loop.
> > > That is it never gets far enough to fail this test.
> > >
> > > Getting stuck in a psci cpu_on call. I'm guessing something that
> > > we didn't get to in the earlier gicv3 calls before bailing out is blocking that?
> > > Looks like it gets to
> > > SMCCC smc
> > > and is never seen again.
> > >
> > > Any ideas on where to look? The one advantage so far of the higher level
> > > approach is we never tried the hotplug callbacks at all so avoided hitting
> > > that call. One (little bit horrible) solution that might avoid this would
> > > be to add another cpuhp state very early on and fail at that stage.
> > > I'm not keen on doing that without a better explanation than I have so far!
> >
> > Whilst it still doesn't work I suspect I'm loosing ability to print to the console
> > between that point and somewhat later and real problem is
> > elsewhere.
Sorry, travelling at the moment, so only spotted this now.
>
> Hi again,
>
> Found it I think. cpuhp calls between cpu:bringup and ap:online
> arm made from notify_cpu_starting() are clearly marked as nofail with a comment.
> STARTING must not fail!
>
> https://elixir.bootlin.com/linux/latest/source/kernel/cpu.c#L1642
Ah, now that rings a bell! ;-)
>
> Whilst I have no immediate idea why that comment is there it is pretty strong
> argument against trying to have the CPUHP_AP_IRQ_GIC_STARTING callback fail
> and expecting it to carry on working :(
> There would have been a nice print message, but given I don't appear to have
> a working console after that stage I never see it.
>
> So the best I have yet come up with for this is the option of a new callback registered
> in gic_smp_init()
>
> cpuhp_setup_state_nocalls(CPUHP_BP_PREPARE_DYN,
> "irqchip/arm/gicv3:checkrdist",
> gic_broken_rdist, NULL);
>
> with callback being simply
>
> static int gic_broken_rdist(unsigned int cpu)
> {
> if (cpumask_test_cpu(cpu, &broken_rdists))
> return -EINVAL;
>
> return 0;
> }
>
> That gets called cpuhp_up_callbacks() and is allows to fail and roll back the steps.
>
> Not particularly satisfying but keeps the logic confined to the gicv3 driver.
>
> What do you think?
Good enough for me. Cc me on the resulting patch when you repost it so
that I can eyeball it, but this is IMO the right direction.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2024-04-26 12:41 UTC|newest]
Thread overview: 134+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-18 13:53 [PATCH v7 00/16] ACPI/arm64: add support for virtual cpu hotplug Jonathan Cameron
2024-04-18 13:53 ` Jonathan Cameron
2024-04-18 13:53 ` [PATCH v7 01/16] ACPI: processor: Simplify initial onlining to use same path for cold and hotplug Jonathan Cameron
2024-04-18 13:53 ` Jonathan Cameron
2024-04-22 18:46 ` Rafael J. Wysocki
2024-04-22 18:46 ` Rafael J. Wysocki
2024-04-23 6:18 ` Hanjun Guo
2024-04-23 6:18 ` Hanjun Guo
2024-04-26 9:23 ` Gavin Shan
2024-04-26 9:23 ` Gavin Shan
2024-04-18 13:53 ` [PATCH v7 02/16] cpu: Do not warn on arch_register_cpu() returning -EPROBE_DEFER Jonathan Cameron
2024-04-18 13:53 ` Jonathan Cameron
2024-04-23 6:22 ` Hanjun Guo
2024-04-23 6:22 ` Hanjun Guo
2024-04-26 9:20 ` Gavin Shan
2024-04-26 9:20 ` Gavin Shan
2024-04-18 13:53 ` [PATCH v7 03/16] ACPI: processor: Drop duplicated check on _STA (enabled + present) Jonathan Cameron
2024-04-18 13:53 ` Jonathan Cameron
2024-04-22 18:48 ` Rafael J. Wysocki
2024-04-22 18:48 ` Rafael J. Wysocki
2024-04-23 6:49 ` Hanjun Guo
2024-04-23 6:49 ` Hanjun Guo
2024-04-23 9:31 ` Rafael J. Wysocki
2024-04-23 9:31 ` Rafael J. Wysocki
2024-04-23 11:13 ` Hanjun Guo
2024-04-23 11:13 ` Hanjun Guo
2024-04-26 9:24 ` Gavin Shan
2024-04-26 9:24 ` Gavin Shan
2024-04-18 13:54 ` [PATCH v7 04/16] ACPI: processor: Move checks and availability of acpi_processor earlier Jonathan Cameron
2024-04-18 13:54 ` Jonathan Cameron
2024-04-22 18:56 ` Rafael J. Wysocki
2024-04-22 18:56 ` Rafael J. Wysocki
2024-04-24 16:53 ` Jonathan Cameron
2024-04-24 16:53 ` Jonathan Cameron
2024-04-23 11:53 ` Hanjun Guo
2024-04-23 11:53 ` Hanjun Guo
2024-04-24 17:18 ` Jonathan Cameron
2024-04-24 17:18 ` Jonathan Cameron
2024-04-25 1:20 ` Hanjun Guo
2024-04-25 1:20 ` Hanjun Guo
2024-04-18 13:54 ` [PATCH v7 05/16] ACPI: processor: Add acpi_get_processor_handle() helper Jonathan Cameron
2024-04-18 13:54 ` Jonathan Cameron
2024-04-22 18:59 ` Rafael J. Wysocki
2024-04-22 18:59 ` Rafael J. Wysocki
2024-04-26 9:15 ` Gavin Shan
2024-04-26 9:15 ` Gavin Shan
2024-04-18 13:54 ` [PATCH v7 06/16] ACPI: processor: Register deferred CPUs from acpi_processor_get_info() Jonathan Cameron
2024-04-18 13:54 ` Jonathan Cameron
2024-04-22 19:02 ` Rafael J. Wysocki
2024-04-22 19:02 ` Rafael J. Wysocki
2024-04-23 11:58 ` Hanjun Guo
2024-04-23 11:58 ` Hanjun Guo
2024-04-26 9:18 ` Gavin Shan
2024-04-26 9:18 ` Gavin Shan
2024-04-18 13:54 ` [PATCH v7 07/16] ACPI: scan: switch to flags for acpi_scan_check_and_detach() Jonathan Cameron
2024-04-18 13:54 ` Jonathan Cameron
2024-04-22 19:05 ` Rafael J. Wysocki
2024-04-22 19:05 ` Rafael J. Wysocki
2024-04-23 12:02 ` Hanjun Guo
2024-04-23 12:02 ` Hanjun Guo
2024-04-26 9:25 ` Gavin Shan
2024-04-26 9:25 ` Gavin Shan
2024-04-18 13:54 ` [PATCH v7 08/16] ACPI: Add post_eject to struct acpi_scan_handler for cpu hotplug Jonathan Cameron
2024-04-18 13:54 ` Jonathan Cameron
2024-04-22 19:10 ` Rafael J. Wysocki
2024-04-22 19:10 ` Rafael J. Wysocki
2024-04-23 12:06 ` Hanjun Guo
2024-04-23 12:06 ` Hanjun Guo
2024-04-26 11:48 ` Jonathan Cameron
2024-04-26 11:48 ` Jonathan Cameron
2024-04-18 13:54 ` [PATCH v7 09/16] arm64: acpi: Move get_cpu_for_acpi_id() to a header Jonathan Cameron
2024-04-18 13:54 ` Jonathan Cameron
2024-04-22 10:46 ` Jonathan Cameron
2024-04-22 10:46 ` Jonathan Cameron
2024-04-23 12:10 ` Hanjun Guo
2024-04-23 12:10 ` Hanjun Guo
2024-04-18 13:54 ` [PATCH v7 10/16] irqchip/gic-v3: Don't return errors from gic_acpi_match_gicc() Jonathan Cameron
2024-04-18 13:54 ` Jonathan Cameron
2024-04-22 10:39 ` Jonathan Cameron
2024-04-22 10:39 ` Jonathan Cameron
2024-04-18 13:54 ` [PATCH v7 11/16] irqchip/gic-v3: Add support for ACPI's disabled but 'online capable' CPUs Jonathan Cameron
2024-04-18 13:54 ` Jonathan Cameron
2024-04-22 10:40 ` Jonathan Cameron
2024-04-22 10:40 ` Jonathan Cameron
2024-04-23 12:01 ` Marc Zyngier
2024-04-23 12:01 ` Marc Zyngier
2024-04-24 12:54 ` Jonathan Cameron
2024-04-24 12:54 ` Jonathan Cameron
2024-04-24 15:33 ` Marc Zyngier
2024-04-24 15:33 ` Marc Zyngier
2024-04-24 16:35 ` Salil Mehta
2024-04-24 16:35 ` Salil Mehta
2024-04-24 17:08 ` Jonathan Cameron
2024-04-24 17:08 ` Jonathan Cameron
2024-04-25 10:23 ` Jonathan Cameron
2024-04-25 10:23 ` Jonathan Cameron
2024-04-25 12:31 ` Jonathan Cameron
2024-04-25 12:31 ` Jonathan Cameron
2024-04-25 15:00 ` Jonathan Cameron
2024-04-25 15:00 ` Jonathan Cameron
2024-04-25 16:55 ` Jonathan Cameron
2024-04-25 16:55 ` Jonathan Cameron
2024-04-26 12:41 ` Marc Zyngier [this message]
2024-04-26 12:41 ` Marc Zyngier
2024-04-25 9:28 ` Jonathan Cameron
2024-04-25 9:28 ` Jonathan Cameron
2024-04-25 9:56 ` Jonathan Cameron
2024-04-25 9:56 ` Jonathan Cameron
2024-04-25 10:13 ` Jonathan Cameron
2024-04-25 10:13 ` Jonathan Cameron
2024-04-18 13:54 ` [PATCH v7 12/16] arm64: psci: Ignore DENIED CPUs Jonathan Cameron
2024-04-18 13:54 ` Jonathan Cameron
2024-04-22 10:44 ` Jonathan Cameron
2024-04-22 10:44 ` Jonathan Cameron
2024-04-26 9:36 ` Gavin Shan
2024-04-26 9:36 ` Gavin Shan
2024-04-26 9:57 ` Jonathan Cameron
2024-04-26 9:57 ` Jonathan Cameron
2024-04-18 13:54 ` [PATCH v7 13/16] arm64: arch_register_cpu() variant to check if an ACPI handle is now available Jonathan Cameron
2024-04-18 13:54 ` Jonathan Cameron
2024-04-18 13:54 ` [PATCH v7 14/16] arm64: Kconfig: Enable hotplug CPU on arm64 if ACPI_PROCESSOR is enabled Jonathan Cameron
2024-04-18 13:54 ` Jonathan Cameron
2024-04-24 17:24 ` Jonathan Cameron
2024-04-24 17:24 ` Jonathan Cameron
2024-04-18 13:54 ` [PATCH v7 15/16] arm64: document virtual CPU hotplug's expectations Jonathan Cameron
2024-04-18 13:54 ` Jonathan Cameron
2024-04-18 13:54 ` [PATCH v7 16/16] cpumask: Add enabled cpumask for present CPUs that can be brought online Jonathan Cameron
2024-04-18 13:54 ` Jonathan Cameron
2024-04-18 19:50 ` [PATCH v7 00/16] ACPI/arm64: add support for virtual cpu hotplug Rafael J. Wysocki
2024-04-18 19:50 ` Rafael J. Wysocki
2024-04-22 19:16 ` Rafael J. Wysocki
2024-04-22 19:16 ` Rafael J. Wysocki
2024-04-19 15:39 ` Miguel Luis
2024-04-19 15:39 ` Miguel Luis
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=86bk5wqoso.wl-maz@kernel.org \
--to=maz@kernel.org \
--cc=Jonathan.Cameron@Huawei.com \
--cc=bp@alien8.de \
--cc=catalin.marinas@arm.com \
--cc=dave.hansen@linux.intel.com \
--cc=james.morse@arm.com \
--cc=jean-philippe@linaro.org \
--cc=jianyong.wu@arm.com \
--cc=justin.he@arm.com \
--cc=kvmarm@lists.linux.dev \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=linux@armlinux.org.uk \
--cc=linuxarm@huawei.com \
--cc=loongarch@lists.linux.dev \
--cc=miguel.luis@oracle.com \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rafael@kernel.org \
--cc=salil.mehta@huawei.com \
--cc=tglx@linutronix.de \
--cc=will@kernel.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.