linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Revert "intel_idle: Rescan "dead" SMT siblings during, initialization"
@ 2025-08-28 14:25 Christian Loehle
  2025-08-28 14:44 ` Rafael J. Wysocki
  0 siblings, 1 reply; 6+ messages in thread
From: Christian Loehle @ 2025-08-28 14:25 UTC (permalink / raw)
  To: Rafael J. Wysocki, linux-pm, linux-kernel@vger.kernel.org,
	Artem Bityutskiy

This reverts commit a430c11f401589a0f4f57fd398271a5d85142c7a.

Calling arch_cpu_rescan_dead_smt_siblings() in intel_idle_init with
boot parameter nosmt and maxcpus active hotplugged boot-offline CPUs
in (and leave them online) which weren't supposed to be online.

With the revert and nosmt and maxcpus=12 on a raptor lake:
cpu	online	capacity
cpu0	1	1009
cpu1	0	-
cpu2	1	1009
cpu3	0	-
cpu4	1	1009
cpu5	0	-
cpu6	1	1009
cpu7	0	-
cpu8	1	1024
cpu9	0	-
cpu10	1	1024
cpu11	0	-
cpu12	1	1009
cpu13	0	-
cpu14	1	1009
cpu15	0	-
cpu16	1	623
cpu17	1	623
cpu18	1	623
cpu19	1	623
cpu20	0	-
cpu21	0	-
cpu22	0	-
cpu23	0	-

Previously:
cpu	online	capacity
cpu0	1	1009
cpu1	0	-
cpu2	1	1009
cpu3	0	-
cpu4	1	1009
cpu5	0	-
cpu6	1	1009
cpu7	0	-
cpu8	1	1024
cpu9	0	-
cpu10	1	1024
cpu11	0	-
cpu12	1	1009
cpu13	0	-
cpu14	1	1009
cpu15	0	-
cpu16	1	623
cpu17	1	623
cpu18	1	623
cpu19	1	623
cpu20	1	623
cpu21	1	623
cpu22	1	623
cpu23	1	623

Signed-off-by: Christian Loehle <christian.loehle@arm.com>
---
Rafael, I don't immediately see how to fix this properly so I won't
try to, feel free to treat this as a bug report.


 drivers/idle/intel_idle.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index 91a7b7e7c0c8..a9c58395a425 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -2507,8 +2507,6 @@ static int __init intel_idle_init(void)
 	pr_debug("Local APIC timer is reliable in %s\n",
 		 boot_cpu_has(X86_FEATURE_ARAT) ? "all C-states" : "C1");
 
-	arch_cpu_rescan_dead_smt_siblings();
-
 	return 0;
 
 hp_setup_fail:
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] Revert "intel_idle: Rescan "dead" SMT siblings during, initialization"
  2025-08-28 14:25 [PATCH] Revert "intel_idle: Rescan "dead" SMT siblings during, initialization" Christian Loehle
@ 2025-08-28 14:44 ` Rafael J. Wysocki
  2025-08-28 16:13   ` Rafael J. Wysocki
  0 siblings, 1 reply; 6+ messages in thread
From: Rafael J. Wysocki @ 2025-08-28 14:44 UTC (permalink / raw)
  To: Christian Loehle
  Cc: Rafael J. Wysocki, linux-pm, linux-kernel@vger.kernel.org,
	Artem Bityutskiy

On Thu, Aug 28, 2025 at 4:26 PM Christian Loehle
<christian.loehle@arm.com> wrote:
>
> This reverts commit a430c11f401589a0f4f57fd398271a5d85142c7a.
>
> Calling arch_cpu_rescan_dead_smt_siblings() in intel_idle_init with
> boot parameter nosmt and maxcpus active hotplugged boot-offline CPUs
> in (and leave them online) which weren't supposed to be online.
>
> With the revert and nosmt and maxcpus=12 on a raptor lake:
> cpu     online  capacity
> cpu0    1       1009
> cpu1    0       -
> cpu2    1       1009
> cpu3    0       -
> cpu4    1       1009
> cpu5    0       -
> cpu6    1       1009
> cpu7    0       -
> cpu8    1       1024
> cpu9    0       -
> cpu10   1       1024
> cpu11   0       -
> cpu12   1       1009
> cpu13   0       -
> cpu14   1       1009
> cpu15   0       -
> cpu16   1       623
> cpu17   1       623
> cpu18   1       623
> cpu19   1       623
> cpu20   0       -
> cpu21   0       -
> cpu22   0       -
> cpu23   0       -
>
> Previously:
> cpu     online  capacity
> cpu0    1       1009
> cpu1    0       -
> cpu2    1       1009
> cpu3    0       -
> cpu4    1       1009
> cpu5    0       -
> cpu6    1       1009
> cpu7    0       -
> cpu8    1       1024
> cpu9    0       -
> cpu10   1       1024
> cpu11   0       -
> cpu12   1       1009
> cpu13   0       -
> cpu14   1       1009
> cpu15   0       -
> cpu16   1       623
> cpu17   1       623
> cpu18   1       623
> cpu19   1       623
> cpu20   1       623
> cpu21   1       623
> cpu22   1       623
> cpu23   1       623
>
> Signed-off-by: Christian Loehle <christian.loehle@arm.com>
> ---
> Rafael, I don't immediately see how to fix this properly so I won't
> try to, feel free to treat this as a bug report.

Sure, thanks for reporting this!

Well, I think that cpuhp_smt_enable() is missing a check.  It looks to
me like it should do the topology_is_primary_thread(cpu) check like
cpuhp_smt_disable().

I'll cut a test patch for this later.

>  drivers/idle/intel_idle.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
> index 91a7b7e7c0c8..a9c58395a425 100644
> --- a/drivers/idle/intel_idle.c
> +++ b/drivers/idle/intel_idle.c
> @@ -2507,8 +2507,6 @@ static int __init intel_idle_init(void)
>         pr_debug("Local APIC timer is reliable in %s\n",
>                  boot_cpu_has(X86_FEATURE_ARAT) ? "all C-states" : "C1");
>
> -       arch_cpu_rescan_dead_smt_siblings();
> -
>         return 0;
>
>  hp_setup_fail:
> --

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Revert "intel_idle: Rescan "dead" SMT siblings during, initialization"
  2025-08-28 14:44 ` Rafael J. Wysocki
@ 2025-08-28 16:13   ` Rafael J. Wysocki
  2025-08-28 19:06     ` Rafael J. Wysocki
  0 siblings, 1 reply; 6+ messages in thread
From: Rafael J. Wysocki @ 2025-08-28 16:13 UTC (permalink / raw)
  To: Christian Loehle; +Cc: linux-pm, linux-kernel@vger.kernel.org, Artem Bityutskiy

[-- Attachment #1: Type: text/plain, Size: 2350 bytes --]

On Thu, Aug 28, 2025 at 4:44 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Thu, Aug 28, 2025 at 4:26 PM Christian Loehle
> <christian.loehle@arm.com> wrote:
> >
> > This reverts commit a430c11f401589a0f4f57fd398271a5d85142c7a.
> >
> > Calling arch_cpu_rescan_dead_smt_siblings() in intel_idle_init with
> > boot parameter nosmt and maxcpus active hotplugged boot-offline CPUs
> > in (and leave them online) which weren't supposed to be online.
> >
> > With the revert and nosmt and maxcpus=12 on a raptor lake:
> > cpu     online  capacity
> > cpu0    1       1009
> > cpu1    0       -
> > cpu2    1       1009
> > cpu3    0       -
> > cpu4    1       1009
> > cpu5    0       -
> > cpu6    1       1009
> > cpu7    0       -
> > cpu8    1       1024
> > cpu9    0       -
> > cpu10   1       1024
> > cpu11   0       -
> > cpu12   1       1009
> > cpu13   0       -
> > cpu14   1       1009
> > cpu15   0       -
> > cpu16   1       623
> > cpu17   1       623
> > cpu18   1       623
> > cpu19   1       623
> > cpu20   0       -
> > cpu21   0       -
> > cpu22   0       -
> > cpu23   0       -
> >
> > Previously:
> > cpu     online  capacity
> > cpu0    1       1009
> > cpu1    0       -
> > cpu2    1       1009
> > cpu3    0       -
> > cpu4    1       1009
> > cpu5    0       -
> > cpu6    1       1009
> > cpu7    0       -
> > cpu8    1       1024
> > cpu9    0       -
> > cpu10   1       1024
> > cpu11   0       -
> > cpu12   1       1009
> > cpu13   0       -
> > cpu14   1       1009
> > cpu15   0       -
> > cpu16   1       623
> > cpu17   1       623
> > cpu18   1       623
> > cpu19   1       623
> > cpu20   1       623
> > cpu21   1       623
> > cpu22   1       623
> > cpu23   1       623
> >
> > Signed-off-by: Christian Loehle <christian.loehle@arm.com>
> > ---
> > Rafael, I don't immediately see how to fix this properly so I won't
> > try to, feel free to treat this as a bug report.
>
> Sure, thanks for reporting this!
>
> Well, I think that cpuhp_smt_enable() is missing a check.  It looks to
> me like it should do the topology_is_primary_thread(cpu) check like
> cpuhp_smt_disable().
>
> I'll cut a test patch for this later.

Something like the attached one, perhaps.  I haven't tested it yet,
but I'll do that later.

[-- Attachment #2: kernel-cpu-enable-smt.patch --]
[-- Type: text/x-patch, Size: 505 bytes --]

---
 kernel/cpu.c |    7 +++++++
 1 file changed, 7 insertions(+)

--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -2710,6 +2710,13 @@
 	cpu_maps_update_begin();
 	cpu_smt_control = CPU_SMT_ENABLED;
 	for_each_present_cpu(cpu) {
+		/*
+		 * Avoid accidentally onlining primary thread CPUs that have
+		 * been taken offline.
+		 */
+		if (topology_is_primary_thread(cpu))
+			continue;
+
 		/* Skip online CPUs and CPUs on offline nodes */
 		if (cpu_online(cpu) || !node_online(cpu_to_node(cpu)))
 			continue;

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Revert "intel_idle: Rescan "dead" SMT siblings during, initialization"
  2025-08-28 16:13   ` Rafael J. Wysocki
@ 2025-08-28 19:06     ` Rafael J. Wysocki
  2025-08-28 21:44       ` Christian Loehle
  0 siblings, 1 reply; 6+ messages in thread
From: Rafael J. Wysocki @ 2025-08-28 19:06 UTC (permalink / raw)
  To: Christian Loehle; +Cc: linux-pm, linux-kernel@vger.kernel.org, Artem Bityutskiy

On Thu, Aug 28, 2025 at 6:13 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Thu, Aug 28, 2025 at 4:44 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> >
> > On Thu, Aug 28, 2025 at 4:26 PM Christian Loehle
> > <christian.loehle@arm.com> wrote:
> > >
> > > This reverts commit a430c11f401589a0f4f57fd398271a5d85142c7a.
> > >
> > > Calling arch_cpu_rescan_dead_smt_siblings() in intel_idle_init with
> > > boot parameter nosmt and maxcpus active hotplugged boot-offline CPUs
> > > in (and leave them online) which weren't supposed to be online.
> > >
> > > With the revert and nosmt and maxcpus=12 on a raptor lake:
> > > cpu     online  capacity
> > > cpu0    1       1009
> > > cpu1    0       -
> > > cpu2    1       1009
> > > cpu3    0       -
> > > cpu4    1       1009
> > > cpu5    0       -
> > > cpu6    1       1009
> > > cpu7    0       -
> > > cpu8    1       1024
> > > cpu9    0       -
> > > cpu10   1       1024
> > > cpu11   0       -
> > > cpu12   1       1009
> > > cpu13   0       -
> > > cpu14   1       1009
> > > cpu15   0       -
> > > cpu16   1       623
> > > cpu17   1       623
> > > cpu18   1       623
> > > cpu19   1       623
> > > cpu20   0       -
> > > cpu21   0       -
> > > cpu22   0       -
> > > cpu23   0       -
> > >
> > > Previously:
> > > cpu     online  capacity
> > > cpu0    1       1009
> > > cpu1    0       -
> > > cpu2    1       1009
> > > cpu3    0       -
> > > cpu4    1       1009
> > > cpu5    0       -
> > > cpu6    1       1009
> > > cpu7    0       -
> > > cpu8    1       1024
> > > cpu9    0       -
> > > cpu10   1       1024
> > > cpu11   0       -
> > > cpu12   1       1009
> > > cpu13   0       -
> > > cpu14   1       1009
> > > cpu15   0       -
> > > cpu16   1       623
> > > cpu17   1       623
> > > cpu18   1       623
> > > cpu19   1       623
> > > cpu20   1       623
> > > cpu21   1       623
> > > cpu22   1       623
> > > cpu23   1       623
> > >
> > > Signed-off-by: Christian Loehle <christian.loehle@arm.com>
> > > ---
> > > Rafael, I don't immediately see how to fix this properly so I won't
> > > try to, feel free to treat this as a bug report.
> >
> > Sure, thanks for reporting this!
> >
> > Well, I think that cpuhp_smt_enable() is missing a check.  It looks to
> > me like it should do the topology_is_primary_thread(cpu) check like
> > cpuhp_smt_disable().
> >
> > I'll cut a test patch for this later.
>
> Something like the attached one, perhaps.  I haven't tested it yet,
> but I'll do that later.

Works here AFAICS, but my test system is not hybrid.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Revert "intel_idle: Rescan "dead" SMT siblings during, initialization"
  2025-08-28 19:06     ` Rafael J. Wysocki
@ 2025-08-28 21:44       ` Christian Loehle
  2025-08-29 17:49         ` Rafael J. Wysocki
  0 siblings, 1 reply; 6+ messages in thread
From: Christian Loehle @ 2025-08-28 21:44 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, linux-kernel@vger.kernel.org, Artem Bityutskiy

On 8/28/25 20:06, Rafael J. Wysocki wrote:
> On Thu, Aug 28, 2025 at 6:13 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>>
>> On Thu, Aug 28, 2025 at 4:44 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>>>
>>> On Thu, Aug 28, 2025 at 4:26 PM Christian Loehle
>>> <christian.loehle@arm.com> wrote:
>>>>
>>>> This reverts commit a430c11f401589a0f4f57fd398271a5d85142c7a.
>>>>
>>>> Calling arch_cpu_rescan_dead_smt_siblings() in intel_idle_init with
>>>> boot parameter nosmt and maxcpus active hotplugged boot-offline CPUs
>>>> in (and leave them online) which weren't supposed to be online.
>>>>
>>>> With the revert and nosmt and maxcpus=12 on a raptor lake:
>>>> cpu     online  capacity
>>>> cpu0    1       1009
>>>> cpu1    0       -
>>>> cpu2    1       1009
>>>> cpu3    0       -
>>>> cpu4    1       1009
>>>> cpu5    0       -
>>>> cpu6    1       1009
>>>> cpu7    0       -
>>>> cpu8    1       1024
>>>> cpu9    0       -
>>>> cpu10   1       1024
>>>> cpu11   0       -
>>>> cpu12   1       1009
>>>> cpu13   0       -
>>>> cpu14   1       1009
>>>> cpu15   0       -
>>>> cpu16   1       623
>>>> cpu17   1       623
>>>> cpu18   1       623
>>>> cpu19   1       623
>>>> cpu20   0       -
>>>> cpu21   0       -
>>>> cpu22   0       -
>>>> cpu23   0       -
>>>>
>>>> Previously:
>>>> cpu     online  capacity
>>>> cpu0    1       1009
>>>> cpu1    0       -
>>>> cpu2    1       1009
>>>> cpu3    0       -
>>>> cpu4    1       1009
>>>> cpu5    0       -
>>>> cpu6    1       1009
>>>> cpu7    0       -
>>>> cpu8    1       1024
>>>> cpu9    0       -
>>>> cpu10   1       1024
>>>> cpu11   0       -
>>>> cpu12   1       1009
>>>> cpu13   0       -
>>>> cpu14   1       1009
>>>> cpu15   0       -
>>>> cpu16   1       623
>>>> cpu17   1       623
>>>> cpu18   1       623
>>>> cpu19   1       623
>>>> cpu20   1       623
>>>> cpu21   1       623
>>>> cpu22   1       623
>>>> cpu23   1       623
>>>>
>>>> Signed-off-by: Christian Loehle <christian.loehle@arm.com>
>>>> ---
>>>> Rafael, I don't immediately see how to fix this properly so I won't
>>>> try to, feel free to treat this as a bug report.
>>>
>>> Sure, thanks for reporting this!
>>>
>>> Well, I think that cpuhp_smt_enable() is missing a check.  It looks to
>>> me like it should do the topology_is_primary_thread(cpu) check like
>>> cpuhp_smt_disable().
>>>
>>> I'll cut a test patch for this later.
>>
>> Something like the attached one, perhaps.  I haven't tested it yet,
>> but I'll do that later.
> 
> Works here AFAICS, but my test system is not hybrid.

Yep, on my end as well, thanks!
Tested-by: Christian Loehle <christian.loehle@arm.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Revert "intel_idle: Rescan "dead" SMT siblings during, initialization"
  2025-08-28 21:44       ` Christian Loehle
@ 2025-08-29 17:49         ` Rafael J. Wysocki
  0 siblings, 0 replies; 6+ messages in thread
From: Rafael J. Wysocki @ 2025-08-29 17:49 UTC (permalink / raw)
  To: Christian Loehle
  Cc: Rafael J. Wysocki, linux-pm, linux-kernel@vger.kernel.org,
	Artem Bityutskiy

On Thu, Aug 28, 2025 at 11:44 PM Christian Loehle
<christian.loehle@arm.com> wrote:
>
> On 8/28/25 20:06, Rafael J. Wysocki wrote:
> > On Thu, Aug 28, 2025 at 6:13 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> >>
> >> On Thu, Aug 28, 2025 at 4:44 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> >>>
> >>> On Thu, Aug 28, 2025 at 4:26 PM Christian Loehle
> >>> <christian.loehle@arm.com> wrote:
> >>>>
> >>>> This reverts commit a430c11f401589a0f4f57fd398271a5d85142c7a.
> >>>>
> >>>> Calling arch_cpu_rescan_dead_smt_siblings() in intel_idle_init with
> >>>> boot parameter nosmt and maxcpus active hotplugged boot-offline CPUs
> >>>> in (and leave them online) which weren't supposed to be online.
> >>>>
> >>>> With the revert and nosmt and maxcpus=12 on a raptor lake:
> >>>> cpu     online  capacity
> >>>> cpu0    1       1009
> >>>> cpu1    0       -
> >>>> cpu2    1       1009
> >>>> cpu3    0       -
> >>>> cpu4    1       1009
> >>>> cpu5    0       -
> >>>> cpu6    1       1009
> >>>> cpu7    0       -
> >>>> cpu8    1       1024
> >>>> cpu9    0       -
> >>>> cpu10   1       1024
> >>>> cpu11   0       -
> >>>> cpu12   1       1009
> >>>> cpu13   0       -
> >>>> cpu14   1       1009
> >>>> cpu15   0       -
> >>>> cpu16   1       623
> >>>> cpu17   1       623
> >>>> cpu18   1       623
> >>>> cpu19   1       623
> >>>> cpu20   0       -
> >>>> cpu21   0       -
> >>>> cpu22   0       -
> >>>> cpu23   0       -
> >>>>
> >>>> Previously:
> >>>> cpu     online  capacity
> >>>> cpu0    1       1009
> >>>> cpu1    0       -
> >>>> cpu2    1       1009
> >>>> cpu3    0       -
> >>>> cpu4    1       1009
> >>>> cpu5    0       -
> >>>> cpu6    1       1009
> >>>> cpu7    0       -
> >>>> cpu8    1       1024
> >>>> cpu9    0       -
> >>>> cpu10   1       1024
> >>>> cpu11   0       -
> >>>> cpu12   1       1009
> >>>> cpu13   0       -
> >>>> cpu14   1       1009
> >>>> cpu15   0       -
> >>>> cpu16   1       623
> >>>> cpu17   1       623
> >>>> cpu18   1       623
> >>>> cpu19   1       623
> >>>> cpu20   1       623
> >>>> cpu21   1       623
> >>>> cpu22   1       623
> >>>> cpu23   1       623
> >>>>
> >>>> Signed-off-by: Christian Loehle <christian.loehle@arm.com>
> >>>> ---
> >>>> Rafael, I don't immediately see how to fix this properly so I won't
> >>>> try to, feel free to treat this as a bug report.
> >>>
> >>> Sure, thanks for reporting this!
> >>>
> >>> Well, I think that cpuhp_smt_enable() is missing a check.  It looks to
> >>> me like it should do the topology_is_primary_thread(cpu) check like
> >>> cpuhp_smt_disable().
> >>>
> >>> I'll cut a test patch for this later.
> >>
> >> Something like the attached one, perhaps.  I haven't tested it yet,
> >> but I'll do that later.
> >
> > Works here AFAICS, but my test system is not hybrid.
>
> Yep, on my end as well, thanks!
> Tested-by: Christian Loehle <christian.loehle@arm.com>

Thanks for testing and let me submit a proper patch.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-08-29 17:50 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-28 14:25 [PATCH] Revert "intel_idle: Rescan "dead" SMT siblings during, initialization" Christian Loehle
2025-08-28 14:44 ` Rafael J. Wysocki
2025-08-28 16:13   ` Rafael J. Wysocki
2025-08-28 19:06     ` Rafael J. Wysocki
2025-08-28 21:44       ` Christian Loehle
2025-08-29 17:49         ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).