* Re: PROBLEM: Only one CPU active on Ultra 60 since ~4.8 (regression) [not found] <CADyTPEwt=ZNams+1bpMB1F9w_vUdPsGCt92DBQxxq_VtaLoTdw@mail.gmail.com> @ 2023-01-20 3:15 ` Nick Bowler 2023-01-21 13:31 ` Linux kernel regression tracking (Thorsten Leemhuis) 0 siblings, 1 reply; 10+ messages in thread From: Nick Bowler @ 2023-01-20 3:15 UTC (permalink / raw) To: sparclinux; +Cc: linux-kernel Hi, I'm resending this report CC'd to linux-kernel as there was no response on the sparclinux list. I tried 6.2-rc4 and there is no change in behaviour. Reverting the indicated commit still works to fix the problem. On 2022-07-12, Nick Bowler <nbowler@draconx.ca> wrote: > When using newer kernels on my Ultra 60 with dual 450MHz UltraSPARC-II > CPUs, I noticed that only CPU 0 comes up, while older kernels (including > 4.7) are working fine with both CPUs. > > I bisected the failure to this commit: > > 9b2f753ec23710aa32c0d837d2499db92fe9115b is the first bad commit > commit 9b2f753ec23710aa32c0d837d2499db92fe9115b > Author: Atish Patra <atish.patra@oracle.com> > Date: Thu Sep 15 14:54:40 2016 -0600 > > sparc64: Fix cpu_possible_mask if nr_cpus is set > > This is a small change that reverts very easily on top of 5.18: there is > just one trivial conflict. Once reverted, both CPUs work again. > > Maybe this is related to the fact that the CPUs on this system are > numbered CPU0 and CPU2 (there is no CPU1)? > > Here is /proc/cpuinfo on a working kernel: > > % cat /proc/cpuinfo > cpu : TI UltraSparc II (BlackBird) > fpu : UltraSparc II integrated FPU > pmu : ultra12 > prom : OBP 3.23.1 1999/07/16 12:08 > type : sun4u > ncpus probed : 2 > ncpus active : 2 > D$ parity tl1 : 0 > I$ parity tl1 : 0 > cpucaps : flush,stbar,swap,muldiv,v9,mul32,div32,v8plus,vis > Cpu0ClkTck : 000000001ad31b4f > Cpu2ClkTck : 000000001ad31b4f > MMU Type : Spitfire > MMU PGSZs : 8K,64K,512K,4MB > State: > CPU0: online > CPU2: online > > And on a broken kernel: > > % cat /proc/cpuinfo > cpu : TI UltraSparc II (BlackBird) > fpu : UltraSparc II integrated FPU > pmu : ultra12 > prom : OBP 3.23.1 1999/07/16 12:08 > type : sun4u > ncpus probed : 2 > ncpus active : 1 > D$ parity tl1 : 0 > I$ parity tl1 : 0 > cpucaps : flush,stbar,swap,muldiv,v9,mul32,div32,v8plus,vis > Cpu0ClkTck : 000000001ad31861 > MMU Type : Spitfire > MMU PGSZs : 8K,64K,512K,4MB > State: > CPU0: online > > Let me know if you need any more info. > > Thanks, > Nick ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PROBLEM: Only one CPU active on Ultra 60 since ~4.8 (regression) 2023-01-20 3:15 ` PROBLEM: Only one CPU active on Ultra 60 since ~4.8 (regression) Nick Bowler @ 2023-01-21 13:31 ` Linux kernel regression tracking (Thorsten Leemhuis) 2024-03-22 4:57 ` Nick Bowler 0 siblings, 1 reply; 10+ messages in thread From: Linux kernel regression tracking (Thorsten Leemhuis) @ 2023-01-21 13:31 UTC (permalink / raw) To: Nick Bowler, sparclinux Cc: linux-kernel, David S. Miller, Linux kernel regressions list CCing the sparc maintainer. Also CCing the regression list, as it should be in the loop for regressions: https://docs.kernel.org/admin-guide/reporting-regressions.html The the mail address of the culprit's author bounces. There is another Atish Patra still active; does anyone known if those two are the same person? Anyway, that's it from my side. Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr If I did something stupid, please tell me, as explained on that page. On 20.01.23 04:15, Nick Bowler wrote: > Hi, > > I'm resending this report CC'd to linux-kernel as there was no response > on the sparclinux list. > > I tried 6.2-rc4 and there is no change in behaviour. Reverting the > indicated commit still works to fix the problem. > > On 2022-07-12, Nick Bowler <nbowler@draconx.ca> wrote: >> When using newer kernels on my Ultra 60 with dual 450MHz UltraSPARC-II >> CPUs, I noticed that only CPU 0 comes up, while older kernels (including >> 4.7) are working fine with both CPUs. >> >> I bisected the failure to this commit: >> >> 9b2f753ec23710aa32c0d837d2499db92fe9115b is the first bad commit >> commit 9b2f753ec23710aa32c0d837d2499db92fe9115b >> Author: Atish Patra <atish.patra@oracle.com> >> Date: Thu Sep 15 14:54:40 2016 -0600 >> >> sparc64: Fix cpu_possible_mask if nr_cpus is set >> >> This is a small change that reverts very easily on top of 5.18: there is >> just one trivial conflict. Once reverted, both CPUs work again. >> >> Maybe this is related to the fact that the CPUs on this system are >> numbered CPU0 and CPU2 (there is no CPU1)? >> >> Here is /proc/cpuinfo on a working kernel: >> >> % cat /proc/cpuinfo >> cpu : TI UltraSparc II (BlackBird) >> fpu : UltraSparc II integrated FPU >> pmu : ultra12 >> prom : OBP 3.23.1 1999/07/16 12:08 >> type : sun4u >> ncpus probed : 2 >> ncpus active : 2 >> D$ parity tl1 : 0 >> I$ parity tl1 : 0 >> cpucaps : flush,stbar,swap,muldiv,v9,mul32,div32,v8plus,vis >> Cpu0ClkTck : 000000001ad31b4f >> Cpu2ClkTck : 000000001ad31b4f >> MMU Type : Spitfire >> MMU PGSZs : 8K,64K,512K,4MB >> State: >> CPU0: online >> CPU2: online >> >> And on a broken kernel: >> >> % cat /proc/cpuinfo >> cpu : TI UltraSparc II (BlackBird) >> fpu : UltraSparc II integrated FPU >> pmu : ultra12 >> prom : OBP 3.23.1 1999/07/16 12:08 >> type : sun4u >> ncpus probed : 2 >> ncpus active : 1 >> D$ parity tl1 : 0 >> I$ parity tl1 : 0 >> cpucaps : flush,stbar,swap,muldiv,v9,mul32,div32,v8plus,vis >> Cpu0ClkTck : 000000001ad31861 >> MMU Type : Spitfire >> MMU PGSZs : 8K,64K,512K,4MB >> State: >> CPU0: online >> >> Let me know if you need any more info. >> >> Thanks, >> Nick > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PROBLEM: Only one CPU active on Ultra 60 since ~4.8 (regression) 2023-01-21 13:31 ` Linux kernel regression tracking (Thorsten Leemhuis) @ 2024-03-22 4:57 ` Nick Bowler 2024-03-28 19:36 ` Linux regression tracking (Thorsten Leemhuis) 0 siblings, 1 reply; 10+ messages in thread From: Nick Bowler @ 2024-03-22 4:57 UTC (permalink / raw) To: Linux regressions mailing list; +Cc: linux-kernel, David S. Miller, sparclinux Hi, Just a friendly reminder that this issue still happens on Linux 6.8 and reverting commit 9b2f753ec237 as indicated below is still sufficient to resolve the problem. On 2023-01-21 08:31, Linux kernel regression tracking (Thorsten Leemhuis) wrote: > CCing the sparc maintainer. Also CCing the regression list, as it should > be in the loop for regressions: > https://docs.kernel.org/admin-guide/reporting-regressions.html > > The the mail address of the culprit's author bounces. There is another > Atish Patra still active; does anyone known if those two are the same > person? > > Anyway, that's it from my side. [...] > On 20.01.23 04:15, Nick Bowler wrote: >> Hi, >> >> I'm resending this report CC'd to linux-kernel as there was no response >> on the sparclinux list. >> >> I tried 6.2-rc4 and there is no change in behaviour. Reverting the >> indicated commit still works to fix the problem. >> >> On 2022-07-12, Nick Bowler <nbowler@draconx.ca> wrote: >>> When using newer kernels on my Ultra 60 with dual 450MHz UltraSPARC-II >>> CPUs, I noticed that only CPU 0 comes up, while older kernels (including >>> 4.7) are working fine with both CPUs. >>> >>> I bisected the failure to this commit: >>> >>> 9b2f753ec23710aa32c0d837d2499db92fe9115b is the first bad commit >>> commit 9b2f753ec23710aa32c0d837d2499db92fe9115b >>> Author: Atish Patra <atish.patra@oracle.com> >>> Date: Thu Sep 15 14:54:40 2016 -0600 >>> >>> sparc64: Fix cpu_possible_mask if nr_cpus is set >>> >>> This is a small change that reverts very easily on top of 5.18: there is >>> just one trivial conflict. Once reverted, both CPUs work again. >>> >>> Maybe this is related to the fact that the CPUs on this system are >>> numbered CPU0 and CPU2 (there is no CPU1)? >>> >>> Here is /proc/cpuinfo on a working kernel: >>> >>> % cat /proc/cpuinfo >>> cpu : TI UltraSparc II (BlackBird) >>> fpu : UltraSparc II integrated FPU >>> pmu : ultra12 >>> prom : OBP 3.23.1 1999/07/16 12:08 >>> type : sun4u >>> ncpus probed : 2 >>> ncpus active : 2 >>> D$ parity tl1 : 0 >>> I$ parity tl1 : 0 >>> cpucaps : flush,stbar,swap,muldiv,v9,mul32,div32,v8plus,vis >>> Cpu0ClkTck : 000000001ad31b4f >>> Cpu2ClkTck : 000000001ad31b4f >>> MMU Type : Spitfire >>> MMU PGSZs : 8K,64K,512K,4MB >>> State: >>> CPU0: online >>> CPU2: online >>> >>> And on a broken kernel: >>> >>> % cat /proc/cpuinfo >>> cpu : TI UltraSparc II (BlackBird) >>> fpu : UltraSparc II integrated FPU >>> pmu : ultra12 >>> prom : OBP 3.23.1 1999/07/16 12:08 >>> type : sun4u >>> ncpus probed : 2 >>> ncpus active : 1 >>> D$ parity tl1 : 0 >>> I$ parity tl1 : 0 >>> cpucaps : flush,stbar,swap,muldiv,v9,mul32,div32,v8plus,vis >>> Cpu0ClkTck : 000000001ad31861 >>> MMU Type : Spitfire >>> MMU PGSZs : 8K,64K,512K,4MB >>> State: >>> CPU0: online >>> >>> Let me know if you need any more info. >>> >>> Thanks, >>> Nick ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PROBLEM: Only one CPU active on Ultra 60 since ~4.8 (regression) 2024-03-22 4:57 ` Nick Bowler @ 2024-03-28 19:36 ` Linux regression tracking (Thorsten Leemhuis) 2024-03-28 20:09 ` Linus Torvalds 0 siblings, 1 reply; 10+ messages in thread From: Linux regression tracking (Thorsten Leemhuis) @ 2024-03-28 19:36 UTC (permalink / raw) To: Nick Bowler, Linux regressions mailing list Cc: linux-kernel, David S. Miller, sparclinux, Linus Torvalds [CCing Linus, in case I say something to his disliking] On 22.03.24 05:57, Nick Bowler wrote: > > Just a friendly reminder that this issue still happens on Linux 6.8 and > reverting commit 9b2f753ec237 as indicated below is still sufficient to > resolve the problem. FWIW, that commit 9b2f753ec23710 ("sparc64: Fix cpu_possible_mask if nr_cpus is set") is from v4.8. Reverting it after all that time might easily lead to even bigger trouble. That's why it might be better to handle this like a bug and not like a regression. At least unless we find someone to judge how likely such an outcome is. But it seems nobody really cared so far, so unless this mail makes someone act you might be out of luck. :-/ I wish it was different, but in the end we (including the maintainers) are all just volunteers here which you can only motivate or compel (up to some point) to look into some issue, but can not force to do so. Ciao, Thorsten > On 2023-01-21 08:31, Linux kernel regression tracking (Thorsten Leemhuis) wrote: >> CCing the sparc maintainer. Also CCing the regression list, as it should >> be in the loop for regressions: >> https://docs.kernel.org/admin-guide/reporting-regressions.html >> >> The the mail address of the culprit's author bounces. There is another >> Atish Patra still active; does anyone known if those two are the same >> person? >> >> Anyway, that's it from my side. > [...] >> On 20.01.23 04:15, Nick Bowler wrote: >>> Hi, >>> >>> I'm resending this report CC'd to linux-kernel as there was no response >>> on the sparclinux list. >>> >>> I tried 6.2-rc4 and there is no change in behaviour. Reverting the >>> indicated commit still works to fix the problem. >>> >>> On 2022-07-12, Nick Bowler <nbowler@draconx.ca> wrote: >>>> When using newer kernels on my Ultra 60 with dual 450MHz UltraSPARC-II >>>> CPUs, I noticed that only CPU 0 comes up, while older kernels (including >>>> 4.7) are working fine with both CPUs. >>>> >>>> I bisected the failure to this commit: >>>> >>>> 9b2f753ec23710aa32c0d837d2499db92fe9115b is the first bad commit >>>> commit 9b2f753ec23710aa32c0d837d2499db92fe9115b >>>> Author: Atish Patra <atish.patra@oracle.com> >>>> Date: Thu Sep 15 14:54:40 2016 -0600 >>>> >>>> sparc64: Fix cpu_possible_mask if nr_cpus is set >>>> >>>> This is a small change that reverts very easily on top of 5.18: there is >>>> just one trivial conflict. Once reverted, both CPUs work again. >>>> >>>> Maybe this is related to the fact that the CPUs on this system are >>>> numbered CPU0 and CPU2 (there is no CPU1)? >>>> >>>> Here is /proc/cpuinfo on a working kernel: >>>> >>>> % cat /proc/cpuinfo >>>> cpu : TI UltraSparc II (BlackBird) >>>> fpu : UltraSparc II integrated FPU >>>> pmu : ultra12 >>>> prom : OBP 3.23.1 1999/07/16 12:08 >>>> type : sun4u >>>> ncpus probed : 2 >>>> ncpus active : 2 >>>> D$ parity tl1 : 0 >>>> I$ parity tl1 : 0 >>>> cpucaps : flush,stbar,swap,muldiv,v9,mul32,div32,v8plus,vis >>>> Cpu0ClkTck : 000000001ad31b4f >>>> Cpu2ClkTck : 000000001ad31b4f >>>> MMU Type : Spitfire >>>> MMU PGSZs : 8K,64K,512K,4MB >>>> State: >>>> CPU0: online >>>> CPU2: online >>>> >>>> And on a broken kernel: >>>> >>>> % cat /proc/cpuinfo >>>> cpu : TI UltraSparc II (BlackBird) >>>> fpu : UltraSparc II integrated FPU >>>> pmu : ultra12 >>>> prom : OBP 3.23.1 1999/07/16 12:08 >>>> type : sun4u >>>> ncpus probed : 2 >>>> ncpus active : 1 >>>> D$ parity tl1 : 0 >>>> I$ parity tl1 : 0 >>>> cpucaps : flush,stbar,swap,muldiv,v9,mul32,div32,v8plus,vis >>>> Cpu0ClkTck : 000000001ad31861 >>>> MMU Type : Spitfire >>>> MMU PGSZs : 8K,64K,512K,4MB >>>> State: >>>> CPU0: online >>>> >>>> Let me know if you need any more info. >>>> >>>> Thanks, >>>> Nick > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PROBLEM: Only one CPU active on Ultra 60 since ~4.8 (regression) 2024-03-28 19:36 ` Linux regression tracking (Thorsten Leemhuis) @ 2024-03-28 20:09 ` Linus Torvalds 2024-03-28 21:08 ` Nick Bowler 2024-04-05 15:05 ` Andreas Larsson 0 siblings, 2 replies; 10+ messages in thread From: Linus Torvalds @ 2024-03-28 20:09 UTC (permalink / raw) To: Linux regressions mailing list, Andreas Larsson Cc: Nick Bowler, linux-kernel, David S. Miller, sparclinux On Thu, 28 Mar 2024 at 12:36, Linux regression tracking (Thorsten Leemhuis) <regressions@leemhuis.info> wrote: > > [CCing Linus, in case I say something to his disliking] > > On 22.03.24 05:57, Nick Bowler wrote: > > > > Just a friendly reminder that this issue still happens on Linux 6.8 and > > reverting commit 9b2f753ec237 as indicated below is still sufficient to > > resolve the problem. > > FWIW, that commit 9b2f753ec23710 ("sparc64: Fix cpu_possible_mask if > nr_cpus is set") is from v4.8. Reverting it after all that time might > easily lead to even bigger trouble. I'm definitely not reverting a patch from almost a decade ago as a regression. If it took that long to find, it can't be that critical of a regression. So yes, let's treat it as a regular bug. And let's bring in Andreas to the discussion too (although presumably he has seen it on the sparclinux mailing list). Andreas, if not, here's the link to lore for the beginning of the thread: https://lore.kernel.org/all/CADyTPEwt=ZNams+1bpMB1F9w_vUdPsGCt92DBQxxq_VtaLoTdw@mail.gmail.com/ And from a quick look I do think that commit is buggy, and yes, the fix probably is just be to revert it. As the original report makes clear, that commit 9b2f753ec23710 is clearly confused about the difference between "number of CPU's", and "index of CPU numbers". When that smp_fill_in_cpu_possible_map() does int possible_cpus = num_possible_cpus(); and then uses that to fill in &__cpu_possible_mask, that's completely nonsensical. Because we literally have #define cpu_possible_mask ((const struct cpumask *)&__cpu_possible_mask) #define num_possible_cpus() cpumask_weight(cpu_possible_mask) so it's reading cpu_possible_mask to figure out how many cpus it might have, and then using that number to set possibly *different* bits in the same bitmap that is just used to judge what the max number is. So I do think a revert is called for, but I'm not going to treat this as a regression, I'm going to just treat it as "sparc bug" and hope that the sparc people try to figure out why that crazy code was written. And maybe it made more sense back a decade ago than it does now. Andreas? Linus ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PROBLEM: Only one CPU active on Ultra 60 since ~4.8 (regression) 2024-03-28 20:09 ` Linus Torvalds @ 2024-03-28 21:08 ` Nick Bowler 2024-03-29 9:44 ` Sam Ravnborg 2024-04-05 15:05 ` Andreas Larsson 1 sibling, 1 reply; 10+ messages in thread From: Nick Bowler @ 2024-03-28 21:08 UTC (permalink / raw) To: Linus Torvalds Cc: linux-kernel, David S. Miller, sparclinux, Andreas Larsson, Linux regressions mailing list On 2024-03-28 16:09, Linus Torvalds wrote: > On Thu, 28 Mar 2024 at 12:36, Linux regression tracking (Thorsten > Leemhuis) <regressions@leemhuis.info> wrote: >> >> [CCing Linus, in case I say something to his disliking] >> >> On 22.03.24 05:57, Nick Bowler wrote: >>> >>> Just a friendly reminder that this issue still happens on Linux 6.8 and >>> reverting commit 9b2f753ec237 as indicated below is still sufficient to >>> resolve the problem. >> >> FWIW, that commit 9b2f753ec23710 ("sparc64: Fix cpu_possible_mask if >> nr_cpus is set") is from v4.8. Reverting it after all that time might >> easily lead to even bigger trouble. > > I'm definitely not reverting a patch from almost a decade ago as a regression. > > If it took that long to find, it can't be that critical of a regression. FWIW I'm not the first person to notice this problem. Searching the sparclinux archive for "ultra 60" which turns up this very similar report[1] from two years prior to mine which also went nowhere (sadly, this reporter did not perform a bisection to find the problematic commit -- perhaps because nobody asked). [1] https://lore.kernel.org/sparclinux/20201009161924.c8f031c079dd852941307870@gmx.de/ Cheers, Nick ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PROBLEM: Only one CPU active on Ultra 60 since ~4.8 (regression) 2024-03-28 21:08 ` Nick Bowler @ 2024-03-29 9:44 ` Sam Ravnborg 2024-03-29 20:11 ` Nick Bowler 0 siblings, 1 reply; 10+ messages in thread From: Sam Ravnborg @ 2024-03-29 9:44 UTC (permalink / raw) To: Nick Bowler Cc: Linus Torvalds, linux-kernel, David S. Miller, sparclinux, Andreas Larsson, Linux regressions mailing list Hi Nick, On Thu, Mar 28, 2024 at 05:08:50PM -0400, Nick Bowler wrote: > On 2024-03-28 16:09, Linus Torvalds wrote: > > On Thu, 28 Mar 2024 at 12:36, Linux regression tracking (Thorsten > > Leemhuis) <regressions@leemhuis.info> wrote: > >> > >> [CCing Linus, in case I say something to his disliking] > >> > >> On 22.03.24 05:57, Nick Bowler wrote: > >>> > >>> Just a friendly reminder that this issue still happens on Linux 6.8 and > >>> reverting commit 9b2f753ec237 as indicated below is still sufficient to > >>> resolve the problem. > >> > >> FWIW, that commit 9b2f753ec23710 ("sparc64: Fix cpu_possible_mask if > >> nr_cpus is set") is from v4.8. Reverting it after all that time might > >> easily lead to even bigger trouble. > > > > I'm definitely not reverting a patch from almost a decade ago as a regression. > > > > If it took that long to find, it can't be that critical of a regression. > > FWIW I'm not the first person to notice this problem. Searching the sparclinux > archive for "ultra 60" which turns up this very similar report[1] from two years > prior to mine which also went nowhere (sadly, this reporter did not perform a > bisection to find the problematic commit -- perhaps because nobody asked). > > [1] https://lore.kernel.org/sparclinux/20201009161924.c8f031c079dd852941307870@gmx.de/ I took a look at this and may have a fix. Could you try the following patch. It builds - but I have not tested it. Sam From a0fb7c6e6817849550d07b4c5a354ccc58382bc1 Mon Sep 17 00:00:00 2001 From: Sam Ravnborg <sam@ravnborg.org> Date: Fri, 29 Mar 2024 10:34:07 +0100 Subject: [PATCH] sparc64: Fix number of online CPUs Nick Bowler reported: When using newer kernels on my Ultra 60 with dual 450MHz UltraSPARC-II CPUs, I noticed that only CPU 0 comes up, while older kernels (including 4.7) are working fine with both CPUs. I bisected the failure to this commit: 9b2f753ec23710aa32c0d837d2499db92fe9115b is the first bad commit commit 9b2f753ec23710aa32c0d837d2499db92fe9115b Author: Atish Patra <atish.patra@oracle.com> Date: Thu Sep 15 14:54:40 2016 -0600 sparc64: Fix cpu_possible_mask if nr_cpus is set This is a small change that reverts very easily on top of 5.18: there is just one trivial conflict. Once reverted, both CPUs work again. Maybe this is related to the fact that the CPUs on this system are numbered CPU0 and CPU2 (there is no CPU1)? The current code that adjust cpu_possible based on nr_cpu_ids do not take into account that CPU's may not come one after each other. Move the check to the function that setup the cpu_possible mask so there is no need to adjust it later. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Reported-by: Nick Bowler <nbowler@draconx.ca> Cc: Andreas Larsson <andreas@gaisler.com> Cc: "David S. Miller" <davem@davemloft.net> --- arch/sparc/include/asm/smp_64.h | 2 -- arch/sparc/kernel/prom_64.c | 4 +++- arch/sparc/kernel/setup_64.c | 1 - arch/sparc/kernel/smp_64.c | 14 -------------- 4 files changed, 3 insertions(+), 18 deletions(-) diff --git a/arch/sparc/include/asm/smp_64.h b/arch/sparc/include/asm/smp_64.h index 505b6700805d..0964fede0b2c 100644 --- a/arch/sparc/include/asm/smp_64.h +++ b/arch/sparc/include/asm/smp_64.h @@ -47,7 +47,6 @@ void arch_send_call_function_ipi_mask(const struct cpumask *mask); int hard_smp_processor_id(void); #define raw_smp_processor_id() (current_thread_info()->cpu) -void smp_fill_in_cpu_possible_map(void); void smp_fill_in_sib_core_maps(void); void __noreturn cpu_play_dead(void); @@ -77,7 +76,6 @@ void __cpu_die(unsigned int cpu); #define smp_fill_in_sib_core_maps() do { } while (0) #define smp_fetch_global_regs() do { } while (0) #define smp_fetch_global_pmu() do { } while (0) -#define smp_fill_in_cpu_possible_map() do { } while (0) #define smp_init_cpu_poke() do { } while (0) #define scheduler_poke() do { } while (0) diff --git a/arch/sparc/kernel/prom_64.c b/arch/sparc/kernel/prom_64.c index 998aa693d491..ba82884cb92a 100644 --- a/arch/sparc/kernel/prom_64.c +++ b/arch/sparc/kernel/prom_64.c @@ -483,7 +483,9 @@ static void *record_one_cpu(struct device_node *dp, int cpuid, int arg) ncpus_probed++; #ifdef CONFIG_SMP set_cpu_present(cpuid, true); - set_cpu_possible(cpuid, true); + + if (num_possible_cpus() < nr_cpu_ids) + set_cpu_possible(cpuid, true); #endif return NULL; } diff --git a/arch/sparc/kernel/setup_64.c b/arch/sparc/kernel/setup_64.c index 6a4797dec34b..6bbe8e394ad3 100644 --- a/arch/sparc/kernel/setup_64.c +++ b/arch/sparc/kernel/setup_64.c @@ -671,7 +671,6 @@ void __init setup_arch(char **cmdline_p) paging_init(); init_sparc64_elf_hwcap(); - smp_fill_in_cpu_possible_map(); /* * Once the OF device tree and MDESC have been setup and nr_cpus has * been parsed, we know the list of possible cpus. Therefore we can diff --git a/arch/sparc/kernel/smp_64.c b/arch/sparc/kernel/smp_64.c index f3969a3600db..e50c38eba2b8 100644 --- a/arch/sparc/kernel/smp_64.c +++ b/arch/sparc/kernel/smp_64.c @@ -1220,20 +1220,6 @@ void __init smp_setup_processor_id(void) xcall_deliver_impl = hypervisor_xcall_deliver; } -void __init smp_fill_in_cpu_possible_map(void) -{ - int possible_cpus = num_possible_cpus(); - int i; - - if (possible_cpus > nr_cpu_ids) - possible_cpus = nr_cpu_ids; - - for (i = 0; i < possible_cpus; i++) - set_cpu_possible(i, true); - for (; i < NR_CPUS; i++) - set_cpu_possible(i, false); -} - void smp_fill_in_sib_core_maps(void) { unsigned int i; -- 2.34.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: PROBLEM: Only one CPU active on Ultra 60 since ~4.8 (regression) 2024-03-29 9:44 ` Sam Ravnborg @ 2024-03-29 20:11 ` Nick Bowler 2024-03-30 9:16 ` Sam Ravnborg 0 siblings, 1 reply; 10+ messages in thread From: Nick Bowler @ 2024-03-29 20:11 UTC (permalink / raw) To: Sam Ravnborg Cc: Linus Torvalds, linux-kernel, David S. Miller, sparclinux, Andreas Larsson, Linux regressions mailing list Hi Sam, On 2024-03-29 05:44, Sam Ravnborg wrote: > I took a look at this and may have a fix. Could you try the following > patch. It builds - but I have not tested it. With this patch applied on top of 6.9-rc1, both CPUs appear to come up: % cat /proc/cpuinfo [...] ncpus probed : 2 ncpus active : 2 [...] State: CPU0: online CPU2: online Thanks, Nick ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PROBLEM: Only one CPU active on Ultra 60 since ~4.8 (regression) 2024-03-29 20:11 ` Nick Bowler @ 2024-03-30 9:16 ` Sam Ravnborg 0 siblings, 0 replies; 10+ messages in thread From: Sam Ravnborg @ 2024-03-30 9:16 UTC (permalink / raw) To: Nick Bowler Cc: Linus Torvalds, linux-kernel, David S. Miller, sparclinux, Andreas Larsson, Linux regressions mailing list On Fri, Mar 29, 2024 at 04:11:06PM -0400, Nick Bowler wrote: > Hi Sam, > > On 2024-03-29 05:44, Sam Ravnborg wrote: > > I took a look at this and may have a fix. Could you try the following > > patch. It builds - but I have not tested it. > > With this patch applied on top of 6.9-rc1, both CPUs appear to come up: > > % cat /proc/cpuinfo > [...] > ncpus probed : 2 > ncpus active : 2 > [...] > State: > CPU0: online > CPU2: online Thanks, I will add a Tested-by: Nick Bowler <nbowler@draconx.ca> and submit the patch properly along with a few other sparc64 related fixes. Sam ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PROBLEM: Only one CPU active on Ultra 60 since ~4.8 (regression) 2024-03-28 20:09 ` Linus Torvalds 2024-03-28 21:08 ` Nick Bowler @ 2024-04-05 15:05 ` Andreas Larsson 1 sibling, 0 replies; 10+ messages in thread From: Andreas Larsson @ 2024-04-05 15:05 UTC (permalink / raw) To: Linus Torvalds, Linux regressions mailing list Cc: Nick Bowler, linux-kernel, David S. Miller, sparclinux, Sam Ravnborg On 2024-03-28 21:09, Linus Torvalds wrote: > On Thu, 28 Mar 2024 at 12:36, Linux regression tracking (Thorsten > Leemhuis) <regressions@leemhuis.info> wrote: >> >> [CCing Linus, in case I say something to his disliking] >> >> On 22.03.24 05:57, Nick Bowler wrote: >>> >>> Just a friendly reminder that this issue still happens on Linux 6.8 and >>> reverting commit 9b2f753ec237 as indicated below is still sufficient to >>> resolve the problem. >> >> FWIW, that commit 9b2f753ec23710 ("sparc64: Fix cpu_possible_mask if >> nr_cpus is set") is from v4.8. Reverting it after all that time might >> easily lead to even bigger trouble. > > I'm definitely not reverting a patch from almost a decade ago as a regression. > > If it took that long to find, it can't be that critical of a regression. > > So yes, let's treat it as a regular bug. And let's bring in Andreas to > the discussion too (although presumably he has seen it on the > sparclinux mailing list). Yes, I am aware and I agree we should treat it as a regular bug. Reverting it as a regression fix would lead to followup issues like canceling the effect of commit ebb99a4c12e4 ("sparc64: Fix irq stack bootmem allocation.") but with misleading comments left in place. Sam's fix looks like a good solution for me to pick up to my for-next branch. Thanks, Andreas ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-04-05 15:11 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CADyTPEwt=ZNams+1bpMB1F9w_vUdPsGCt92DBQxxq_VtaLoTdw@mail.gmail.com>
2023-01-20 3:15 ` PROBLEM: Only one CPU active on Ultra 60 since ~4.8 (regression) Nick Bowler
2023-01-21 13:31 ` Linux kernel regression tracking (Thorsten Leemhuis)
2024-03-22 4:57 ` Nick Bowler
2024-03-28 19:36 ` Linux regression tracking (Thorsten Leemhuis)
2024-03-28 20:09 ` Linus Torvalds
2024-03-28 21:08 ` Nick Bowler
2024-03-29 9:44 ` Sam Ravnborg
2024-03-29 20:11 ` Nick Bowler
2024-03-30 9:16 ` Sam Ravnborg
2024-04-05 15:05 ` Andreas Larsson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox