* ARM64: CPU Hotplug: can't enable more cpus than maxcpus value (kernel 4.5)
@ 2016-03-03 14:17 Vadim Lomovtsev
2016-03-03 14:42 ` Mark Rutland
0 siblings, 1 reply; 8+ messages in thread
From: Vadim Lomovtsev @ 2016-03-03 14:17 UTC (permalink / raw)
To: linux-arm-kernel
Hi all,
While working with system which has 48 CPUs per one socket it was found that it is not possible to enable (turn on) more CPUs than 'maxcpus' value. The kernel booted with maxcpus=4 argument:
[root at localhost ~]# cat /proc/cmdline
BOOT_IMAGE=/boot/Image-4.5-rc5 root=UUID=9cece803-ce48-4b84-ae29-1091591c9edc ro crashkernel=2048M at 0M console=ttyAMA0,115200n8 LANG=en_US.UTF-8 earlycon=pl011,0x87e024000000 debug maxcpus=4
Check the overall number of CPUs and online CPUs:
[root at localhost ~]# cat /sys/devices/system/cpu/possible
0-47
[root at localhost ~]# cat /sys/devices/system/cpu/online
0-3
Now trying to turn on cpu which is out of scope of initial maxcpus value:
[root at localhost ~]# echo 1 > /sys/devices/system/cpu/cpu4/online
-bash: echo: write error: Invalid argument
Which happens because of cpu_present_mask is set to 0x3 and prevents running more than 'maxcpus' cpus.
[root at localhost ~]# cat /sys/devices/system/cpu/present
0-3
For cpu0-3 it works fine:
[root at localhost ~]# echo 0 > /sys/devices/system/cpu/cpu3/online
[root at localhost ~]# echo 1 > /sys/devices/system/cpu/cpu3/online
At the same time at the Documentation/cpu-hotplug.txt it is described that:
> maxcpus=n Restrict boot time cpus to n. Say if you have 4 cpus, using
> maxcpus=2 will only boot 2. You can choose to bring the
> other cpus later online, read FAQ's for more info.
and
> cpu_present_mask: Bitmap of CPUs currently present in the system. Not all
> of them may be online. When physical hotplug is processed by the relevant
> subsystem (e.g ACPI) can change and new bit either be added or removed
> from the map depending on the event is hot-add/hot-remove. There are currently
> no locking rules as of now. Typical usage is to init topology during boot,
> at which time hotplug is disabled.
So after system start, according to docs, it should be possible to run more CPUs than 'maxcpus' value and cpu_present_mask should contain mask for all CPUs which currently present at the system.
However for arm64 it is implemented that cpu_present_mask is explicetely set accordingly to 'maxcpus' value. Is it design intent ?
Vadim
^ permalink raw reply [flat|nested] 8+ messages in thread* ARM64: CPU Hotplug: can't enable more cpus than maxcpus value (kernel 4.5)
2016-03-03 14:17 ARM64: CPU Hotplug: can't enable more cpus than maxcpus value (kernel 4.5) Vadim Lomovtsev
@ 2016-03-03 14:42 ` Mark Rutland
2016-03-03 14:59 ` Suzuki K. Poulose
0 siblings, 1 reply; 8+ messages in thread
From: Mark Rutland @ 2016-03-03 14:42 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Mar 03, 2016 at 06:17:52AM -0800, Vadim Lomovtsev wrote:
> Hi all,
>
> While working with system which has 48 CPUs per one socket it was
> found that it is not possible to enable (turn on) more CPUs than
> 'maxcpus' value. The kernel booted with maxcpus=4 argument:
>
> [root at localhost ~]# cat /proc/cmdline
> BOOT_IMAGE=/boot/Image-4.5-rc5 root=UUID=9cece803-ce48-4b84-ae29-1091591c9edc ro crashkernel=2048M at 0M console=ttyAMA0,115200n8 LANG=en_US.UTF-8 earlycon=pl011,0x87e024000000 debug maxcpus=4
>
> Check the overall number of CPUs and online CPUs:
>
> [root at localhost ~]# cat /sys/devices/system/cpu/possible
> 0-47
> [root at localhost ~]# cat /sys/devices/system/cpu/online
> 0-3
>
> Now trying to turn on cpu which is out of scope of initial maxcpus value:
> [root at localhost ~]# echo 1 > /sys/devices/system/cpu/cpu4/online
> -bash: echo: write error: Invalid argument
>
> Which happens because of cpu_present_mask is set to 0x3 and prevents running more than 'maxcpus' cpus.
> [root at localhost ~]# cat /sys/devices/system/cpu/present
> 0-3
>
> For cpu0-3 it works fine:
> [root at localhost ~]# echo 0 > /sys/devices/system/cpu/cpu3/online
> [root at localhost ~]# echo 1 > /sys/devices/system/cpu/cpu3/online
>
> At the same time at the Documentation/cpu-hotplug.txt it is described that:
>
> > maxcpus=n Restrict boot time cpus to n. Say if you have 4 cpus, using
> > maxcpus=2 will only boot 2. You can choose to bring the
> > other cpus later online, read FAQ's for more info.
>
> and
>
> > cpu_present_mask: Bitmap of CPUs currently present in the system. Not all
> > of them may be online. When physical hotplug is processed by the relevant
> > subsystem (e.g ACPI) can change and new bit either be added or removed
> > from the map depending on the event is hot-add/hot-remove. There are currently
> > no locking rules as of now. Typical usage is to init topology during boot,
> > at which time hotplug is disabled.
>
> So after system start, according to docs, it should be possible to run
> more CPUs than 'maxcpus' value and cpu_present_mask should contain
> mask for all CPUs which currently present at the system.
>
> However for arm64 it is implemented that cpu_present_mask is
> explicetely set accordingly to 'maxcpus' value. Is it design intent ?
To some extent, yes.
Due to the possibility of a heterogeneous system, we must bring all CPUs
online at boot time, and cannot defer this.
This is necessary to detect the common subset of supported features, and
also to detect the full set of CPUs in the system to correctly apply
errata workarounds which require kernel text patching. Other things like
CPU-affine device probing (e.g. PMU) also require specific CPUs to be
online.
I don't think we can reliably support maxcpus for the above case.
Thanks,
Mark.
^ permalink raw reply [flat|nested] 8+ messages in thread
* ARM64: CPU Hotplug: can't enable more cpus than maxcpus value (kernel 4.5)
2016-03-03 14:42 ` Mark Rutland
@ 2016-03-03 14:59 ` Suzuki K. Poulose
2016-03-03 15:01 ` Suzuki K. Poulose
2016-03-03 15:12 ` Mark Rutland
0 siblings, 2 replies; 8+ messages in thread
From: Suzuki K. Poulose @ 2016-03-03 14:59 UTC (permalink / raw)
To: linux-arm-kernel
On 03/03/16 14:42, Mark Rutland wrote:
>> However for arm64 it is implemented that cpu_present_mask is
>> explicetely set accordingly to 'maxcpus' value. Is it design intent ?
>
> To some extent, yes.
>
> Due to the possibility of a heterogeneous system, we must bring all CPUs
> online at boot time, and cannot defer this.
>
> This is necessary to detect the common subset of supported features, and
> also to detect the full set of CPUs in the system to correctly apply
> errata workarounds which require kernel text patching.
We don't have this limitation anymore, as we can check if the booting CPU
has any conflicting/missing features w.r.t the established set and fail the
booting if it does.
Cheers
Suzuki
^ permalink raw reply [flat|nested] 8+ messages in thread
* ARM64: CPU Hotplug: can't enable more cpus than maxcpus value (kernel 4.5)
2016-03-03 14:59 ` Suzuki K. Poulose
@ 2016-03-03 15:01 ` Suzuki K. Poulose
2016-03-03 15:12 ` Mark Rutland
1 sibling, 0 replies; 8+ messages in thread
From: Suzuki K. Poulose @ 2016-03-03 15:01 UTC (permalink / raw)
To: linux-arm-kernel
On 03/03/16 14:59, Suzuki K. Poulose wrote:
> On 03/03/16 14:42, Mark Rutland wrote:
>
>>> However for arm64 it is implemented that cpu_present_mask is
>>> explicetely set accordingly to 'maxcpus' value. Is it design intent ?
>>
>> To some extent, yes.
>>
>> Due to the possibility of a heterogeneous system, we must bring all CPUs
>> online at boot time, and cannot defer this.
>>
>> This is necessary to detect the common subset of supported features, and
>> also to detect the full set of CPUs in the system to correctly apply
>> errata workarounds which require kernel text patching.
>
> We don't have this limitation anymore, as we can check if the booting CPU
s/we can/we do/
for the hotplugged CPUs.
Cheers
Suzuki
^ permalink raw reply [flat|nested] 8+ messages in thread
* ARM64: CPU Hotplug: can't enable more cpus than maxcpus value (kernel 4.5)
2016-03-03 14:59 ` Suzuki K. Poulose
2016-03-03 15:01 ` Suzuki K. Poulose
@ 2016-03-03 15:12 ` Mark Rutland
2016-03-03 15:41 ` Suzuki K. Poulose
2016-03-07 17:18 ` Catalin Marinas
1 sibling, 2 replies; 8+ messages in thread
From: Mark Rutland @ 2016-03-03 15:12 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Mar 03, 2016 at 02:59:29PM +0000, Suzuki K. Poulose wrote:
> On 03/03/16 14:42, Mark Rutland wrote:
>
> >>However for arm64 it is implemented that cpu_present_mask is
> >>explicetely set accordingly to 'maxcpus' value. Is it design intent ?
> >
> >To some extent, yes.
> >
> >Due to the possibility of a heterogeneous system, we must bring all CPUs
> >online at boot time, and cannot defer this.
> >
> >This is necessary to detect the common subset of supported features, and
> >also to detect the full set of CPUs in the system to correctly apply
> >errata workarounds which require kernel text patching.
>
> We don't have this limitation anymore, as we can check if the booting CPU
> has any conflicting/missing features w.r.t the established set and fail the
> booting if it does.
While we do this, that's more of a last-ditch effort as opposed to a
general solution, and I'm not sure it's complete.
What happens when we online a CPU that we determine needs a new erratum
workaround applied? I didn't think we prohibited onlining in that case.
I guess maxcpus is effectively the same as physical CPU hotplug, and the
same caveats apply to both -- we can't reliably support either in the
general case.
Thanks,
Mark.
^ permalink raw reply [flat|nested] 8+ messages in thread
* ARM64: CPU Hotplug: can't enable more cpus than maxcpus value (kernel 4.5)
2016-03-03 15:12 ` Mark Rutland
@ 2016-03-03 15:41 ` Suzuki K. Poulose
2016-03-07 17:22 ` Suzuki K. Poulose
2016-03-07 17:18 ` Catalin Marinas
1 sibling, 1 reply; 8+ messages in thread
From: Suzuki K. Poulose @ 2016-03-03 15:41 UTC (permalink / raw)
To: linux-arm-kernel
On 03/03/16 15:12, Mark Rutland wrote:
> On Thu, Mar 03, 2016 at 02:59:29PM +0000, Suzuki K. Poulose wrote:
>> On 03/03/16 14:42, Mark Rutland wrote:
>> We don't have this limitation anymore, as we can check if the booting CPU
>> has any conflicting/missing features w.r.t the established set and fail the
>> booting if it does.
>
> While we do this, that's more of a last-ditch effort as opposed to a
> general solution, and I'm not sure it's complete.
>
> What happens when we online a CPU that we determine needs a new erratum
> workaround applied? I didn't think we prohibited onlining in that case.
Right, the erratas can't be applied, as we would have free'd them already.
Cheers
Suzuki
^ permalink raw reply [flat|nested] 8+ messages in thread
* ARM64: CPU Hotplug: can't enable more cpus than maxcpus value (kernel 4.5)
2016-03-03 15:41 ` Suzuki K. Poulose
@ 2016-03-07 17:22 ` Suzuki K. Poulose
0 siblings, 0 replies; 8+ messages in thread
From: Suzuki K. Poulose @ 2016-03-07 17:22 UTC (permalink / raw)
To: linux-arm-kernel
On 03/03/16 15:41, Suzuki K. Poulose wrote:
> On 03/03/16 15:12, Mark Rutland wrote:
>> On Thu, Mar 03, 2016 at 02:59:29PM +0000, Suzuki K. Poulose wrote:
>>> On 03/03/16 14:42, Mark Rutland wrote:
>>> We don't have this limitation anymore, as we can check if the booting CPU
>>> has any conflicting/missing features w.r.t the established set and fail the
>>> booting if it does.
>>
>> While we do this, that's more of a last-ditch effort as opposed to a
>> general solution, and I'm not sure it's complete.
>>
>> What happens when we online a CPU that we determine needs a new erratum
>> workaround applied? I didn't think we prohibited onlining in that case.
>
> Right, the erratas can't be applied, as we would have free'd them already.
We could do what we now do for the CPU features though, i.e, fail any CPUs
which has an ERRATA that hasn't been applied in the kernel at boot time.
Cheers
Suzuki
^ permalink raw reply [flat|nested] 8+ messages in thread
* ARM64: CPU Hotplug: can't enable more cpus than maxcpus value (kernel 4.5)
2016-03-03 15:12 ` Mark Rutland
2016-03-03 15:41 ` Suzuki K. Poulose
@ 2016-03-07 17:18 ` Catalin Marinas
1 sibling, 0 replies; 8+ messages in thread
From: Catalin Marinas @ 2016-03-07 17:18 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Mar 03, 2016 at 03:12:01PM +0000, Mark Rutland wrote:
> On Thu, Mar 03, 2016 at 02:59:29PM +0000, Suzuki K. Poulose wrote:
> > On 03/03/16 14:42, Mark Rutland wrote:
> >
> > >>However for arm64 it is implemented that cpu_present_mask is
> > >>explicetely set accordingly to 'maxcpus' value. Is it design intent ?
> > >
> > >To some extent, yes.
> > >
> > >Due to the possibility of a heterogeneous system, we must bring all CPUs
> > >online at boot time, and cannot defer this.
> > >
> > >This is necessary to detect the common subset of supported features, and
> > >also to detect the full set of CPUs in the system to correctly apply
> > >errata workarounds which require kernel text patching.
> >
> > We don't have this limitation anymore, as we can check if the booting CPU
> > has any conflicting/missing features w.r.t the established set and fail the
> > booting if it does.
>
> While we do this, that's more of a last-ditch effort as opposed to a
> general solution, and I'm not sure it's complete.
>
> What happens when we online a CPU that we determine needs a new erratum
> workaround applied? I didn't think we prohibited onlining in that case.
>
> I guess maxcpus is effectively the same as physical CPU hotplug, and the
> same caveats apply to both -- we can't reliably support either in the
> general case.
While I agree that there are still issues to address, I would rather
like to have such feature in the kernel. It makes sense for some systems
with lots of (homogeneous) CPUs and we shouldn't penalise them just
because there are some big.LITTLE configurations out there.
My proposal is to block late hotplug of any CPU that the kernel was not
aware of during boot (e.g. a new MIDR). We could even simplify it and
check late CPUs against the MIDR of CPU0, I really don't care about
heterogeneous CPU systems wanting to do late hotplug (at least until
someone comes with a real use-case).
--
Catalin
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2016-03-07 17:22 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-03 14:17 ARM64: CPU Hotplug: can't enable more cpus than maxcpus value (kernel 4.5) Vadim Lomovtsev
2016-03-03 14:42 ` Mark Rutland
2016-03-03 14:59 ` Suzuki K. Poulose
2016-03-03 15:01 ` Suzuki K. Poulose
2016-03-03 15:12 ` Mark Rutland
2016-03-03 15:41 ` Suzuki K. Poulose
2016-03-07 17:22 ` Suzuki K. Poulose
2016-03-07 17:18 ` Catalin Marinas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).