* [PATCH 1/9] cpuidle: rename ARCH_HAS_CPU_RELAX to ARCH_HAS_OPTIMIZED_POLL
2024-04-30 18:37 [PATCH 0/9] Enable haltpoll for arm64 Ankur Arora
@ 2024-04-30 18:37 ` Ankur Arora
2024-05-02 1:33 ` Christoph Lameter (Ampere)
2024-04-30 18:37 ` [PATCH 2/9] Kconfig: move ARCH_HAS_OPTIMIZED_POLL to arch/Kconfig Ankur Arora
` (8 subsequent siblings)
9 siblings, 1 reply; 25+ messages in thread
From: Ankur Arora @ 2024-04-30 18:37 UTC (permalink / raw)
To: linux-pm, kvm, linux-arm-kernel, linux-kernel
Cc: catalin.marinas, will, tglx, mingo, bp, x86, hpa, pbonzini,
wanpengli, vkuznets, rafael, daniel.lezcano, peterz, arnd, lenb,
mark.rutland, harisokn, joao.m.martins, boris.ostrovsky,
konrad.wilk, ankur.a.arora
ARCH_HAS_CPU_RELAX is a bit of a misnomer since all architectures
define cpu_relax(). Not all, however, have a performant version, with
some only implementing it as a compiler barrier.
In contexts that this config option is used, it is expected to provide
an architectural primitive that can be used as part of a polling
mechanism -- one that would be cheaper than spinning in a tight loop.
Advertise the availability of such a primitive by renaming to
ARCH_HAS_OPTIMIZED_POLL. And, while at it, explicitly condition
cpuidle-haltpoll and intel-idle, both of which depend on a polling
state, on it.
Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
arch/x86/Kconfig | 2 +-
drivers/acpi/processor_idle.c | 4 ++--
drivers/cpuidle/Kconfig | 2 +-
drivers/cpuidle/Makefile | 2 +-
drivers/idle/Kconfig | 1 +
include/linux/cpuidle.h | 2 +-
6 files changed, 7 insertions(+), 6 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 4474bf32d0a4..b238c874875a 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -368,7 +368,7 @@ config ARCH_MAY_HAVE_PC_FDC
config GENERIC_CALIBRATE_DELAY
def_bool y
-config ARCH_HAS_CPU_RELAX
+config ARCH_HAS_OPTIMIZED_POLL
def_bool y
config ARCH_HIBERNATION_POSSIBLE
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index bd6a7857ce05..ccef38410950 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -36,7 +36,7 @@
#include <asm/cpu.h>
#endif
-#define ACPI_IDLE_STATE_START (IS_ENABLED(CONFIG_ARCH_HAS_CPU_RELAX) ? 1 : 0)
+#define ACPI_IDLE_STATE_START (IS_ENABLED(CONFIG_ARCH_HAS_OPTIMIZED_POLL) ? 1 : 0)
static unsigned int max_cstate __read_mostly = ACPI_PROCESSOR_MAX_POWER;
module_param(max_cstate, uint, 0400);
@@ -787,7 +787,7 @@ static int acpi_processor_setup_cstates(struct acpi_processor *pr)
if (max_cstate == 0)
max_cstate = 1;
- if (IS_ENABLED(CONFIG_ARCH_HAS_CPU_RELAX)) {
+ if (IS_ENABLED(CONFIG_ARCH_HAS_OPTIMIZED_POLL)) {
cpuidle_poll_state_init(drv);
count = 1;
} else {
diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig
index cac5997dca50..75f6e176bbc8 100644
--- a/drivers/cpuidle/Kconfig
+++ b/drivers/cpuidle/Kconfig
@@ -73,7 +73,7 @@ endmenu
config HALTPOLL_CPUIDLE
tristate "Halt poll cpuidle driver"
- depends on X86 && KVM_GUEST
+ depends on X86 && KVM_GUEST && ARCH_HAS_OPTIMIZED_POLL
select CPU_IDLE_GOV_HALTPOLL
default y
help
diff --git a/drivers/cpuidle/Makefile b/drivers/cpuidle/Makefile
index d103342b7cfc..f29dfd1525b0 100644
--- a/drivers/cpuidle/Makefile
+++ b/drivers/cpuidle/Makefile
@@ -7,7 +7,7 @@ obj-y += cpuidle.o driver.o governor.o sysfs.o governors/
obj-$(CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED) += coupled.o
obj-$(CONFIG_DT_IDLE_STATES) += dt_idle_states.o
obj-$(CONFIG_DT_IDLE_GENPD) += dt_idle_genpd.o
-obj-$(CONFIG_ARCH_HAS_CPU_RELAX) += poll_state.o
+obj-$(CONFIG_ARCH_HAS_OPTIMIZED_POLL) += poll_state.o
obj-$(CONFIG_HALTPOLL_CPUIDLE) += cpuidle-haltpoll.o
##################################################################################
diff --git a/drivers/idle/Kconfig b/drivers/idle/Kconfig
index 6707d2539fc4..6f9b1d48fede 100644
--- a/drivers/idle/Kconfig
+++ b/drivers/idle/Kconfig
@@ -4,6 +4,7 @@ config INTEL_IDLE
depends on CPU_IDLE
depends on X86
depends on CPU_SUP_INTEL
+ depends on ARCH_HAS_OPTIMIZED_POLL
help
Enable intel_idle, a cpuidle driver that includes knowledge of
native Intel hardware idle features. The acpi_idle driver
diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
index 3183aeb7f5b4..7e7e58a17b07 100644
--- a/include/linux/cpuidle.h
+++ b/include/linux/cpuidle.h
@@ -275,7 +275,7 @@ static inline void cpuidle_coupled_parallel_barrier(struct cpuidle_device *dev,
}
#endif
-#if defined(CONFIG_CPU_IDLE) && defined(CONFIG_ARCH_HAS_CPU_RELAX)
+#if defined(CONFIG_CPU_IDLE) && defined(CONFIG_ARCH_HAS_OPTIMIZED_POLL)
void cpuidle_poll_state_init(struct cpuidle_driver *drv);
#else
static inline void cpuidle_poll_state_init(struct cpuidle_driver *drv) {}
--
2.39.3
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [PATCH 1/9] cpuidle: rename ARCH_HAS_CPU_RELAX to ARCH_HAS_OPTIMIZED_POLL
2024-04-30 18:37 ` [PATCH 1/9] cpuidle: rename ARCH_HAS_CPU_RELAX to ARCH_HAS_OPTIMIZED_POLL Ankur Arora
@ 2024-05-02 1:33 ` Christoph Lameter (Ampere)
2024-05-03 4:13 ` Ankur Arora
0 siblings, 1 reply; 25+ messages in thread
From: Christoph Lameter (Ampere) @ 2024-05-02 1:33 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-pm, kvm, linux-arm-kernel, linux-kernel, catalin.marinas,
will, tglx, mingo, bp, x86, hpa, pbonzini, wanpengli, vkuznets,
rafael, daniel.lezcano, peterz, arnd, lenb, mark.rutland,
harisokn, joao.m.martins, boris.ostrovsky, konrad.wilk
On Tue, 30 Apr 2024, Ankur Arora wrote:
> ARCH_HAS_CPU_RELAX is a bit of a misnomer since all architectures
> define cpu_relax(). Not all, however, have a performant version, with
> some only implementing it as a compiler barrier.
>
> In contexts that this config option is used, it is expected to provide
> an architectural primitive that can be used as part of a polling
> mechanism -- one that would be cheaper than spinning in a tight loop.
The intend of cpu_relax() is not a polling mechanism. Initial AFAICT it
was introduced on x86 as the REP NOP instruction. Aka as PAUSE. And it was
part of a spin loop. So there was no connection to polling anything.
The intend was to make the processor aware that we are in a spin loop.
Various processors have different actions that they take upon encountering
such a cpu relax operation.
The polling (WFE/WFI) available on ARM (and potentially other platforms)
is a different mechanism that is actually intended to reduce the power
requirement of the processor until a certain condition is met and that
check is done in hardware.
These are not the same and I think we need both config options.
The issues that you have with WFET later in the patchset arise from not
making this distinction.
The polling (waiting for an event) could be implemented for a
processor not supporting that in hardware by using a loop that
checks for the condition and then does a cpu_relax().
With that you could f.e. support the existing cpu_relax() and also have
some form of cpu_poll() interface.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/9] cpuidle: rename ARCH_HAS_CPU_RELAX to ARCH_HAS_OPTIMIZED_POLL
2024-05-02 1:33 ` Christoph Lameter (Ampere)
@ 2024-05-03 4:13 ` Ankur Arora
2024-05-03 17:07 ` Christoph Lameter (Ampere)
0 siblings, 1 reply; 25+ messages in thread
From: Ankur Arora @ 2024-05-03 4:13 UTC (permalink / raw)
To: Christoph Lameter (Ampere)
Cc: Ankur Arora, linux-pm, kvm, linux-arm-kernel, linux-kernel,
catalin.marinas, will, tglx, mingo, bp, x86, hpa, pbonzini,
wanpengli, vkuznets, rafael, daniel.lezcano, peterz, arnd, lenb,
mark.rutland, harisokn, joao.m.martins, boris.ostrovsky,
konrad.wilk
Christoph Lameter (Ampere) <cl@gentwo.org> writes:
> On Tue, 30 Apr 2024, Ankur Arora wrote:
>
>> ARCH_HAS_CPU_RELAX is a bit of a misnomer since all architectures
>> define cpu_relax(). Not all, however, have a performant version, with
>> some only implementing it as a compiler barrier.
>>
>> In contexts that this config option is used, it is expected to provide
>> an architectural primitive that can be used as part of a polling
>> mechanism -- one that would be cheaper than spinning in a tight loop.
>
> The intend of cpu_relax() is not a polling mechanism. Initial AFAICT it was
> introduced on x86 as the REP NOP instruction. Aka as PAUSE. And it was part of a
> spin loop. So there was no connection to polling anything.
Agreed, cpu_relax() is just a mechanism to tell the pipeline that
we are in a spin-loop.
> The intend was to make the processor aware that we are in a spin loop. Various
> processors have different actions that they take upon encountering such a cpu
> relax operation.
Sure, though most processors don't have a nice mechanism to do that.
x86 clearly has the REP; NOP thing. arm64 only has a YIELD which from my
measurements is basically a NOP when executed on a system without
hardware threads.
And that's why only x86 defines ARCH_HAS_CPU_RELAX.
> The polling (WFE/WFI) available on ARM (and potentially other platforms) is a
> different mechanism that is actually intended to reduce the power requirement of
> the processor until a certain condition is met and that check is done in
> hardware.
Sure. Which almost exactly fits the bill for the poll-idle loop -- except for the
timeout part.
> These are not the same and I think we need both config options.
My main concern is that poll_idle() conflates polling in idle with
ARCH_HAS_CPU_RELAX, when they aren't really related.
So, poll_idle(), and its users should depend on ARCH_HAS_OPTIMIZED_POLL
which, if defined by some architecture, means that poll_idle() would
be better than a spin-wait loop.
Beyond that I'm okay to keep ARCH_HAS_CPU_RELAX around.
That said, do you see a use for ARCH_HAS_CPU_RELAX? The only current
user is the poll-idle path.
> The issues that you have with WFET later in the patchset arise from not making
> this distinction.
Did you mean the issue with WFE? I'm not using WFET in this patchset at all.
With WFE, sure there's a problem in that you depend on an interrupt or
the event-stream to get out of the wait. And, so sometimes you would
overshoot the target poll timeout.
> The polling (waiting for an event) could be implemented for a processor not
> supporting that in hardware by using a loop that checks for the condition and
> then does a cpu_relax().
Yeah. That's exactly what patch-6 does. smp_cond_load_relaxed() uses
cpu_relax() internally in its spin-loop variant (non arm64).
On arm64, this would use LDXR; WFE. Or are you suggesting implementing
the arm64 loop via cpu_relax() (and thus YIELD?)
Ankur
> With that you could f.e. support the existing cpu_relax() and also have some
> form of cpu_poll() interface.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/9] cpuidle: rename ARCH_HAS_CPU_RELAX to ARCH_HAS_OPTIMIZED_POLL
2024-05-03 4:13 ` Ankur Arora
@ 2024-05-03 17:07 ` Christoph Lameter (Ampere)
2024-05-06 21:27 ` Ankur Arora
0 siblings, 1 reply; 25+ messages in thread
From: Christoph Lameter (Ampere) @ 2024-05-03 17:07 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-pm, kvm, linux-arm-kernel, linux-kernel, catalin.marinas,
will, tglx, mingo, bp, x86, hpa, pbonzini, wanpengli, vkuznets,
rafael, daniel.lezcano, peterz, arnd, lenb, mark.rutland,
harisokn, joao.m.martins, boris.ostrovsky, konrad.wilk
On Thu, 2 May 2024, Ankur Arora wrote:
>> The intend was to make the processor aware that we are in a spin loop. Various
>> processors have different actions that they take upon encountering such a cpu
>> relax operation.
>
> Sure, though most processors don't have a nice mechanism to do that.
> x86 clearly has the REP; NOP thing. arm64 only has a YIELD which from my
> measurements is basically a NOP when executed on a system without
> hardware threads.
>
> And that's why only x86 defines ARCH_HAS_CPU_RELAX.
My impression is that the use of arm YIELD has led cpu architects to
implement similar mechanisms to x86s PAUSE, This is not part of the spec
but it has been there for a long time. So I would rather leave it as is.
>> These are not the same and I think we need both config options.
>
> My main concern is that poll_idle() conflates polling in idle with
> ARCH_HAS_CPU_RELAX, when they aren't really related.
>
> So, poll_idle(), and its users should depend on ARCH_HAS_OPTIMIZED_POLL
> which, if defined by some architecture, means that poll_idle() would
> be better than a spin-wait loop.
>
> Beyond that I'm okay to keep ARCH_HAS_CPU_RELAX around.
>
> That said, do you see a use for ARCH_HAS_CPU_RELAX? The only current
> user is the poll-idle path.
I would think that we need a generic cpu_poll() mechanism that can fall
back to cpu_relax() on processors that do not offer such thing (x86?) and
if not even that is there fall back.
We already have something like that in the smp_cond_acquire mechanism (a
bit weird to put that in the barrier.h>).
So what if we had
void cpu_wait(unsigned flags, unsigned long timeout, void *cacheline);
With
#define CPU_POLL_INTERRUPT (1 << 0)
#define CPU_POLL_EVENT (1 << 1)
#define CPU_POLL_CACHELINE (1 << 2)
#define CPU_POLL_TIMEOUT (1 << 3)
#define CPU_POLL_BROADCAST_EVENT (1 << 4)
#define CPU_POLL_LOCAL_EVENT (1 << 5)
The cpu_poll() function coud be generically defined in asm-generic and
then arches could provide their own implementation optimizing the hardware
polling mechanisms.
Any number of flags could be specified simultaneously. On ARM this would
map then to SEVL SEV and WFI/WFE WFIT/WFET
So f.e.
cpu_wait(CPU_POLL_INTERUPT|CPU_POLL_EVENT|CPU_POLL_TIMEOUT|CPU_POLL_CACHELINE,
timeout, &mylock);
to wait on a change in a cacheline with a timeout.
In additional we could then think about making effective use of the
signaling mechanism provided by SEV in core logic of the kernel. Maybe
that is more effective then waiting for a cacheline in some situations.
> With WFE, sure there's a problem in that you depend on an interrupt or
> the event-stream to get out of the wait. And, so sometimes you would
> overshoot the target poll timeout.
Right. The dependence on the event stream makes this approach a bit
strange. Having some sort of generic cpu_wait() feature with timeout spec
could avoid that.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/9] cpuidle: rename ARCH_HAS_CPU_RELAX to ARCH_HAS_OPTIMIZED_POLL
2024-05-03 17:07 ` Christoph Lameter (Ampere)
@ 2024-05-06 21:27 ` Ankur Arora
0 siblings, 0 replies; 25+ messages in thread
From: Ankur Arora @ 2024-05-06 21:27 UTC (permalink / raw)
To: Christoph Lameter (Ampere)
Cc: Ankur Arora, linux-pm, kvm, linux-arm-kernel, linux-kernel,
catalin.marinas, will, tglx, mingo, bp, x86, hpa, pbonzini,
wanpengli, vkuznets, rafael, daniel.lezcano, peterz, arnd, lenb,
mark.rutland, harisokn, joao.m.martins, boris.ostrovsky,
konrad.wilk
Christoph Lameter (Ampere) <cl@gentwo.org> writes:
> On Thu, 2 May 2024, Ankur Arora wrote:
>
>>> The intend was to make the processor aware that we are in a spin loop. Various
>>> processors have different actions that they take upon encountering such a cpu
>>> relax operation.
>>
>> Sure, though most processors don't have a nice mechanism to do that.
>> x86 clearly has the REP; NOP thing. arm64 only has a YIELD which from my
>> measurements is basically a NOP when executed on a system without
>> hardware threads.
>>
>> And that's why only x86 defines ARCH_HAS_CPU_RELAX.
>
> My impression is that the use of arm YIELD has led cpu architects to implement
> similar mechanisms to x86s PAUSE, This is not part of the spec but it has been
> there for a long time. So I would rather leave it as is.
>
>
>>> These are not the same and I think we need both config options.
>>
>> My main concern is that poll_idle() conflates polling in idle with
>> ARCH_HAS_CPU_RELAX, when they aren't really related.
>>
>> So, poll_idle(), and its users should depend on ARCH_HAS_OPTIMIZED_POLL
>> which, if defined by some architecture, means that poll_idle() would
>> be better than a spin-wait loop.
>>
>> Beyond that I'm okay to keep ARCH_HAS_CPU_RELAX around.
>>
>> That said, do you see a use for ARCH_HAS_CPU_RELAX? The only current
>> user is the poll-idle path.
>
> I would think that we need a generic cpu_poll() mechanism that can fall back to
> cpu_relax() on processors that do not offer such thing (x86?) and if not even
> that is there fall back.
>
> We already have something like that in the smp_cond_acquire mechanism (a bit
> weird to put that in the barrier.h>).
>
> So what if we had
>
> void cpu_wait(unsigned flags, unsigned long timeout, void *cacheline);
>
> With
>
> #define CPU_POLL_INTERRUPT (1 << 0)
> #define CPU_POLL_EVENT (1 << 1)
> #define CPU_POLL_CACHELINE (1 << 2)
> #define CPU_POLL_TIMEOUT (1 << 3)
> #define CPU_POLL_BROADCAST_EVENT (1 << 4)
> #define CPU_POLL_LOCAL_EVENT (1 << 5)
>
>
> The cpu_poll() function coud be generically defined in asm-generic and then
> arches could provide their own implementation optimizing the hardware polling
> mechanisms.
>
> Any number of flags could be specified simultaneously. On ARM this would map
> then to SEVL SEV and WFI/WFE WFIT/WFET
>
> So f.e.
>
> cpu_wait(CPU_POLL_INTERUPT|CPU_POLL_EVENT|CPU_POLL_TIMEOUT|CPU_POLL_CACHELINE,
> timeout, &mylock);
>
> to wait on a change in a cacheline with a timeout.
>
> In additional we could then think about making effective use of the signaling
> mechanism provided by SEV in core logic of the kernel. Maybe that is more
> effective then waiting for a cacheline in some situations.
>
>
>> With WFE, sure there's a problem in that you depend on an interrupt or
>> the event-stream to get out of the wait. And, so sometimes you would
>> overshoot the target poll timeout.
>
> Right. The dependence on the event stream makes this approach a bit strange.
> Having some sort of generic cpu_wait() feature with timeout spec could avoid
> that.
Thanks for the detailed comments. Helped me think through some of the details.
So, there are three issues that you bring up. Let me address each in turn.
1) A generic cpu_poll() mechanism that can fall back to cpu_relax().
> I would think that we need a generic cpu_poll() mechanism that can fall back to
> cpu_relax() on processors that do not offer such thing (x86?) and if not even
> that is there fall back.
>
> We already have something like that in the smp_cond_acquire mechanism (a bit
> weird to put that in the barrier.h>).
Isn't that exactly what this series does?
If you see patch-6, that gets rid of direct use of cpu_relax(), instead
using smp_cond_load_relaxed().
And smp_cond_load_relaxed(), in its generic variant (used everywhere but
arm64) uses cpu_relax() implicitly. Any architecture that override
this -- as arm64 does -- get their own optimizations.
(Maybe this patch would be clearer if it was sequenced after patch-6?)
2) That brings me back to your second point, about having a different
interface which allows for different optimizations.
> void cpu_wait(unsigned flags, unsigned long timeout, void *cacheline);
>
> With
>
> #define CPU_POLL_INTERRUPT (1 << 0)
> #define CPU_POLL_EVENT (1 << 1)
> #define CPU_POLL_CACHELINE (1 << 2)
> #define CPU_POLL_TIMEOUT (1 << 3)
> #define CPU_POLL_BROADCAST_EVENT (1 << 4)
> #define CPU_POLL_LOCAL_EVENT (1 << 5)
I agree with you that the polling logic does need to handle timeouts
but I don't think that we need a special interface.
Given that we are only concerned about poll_idle() here and that needs
to work with the scheduler's set_nr_if_polling() machinery to elide IPIs,
the polling on a cacheline is needed anyway. That also means we poll_idle()
doesn't need to handle interrupts.
For the rest, the architecture could internally choose whichever
variation they perform best at -- so long as they can either spin-wait
or have some kind of event driven mechanism (WFE/WFET, MONITOR/MWAIT[X])
The timeout is something that I plan to address separately. I think we can
straight-forwardly extend to smp_cond_load_timeout() to do that at least
for WFET supporting platforms where others depend on the event-stream.
#define smp_cond_load_relaxed_timeout(ptr, cond_expr, time_check_expr, \
time_limit, timeout) ({ \
typeof(ptr) __PTR = (ptr); \
__unqual_scalar_typeof(*ptr) VAL; \
unsigned int __count = 0; \
for (;;) { \
VAL = READ_ONCE(*__PTR); \
if (cond_expr) \
break; \
cpu_relax(); \
if (__count++ < smp_cond_time_check_count) \
continue; \
\
if ((time_check_expr) > time_limit) \
goto timeout; \
\
__count = 0; \
} \
(typeof(*ptr))VAL; \
})
arm64, for instance, can use alternatives to implement whichever variant
the processor supports (untested, also needs massaging).
static inline void __cmpwait_case_##sz(volatile void *ptr, \
unsigned long val, \
unsigned long etime) \
\
unsigned long tmp; \
\
const unsigned long ecycles = xloops_to_cycles(nsecs_to_xloops(etime)); \
asm volatile( \
" sevl\ n" \
" wfe\ n" \
" ldxr" #sfx "\ t%" #w "[tmp], %[v]\n" \
" eor %" #w "[tmp], %" #w "[tmp], %" #w "[val]\ n" \
" cbnz %" #w "[tmp], 1f\ n" \
ALTERNATIVE("wfe\ n", \
"msr s0_3_c1_c0_0, %[ecycles]\ n", \
ARM64_HAS_WFXT) \
"1:" \
: [tmp] "=&r" (tmp), [v] "+Q" (*(u##sz *)ptr) \
: [val] "r" (val), [ecycles] "r" (ecycles)); \
}
And, then poll_idle() only need be:
static int __cpuidle poll_idle(struct cpuidle_device *dev,
struct cpuidle_driver *drv, int index)
{
u64 time_start;
time_start = local_clock_noinstr();
dev->poll_time_limit = false;
raw_local_irq_enable();
if (!current_set_polling_and_test()) {
u64 time_limit;
time_limit = cpuidle_poll_time(drv, dev);
smp_cond_load_relaxed_timeout(¤t_thread_info()->flags,
VAL & _TIF_NEED_RESCHED,
local_clock_noinstr(),
time_start + time_limit,
timed_out);
}
...
On x86, this generates code pretty similar to the current version and
on arm64 similar to the WFE version.
3) Dependence on the event-stream for the WFE variants
> Right. The dependence on the event stream makes this approach a bit strange.
> Having some sort of generic cpu_wait() feature with timeout spec could avoid
> that.
Not sure I agree with that. Seems to me, the event-stream is present for
exactly that -- so we don't wait in a WFE or WFI forever. The spec
say (section D12.2.3)
"An implementation that includes the Generic Timer can use the system
counter to generate one or more event streams, to generate periodic
wakeup events as part of the mechanism described in Wait for Event.
An event stream might be used:
- To impose a time-out on a Wait For Event polling loop."
The overshoot is a problem, but I don't think it is a huge one. Most
of the time that haltpoll is in effect, it should wake up with work
to do in the guest_halt_poll_ns duration. The times, it doesn't it
would exit poll_idle() and exit to the hypervisor.
So, this is only an issue in the last iteration.
And, I'm not sure how a generic cpu_wait() would work? Either the
generic cpu_wait() spins in cpu_relax()/YIELD which is suboptimal
on arm64 or it uses WFE but sets a timer for 50us or whatever. Seems
like unnecessary overhead when the overshoot is relatively uncommon.
Or is there another mechanism you are thinking of for enforcing a
timeout?
Thanks
--
ankur
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 2/9] Kconfig: move ARCH_HAS_OPTIMIZED_POLL to arch/Kconfig
2024-04-30 18:37 [PATCH 0/9] Enable haltpoll for arm64 Ankur Arora
2024-04-30 18:37 ` [PATCH 1/9] cpuidle: rename ARCH_HAS_CPU_RELAX to ARCH_HAS_OPTIMIZED_POLL Ankur Arora
@ 2024-04-30 18:37 ` Ankur Arora
2024-04-30 18:37 ` [PATCH 3/9] cpuidle-haltpoll: condition on ARCH_CPUIDLE_HALTPOLL Ankur Arora
` (7 subsequent siblings)
9 siblings, 0 replies; 25+ messages in thread
From: Ankur Arora @ 2024-04-30 18:37 UTC (permalink / raw)
To: linux-pm, kvm, linux-arm-kernel, linux-kernel
Cc: catalin.marinas, will, tglx, mingo, bp, x86, hpa, pbonzini,
wanpengli, vkuznets, rafael, daniel.lezcano, peterz, arnd, lenb,
mark.rutland, harisokn, joao.m.martins, boris.ostrovsky,
konrad.wilk, ankur.a.arora
From: Joao Martins <joao.m.martins@oracle.com>
ARCH_HAS_OPTIMIZED_POLL gates selection of polling while idle in
poll_idle(). Move the configuration option to arch/Kconfig to allow
non-x86 architectures to select it.
Note that ARCH_HAS_OPTIMIZED_POLL should probably be exclusive with
GENERIC_IDLE_POLL_SETUP (which controls the generic polling logic in
cpu_idle_poll()). However, that would remove boot options
(hlt=, nohlt=). So, leave it untouched for now.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Mihai Carabas <mihai.carabas@oracle.com>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
arch/Kconfig | 3 +++
arch/x86/Kconfig | 4 +---
2 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/arch/Kconfig b/arch/Kconfig
index 65afb1de48b3..6d918c19a099 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -256,6 +256,9 @@ config HAVE_ARCH_TRACEHOOK
config HAVE_DMA_CONTIGUOUS
bool
+config ARCH_HAS_OPTIMIZED_POLL
+ bool
+
config GENERIC_SMP_IDLE_THREAD
bool
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index b238c874875a..670ec5d5d923 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -131,6 +131,7 @@ config X86
select ARCH_WANTS_NO_INSTR
select ARCH_WANT_GENERAL_HUGETLB
select ARCH_WANT_HUGE_PMD_SHARE
+ select ARCH_HAS_OPTIMIZED_POLL
select ARCH_WANT_LD_ORPHAN_WARN
select ARCH_WANT_OPTIMIZE_DAX_VMEMMAP if X86_64
select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP if X86_64
@@ -368,9 +369,6 @@ config ARCH_MAY_HAVE_PC_FDC
config GENERIC_CALIBRATE_DELAY
def_bool y
-config ARCH_HAS_OPTIMIZED_POLL
- def_bool y
-
config ARCH_HIBERNATION_POSSIBLE
def_bool y
--
2.39.3
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 25+ messages in thread* [PATCH 3/9] cpuidle-haltpoll: condition on ARCH_CPUIDLE_HALTPOLL
2024-04-30 18:37 [PATCH 0/9] Enable haltpoll for arm64 Ankur Arora
2024-04-30 18:37 ` [PATCH 1/9] cpuidle: rename ARCH_HAS_CPU_RELAX to ARCH_HAS_OPTIMIZED_POLL Ankur Arora
2024-04-30 18:37 ` [PATCH 2/9] Kconfig: move ARCH_HAS_OPTIMIZED_POLL to arch/Kconfig Ankur Arora
@ 2024-04-30 18:37 ` Ankur Arora
2024-05-22 16:09 ` Joao Martins
2024-04-30 18:37 ` [PATCH 4/9] cpuidle-haltpoll: define arch_haltpoll_supported() Ankur Arora
` (6 subsequent siblings)
9 siblings, 1 reply; 25+ messages in thread
From: Ankur Arora @ 2024-04-30 18:37 UTC (permalink / raw)
To: linux-pm, kvm, linux-arm-kernel, linux-kernel
Cc: catalin.marinas, will, tglx, mingo, bp, x86, hpa, pbonzini,
wanpengli, vkuznets, rafael, daniel.lezcano, peterz, arnd, lenb,
mark.rutland, harisokn, joao.m.martins, boris.ostrovsky,
konrad.wilk, ankur.a.arora
The cpuidle-haltpoll driver and its namesake governor are selected
under KVM_GUEST on X86. In addition, KVM_GUEST also selects
ARCH_CPUIDLE_HALTPOLL and defines the requisite
arch_haltpoll_{enable,disable}() functions.
So remove the explicit dependence on KVM_GUEST, and instead use
ARCH_CPUIDLE_HALTPOLL as proxy for architectural support for
haltpoll.
While at it, change "halt poll" to "haltpoll" in one of the summary
clauses, since the second form is used everywhere else.
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
drivers/cpuidle/Kconfig | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig
index 75f6e176bbc8..c1bebadf22bc 100644
--- a/drivers/cpuidle/Kconfig
+++ b/drivers/cpuidle/Kconfig
@@ -35,7 +35,6 @@ config CPU_IDLE_GOV_TEO
config CPU_IDLE_GOV_HALTPOLL
bool "Haltpoll governor (for virtualized systems)"
- depends on KVM_GUEST
help
This governor implements haltpoll idle state selection, to be
used in conjunction with the haltpoll cpuidle driver, allowing
@@ -72,8 +71,8 @@ source "drivers/cpuidle/Kconfig.riscv"
endmenu
config HALTPOLL_CPUIDLE
- tristate "Halt poll cpuidle driver"
- depends on X86 && KVM_GUEST && ARCH_HAS_OPTIMIZED_POLL
+ tristate "Haltpoll cpuidle driver"
+ depends on ARCH_CPUIDLE_HALTPOLL && ARCH_HAS_OPTIMIZED_POLL
select CPU_IDLE_GOV_HALTPOLL
default y
help
--
2.39.3
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [PATCH 3/9] cpuidle-haltpoll: condition on ARCH_CPUIDLE_HALTPOLL
2024-04-30 18:37 ` [PATCH 3/9] cpuidle-haltpoll: condition on ARCH_CPUIDLE_HALTPOLL Ankur Arora
@ 2024-05-22 16:09 ` Joao Martins
0 siblings, 0 replies; 25+ messages in thread
From: Joao Martins @ 2024-05-22 16:09 UTC (permalink / raw)
To: Ankur Arora
Cc: catalin.marinas, will, tglx, mingo, bp, x86, hpa, pbonzini,
wanpengli, vkuznets, rafael, daniel.lezcano, peterz, linux-pm,
arnd, lenb, mark.rutland, harisokn, boris.ostrovsky, konrad.wilk,
kvm, linux-arm-kernel, linux-kernel
On 30/04/2024 19:37, Ankur Arora wrote:
> diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig
> index 75f6e176bbc8..c1bebadf22bc 100644
> --- a/drivers/cpuidle/Kconfig
> +++ b/drivers/cpuidle/Kconfig
> @@ -72,8 +71,8 @@ source "drivers/cpuidle/Kconfig.riscv"
> endmenu
>
> config HALTPOLL_CPUIDLE
> - tristate "Halt poll cpuidle driver"
> - depends on X86 && KVM_GUEST && ARCH_HAS_OPTIMIZED_POLL
> + tristate "Haltpoll cpuidle driver"
> + depends on ARCH_CPUIDLE_HALTPOLL && ARCH_HAS_OPTIMIZED_POLL
> select CPU_IDLE_GOV_HALTPOLL
> default y
> help
I suspect the drop on KVM_GUEST is causing the kbuild robot as it's
arch/x86/kernel/kvm.c that ends up including the arch haltpoll definitions.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 4/9] cpuidle-haltpoll: define arch_haltpoll_supported()
2024-04-30 18:37 [PATCH 0/9] Enable haltpoll for arm64 Ankur Arora
` (2 preceding siblings ...)
2024-04-30 18:37 ` [PATCH 3/9] cpuidle-haltpoll: condition on ARCH_CPUIDLE_HALTPOLL Ankur Arora
@ 2024-04-30 18:37 ` Ankur Arora
2024-05-01 11:48 ` kernel test robot
2024-05-22 16:09 ` Joao Martins
2024-04-30 18:37 ` [PATCH 5/9] governors/haltpoll: drop kvm_para_available() check Ankur Arora
` (5 subsequent siblings)
9 siblings, 2 replies; 25+ messages in thread
From: Ankur Arora @ 2024-04-30 18:37 UTC (permalink / raw)
To: linux-pm, kvm, linux-arm-kernel, linux-kernel
Cc: catalin.marinas, will, tglx, mingo, bp, x86, hpa, pbonzini,
wanpengli, vkuznets, rafael, daniel.lezcano, peterz, arnd, lenb,
mark.rutland, harisokn, joao.m.martins, boris.ostrovsky,
konrad.wilk, ankur.a.arora
From: Joao Martins <joao.m.martins@oracle.com>
Right now kvm_para_has_hint(KVM_HINTS_REALTIME) is x86 only. In
pursuit of making cpuidle-haltpoll architecture independent, define
arch_haltpoll_supported() which handles the architectural check for
enabling haltpoll.
Move the (kvm_para_available() && kvm_para_has_hint(KVM_HINTS_REALTIME))
check to the x86 specific arch_haltpoll_supported().
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Mihai Carabas <mihai.carabas@oracle.com>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
Changelog:
- s/arch_haltpoll_want/arch_haltpoll_supported/
- change the check in haltpoll_want() from:
(kvm_para_available() && arch_haltpoll_want()) || force;
to
arch_haltpoll_supported() || force;
Dropped Rafael's acked-by due to these changes.
---
arch/x86/include/asm/cpuidle_haltpoll.h | 1 +
arch/x86/kernel/kvm.c | 10 ++++++++++
drivers/cpuidle/cpuidle-haltpoll.c | 9 ++-------
include/linux/cpuidle_haltpoll.h | 5 +++++
4 files changed, 18 insertions(+), 7 deletions(-)
diff --git a/arch/x86/include/asm/cpuidle_haltpoll.h b/arch/x86/include/asm/cpuidle_haltpoll.h
index c8b39c6716ff..43ce79b88662 100644
--- a/arch/x86/include/asm/cpuidle_haltpoll.h
+++ b/arch/x86/include/asm/cpuidle_haltpoll.h
@@ -4,5 +4,6 @@
void arch_haltpoll_enable(unsigned int cpu);
void arch_haltpoll_disable(unsigned int cpu);
+bool arch_haltpoll_supported(void);
#endif
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 7f0732bc0ccd..e4dcbe9acc07 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -1151,4 +1151,14 @@ void arch_haltpoll_disable(unsigned int cpu)
smp_call_function_single(cpu, kvm_enable_host_haltpoll, NULL, 1);
}
EXPORT_SYMBOL_GPL(arch_haltpoll_disable);
+
+bool arch_haltpoll_supported(void)
+{
+ /* Do not load haltpoll if idle= is passed */
+ if (boot_option_idle_override != IDLE_NO_OVERRIDE)
+ return false;
+
+ return kvm_para_available() && kvm_para_has_hint(KVM_HINTS_REALTIME);
+}
+EXPORT_SYMBOL_GPL(arch_haltpoll_supported);
#endif
diff --git a/drivers/cpuidle/cpuidle-haltpoll.c b/drivers/cpuidle/cpuidle-haltpoll.c
index d8515d5c0853..70f585383171 100644
--- a/drivers/cpuidle/cpuidle-haltpoll.c
+++ b/drivers/cpuidle/cpuidle-haltpoll.c
@@ -15,7 +15,6 @@
#include <linux/cpuidle.h>
#include <linux/module.h>
#include <linux/sched/idle.h>
-#include <linux/kvm_para.h>
#include <linux/cpuidle_haltpoll.h>
static bool force __read_mostly;
@@ -95,7 +94,7 @@ static void haltpoll_uninit(void)
static bool haltpoll_want(void)
{
- return kvm_para_has_hint(KVM_HINTS_REALTIME) || force;
+ return arch_haltpoll_supported() || force;
}
static int __init haltpoll_init(void)
@@ -103,11 +102,7 @@ static int __init haltpoll_init(void)
int ret;
struct cpuidle_driver *drv = &haltpoll_driver;
- /* Do not load haltpoll if idle= is passed */
- if (boot_option_idle_override != IDLE_NO_OVERRIDE)
- return -ENODEV;
-
- if (!kvm_para_available() || !haltpoll_want())
+ if (!haltpoll_want())
return -ENODEV;
cpuidle_poll_state_init(drv);
diff --git a/include/linux/cpuidle_haltpoll.h b/include/linux/cpuidle_haltpoll.h
index d50c1e0411a2..a3caf01d3f0e 100644
--- a/include/linux/cpuidle_haltpoll.h
+++ b/include/linux/cpuidle_haltpoll.h
@@ -12,5 +12,10 @@ static inline void arch_haltpoll_enable(unsigned int cpu)
static inline void arch_haltpoll_disable(unsigned int cpu)
{
}
+
+static inline bool arch_haltpoll_supported(void)
+{
+ return false;
+}
#endif
#endif
--
2.39.3
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [PATCH 4/9] cpuidle-haltpoll: define arch_haltpoll_supported()
2024-04-30 18:37 ` [PATCH 4/9] cpuidle-haltpoll: define arch_haltpoll_supported() Ankur Arora
@ 2024-05-01 11:48 ` kernel test robot
2024-05-22 16:09 ` Joao Martins
1 sibling, 0 replies; 25+ messages in thread
From: kernel test robot @ 2024-05-01 11:48 UTC (permalink / raw)
To: Ankur Arora, linux-pm, kvm, linux-arm-kernel, linux-kernel
Cc: llvm, oe-kbuild-all, catalin.marinas, will, tglx, mingo, bp, x86,
hpa, pbonzini, wanpengli, vkuznets, rafael, daniel.lezcano,
peterz, arnd, lenb, mark.rutland, harisokn, joao.m.martins,
boris.ostrovsky, konrad.wilk, ankur.a.arora
Hi Ankur,
kernel test robot noticed the following build errors:
[auto build test ERROR on rafael-pm/linux-next]
[also build test ERROR on rafael-pm/bleeding-edge tip/x86/core arm64/for-next/core linus/master v6.9-rc6 next-20240430]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Ankur-Arora/cpuidle-rename-ARCH_HAS_CPU_RELAX-to-ARCH_HAS_OPTIMIZED_POLL/20240501-024252
base: https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next
patch link: https://lore.kernel.org/r/20240430183730.561960-5-ankur.a.arora%40oracle.com
patch subject: [PATCH 4/9] cpuidle-haltpoll: define arch_haltpoll_supported()
config: i386-buildonly-randconfig-001-20240501 (https://download.01.org/0day-ci/archive/20240501/202405011942.NBEU9bJO-lkp@intel.com/config)
compiler: clang version 18.1.4 (https://github.com/llvm/llvm-project e6c3289804a67ea0bb6a86fadbe454dd93b8d855)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240501/202405011942.NBEU9bJO-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202405011942.NBEU9bJO-lkp@intel.com/
All errors (new ones prefixed by >>):
>> ld.lld: error: undefined symbol: arch_haltpoll_enable
>>> referenced by cpuidle-haltpoll.c
>>> drivers/cpuidle/cpuidle-haltpoll.o:(haltpoll_cpu_online) in archive vmlinux.a
--
>> ld.lld: error: undefined symbol: arch_haltpoll_disable
>>> referenced by cpuidle-haltpoll.c
>>> drivers/cpuidle/cpuidle-haltpoll.o:(haltpoll_cpu_offline) in archive vmlinux.a
--
>> ld.lld: error: undefined symbol: arch_haltpoll_supported
>>> referenced by cpuidle-haltpoll.c
>>> drivers/cpuidle/cpuidle-haltpoll.o:(haltpoll_init) in archive vmlinux.a
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 4/9] cpuidle-haltpoll: define arch_haltpoll_supported()
2024-04-30 18:37 ` [PATCH 4/9] cpuidle-haltpoll: define arch_haltpoll_supported() Ankur Arora
2024-05-01 11:48 ` kernel test robot
@ 2024-05-22 16:09 ` Joao Martins
2024-06-05 5:47 ` Ankur Arora
1 sibling, 1 reply; 25+ messages in thread
From: Joao Martins @ 2024-05-22 16:09 UTC (permalink / raw)
To: Ankur Arora
Cc: catalin.marinas, will, tglx, mingo, bp, x86, hpa, pbonzini,
wanpengli, vkuznets, rafael, daniel.lezcano, peterz, arnd, lenb,
mark.rutland, harisokn, boris.ostrovsky, konrad.wilk,
linux-kernel, linux-pm, kvm, linux-arm-kernel
On 30/04/2024 19:37, Ankur Arora wrote:
> From: Joao Martins <joao.m.martins@oracle.com>
>
> Right now kvm_para_has_hint(KVM_HINTS_REALTIME) is x86 only. In
> pursuit of making cpuidle-haltpoll architecture independent, define
> arch_haltpoll_supported() which handles the architectural check for
> enabling haltpoll.
>
> Move the (kvm_para_available() && kvm_para_has_hint(KVM_HINTS_REALTIME))
> check to the x86 specific arch_haltpoll_supported().
>
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Mihai Carabas <mihai.carabas@oracle.com>
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>
> ---
> Changelog:
>
> - s/arch_haltpoll_want/arch_haltpoll_supported/
I am not sure it's correct to call supported() considering that it's supposed to
always supported (via WFE or cpu_relax()) and it's not exactly what it is doing.
The function you were changing is more about whether it's default enabled or
not. So I think the old name in v4 is more appropriate i.e. arch_haltpoll_want()
Alternatively you could have it called arch_haltpoll_default_enabled() though
it's longer/verbose.
Though if you want a true supported() arch helper *I think* you need to make a
bigger change into introducing arch_haltpoll_supported() separate from
arch_haltpoll_want() where the former would ignore the .force=y modparam and
never be able to load if a given feature wasn't present e.g. prevent arm64
haltpoll loading be conditioned to arch_timer_evtstrm_available() being present.
Though I don't think that you want this AIUI
Joao
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 4/9] cpuidle-haltpoll: define arch_haltpoll_supported()
2024-05-22 16:09 ` Joao Martins
@ 2024-06-05 5:47 ` Ankur Arora
0 siblings, 0 replies; 25+ messages in thread
From: Ankur Arora @ 2024-06-05 5:47 UTC (permalink / raw)
To: Joao Martins
Cc: Ankur Arora, catalin.marinas, will, tglx, mingo, bp, x86, hpa,
pbonzini, wanpengli, vkuznets, rafael, daniel.lezcano, peterz,
arnd, lenb, mark.rutland, harisokn, boris.ostrovsky, konrad.wilk,
linux-kernel, linux-pm, kvm, linux-arm-kernel
Joao Martins <joao.m.martins@oracle.com> writes:
> On 30/04/2024 19:37, Ankur Arora wrote:
>> From: Joao Martins <joao.m.martins@oracle.com>
>>
>> Right now kvm_para_has_hint(KVM_HINTS_REALTIME) is x86 only. In
>> pursuit of making cpuidle-haltpoll architecture independent, define
>> arch_haltpoll_supported() which handles the architectural check for
>> enabling haltpoll.
>>
>> Move the (kvm_para_available() && kvm_para_has_hint(KVM_HINTS_REALTIME))
>> check to the x86 specific arch_haltpoll_supported().
>>
>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>> Signed-off-by: Mihai Carabas <mihai.carabas@oracle.com>
>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>>
>> ---
>> Changelog:
>>
>> - s/arch_haltpoll_want/arch_haltpoll_supported/
>
>
> I am not sure it's correct to call supported() considering that it's supposed to
> always supported (via WFE or cpu_relax()) and it's not exactly what it is doing.
> The function you were changing is more about whether it's default enabled or
> not. So I think the old name in v4 is more appropriate i.e. arch_haltpoll_want()
>
> Alternatively you could have it called arch_haltpoll_default_enabled() though
> it's longer/verbose.
So, I thought about it some and the driver loading decision tree
should be:
1. bail out based on the value of boot_option_idle_override.
2. if arch_haltpoll_supported(), enable haltpoll
3. if cpuidle-haltpoll.force=1, enable haltpoll,
Note: in the posted versions, cpuidle-haltpoll.force is allowed to
override boot_option_idle_override, which is wrong. With that fixed
the x86 check should be:
bool arch_haltpoll_supported(void)
{
return kvm_para_available() && kvm_para_has_hint(KVM_HINTS_REALTIME);
}
and arm64:
static inline bool arch_haltpoll_supported(void)
{
/*
* Ensure the event stream is available to provide a terminating
* condition to the WFE in the poll loop.
*/
return arch_timer_evtstrm_available();
}
Now, both of these fit reasonably well with arch_haltpoll_supported().
My personal preference for that is because it seems to me that the
architecture code should just deal with mechanism and not policy.
However, as you imply arch_haltpoll_supported() is a more loaded name
and given that the KVM side of arm64 haltpoll is not done yet, it's
best to have a more neutral label like arch_haltpoll_want() or
arch_haltpoll_do_enable().
> Though if you want a true supported() arch helper *I think* you need to make a
> bigger change into introducing arch_haltpoll_supported() separate from
> arch_haltpoll_want() where the former would ignore the .force=y modparam and
> never be able to load if a given feature wasn't present e.g. prevent arm64
> haltpoll loading be conditioned to arch_timer_evtstrm_available() being present.
>
> Though I don't think that you want this AIUI
Yeah I don't. I think the cpuidle-haltpoll.force=1, should be allowed to
override arch_haltpoll_supported(), so long as smp_cond_load_relaxed()
is well defined (as it is here).
It shouldn't, however, override the user's choice of boot_option_idle_override.
--
ankur
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 5/9] governors/haltpoll: drop kvm_para_available() check
2024-04-30 18:37 [PATCH 0/9] Enable haltpoll for arm64 Ankur Arora
` (3 preceding siblings ...)
2024-04-30 18:37 ` [PATCH 4/9] cpuidle-haltpoll: define arch_haltpoll_supported() Ankur Arora
@ 2024-04-30 18:37 ` Ankur Arora
2024-04-30 18:37 ` [PATCH 6/9] cpuidle/poll_state: poll via smp_cond_load_relaxed() Ankur Arora
` (4 subsequent siblings)
9 siblings, 0 replies; 25+ messages in thread
From: Ankur Arora @ 2024-04-30 18:37 UTC (permalink / raw)
To: linux-pm, kvm, linux-arm-kernel, linux-kernel
Cc: catalin.marinas, will, tglx, mingo, bp, x86, hpa, pbonzini,
wanpengli, vkuznets, rafael, daniel.lezcano, peterz, arnd, lenb,
mark.rutland, harisokn, joao.m.martins, boris.ostrovsky,
konrad.wilk, ankur.a.arora
From: Joao Martins <joao.m.martins@oracle.com>
The haltpoll governor is selected either by the cpuidle-haltpoll
driver, or explicitly by the user.
In particular, it is never selected by default since it has the lowest
rating of all governors (menu=20, teo=19, ladder=10/25, haltpoll=9).
So, we can safely forgo the kvm_para_available() check. This also
allows cpuidle-haltpoll to be tested on baremetal.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Mihai Carabas <mihai.carabas@oracle.com>
Acked-by: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
drivers/cpuidle/governors/haltpoll.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/drivers/cpuidle/governors/haltpoll.c b/drivers/cpuidle/governors/haltpoll.c
index 663b7f164d20..c8752f793e61 100644
--- a/drivers/cpuidle/governors/haltpoll.c
+++ b/drivers/cpuidle/governors/haltpoll.c
@@ -18,7 +18,6 @@
#include <linux/tick.h>
#include <linux/sched.h>
#include <linux/module.h>
-#include <linux/kvm_para.h>
#include <trace/events/power.h>
static unsigned int guest_halt_poll_ns __read_mostly = 200000;
@@ -148,10 +147,7 @@ static struct cpuidle_governor haltpoll_governor = {
static int __init init_haltpoll(void)
{
- if (kvm_para_available())
- return cpuidle_register_governor(&haltpoll_governor);
-
- return 0;
+ return cpuidle_register_governor(&haltpoll_governor);
}
postcore_initcall(init_haltpoll);
--
2.39.3
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 25+ messages in thread* [PATCH 6/9] cpuidle/poll_state: poll via smp_cond_load_relaxed()
2024-04-30 18:37 [PATCH 0/9] Enable haltpoll for arm64 Ankur Arora
` (4 preceding siblings ...)
2024-04-30 18:37 ` [PATCH 5/9] governors/haltpoll: drop kvm_para_available() check Ankur Arora
@ 2024-04-30 18:37 ` Ankur Arora
2024-04-30 18:37 ` [PATCH 7/9] arm64: define TIF_POLLING_NRFLAG Ankur Arora
` (3 subsequent siblings)
9 siblings, 0 replies; 25+ messages in thread
From: Ankur Arora @ 2024-04-30 18:37 UTC (permalink / raw)
To: linux-pm, kvm, linux-arm-kernel, linux-kernel
Cc: catalin.marinas, will, tglx, mingo, bp, x86, hpa, pbonzini,
wanpengli, vkuznets, rafael, daniel.lezcano, peterz, arnd, lenb,
mark.rutland, harisokn, joao.m.martins, boris.ostrovsky,
konrad.wilk, ankur.a.arora
From: Mihai Carabas <mihai.carabas@oracle.com>
The inner loop in poll_idle() polls up to POLL_IDLE_RELAX_COUNT times,
checking to see if the thread has the TIF_NEED_RESCHED bit set. The
loop exits once the condition is met, or if the poll time limit has
been exceeded.
The time check is done only infrequently (once in POLL_IDLE_RELAX_COUNT
iterations) so as to minimize the number of instructions executed in
each iteration. In addition, each loop iteration executes cpu_relax()
which on certain platforms provides a hint to the pipeline that the
loop is busy-waiting, thus allowing the processor to reduce power
consumption.
However, cpu_relax() is not defined optimally everywhere. In particular,
on arm64, it is implemented as a YIELD which merely serves as a hint to
prefer a different hardware thread if one is available.
arm64 exposes a better mechanism via smp_cond_load_relaxed() which uses
LDXR, WFE where the LDXR loads a memory region in exclusive state and
the WFE waits for any stores to the region.
So restructure the loop and fold both checks in smp_cond_load_relaxed().
Also, move the time check to the head of the loop so, once
TIF_NEED_RESCHED is set, we exit straight-away without doing an
unnecessary time check.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Mihai Carabas <mihai.carabas@oracle.com>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
Changelog:
- reorganized the loop to keep the original poll_idle() structure.
---
drivers/cpuidle/poll_state.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/cpuidle/poll_state.c b/drivers/cpuidle/poll_state.c
index 9b6d90a72601..532e4ed19e0f 100644
--- a/drivers/cpuidle/poll_state.c
+++ b/drivers/cpuidle/poll_state.c
@@ -21,21 +21,21 @@ static int __cpuidle poll_idle(struct cpuidle_device *dev,
raw_local_irq_enable();
if (!current_set_polling_and_test()) {
- unsigned int loop_count = 0;
+ unsigned int loop_count;
u64 limit;
limit = cpuidle_poll_time(drv, dev);
while (!need_resched()) {
- cpu_relax();
- if (loop_count++ < POLL_IDLE_RELAX_COUNT)
- continue;
-
loop_count = 0;
if (local_clock_noinstr() - time_start > limit) {
dev->poll_time_limit = true;
break;
}
+
+ smp_cond_load_relaxed(¤t_thread_info()->flags,
+ VAL & _TIF_NEED_RESCHED ||
+ loop_count++ >= POLL_IDLE_RELAX_COUNT);
}
}
raw_local_irq_disable();
--
2.39.3
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 25+ messages in thread* [PATCH 7/9] arm64: define TIF_POLLING_NRFLAG
2024-04-30 18:37 [PATCH 0/9] Enable haltpoll for arm64 Ankur Arora
` (5 preceding siblings ...)
2024-04-30 18:37 ` [PATCH 6/9] cpuidle/poll_state: poll via smp_cond_load_relaxed() Ankur Arora
@ 2024-04-30 18:37 ` Ankur Arora
2024-04-30 18:37 ` [PATCH 8/9] arm64: support cpuidle-haltpoll Ankur Arora
` (2 subsequent siblings)
9 siblings, 0 replies; 25+ messages in thread
From: Ankur Arora @ 2024-04-30 18:37 UTC (permalink / raw)
To: linux-pm, kvm, linux-arm-kernel, linux-kernel
Cc: catalin.marinas, will, tglx, mingo, bp, x86, hpa, pbonzini,
wanpengli, vkuznets, rafael, daniel.lezcano, peterz, arnd, lenb,
mark.rutland, harisokn, joao.m.martins, boris.ostrovsky,
konrad.wilk, ankur.a.arora
From: Joao Martins <joao.m.martins@oracle.com>
Commit 842514849a61 ("arm64: Remove TIF_POLLING_NRFLAG") had removed
TIF_POLLING_NRFLAG because arm64 only supported non-polled idling via
cpu_do_idle().
To add support for polling via cpuidle-haltpoll, we want to use the
standard poll_idle() interface, which sets TIF_POLLING_NRFLAG while
polling.
Reuse the same bit to define TIF_POLLING_NRFLAG.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Mihai Carabas <mihai.carabas@oracle.com>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
arch/arm64/include/asm/thread_info.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index e72a3bf9e563..23ff72168e48 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -69,6 +69,7 @@ void arch_setup_new_exec(void);
#define TIF_SYSCALL_TRACEPOINT 10 /* syscall tracepoint for ftrace */
#define TIF_SECCOMP 11 /* syscall secure computing */
#define TIF_SYSCALL_EMU 12 /* syscall emulation active */
+#define TIF_POLLING_NRFLAG 16 /* set while polling in poll_idle() */
#define TIF_MEMDIE 18 /* is terminating due to OOM killer */
#define TIF_FREEZE 19
#define TIF_RESTORE_SIGMASK 20
@@ -91,6 +92,7 @@ void arch_setup_new_exec(void);
#define _TIF_SYSCALL_TRACEPOINT (1 << TIF_SYSCALL_TRACEPOINT)
#define _TIF_SECCOMP (1 << TIF_SECCOMP)
#define _TIF_SYSCALL_EMU (1 << TIF_SYSCALL_EMU)
+#define _TIF_POLLING_NRFLAG (1 << TIF_POLLING_NRFLAG)
#define _TIF_UPROBE (1 << TIF_UPROBE)
#define _TIF_SINGLESTEP (1 << TIF_SINGLESTEP)
#define _TIF_32BIT (1 << TIF_32BIT)
--
2.39.3
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 25+ messages in thread* [PATCH 8/9] arm64: support cpuidle-haltpoll
2024-04-30 18:37 [PATCH 0/9] Enable haltpoll for arm64 Ankur Arora
` (6 preceding siblings ...)
2024-04-30 18:37 ` [PATCH 7/9] arm64: define TIF_POLLING_NRFLAG Ankur Arora
@ 2024-04-30 18:37 ` Ankur Arora
2024-05-30 23:07 ` Okanovic, Haris
2024-06-19 12:17 ` Sudeep Holla
2024-04-30 18:37 ` [PATCH 9/9] cpuidle/poll_state: limit POLL_IDLE_RELAX_COUNT on arm64 Ankur Arora
2024-04-30 18:56 ` [PATCH 0/9] Enable haltpoll for arm64 Ankur Arora
9 siblings, 2 replies; 25+ messages in thread
From: Ankur Arora @ 2024-04-30 18:37 UTC (permalink / raw)
To: linux-pm, kvm, linux-arm-kernel, linux-kernel
Cc: catalin.marinas, will, tglx, mingo, bp, x86, hpa, pbonzini,
wanpengli, vkuznets, rafael, daniel.lezcano, peterz, arnd, lenb,
mark.rutland, harisokn, joao.m.martins, boris.ostrovsky,
konrad.wilk, ankur.a.arora
Add architectural support for the cpuidle-haltpoll driver by defining
arch_haltpoll_*(). Also select ARCH_HAS_OPTIMIZED_POLL since we have
an optimized polling mechanism via smp_cond_load*().
Add the configuration option, ARCH_CPUIDLE_HALTPOLL to allow
cpuidle-haltpoll to be selected.
Note that we limit cpuidle-haltpoll support to when the event-stream is
available. This is necessary because polling via smp_cond_load_relaxed()
uses WFE to wait for a store which might not happen for an prolonged
period of time. So, ensure the event-stream is around to provide a
terminating condition.
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
arch/arm64/Kconfig | 10 ++++++++++
arch/arm64/include/asm/cpuidle_haltpoll.h | 21 +++++++++++++++++++++
2 files changed, 31 insertions(+)
create mode 100644 arch/arm64/include/asm/cpuidle_haltpoll.h
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 7b11c98b3e84..6f2df162b10e 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -34,6 +34,7 @@ config ARM64
select ARCH_HAS_MEMBARRIER_SYNC_CORE
select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
+ select ARCH_HAS_OPTIMIZED_POLL
select ARCH_HAS_PTE_DEVMAP
select ARCH_HAS_PTE_SPECIAL
select ARCH_HAS_HW_PTE_YOUNG
@@ -2331,6 +2332,15 @@ config ARCH_HIBERNATION_HEADER
config ARCH_SUSPEND_POSSIBLE
def_bool y
+config ARCH_CPUIDLE_HALTPOLL
+ bool "Enable selection of the cpuidle-haltpoll driver"
+ default n
+ help
+ cpuidle-haltpoll allows for adaptive polling based on
+ current load before entering the idle state.
+
+ Some virtualized workloads benefit from using it.
+
endmenu # "Power management options"
menu "CPU Power Management"
diff --git a/arch/arm64/include/asm/cpuidle_haltpoll.h b/arch/arm64/include/asm/cpuidle_haltpoll.h
new file mode 100644
index 000000000000..a79bdec7f516
--- /dev/null
+++ b/arch/arm64/include/asm/cpuidle_haltpoll.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_HALTPOLL_H
+#define _ASM_HALTPOLL_H
+
+static inline void arch_haltpoll_enable(unsigned int cpu)
+{
+}
+
+static inline void arch_haltpoll_disable(unsigned int cpu)
+{
+}
+
+static inline bool arch_haltpoll_supported(void)
+{
+ /*
+ * Ensure the event stream is available to provide a terminating
+ * condition to the WFE in the poll loop.
+ */
+ return arch_timer_evtstrm_available();
+}
+#endif
--
2.39.3
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [PATCH 8/9] arm64: support cpuidle-haltpoll
2024-04-30 18:37 ` [PATCH 8/9] arm64: support cpuidle-haltpoll Ankur Arora
@ 2024-05-30 23:07 ` Okanovic, Haris
2024-06-04 23:09 ` Ankur Arora
2024-06-19 12:17 ` Sudeep Holla
1 sibling, 1 reply; 25+ messages in thread
From: Okanovic, Haris @ 2024-05-30 23:07 UTC (permalink / raw)
To: linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org,
linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org,
ankur.a.arora@oracle.com
Cc: joao.m.martins@oracle.com, boris.ostrovsky@oracle.com,
konrad.wilk@oracle.com, wanpengli@tencent.com, mingo@redhat.com,
catalin.marinas@arm.com, pbonzini@redhat.com, tglx@linutronix.de,
daniel.lezcano@linaro.org, lenb@kernel.org, arnd@arndb.de,
will@kernel.org, hpa@zytor.com, peterz@infradead.org,
vkuznets@redhat.com, bp@alien8.de, Okanovic, Haris,
rafael@kernel.org, x86@kernel.org, mark.rutland@arm.com
On Tue, 2024-04-30 at 11:37 -0700, Ankur Arora wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> Add architectural support for the cpuidle-haltpoll driver by defining
> arch_haltpoll_*(). Also select ARCH_HAS_OPTIMIZED_POLL since we have
> an optimized polling mechanism via smp_cond_load*().
>
> Add the configuration option, ARCH_CPUIDLE_HALTPOLL to allow
> cpuidle-haltpoll to be selected.
>
> Note that we limit cpuidle-haltpoll support to when the event-stream is
> available. This is necessary because polling via smp_cond_load_relaxed()
> uses WFE to wait for a store which might not happen for an prolonged
> period of time. So, ensure the event-stream is around to provide a
> terminating condition.
>
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
> arch/arm64/Kconfig | 10 ++++++++++
> arch/arm64/include/asm/cpuidle_haltpoll.h | 21 +++++++++++++++++++++
> 2 files changed, 31 insertions(+)
> create mode 100644 arch/arm64/include/asm/cpuidle_haltpoll.h
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 7b11c98b3e84..6f2df162b10e 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -34,6 +34,7 @@ config ARM64
> select ARCH_HAS_MEMBARRIER_SYNC_CORE
> select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
> select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
> + select ARCH_HAS_OPTIMIZED_POLL
> select ARCH_HAS_PTE_DEVMAP
> select ARCH_HAS_PTE_SPECIAL
> select ARCH_HAS_HW_PTE_YOUNG
> @@ -2331,6 +2332,15 @@ config ARCH_HIBERNATION_HEADER
> config ARCH_SUSPEND_POSSIBLE
> def_bool y
>
> +config ARCH_CPUIDLE_HALTPOLL
> + bool "Enable selection of the cpuidle-haltpoll driver"
> + default n
> + help
> + cpuidle-haltpoll allows for adaptive polling based on
> + current load before entering the idle state.
> +
> + Some virtualized workloads benefit from using it.
> +
> endmenu # "Power management options"
>
> menu "CPU Power Management"
> diff --git a/arch/arm64/include/asm/cpuidle_haltpoll.h b/arch/arm64/include/asm/cpuidle_haltpoll.h
> new file mode 100644
> index 000000000000..a79bdec7f516
> --- /dev/null
> +++ b/arch/arm64/include/asm/cpuidle_haltpoll.h
> @@ -0,0 +1,21 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_HALTPOLL_H
> +#define _ASM_HALTPOLL_H
> +
> +static inline void arch_haltpoll_enable(unsigned int cpu)
> +{
> +}
> +
> +static inline void arch_haltpoll_disable(unsigned int cpu)
> +{
> +}
> +
> +static inline bool arch_haltpoll_supported(void)
> +{
> + /*
> + * Ensure the event stream is available to provide a terminating
> + * condition to the WFE in the poll loop.
> + */
> + return arch_timer_evtstrm_available();
Note this fails build when CONFIG_HALTPOLL_CPUIDLE=m (module):
ERROR: modpost: "arch_cpu_idle" [drivers/cpuidle/cpuidle-haltpoll.ko]
undefined!
ERROR: modpost: "arch_timer_evtstrm_available"
[drivers/cpuidle/cpuidle-haltpoll.ko] undefined!
make[2]: *** [scripts/Makefile.modpost:145: Module.symvers] Error 1
make[1]: *** [/home/ubuntu/linux/Makefile:1886: modpost] Error 2
make: *** [Makefile:240: __sub-make] Error 2
You could add EXPORT_SYMBOL_*()'s on the above helpers or restrict
HALTPOLL_CPUIDLE module to built-in (remove "tristate" Kconfig).
Otherwise, everything worked for me when built-in (=y) atop 6.10.0
(4a4be1a). I see similar performance gains in `perf bench` on AWS
Graviton3 c7g.16xlarge.
Regards,
Haris Okanovic
> +}
> +#endif
> --
> 2.39.3
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 25+ messages in thread* Re: [PATCH 8/9] arm64: support cpuidle-haltpoll
2024-05-30 23:07 ` Okanovic, Haris
@ 2024-06-04 23:09 ` Ankur Arora
0 siblings, 0 replies; 25+ messages in thread
From: Ankur Arora @ 2024-06-04 23:09 UTC (permalink / raw)
To: Okanovic, Haris
Cc: linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org,
linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org,
ankur.a.arora@oracle.com, joao.m.martins@oracle.com,
boris.ostrovsky@oracle.com, konrad.wilk@oracle.com,
wanpengli@tencent.com, mingo@redhat.com, catalin.marinas@arm.com,
pbonzini@redhat.com, tglx@linutronix.de,
daniel.lezcano@linaro.org, lenb@kernel.org, arnd@arndb.de,
will@kernel.org, hpa@zytor.com, peterz@infradead.org,
vkuznets@redhat.com, bp@alien8.de, rafael@kernel.org,
x86@kernel.org, mark.rutland@arm.com
Okanovic, Haris <harisokn@amazon.com> writes:
> On Tue, 2024-04-30 at 11:37 -0700, Ankur Arora wrote:
>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>>
>>
>>
>> Add architectural support for the cpuidle-haltpoll driver by defining
>> arch_haltpoll_*(). Also select ARCH_HAS_OPTIMIZED_POLL since we have
>> an optimized polling mechanism via smp_cond_load*().
>>
>> Add the configuration option, ARCH_CPUIDLE_HALTPOLL to allow
>> cpuidle-haltpoll to be selected.
>>
>> Note that we limit cpuidle-haltpoll support to when the event-stream is
>> available. This is necessary because polling via smp_cond_load_relaxed()
>> uses WFE to wait for a store which might not happen for an prolonged
>> period of time. So, ensure the event-stream is around to provide a
>> terminating condition.
>>
>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>> ---
>> arch/arm64/Kconfig | 10 ++++++++++
>> arch/arm64/include/asm/cpuidle_haltpoll.h | 21 +++++++++++++++++++++
>> 2 files changed, 31 insertions(+)
>> create mode 100644 arch/arm64/include/asm/cpuidle_haltpoll.h
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 7b11c98b3e84..6f2df162b10e 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -34,6 +34,7 @@ config ARM64
>> select ARCH_HAS_MEMBARRIER_SYNC_CORE
>> select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
>> select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
>> + select ARCH_HAS_OPTIMIZED_POLL
>> select ARCH_HAS_PTE_DEVMAP
>> select ARCH_HAS_PTE_SPECIAL
>> select ARCH_HAS_HW_PTE_YOUNG
>> @@ -2331,6 +2332,15 @@ config ARCH_HIBERNATION_HEADER
>> config ARCH_SUSPEND_POSSIBLE
>> def_bool y
>>
>> +config ARCH_CPUIDLE_HALTPOLL
>> + bool "Enable selection of the cpuidle-haltpoll driver"
>> + default n
>> + help
>> + cpuidle-haltpoll allows for adaptive polling based on
>> + current load before entering the idle state.
>> +
>> + Some virtualized workloads benefit from using it.
>> +
>> endmenu # "Power management options"
>>
>> menu "CPU Power Management"
>> diff --git a/arch/arm64/include/asm/cpuidle_haltpoll.h b/arch/arm64/include/asm/cpuidle_haltpoll.h
>> new file mode 100644
>> index 000000000000..a79bdec7f516
>> --- /dev/null
>> +++ b/arch/arm64/include/asm/cpuidle_haltpoll.h
>> @@ -0,0 +1,21 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#ifndef _ASM_HALTPOLL_H
>> +#define _ASM_HALTPOLL_H
>> +
>> +static inline void arch_haltpoll_enable(unsigned int cpu)
>> +{
>> +}
>> +
>> +static inline void arch_haltpoll_disable(unsigned int cpu)
>> +{
>> +}
>> +
>> +static inline bool arch_haltpoll_supported(void)
>> +{
>> + /*
>> + * Ensure the event stream is available to provide a terminating
>> + * condition to the WFE in the poll loop.
>> + */
>> + return arch_timer_evtstrm_available();
>
> Note this fails build when CONFIG_HALTPOLL_CPUIDLE=m (module):
>
> ERROR: modpost: "arch_cpu_idle" [drivers/cpuidle/cpuidle-haltpoll.ko]
> undefined!
> ERROR: modpost: "arch_timer_evtstrm_available"
> [drivers/cpuidle/cpuidle-haltpoll.ko] undefined!
> make[2]: *** [scripts/Makefile.modpost:145: Module.symvers] Error 1
> make[1]: *** [/home/ubuntu/linux/Makefile:1886: modpost] Error 2
> make: *** [Makefile:240: __sub-make] Error 2
Thanks for trying it out. Missed that.
> You could add EXPORT_SYMBOL_*()'s on the above helpers or restrict
> HALTPOLL_CPUIDLE module to built-in (remove "tristate" Kconfig).
Yeah AFAICT this is the only cpuidle driver providing the module
option. Unfortunately can't remove the tristate thing. People might
already be using it as a module on x86.
I think the arch_cpu_idle() makes sense to export. For
arch_timer_evtstrm_available(), eventually the arch_haltpoll_*()
in any case need to move out of a header file. I'll just do that
now.
> Otherwise, everything worked for me when built-in (=y) atop 6.10.0
> (4a4be1a). I see similar performance gains in `perf bench` on AWS
> Graviton3 c7g.16xlarge.
Excellent. Thanks for checking.
--
ankur
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 8/9] arm64: support cpuidle-haltpoll
2024-04-30 18:37 ` [PATCH 8/9] arm64: support cpuidle-haltpoll Ankur Arora
2024-05-30 23:07 ` Okanovic, Haris
@ 2024-06-19 12:17 ` Sudeep Holla
2024-06-21 23:59 ` Ankur Arora
1 sibling, 1 reply; 25+ messages in thread
From: Sudeep Holla @ 2024-06-19 12:17 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-pm, kvm, Sudeep Holla, linux-arm-kernel, linux-kernel,
catalin.marinas, will, tglx, mingo, bp, x86, hpa, pbonzini,
wanpengli, vkuznets, rafael, daniel.lezcano, peterz, arnd, lenb,
mark.rutland, harisokn, joao.m.martins, boris.ostrovsky,
konrad.wilk
On Tue, Apr 30, 2024 at 11:37:29AM -0700, Ankur Arora wrote:
> Add architectural support for the cpuidle-haltpoll driver by defining
> arch_haltpoll_*(). Also select ARCH_HAS_OPTIMIZED_POLL since we have
> an optimized polling mechanism via smp_cond_load*().
>
> Add the configuration option, ARCH_CPUIDLE_HALTPOLL to allow
> cpuidle-haltpoll to be selected.
>
> Note that we limit cpuidle-haltpoll support to when the event-stream is
> available. This is necessary because polling via smp_cond_load_relaxed()
> uses WFE to wait for a store which might not happen for an prolonged
> period of time. So, ensure the event-stream is around to provide a
> terminating condition.
>
Currently the event stream is configured 10kHz(1 signal per 100uS IIRC).
But the information in the cpuidle states for exit latency and residency
is set to 0(as per drivers/cpuidle/poll_state.c). Will this not cause any
performance issues ?
--
Regards,
Sudeep
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 8/9] arm64: support cpuidle-haltpoll
2024-06-19 12:17 ` Sudeep Holla
@ 2024-06-21 23:59 ` Ankur Arora
2024-06-24 10:54 ` Sudeep Holla
0 siblings, 1 reply; 25+ messages in thread
From: Ankur Arora @ 2024-06-21 23:59 UTC (permalink / raw)
To: Sudeep Holla
Cc: Ankur Arora, linux-pm, kvm, linux-arm-kernel, linux-kernel,
catalin.marinas, will, tglx, mingo, bp, x86, hpa, pbonzini,
wanpengli, vkuznets, rafael, daniel.lezcano, peterz, arnd, lenb,
mark.rutland, harisokn, joao.m.martins, boris.ostrovsky,
konrad.wilk
Sudeep Holla <sudeep.holla@arm.com> writes:
> On Tue, Apr 30, 2024 at 11:37:29AM -0700, Ankur Arora wrote:
>> Add architectural support for the cpuidle-haltpoll driver by defining
>> arch_haltpoll_*(). Also select ARCH_HAS_OPTIMIZED_POLL since we have
>> an optimized polling mechanism via smp_cond_load*().
>>
>> Add the configuration option, ARCH_CPUIDLE_HALTPOLL to allow
>> cpuidle-haltpoll to be selected.
>>
>> Note that we limit cpuidle-haltpoll support to when the event-stream is
>> available. This is necessary because polling via smp_cond_load_relaxed()
>> uses WFE to wait for a store which might not happen for an prolonged
>> period of time. So, ensure the event-stream is around to provide a
>> terminating condition.
>>
>
> Currently the event stream is configured 10kHz(1 signal per 100uS IIRC).
> But the information in the cpuidle states for exit latency and residency
> is set to 0(as per drivers/cpuidle/poll_state.c). Will this not cause any
> performance issues ?
No I don't think there's any performance issue.
When the core is waiting in WFE for &thread_info->flags to
change, and set_nr_if_polling() happens, the CPU will come out
of the wait quickly.
So, the exit latency, residency can be reasonably set to 0.
If, however, there is no store to &thread_info->flags, then the event
stream is what would cause us to come out of the WFE and check if
the poll timeout has been exceeded.
In that case, there was no work to be done, so there was nothing
to wake up from.
So, in either circumstance there's no performance loss.
However, when we are polling under the haltpoll governor, this might
mean that we spend more time polling than determined based on the
guest_halt_poll_ns. But, that would only happen in the last polling
iteration.
So, I'd say, at worst no performance loss. But, we would sometimes
poll for longer than necessary before exiting to the host.
--
ankur
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 8/9] arm64: support cpuidle-haltpoll
2024-06-21 23:59 ` Ankur Arora
@ 2024-06-24 10:54 ` Sudeep Holla
2024-06-25 1:17 ` Ankur Arora
0 siblings, 1 reply; 25+ messages in thread
From: Sudeep Holla @ 2024-06-24 10:54 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-pm, kvm, Sudeep Holla, linux-arm-kernel, linux-kernel,
catalin.marinas, will, tglx, mingo, bp, x86, hpa, pbonzini,
wanpengli, vkuznets, rafael, daniel.lezcano, peterz, arnd, lenb,
mark.rutland, harisokn, joao.m.martins, boris.ostrovsky,
konrad.wilk
On Fri, Jun 21, 2024 at 04:59:22PM -0700, Ankur Arora wrote:
>
> Sudeep Holla <sudeep.holla@arm.com> writes:
>
> > On Tue, Apr 30, 2024 at 11:37:29AM -0700, Ankur Arora wrote:
> >> Add architectural support for the cpuidle-haltpoll driver by defining
> >> arch_haltpoll_*(). Also select ARCH_HAS_OPTIMIZED_POLL since we have
> >> an optimized polling mechanism via smp_cond_load*().
> >>
> >> Add the configuration option, ARCH_CPUIDLE_HALTPOLL to allow
> >> cpuidle-haltpoll to be selected.
> >>
> >> Note that we limit cpuidle-haltpoll support to when the event-stream is
> >> available. This is necessary because polling via smp_cond_load_relaxed()
> >> uses WFE to wait for a store which might not happen for an prolonged
> >> period of time. So, ensure the event-stream is around to provide a
> >> terminating condition.
> >>
> >
> > Currently the event stream is configured 10kHz(1 signal per 100uS IIRC).
> > But the information in the cpuidle states for exit latency and residency
> > is set to 0(as per drivers/cpuidle/poll_state.c). Will this not cause any
> > performance issues ?
>
> No I don't think there's any performance issue.
>
Thanks for the confirmation, that was my assumption as well.
> When the core is waiting in WFE for &thread_info->flags to
> change, and set_nr_if_polling() happens, the CPU will come out
> of the wait quickly.
> So, the exit latency, residency can be reasonably set to 0.
>
Sure
> If, however, there is no store to &thread_info->flags, then the event
> stream is what would cause us to come out of the WFE and check if
> the poll timeout has been exceeded.
> In that case, there was no work to be done, so there was nothing
> to wake up from.
>
This is exactly what I was referring when I asked about performance, but
it looks like it is not a concern for the reason specified about.
> So, in either circumstance there's no performance loss.
>
> However, when we are polling under the haltpoll governor, this might
> mean that we spend more time polling than determined based on the
> guest_halt_poll_ns. But, that would only happen in the last polling
> iteration.
>
> So, I'd say, at worst no performance loss. But, we would sometimes
> poll for longer than necessary before exiting to the host.
>
Does it make sense to add some comment that implies briefly what we
have discussed here ? Mainly why 0 exit and target residency values
are fine and how worst case WFE wakeup doesn't impact the performance.
--
Regards,
Sudeep
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 8/9] arm64: support cpuidle-haltpoll
2024-06-24 10:54 ` Sudeep Holla
@ 2024-06-25 1:17 ` Ankur Arora
0 siblings, 0 replies; 25+ messages in thread
From: Ankur Arora @ 2024-06-25 1:17 UTC (permalink / raw)
To: Sudeep Holla
Cc: Ankur Arora, linux-pm, kvm, linux-arm-kernel, linux-kernel,
catalin.marinas, will, tglx, mingo, bp, x86, hpa, pbonzini,
wanpengli, vkuznets, rafael, daniel.lezcano, peterz, arnd, lenb,
mark.rutland, harisokn, joao.m.martins, boris.ostrovsky,
konrad.wilk
Sudeep Holla <sudeep.holla@arm.com> writes:
> On Fri, Jun 21, 2024 at 04:59:22PM -0700, Ankur Arora wrote:
>>
>> Sudeep Holla <sudeep.holla@arm.com> writes:
>>
>> > On Tue, Apr 30, 2024 at 11:37:29AM -0700, Ankur Arora wrote:
>> >> Add architectural support for the cpuidle-haltpoll driver by defining
>> >> arch_haltpoll_*(). Also select ARCH_HAS_OPTIMIZED_POLL since we have
>> >> an optimized polling mechanism via smp_cond_load*().
>> >>
>> >> Add the configuration option, ARCH_CPUIDLE_HALTPOLL to allow
>> >> cpuidle-haltpoll to be selected.
>> >>
>> >> Note that we limit cpuidle-haltpoll support to when the event-stream is
>> >> available. This is necessary because polling via smp_cond_load_relaxed()
>> >> uses WFE to wait for a store which might not happen for an prolonged
>> >> period of time. So, ensure the event-stream is around to provide a
>> >> terminating condition.
>> >>
>> >
>> > Currently the event stream is configured 10kHz(1 signal per 100uS IIRC).
>> > But the information in the cpuidle states for exit latency and residency
>> > is set to 0(as per drivers/cpuidle/poll_state.c). Will this not cause any
>> > performance issues ?
>>
>> No I don't think there's any performance issue.
>>
>
> Thanks for the confirmation, that was my assumption as well.
>
>> When the core is waiting in WFE for &thread_info->flags to
>> change, and set_nr_if_polling() happens, the CPU will come out
>> of the wait quickly.
>> So, the exit latency, residency can be reasonably set to 0.
>>
>
> Sure
>
>> If, however, there is no store to &thread_info->flags, then the event
>> stream is what would cause us to come out of the WFE and check if
>> the poll timeout has been exceeded.
>> In that case, there was no work to be done, so there was nothing
>> to wake up from.
>>
>
> This is exactly what I was referring when I asked about performance, but
> it looks like it is not a concern for the reason specified about.
>
>> So, in either circumstance there's no performance loss.
>>
>> However, when we are polling under the haltpoll governor, this might
>> mean that we spend more time polling than determined based on the
>> guest_halt_poll_ns. But, that would only happen in the last polling
>> iteration.
>>
>> So, I'd say, at worst no performance loss. But, we would sometimes
>> poll for longer than necessary before exiting to the host.
>>
>
> Does it make sense to add some comment that implies briefly what we
> have discussed here ? Mainly why 0 exit and target residency values
> are fine and how worst case WFE wakeup doesn't impact the performance.
Yeah let me thresh out the commit message for this patch a bit more.
Thanks for the review!
--
ankur
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 9/9] cpuidle/poll_state: limit POLL_IDLE_RELAX_COUNT on arm64
2024-04-30 18:37 [PATCH 0/9] Enable haltpoll for arm64 Ankur Arora
` (7 preceding siblings ...)
2024-04-30 18:37 ` [PATCH 8/9] arm64: support cpuidle-haltpoll Ankur Arora
@ 2024-04-30 18:37 ` Ankur Arora
2024-04-30 18:56 ` [PATCH 0/9] Enable haltpoll for arm64 Ankur Arora
9 siblings, 0 replies; 25+ messages in thread
From: Ankur Arora @ 2024-04-30 18:37 UTC (permalink / raw)
To: linux-pm, kvm, linux-arm-kernel, linux-kernel
Cc: catalin.marinas, will, tglx, mingo, bp, x86, hpa, pbonzini,
wanpengli, vkuznets, rafael, daniel.lezcano, peterz, arnd, lenb,
mark.rutland, harisokn, joao.m.martins, boris.ostrovsky,
konrad.wilk, ankur.a.arora
smp_cond_load_relaxed(), in its generic polling variant polls on the
loop condition, waiting for it to change, eventually exiting the loop
if the time limit has been exceeded.
To limit the frequency of the time check it is done only once every
POLL_IDLE_RELAX_COUNT iterations.
arm64, however uses an event based mechanism, where instead of polling,
we wait for store to a region.
Limit the POLL_IDLE_RELAX_COUNT to 1 for that case.
Suggested-by: Haris Okanovic <harisokn@amazon.com>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
drivers/cpuidle/poll_state.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/drivers/cpuidle/poll_state.c b/drivers/cpuidle/poll_state.c
index 532e4ed19e0f..b69fe7b67cb4 100644
--- a/drivers/cpuidle/poll_state.c
+++ b/drivers/cpuidle/poll_state.c
@@ -8,7 +8,18 @@
#include <linux/sched/clock.h>
#include <linux/sched/idle.h>
+#ifdef CONFIG_ARM64
+/*
+ * POLL_IDLE_RELAX_COUNT determines how often we check for timeout
+ * while polling for TIF_NEED_RESCHED in thread_info->flags.
+ *
+ * Set this to a low value since arm64, instead of polling, uses a
+ * event based mechanism.
+ */
+#define POLL_IDLE_RELAX_COUNT 1
+#else
#define POLL_IDLE_RELAX_COUNT 200
+#endif
static int __cpuidle poll_idle(struct cpuidle_device *dev,
struct cpuidle_driver *drv, int index)
--
2.39.3
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [PATCH 0/9] Enable haltpoll for arm64
2024-04-30 18:37 [PATCH 0/9] Enable haltpoll for arm64 Ankur Arora
` (8 preceding siblings ...)
2024-04-30 18:37 ` [PATCH 9/9] cpuidle/poll_state: limit POLL_IDLE_RELAX_COUNT on arm64 Ankur Arora
@ 2024-04-30 18:56 ` Ankur Arora
9 siblings, 0 replies; 25+ messages in thread
From: Ankur Arora @ 2024-04-30 18:56 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-pm, kvm, linux-arm-kernel, linux-kernel, catalin.marinas,
will, tglx, mingo, bp, x86, hpa, pbonzini, wanpengli, vkuznets,
rafael, daniel.lezcano, peterz, arnd, lenb, mark.rutland,
harisokn, joao.m.martins, boris.ostrovsky, konrad.wilk
> Subject: Re: [PATCH 0/9] Enable haltpoll for arm64
A correction: please read the subject for the series as [PATCH v5] ...
Missed the version number it while sending out.
Thanks
Ankur
Ankur Arora <ankur.a.arora@oracle.com> writes:
> This patchset enables the cpuidle-haltpoll driver and its namesake
> governor on arm64. This is specifically interesting for KVM guests by
> reducing the IPC latencies.
>
> Comparing idle switching latencies on an arm64 KVM guest with
> perf bench sched pipe:
>
> usecs/op %stdev
>
> no haltpoll (baseline) 13.48 +- 5.19%
> with haltpoll 6.84 +- 22.07%
>
>
> No change in performance for a similar test on x86:
>
> usecs/op %stdev
>
> haltpoll w/ cpu_relax() (baseline) 4.75 +- 1.76%
> haltpoll w/ smp_cond_load_relaxed() 4.78 +- 2.31%
>
> Both sets of tests were on otherwise idle systems with guest VCPUs
> pinned to specific PCPUs. One reason for the higher stdev on arm64
> is that trapping of the WFE instruction by the host KVM is contingent
> on the number of tasks on the runqueue.
>
>
> The patch series is organized in four parts:
> - patches 1, 2 mangle the config option ARCH_HAS_CPU_RELAX, renaming
> and moving it from x86 to common architectural code.
> - next, patches 3-5, reorganize the haltpoll selection and init logic
> to allow architecture code to select it.
> - patch 6, reorganizes the poll_idle() loop, switching from using
> cpu_relax() directly to smp_cond_load_relaxed().
> - and finally, patches 7-9, add the bits for arm64 support.
>
> What is still missing: this series largely completes the haltpoll side
> of functionality for arm64. There are, however, a few related areas
> that still need to be threshed out:
>
> - WFET support: WFE on arm64 does not guarantee that poll_idle()
> would terminate in halt_poll_ns. Using WFET would address this.
> - KVM_NO_POLL support on arm64
> - KVM TWED support on arm64: allow the host to limit time spent in
> WFE.
>
>
> Changelog:
>
> v5:
> - rework the poll_idle() loop around smp_cond_load_relaxed() (review
> comment from Tomohiro Misono.)
> - also rework selection of cpuidle-haltpoll. Now selected based
> on the architectural selection of ARCH_CPUIDLE_HALTPOLL.
> - arch_haltpoll_supported() (renamed from arch_haltpoll_want()) on
> arm64 now depends on the event-stream being enabled.
> - limit POLL_IDLE_RELAX_COUNT on arm64 (review comment from Haris Okanovic)
> - ARCH_HAS_CPU_RELAX is now renamed to ARCH_HAS_OPTIMIZED_POLL.
>
> v4 changes from v3:
> - change 7/8 per Rafael input: drop the parens and use ret for the final check
> - add 8/8 which renames the guard for building poll_state
>
> v3 changes from v2:
> - fix 1/7 per Petr Mladek - remove ARCH_HAS_CPU_RELAX from arch/x86/Kconfig
> - add Ack-by from Rafael Wysocki on 2/7
>
> v2 changes from v1:
> - added patch 7 where we change cpu_relax with smp_cond_load_relaxed per PeterZ
> (this improves by 50% at least the CPU cycles consumed in the tests above:
> 10,716,881,137 now vs 14,503,014,257 before)
> - removed the ifdef from patch 1 per RafaelW
>
> Ankur Arora (4):
> cpuidle: rename ARCH_HAS_CPU_RELAX to ARCH_HAS_OPTIMIZED_POLL
> cpuidle-haltpoll: condition on ARCH_CPUIDLE_HALTPOLL
> arm64: support cpuidle-haltpoll
> cpuidle/poll_state: limit POLL_IDLE_RELAX_COUNT on arm64
>
> Joao Martins (4):
> Kconfig: move ARCH_HAS_OPTIMIZED_POLL to arch/Kconfig
> cpuidle-haltpoll: define arch_haltpoll_supported()
> governors/haltpoll: drop kvm_para_available() check
> arm64: define TIF_POLLING_NRFLAG
>
> Mihai Carabas (1):
> cpuidle/poll_state: poll via smp_cond_load_relaxed()
>
> arch/Kconfig | 3 +++
> arch/arm64/Kconfig | 10 ++++++++++
> arch/arm64/include/asm/cpuidle_haltpoll.h | 21 +++++++++++++++++++++
> arch/arm64/include/asm/thread_info.h | 2 ++
> arch/x86/Kconfig | 4 +---
> arch/x86/include/asm/cpuidle_haltpoll.h | 1 +
> arch/x86/kernel/kvm.c | 10 ++++++++++
> drivers/acpi/processor_idle.c | 4 ++--
> drivers/cpuidle/Kconfig | 5 ++---
> drivers/cpuidle/Makefile | 2 +-
> drivers/cpuidle/cpuidle-haltpoll.c | 9 ++-------
> drivers/cpuidle/governors/haltpoll.c | 6 +-----
> drivers/cpuidle/poll_state.c | 21 ++++++++++++++++-----
> drivers/idle/Kconfig | 1 +
> include/linux/cpuidle.h | 2 +-
> include/linux/cpuidle_haltpoll.h | 5 +++++
> 16 files changed, 79 insertions(+), 27 deletions(-)
> create mode 100644 arch/arm64/include/asm/cpuidle_haltpoll.h
--
ankur
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 25+ messages in thread