All of lore.kernel.org
 help / color / mirror / Atom feed
From: Catalin Marinas <catalin.marinas@arm.com>
To: Ankur Arora <ankur.a.arora@oracle.com>
Cc: "Okanovic, Haris" <harisokn@amazon.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"rafael@kernel.org" <rafael@kernel.org>,
	"sudeep.holla@arm.com" <sudeep.holla@arm.com>,
	"joao.m.martins@oracle.com" <joao.m.martins@oracle.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"konrad.wilk@oracle.com" <konrad.wilk@oracle.com>,
	"wanpengli@tencent.com" <wanpengli@tencent.com>,
	"cl@gentwo.org" <cl@gentwo.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"maobibo@loongson.cn" <maobibo@loongson.cn>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"misono.tomohiro@fujitsu.com" <misono.tomohiro@fujitsu.com>,
	"daniel.lezcano@linaro.org" <daniel.lezcano@linaro.org>,
	"arnd@arndb.de" <arnd@arndb.de>,
	"lenb@kernel.org" <lenb@kernel.org>,
	"will@kernel.org" <will@kernel.org>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"boris.ostrovsky@oracle.com" <boris.ostrovsky@oracle.com>,
	"vkuznets@redhat.com" <vkuznets@redhat.com>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>,
	"bp@alien8.de" <bp@alien8.de>,
	"mtosatti@redhat.com" <mtosatti@redhat.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"mark.rutland@arm.com" <mark.rutland@arm.com>
Subject: Re: [PATCH v8 01/11] cpuidle/poll_state: poll via smp_cond_load_relaxed()
Date: Fri, 18 Oct 2024 12:05:38 +0100	[thread overview]
Message-ID: <ZxJBAubok8pc5ek7@arm.com> (raw)
In-Reply-To: <87h69amjng.fsf@oracle.com>

On Thu, Oct 17, 2024 at 03:47:31PM -0700, Ankur Arora wrote:
> Catalin Marinas <catalin.marinas@arm.com> writes:
> > On Wed, Oct 16, 2024 at 03:13:33PM +0000, Okanovic, Haris wrote:
> >> On Tue, 2024-10-15 at 13:04 +0100, Catalin Marinas wrote:
> >> > On Wed, Sep 25, 2024 at 04:24:15PM -0700, Ankur Arora wrote:
> >> > > +                     smp_cond_load_relaxed(&current_thread_info()->flags,
> >> > > +                                           VAL & _TIF_NEED_RESCHED ||
> >> > > +                                           loop_count++ >= POLL_IDLE_RELAX_COUNT);
> >> >
> >> > The above is not guaranteed to make progress if _TIF_NEED_RESCHED is
> >> > never set. With the event stream enabled on arm64, the WFE will
> >> > eventually be woken up, loop_count incremented and the condition would
> >> > become true. However, the smp_cond_load_relaxed() semantics require that
> >> > a different agent updates the variable being waited on, not the waiting
> >> > CPU updating it itself. Also note that the event stream can be disabled
> >> > on arm64 on the kernel command line.
> >>
> >> Alternately could we condition arch_haltpoll_want() on
> >> arch_timer_evtstrm_available(), like v7?
> >
> > No. The problem is about the smp_cond_load_relaxed() semantics - it
> > can't wait on a variable that's only updated in its exit condition. We
> > need a new API for this, especially since we are changing generic code
> > here (even it was arm64 code only, I'd still object to such
> > smp_cond_load_*() constructs).
> 
> Right. The problem is that smp_cond_load_relaxed() used in this context
> depends on the event-stream side effect when the interface does not
> encode those semantics anywhere.
> 
> So, a smp_cond_load_timeout() like in [1] that continues to depend on
> the event-stream is better because it explicitly accounts for the side
> effect from the timeout.
> 
> This would cover both the WFxT and the event-stream case.

Indeed.

> The part I'm a little less sure about is the case where WFxT and the
> event-stream are absent.
> 
> As you said earlier, for that case on arm64, we use either short
> __delay() calls or spin in cpu_relax(), both of which are essentially
> the same thing.

Something derived from __delay(), not exactly this function. We can't
use it directly as we also want it to wake up if an event is generated
as a result of a memory write (like the current smp_cond_load().

> Now on x86 cpu_relax() is quite optimal. The spec explicitly recommends
> it and from my measurement a loop doing "while (!cond) cpu_relax()" gets
> an IPC of something like 0.1 or similar.
> 
> On my arm64 systems however the same loop gets an IPC of 2.  Now this
> likely varies greatly but seems like it would run pretty hot some of
> the time.

For the cpu_relax() fall-back, it wouldn't be any worse than the current
poll_idle() code, though I guess in this instance we'd not enable idle
polling.

I expect the event stream to be on in all production deployments. The
reason we have a way to disable it is for testing. We've had hardware
errata in the past where the event on spin_unlock doesn't cross the
cluster boundary. We'd not notice because of the event stream.

> So maybe the right thing to do would be to keep smp_cond_load_timeout()
> but only allow polling if WFxT or event-stream is enabled. And enhance
> cpuidle_poll_state_init() to fail if the above condition is not met.

We could do this as well. Maybe hide this behind another function like
arch_has_efficient_smp_cond_load_timeout() (well, some shorter name),
checked somewhere in or on the path to cpuidle_poll_state_init(). Well,
it might be simpler to do this in haltpoll_want(), backed by an
arch_haltpoll_want() function.

I assume we want poll_idle() to wake up as soon as a task becomes
available. Otherwise we could have just used udelay() for some fraction
of cpuidle_poll_time() instead of cpu_relax().

-- 
Catalin

  reply	other threads:[~2024-10-18 11:05 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-25 23:24 [PATCH v8 00/11] Enable haltpoll on arm64 Ankur Arora
2024-09-25 23:24 ` [PATCH v8 01/11] cpuidle/poll_state: poll via smp_cond_load_relaxed() Ankur Arora
2024-10-15 12:04   ` Catalin Marinas
2024-10-15 16:42     ` Christoph Lameter (Ampere)
2024-10-15 16:50       ` Catalin Marinas
2024-10-15 17:17         ` Christoph Lameter (Ampere)
2024-10-15 17:40           ` Catalin Marinas
2024-10-15 21:53             ` Ankur Arora
2024-10-15 22:28               ` Christoph Lameter (Ampere)
2024-10-16  7:06                 ` Ankur Arora
2024-10-17 16:54                   ` Christoph Lameter (Ampere)
2024-10-17 18:36                     ` Ankur Arora
2024-10-15 22:40               ` Christoph Lameter (Ampere)
2024-10-16  9:54                 ` Catalin Marinas
2024-10-17 16:56                   ` Christoph Lameter (Ampere)
2024-10-17 18:15                     ` Catalin Marinas
2024-10-17 19:34                       ` Ankur Arora
2024-10-15 21:32     ` Ankur Arora
2024-10-16  6:20       ` maobibo
2024-10-16 10:06       ` Catalin Marinas
2024-10-16 15:13     ` Okanovic, Haris
2024-10-16 17:04       ` Ankur Arora
2024-10-16 18:04         ` Okanovic, Haris
2024-10-17 14:01       ` Catalin Marinas
2024-10-17 22:47         ` Ankur Arora
2024-10-18 11:05           ` Catalin Marinas [this message]
2024-10-18 19:00             ` Ankur Arora
2024-10-21 12:02               ` Catalin Marinas
2024-09-25 23:24 ` [PATCH v8 02/11] cpuidle: rename ARCH_HAS_CPU_RELAX to ARCH_HAS_OPTIMIZED_POLL Ankur Arora
2024-09-25 23:24 ` [PATCH v8 03/11] Kconfig: move ARCH_HAS_OPTIMIZED_POLL to arch/Kconfig Ankur Arora
2024-09-25 23:24 ` [PATCH v8 04/11] cpuidle-haltpoll: define arch_haltpoll_want() Ankur Arora
2024-09-25 23:24 ` [PATCH v8 05/11] governors/haltpoll: drop kvm_para_available() check Ankur Arora
2024-09-25 23:24 ` [PATCH v8 06/11] cpuidle-haltpoll: condition on ARCH_CPUIDLE_HALTPOLL Ankur Arora
2024-09-25 23:24 ` [PATCH v8 07/11] arm64: define TIF_POLLING_NRFLAG Ankur Arora
2024-09-25 23:24 ` [PATCH v8 08/11] arm64: idle: export arch_cpu_idle Ankur Arora
2024-09-25 23:24 ` [PATCH v8 09/11] arm64: select ARCH_HAS_OPTIMIZED_POLL Ankur Arora
2024-10-14 22:48   ` Christoph Lameter (Ampere)
2024-09-25 23:24 ` [PATCH v8 10/11] cpuidle/poll_state: limit POLL_IDLE_RELAX_COUNT on arm64 Ankur Arora
2024-09-25 23:24 ` [PATCH v8 11/11] arm64: support cpuidle-haltpoll Ankur Arora
2024-10-02 22:42   ` Okanovic, Haris
2024-10-03  3:29     ` Ankur Arora
2024-10-14 22:49       ` Christoph Lameter (Ampere)
2024-10-15  1:49         ` Ankur Arora
2024-10-16 15:13   ` Okanovic, Haris
2024-10-09  2:37 ` [PATCH v8 00/11] Enable haltpoll on arm64 zhenglifeng (A)
2024-10-15  1:53   ` Ankur Arora
2024-10-14 22:54 ` Christoph Lameter (Ampere)
2024-10-15 12:36 ` Marc Zyngier
2024-10-16 21:55   ` Ankur Arora
2024-10-17  8:19     ` Marc Zyngier
2024-10-17 18:35       ` Ankur Arora
2024-10-22 22:01         ` Ankur Arora
2024-11-05 18:30 ` Haris Okanovic
2024-11-05 18:30   ` [PATCH 1/5] asm-generic: add smp_vcond_load_relaxed() Haris Okanovic
2024-11-05 19:36     ` Christoph Lameter (Ampere)
2024-11-06 17:06       ` Okanovic, Haris
2024-11-06 11:08     ` Catalin Marinas
2024-11-06 18:13       ` Okanovic, Haris
2024-11-06 19:55         ` Catalin Marinas
2024-11-06 20:31           ` Okanovic, Haris
2024-11-06 11:39     ` Will Deacon
2024-11-06 17:18       ` Okanovic, Haris
2024-11-05 18:30   ` [PATCH 2/5] arm64: add __READ_ONCE_EX() Haris Okanovic
2024-11-05 19:39     ` Christoph Lameter (Ampere)
2024-11-06 17:37       ` Okanovic, Haris
2024-11-06 11:43     ` Will Deacon
2024-11-06 17:09       ` Okanovic, Haris
2024-11-09  9:49     ` David Laight
2024-11-05 18:30   ` [PATCH 3/5] arm64: refactor delay() to enable polling for value Haris Okanovic
2024-11-05 19:42     ` Christoph Lameter (Ampere)
2024-11-06 17:42       ` Okanovic, Haris
2024-11-06  9:18     ` Catalin Marinas
2024-11-06 17:38       ` Okanovic, Haris
2024-11-05 18:30   ` [PATCH 4/5] arm64: add smp_vcond_load_relaxed() Haris Okanovic
2024-11-05 18:30   ` [PATCH 5/5] cpuidle: implement poll_idle() using smp_vcond_load_relaxed() Haris Okanovic
2024-11-05 19:45     ` Christoph Lameter (Ampere)
2024-11-05 18:49   ` [PATCH v8 00/11] Enable haltpoll on arm64 Ankur Arora

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZxJBAubok8pc5ek7@arm.com \
    --to=catalin.marinas@arm.com \
    --cc=ankur.a.arora@oracle.com \
    --cc=arnd@arndb.de \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=cl@gentwo.org \
    --cc=daniel.lezcano@linaro.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=harisokn@amazon.com \
    --cc=hpa@zytor.com \
    --cc=joao.m.martins@oracle.com \
    --cc=konrad.wilk@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=lenb@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=maobibo@loongson.cn \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=misono.tomohiro@fujitsu.com \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rafael@kernel.org \
    --cc=sudeep.holla@arm.com \
    --cc=tglx@linutronix.de \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.