All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
To: Andi Kleen <andi-Vw/NltI1exuRpAAqCnN02g@public.gmane.org>
Cc: Linus Torvalds
	<torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
	Florian Weimer <fweimer-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	"H. Peter Anvin" <hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>,
	Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Russell King <linux-lFZ/pmaqli7XmaaqVzeoHQ@public.gmane.org>,
	Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>,
	Ingo Molnar <mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Linux Kernel Mailing List
	<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	linux-api <linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Paul Turner <pjt-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Andrew Hunter <ahh-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>,
	Dave Watson <davejwatson-b10kYP2dOMg@public.gmane.org>,
	Chris Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org>,
	Ben Maurer <bmaurer-b10kYP2dOMg@public.gmane.org>,
	rostedt <rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org>,
	"Paul E. McKenney"
	<paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>,
	Josh Triplett <josh-iaAMLnmF4UmaiuxdJuQwMA@public.gmane.org>,
	Catalin Marinas <catalin.marinas-5wv7dgnIgG8@public.gmane.org>,
	Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>,
	Michael Kerrisk
	<mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Boqun Feng <boqun.feng-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Subject: Re: [RFC PATCH v6 1/5] Thread-local ABI system call: cache CPU number of running thread
Date: Thu, 7 Apr 2016 20:55:41 +0000 (UTC)	[thread overview]
Message-ID: <1353194988.49705.1460062541205.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <20160407202232.GF9407-1g7Xle2YJi4/4alezvVtWx2eb7JE58TQ@public.gmane.org>

----- On Apr 7, 2016, at 4:22 PM, Andi Kleen andi-Vw/NltI1exuRpAAqCnN02g@public.gmane.org wrote:

>> One basic use of cpu id cache is to speed up the sched_getcpu(3)
>> implementation in glibc. This is why I'm proposing it as a stand-alone
> 
> I don't think rseq is needed for faster getcpu.

I agree that rseq is not needed for faster getcpu. This is why I was proposing
to make "cpu_id" feature configurable separately from the rseq feature.
E.g. a kernel configuration that don't want to take the hit of rseq handling
in signal delivery and preemption could just enable the cpu_id feature, and
thus only need to add work in the migration code path, and when returning to
userspace. Also, if a thread only registers the cpu_id feature, the kernel
can skip the rseq code quickly in signal delivery and preemption too.

> 
> User space has to be able handle stale return values anyways, as it
> has no way to lock itself to a cpu while it is using the return value.
> So it can be only a hint.
> 
> The original version of getcpu just had a jiffies based cache. The CPU
> value was valid up to a jiffie (the next time jiffie changes), and then it
> gets looked up again.
> 
> Processes are unlikely to switch CPUs more often than a jiffie, so it's
> good enough as a hint.

One example use-case where this would hurt: we use the CPU id heavily when
tracing to a ring buffer in user-space. Having one event written into the
wrong buffer once in a while is not a big deal, but tracing a whole burst
of events within a jiffy (e.g. 4ms at 250Hz) to the wrong cpu buffer
whenever the thread migrates is really an unwanted side-effect latency-wise.

> 
> This doesn't need any new kernel interfaces at all because jiffies is already
> exported to the vdso.

My understanding is that although your assumptions about availability of
those features in vdso are true for x86 32/64, but do not currently apply
to ARM32.

ARM32 is my main target architecture for the CPU id cache work. x86 32/64
simply also happen to benefit from that work too (see my benchmark numbers
in changelog of patch 1/5).

> It just needs a new entry point into the vdso that handles the jiffie
> check.

This would likely require to extend the ARM vdso page to expose the jiffies
counter to user-space, and update user-space libraries to use this counter
in sched_getcpu. But it would still be slower than the cpu_id cache I propose,
due to the required function call to sched_getcpu, unless you want to open-code
the jiffies check within all applications as an ABI. It would also be bad for
fast bursts of cpu id use (e.g. per-cpu ring buffers).

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

WARNING: multiple messages have this Message-ID (diff)
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Andi Kleen <andi@firstfloor.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Florian Weimer <fweimer@redhat.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Russell King <linux@arm.linux.org.uk>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-api <linux-api@vger.kernel.org>,
	Paul Turner <pjt@google.com>, Andrew Hunter <ahh@google.com>,
	Andy Lutomirski <luto@amacapital.net>,
	Dave Watson <davejwatson@fb.com>, Chris Lameter <cl@linux.com>,
	Ben Maurer <bmaurer@fb.com>, rostedt <rostedt@goodmis.org>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Josh Triplett <josh@joshtriplett.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	Michael Kerrisk <mtk.manpages@gmail.com>,
	Boqun Feng <boqun.feng@gmail.com>
Subject: Re: [RFC PATCH v6 1/5] Thread-local ABI system call: cache CPU number of running thread
Date: Thu, 7 Apr 2016 20:55:41 +0000 (UTC)	[thread overview]
Message-ID: <1353194988.49705.1460062541205.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <20160407202232.GF9407@two.firstfloor.org>

----- On Apr 7, 2016, at 4:22 PM, Andi Kleen andi@firstfloor.org wrote:

>> One basic use of cpu id cache is to speed up the sched_getcpu(3)
>> implementation in glibc. This is why I'm proposing it as a stand-alone
> 
> I don't think rseq is needed for faster getcpu.

I agree that rseq is not needed for faster getcpu. This is why I was proposing
to make "cpu_id" feature configurable separately from the rseq feature.
E.g. a kernel configuration that don't want to take the hit of rseq handling
in signal delivery and preemption could just enable the cpu_id feature, and
thus only need to add work in the migration code path, and when returning to
userspace. Also, if a thread only registers the cpu_id feature, the kernel
can skip the rseq code quickly in signal delivery and preemption too.

> 
> User space has to be able handle stale return values anyways, as it
> has no way to lock itself to a cpu while it is using the return value.
> So it can be only a hint.
> 
> The original version of getcpu just had a jiffies based cache. The CPU
> value was valid up to a jiffie (the next time jiffie changes), and then it
> gets looked up again.
> 
> Processes are unlikely to switch CPUs more often than a jiffie, so it's
> good enough as a hint.

One example use-case where this would hurt: we use the CPU id heavily when
tracing to a ring buffer in user-space. Having one event written into the
wrong buffer once in a while is not a big deal, but tracing a whole burst
of events within a jiffy (e.g. 4ms at 250Hz) to the wrong cpu buffer
whenever the thread migrates is really an unwanted side-effect latency-wise.

> 
> This doesn't need any new kernel interfaces at all because jiffies is already
> exported to the vdso.

My understanding is that although your assumptions about availability of
those features in vdso are true for x86 32/64, but do not currently apply
to ARM32.

ARM32 is my main target architecture for the CPU id cache work. x86 32/64
simply also happen to benefit from that work too (see my benchmark numbers
in changelog of patch 1/5).

> It just needs a new entry point into the vdso that handles the jiffie
> check.

This would likely require to extend the ARM vdso page to expose the jiffies
counter to user-space, and update user-space libraries to use this counter
in sched_getcpu. But it would still be slower than the cpu_id cache I propose,
due to the required function call to sched_getcpu, unless you want to open-code
the jiffies check within all applications as an ABI. It would also be bad for
fast bursts of cpu id use (e.g. per-cpu ring buffers).

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

  parent reply	other threads:[~2016-04-07 20:55 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-04 17:01 [RFC PATCH v6 0/5] Thread-local ABI system call (CPU number cache) Mathieu Desnoyers
2016-04-04 17:01 ` [RFC PATCH v6 1/5] Thread-local ABI system call: cache CPU number of running thread Mathieu Desnoyers
2016-04-04 17:11   ` H. Peter Anvin
2016-04-04 19:46     ` Mathieu Desnoyers
     [not found]       ` <492303698.44994.1459799188052.JavaMail.zimbra-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2016-04-04 20:48         ` Mathieu Desnoyers
2016-04-04 20:48           ` Mathieu Desnoyers
     [not found]           ` <856357054.45028.1459802903401.JavaMail.zimbra-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2016-04-05 16:02             ` Florian Weimer
2016-04-05 16:02               ` Florian Weimer
     [not found]               ` <5703E191.2040707-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-04-05 16:47                 ` Peter Zijlstra
2016-04-05 16:47                   ` Peter Zijlstra
2016-04-07  9:01                   ` Florian Weimer
2016-04-07 10:31                     ` Peter Zijlstra
     [not found]                       ` <20160407103158.GP3430-ndre7Fmf5hadTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
2016-04-07 10:39                         ` Florian Weimer
2016-04-07 10:39                           ` Florian Weimer
     [not found]                           ` <570638D9.7010108-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-04-07 11:19                             ` Peter Zijlstra
2016-04-07 11:19                               ` Peter Zijlstra
     [not found]                               ` <20160407111938.GR3430-ndre7Fmf5hadTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
2016-04-07 12:03                                 ` Florian Weimer
2016-04-07 12:03                                   ` Florian Weimer
     [not found]                                   ` <57064CA9.101-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-04-07 12:25                                     ` Peter Zijlstra
2016-04-07 12:25                                       ` Peter Zijlstra
2016-04-07 12:37                                       ` Florian Weimer
     [not found]                                       ` <20160407122528.GS3430-ndre7Fmf5hadTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
2016-04-07 15:59                                         ` Mathieu Desnoyers
2016-04-07 15:59                                           ` Mathieu Desnoyers
2016-04-07 12:34                                     ` Mathieu Desnoyers
2016-04-07 12:34                                       ` Mathieu Desnoyers
2016-04-07 16:39                                 ` Linus Torvalds
2016-04-07 16:39                                   ` Linus Torvalds
     [not found]                                   ` <CA+55aFxrWx5pFN3LseaKpUHtB6nqXtkgP84seU3pjys-kq7utQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-04-07 16:46                                     ` Andy Lutomirski
2016-04-07 16:46                                       ` Andy Lutomirski
2016-04-07 16:50                                     ` Florian Weimer
2016-04-07 16:50                                       ` Florian Weimer
     [not found]                                       ` <57068FCC.8000701-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-04-07 16:59                                         ` Linus Torvalds
2016-04-07 16:59                                           ` Linus Torvalds
2016-04-07 16:52                                     ` Linus Torvalds
2016-04-07 16:52                                       ` Linus Torvalds
     [not found]                                       ` <CA+55aFyB6CPNiMKGWoaV7vxFWWBTgqOTqG4u2aNnq6uq1cHWZA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-04-07 18:43                                         ` Mathieu Desnoyers
2016-04-07 18:43                                           ` Mathieu Desnoyers
     [not found]                                           ` <1025228632.49344.1460054592801.JavaMail.zimbra-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2016-04-07 20:22                                             ` Andi Kleen
2016-04-07 20:22                                               ` Andi Kleen
     [not found]                                               ` <20160407202232.GF9407-1g7Xle2YJi4/4alezvVtWx2eb7JE58TQ@public.gmane.org>
2016-04-07 20:55                                                 ` Mathieu Desnoyers [this message]
2016-04-07 20:55                                                   ` Mathieu Desnoyers
     [not found]   ` <1459789313-4917-2-git-send-email-mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2016-04-07 10:40     ` Florian Weimer
2016-04-07 10:40       ` Florian Weimer
2016-04-04 17:01 ` [RFC PATCH v6 2/5] Thread-local ABI cpu_id: ARM resume notifier Mathieu Desnoyers
2016-04-04 17:01 ` [RFC PATCH v6 3/5] Thread-local ABI: wire up ARM system call Mathieu Desnoyers
2016-04-04 17:01 ` [RFC PATCH v6 4/5] Thread-local ABI cpu_id: x86 32/64 resume notifier Mathieu Desnoyers
2016-04-04 17:01 ` [RFC PATCH v6 5/5] Thread-local ABI: wire up x86 32/64 system call Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1353194988.49705.1460062541205.JavaMail.zimbra@efficios.com \
    --to=mathieu.desnoyers-vg+e7yoek/dwk0htik3j/w@public.gmane.org \
    --cc=ahh-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    --cc=andi-Vw/NltI1exuRpAAqCnN02g@public.gmane.org \
    --cc=bmaurer-b10kYP2dOMg@public.gmane.org \
    --cc=boqun.feng-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=catalin.marinas-5wv7dgnIgG8@public.gmane.org \
    --cc=cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org \
    --cc=davejwatson-b10kYP2dOMg@public.gmane.org \
    --cc=fweimer-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org \
    --cc=josh-iaAMLnmF4UmaiuxdJuQwMA@public.gmane.org \
    --cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-lFZ/pmaqli7XmaaqVzeoHQ@public.gmane.org \
    --cc=luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org \
    --cc=mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
    --cc=peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
    --cc=pjt-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org \
    --cc=tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org \
    --cc=torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    --cc=will.deacon-5wv7dgnIgG8@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.