linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
To: Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
Cc: "Paul E. McKenney"
	<paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-api <linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	KOSAKI Motohiro
	<kosaki.motohiro-+CUm20s59erQFUHtdCDX3A@public.gmane.org>,
	rostedt <rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org>,
	Nicholas Miell <nmiell-Wuw85uim5zDR7s880joybQ@public.gmane.org>,
	Linus Torvalds
	<torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Ingo Molnar <mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	One Thousand Gnomes
	<gnomes-qBU/x9rampVanCEyBjwyrvXRex20P6io@public.gmane.org>,
	Lai Jiangshan <laijs-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>,
	Stephen Hemminger
	<stephen-OTpzqLSitTUnbdJkjeBofR2eb7JE58TQ@public.gmane.org>,
	Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>,
	Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
	David Howells <dhowells-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Pranith Kumar
	<bobby.prani-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Michael Kerrisk
	<mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Subject: Re: [PATCH for v4.2 v18 1/3] sys_membarrier(): system-wide memory barrier (generic, x86)
Date: Sun, 31 May 2015 12:53:05 +0000 (UTC)	[thread overview]
Message-ID: <1260710044.536.1433076785629.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <20150529154036.9695b6c153ed37aed3343f3b-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>

----- On May 30, 2015, at 12:40 AM, Andrew Morton akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org wrote:

> On Sat, 16 May 2015 19:48:18 -0400 Mathieu Desnoyers
> <mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org> wrote:
> 
>> Here is an implementation of a new system call, sys_membarrier(), which
>> executes a memory barrier on all threads running on the system. It is
>> implemented by calling synchronize_sched(). It can be used to distribute
>> the cost of user-space memory barriers asymmetrically by transforming
>> pairs of memory barriers into pairs consisting of sys_membarrier() and a
>> compiler barrier. For synchronization primitives that distinguish
>> between read-side and write-side (e.g. userspace RCU [1], rwlocks), the
>> read-side can be accelerated significantly by moving the bulk of the
>> memory barrier overhead to the write-side.
>>
>> ...
>>
> 
> It would be nice to hear about the real world value of this syscall to
> our users.  I'm seeing test results for a microbenchmark but so what.
> What actual applications or application classes are calling for this and
> what results can they expect to see?

AFAIK, the existing open source applications that would be improved by this
system call are as follows:

* Through Userspace RCU library (http://urcu.so)
  - DNS server (Knot DNS) https://www.knot-dns.cz/
  - Network sniffer (http://netsniff-ng.org/)
  - Distributed object storage (https://sheepdog.github.io/sheepdog/)
  - User-space tracing (http://lttng.org)
  - Network storage system (https://www.gluster.org/)

Those projects use RCU in userspace to increase read-side speed and
scalability compared to locking. Especially in the case of RCU used
by libraries, sys_membarrier can speed up the read-side by moving the
bulk of the memory barrier cost to synchronize_rcu().

* Direct users of sys_membarrier
  - core dotnet garbage collector (https://github.com/dotnet/coreclr/issues/198)

Microsoft core dotnet GC developers are planning to use the mprotect()
side-effect of issuing memory barriers through IPIs as a way to implement Windows
FlushProcessWriteBuffers() on Linux. They are referring to sys_membarrier in their
github thread, specifically stating that sys_membarrier() is what they are looking
for.

> 
>> 
>> membarrier(2) man page:
>> --------------- snip -------------------
>> MEMBARRIER(2)              Linux Programmer's Manual             MEMBARRIER(2)
>> 
>> NAME
>>        membarrier - issue memory barriers on a set of threads
>> 
>> SYNOPSIS
>>        #include <linux/membarrier.h>
>> 
>>        int membarrier(int cmd, int flags);
>> 
>> DESCRIPTION
>>        The cmd argument is one of the following:
>> 
>>        MEMBARRIER_CMD_QUERY
>>               Query  the  set  of  supported commands. It returns a bitmask of
>>               supported commands.
>> 
>>        MEMBARRIER_CMD_SHARED
>>               Execute a memory barrier on all threads running on  the  system.
>>               Upon  return from system call, the caller thread is ensured that
>>               all running threads have passed through a state where all memory
>>               accesses  to  user-space  addresses  match program order between
>>               entry to and return from the system  call  (non-running  threads
>>               are de facto in such a state). This covers threads from all pro___
>>               cesses running on the system.  This command returns 0.
>> 
>>        The flags argument needs to be 0. For future extensions.
>> 
>>        All memory accesses performed  in  program  order  from  each  targeted
>>        thread is guaranteed to be ordered with respect to sys_membarrier(). If
>>        we use the semantic "barrier()" to represent a compiler barrier forcing
>>        memory  accesses  to  be performed in program order across the barrier,
>>        and smp_mb() to represent explicit memory barriers forcing full  memory
>>        ordering  across  the barrier, we have the following ordering table for
>>        each pair of barrier(), sys_membarrier() and smp_mb():
>> 
>>        The pair ordering is detailed as (O: ordered, X: not ordered):
>> 
>>                               barrier()   smp_mb() sys_membarrier()
>>               barrier()          X           X            O
>>               smp_mb()           X           O            O
>>               sys_membarrier()   O           O            O
>> 
>> RETURN VALUE
>>        On success, these system calls return zero.  On error, -1 is  returned,
>>        and errno is set appropriately. For a given command, with flags
>>        argument set to 0, this system call is guaranteed to always return the
>>        same value until reboot.
> 
> I suggest "with flags argument set to MEMBARRIER_CMD_QUERY" here.

No, the enum is for the "cmd" argument (see above) not the flags argument. We
really mean flags = 0 (the value) here.

> 
>> 
>> ERRORS
>>        ENOSYS System call is not implemented.
>> 
>>        EINVAL Invalid arguments.
>> 
>> ...
>>
>> +SYSCALL_DEFINE2(membarrier, int, cmd, int, flags)
>> +{
>> +	if (flags)
>> +		return -EINVAL;
> 
> I'm not a huge fan of this "add a flags arg to syscalls" rule.  Is
> there any realistic expectation that we'll ever *use* this thing?  If
> not, why add it?

I can see this system call evolve in a few ways in the future, such as
having an expedited version (using IPIs), targeting the local thread
group, and targeting all threads mapping a specific shared memory mapping.
I guess that the cmd argument should be enough to cover that, but
in doubt, it might be better to keep a flags argument there for future
needs we might be overlooking right now, so we never end up needing a
sys_membarrier2 system call.

> 
> You may as well put an unlikely() in there btw.

Will do.

Thanks!

Mathieu

> 
>> +	switch (cmd) {
>> +	case MEMBARRIER_CMD_QUERY:
>> +		return MEMBARRIER_CMD_BITMASK;
>> +	case MEMBARRIER_CMD_SHARED:
>> +		if (num_online_cpus() > 1)
>> +			synchronize_sched();
>> +		return 0;
>> +	default:
>> +		return -EINVAL;
>> +	}
> > +}

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

  parent reply	other threads:[~2015-05-31 12:53 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-16 23:48 [PATCH for v4.2 0/3] membarrier system call Mathieu Desnoyers
     [not found] ` <1431820100-17040-1-git-send-email-mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2015-05-16 23:48   ` [PATCH for v4.2 v18 1/3] sys_membarrier(): system-wide memory barrier (generic, x86) Mathieu Desnoyers
     [not found]     ` <1431820100-17040-2-git-send-email-mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2015-05-29 22:40       ` Andrew Morton
     [not found]         ` <20150529154036.9695b6c153ed37aed3343f3b-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2015-05-31 12:53           ` Mathieu Desnoyers [this message]
2015-05-16 23:48 ` [PATCH for v4.2 2/3] selftests: add membarrier syscall test Mathieu Desnoyers
2015-05-16 23:48 ` [PATCH for v4.2 3/3] selftests: enhance " Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1260710044.536.1433076785629.JavaMail.zimbra@efficios.com \
    --to=mathieu.desnoyers-vg+e7yoek/dwk0htik3j/w@public.gmane.org \
    --cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    --cc=bobby.prani-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=dhowells-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=gnomes-qBU/x9rampVanCEyBjwyrvXRex20P6io@public.gmane.org \
    --cc=kosaki.motohiro-+CUm20s59erQFUHtdCDX3A@public.gmane.org \
    --cc=laijs-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org \
    --cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=nmiell-Wuw85uim5zDR7s880joybQ@public.gmane.org \
    --cc=paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
    --cc=peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
    --cc=rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org \
    --cc=stephen-OTpzqLSitTUnbdJkjeBofR2eb7JE58TQ@public.gmane.org \
    --cc=tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org \
    --cc=torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).