All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: Darren Hart <dvhart-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
Cc: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
	Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>,
	Torvald Riegel <triegel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	lkml <linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	libc-alpha <libc-alpha-9JcytcrH/bA+uJoB2kUjGw@public.gmane.org>,
	linux-man <linux-man-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Carlos O'Donell <carlos-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Roland McGrath <roland-/Z5OmTQCD9xF6kxbq+BtvQ@public.gmane.org>,
	Davidlohr Bueso <dave-h16yJtLeMjHk1uMJSBkQmQ@public.gmane.org>,
	Jakub Jelinek <jakub-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Ingo Molnar <mingo-X9Un+BFzKDI@public.gmane.org>,
	bill o gallmeister
	<bgallmeister-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	bert hubert <bert.hubert-dxZxOz86jR8sYtaaK7K+xw@public.gmane.org>,
	Jan Kiszka <jan.kiszka-kv7WeFo6aLtBDgjK7y7TUQ@public.gmane.org>,
	Eric Dumazet <edumazet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org>,
	Rusty Russell <rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org>,
	Heinrich Schuchardt <xypron.glpk-Mmb7MZpHnFY@public.gmane.org>,
	Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>,
	Daniel Wagner <wagi-kQCPcA+X3s7YtjvyW6yDsg@public.gmane.org>,
	Anton Blanchard <anton-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org>,
	Steven Rostedt <rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org>,
	Rich Felker <dalias-8zAoT0mYgF4@public.gmane.org>,
	Jonathan Wakely <jwakely-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Subject: Re: futex(3) man page, final draft for pre-release review
Date: Wed, 16 Dec 2015 16:54:06 +0100	[thread overview]
Message-ID: <5671891E.404@gmail.com> (raw)
In-Reply-To: <20151215211816.GR11972-Z5kFBHtJu+EzCVHREhWfF0EOCMrvLtNR@public.gmane.org>

Hello Darren,

On 12/15/2015 10:18 PM, Darren Hart wrote:
> On Tue, Dec 15, 2015 at 02:43:50PM +0100, Michael Kerrisk (man-pages) wrote:

[...]

>>        When executing a futex operation that requests to block a thread,
>>        the kernel will block only if the futex word has the  value  that
>>        the  calling  thread  supplied  (as  one  of the arguments of the
>>        futex() call) as the expected value of the futex word.  The load‐
>>        ing  of the futex word's value, the comparison of that value with
>>        the expected value, and the actual blocking  will  happen  atomi‐
>>
>> FIXME: for next line, it would be good to have an explanation of
>> "totally ordered" somewhere around here.
>>
>>        cally  and totally ordered with respect to concurrently executing
> 
> Totally ordered with respect futex operations refers to semantics of the
> ACQUIRE/RELEASE operations and how they impact ordering of memory reads and
> writes. The kernel futex operations are protected by spinlocks, which ensure
> that that all operations are serialized with respect to one another.
> 
> This is a lot to attempt to define in this document. Perhaps a reference to
> linux/Documentation/memory-barriers.txt as a footnote would be sufficient? Or
> perhaps for this manual, "serialized" would be sufficient, with a footnote
> regarding "totally ordered" and a pointer to the memory-barrier documentation?

I think I'll just settle for writing serialized in the man page, and be 
done with it :-).

>>        futex operations on the same futex word.  Thus, the futex word is
>>        used to connect the synchronization in user space with the imple‐
>>        mentation of blocking by the kernel.  Analogously  to  an  atomic
>>        compare-and-exchange  operation  that  potentially changes shared
>>        memory, blocking via a futex is an atomic compare-and-block oper‐
>>        ation.
> 
> ...
> 
>>    Futex operations
>>        The futex_op argument consists of two parts: a command that spec‐
>>        ifies  the  operation to be performed, bit-wise ORed with zero or
>>        or more options that modify the behaviour of the operation.   The
>>        options that may be included in futex_op are as follows:
> 
> ...
> 
>>
>>        FUTEX_CLOCK_REALTIME (since Linux 2.6.28)
>>               This   option   bit   can   be   employed  only  with  the
>>               FUTEX_WAIT_BITSET and FUTEX_WAIT_REQUEUE_PI operations.
> 
> That caught me by surprise, but it's true. We reject FUTEX_WAIT |
> FUTEX_CLOCK_REALTIME, even though FUTEX_WAIT treated as FUTEX_WAIT_BITSET with
> val3=FUTEX_BITSET_MATCH_ANY.

You uncover all sorts of interesting stuff when you document APIs ;-).

> 
> Thomas, this looks like an oversight to me - do you recall if we intentionally
> disallow FUTEX_CLOCK_REALTIME with FUTEX_WAIT?
> 
>>               If this option is set, the kernel  treats  timeout  as  an
>>               absolute time based on CLOCK_REALTIME.
>>
>>               If  this  option  is not set, the kernel treats timeout as
>>               relative time, measured against the CLOCK_MONOTONIC clock.
> 
> ...
> 
>>    Priority-inheritance futexes
> 
> ...
> 
>>        *  If  the lock is owned and there are threads contending for the
>>           lock, then the FUTEX_WAITERS bit shall be  set  in  the  futex
>>           word's value; in other words, this value is:
>>
>>               FUTEX_WAITERS | TID
>>
>>
>>           (Note that is invalid for a PI futex word to have no owner and
> 
>                       ^ it
> 
>>           FUTEX_WAITERS set.)
> ...
> 
>>        FUTEX_TRYLOCK_PI (since Linux 2.6.18)
>>               This operation tries to acquire the futex at uaddr.  It is
>>               invoked when a user-space atomic acquire did  not  succeed
>>               because the futex word was not 0.
>>
>>
>> FIXME(Next sentence) The wording "The trylock in kernel" below 
>> needs clarification. Suggestions?
>>
>>               The trylock in kernel might succeed because the futex word
> 
> The lock acquisition might succeed in the kernel because the futex word

Already did some rewording here which I think makes things better.

>>               contains     stale     state     (FUTEX_WAITERS     and/or
>>               FUTEX_OWNER_DIED).   This can happen when the owner of the
>>               futex died.  User space cannot handle this condition in  a
>>               race-free  manner,  but  the  kernel  can  fix this up and
>>               acquire the futex.
>>
>>               The uaddr2, val, timeout, and val3 arguments are ignored.
> 
> ...
> 
>>    EXAMPLE
>>
>> FIXME I think it would be helpful here to say a few more words about
>>       the difference(s) between FUTEX_LOCK_PI and FUTEX_TRYLOCK_PI.
>>       Can someone propose something?
> 
> Hrm. It seems pretty straightforward to me. I guess I'm too close to it. What
> about it seems unclear and needs clarification?

On reflection, I agree that the difference is perhaps well-enough explained.

Thanks for the comments, Darren.

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

WARNING: multiple messages have this Message-ID (diff)
From: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
To: Darren Hart <dvhart@infradead.org>
Cc: mtk.manpages@gmail.com, Thomas Gleixner <tglx@linutronix.de>,
	Torvald Riegel <triegel@redhat.com>,
	lkml <linux-kernel@vger.kernel.org>,
	libc-alpha <libc-alpha@sourceware.org>,
	linux-man <linux-man@vger.kernel.org>,
	"Carlos O'Donell" <carlos@redhat.com>,
	Roland McGrath <roland@hack.frob.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Jakub Jelinek <jakub@redhat.com>, Ingo Molnar <mingo@elte.hu>,
	bill o gallmeister <bgallmeister@gmail.com>,
	bert hubert <bert.hubert@netherlabs.nl>,
	Jan Kiszka <jan.kiszka@siemens.com>,
	Eric Dumazet <edumazet@google.com>, Arnd Bergmann <arnd@arndb.de>,
	Rusty Russell <rusty@rustcorp.com.au>,
	Heinrich Schuchardt <xypron.glpk@gmx.de>,
	Andy Lutomirski <luto@amacapital.net>,
	Daniel Wagner <wagi@monom.org>, Anton Blanchard <anton@samba.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Rich Felker <dalias@libc.org>,
	Jonathan Wakely <jwakely@redhat.com>,
	Mike Frysinger <vapier@gentoo.org>
Subject: Re: futex(3) man page, final draft for pre-release review
Date: Wed, 16 Dec 2015 16:54:06 +0100	[thread overview]
Message-ID: <5671891E.404@gmail.com> (raw)
In-Reply-To: <20151215211816.GR11972@malice.jf.intel.com>

Hello Darren,

On 12/15/2015 10:18 PM, Darren Hart wrote:
> On Tue, Dec 15, 2015 at 02:43:50PM +0100, Michael Kerrisk (man-pages) wrote:

[...]

>>        When executing a futex operation that requests to block a thread,
>>        the kernel will block only if the futex word has the  value  that
>>        the  calling  thread  supplied  (as  one  of the arguments of the
>>        futex() call) as the expected value of the futex word.  The load‐
>>        ing  of the futex word's value, the comparison of that value with
>>        the expected value, and the actual blocking  will  happen  atomi‐
>>
>> FIXME: for next line, it would be good to have an explanation of
>> "totally ordered" somewhere around here.
>>
>>        cally  and totally ordered with respect to concurrently executing
> 
> Totally ordered with respect futex operations refers to semantics of the
> ACQUIRE/RELEASE operations and how they impact ordering of memory reads and
> writes. The kernel futex operations are protected by spinlocks, which ensure
> that that all operations are serialized with respect to one another.
> 
> This is a lot to attempt to define in this document. Perhaps a reference to
> linux/Documentation/memory-barriers.txt as a footnote would be sufficient? Or
> perhaps for this manual, "serialized" would be sufficient, with a footnote
> regarding "totally ordered" and a pointer to the memory-barrier documentation?

I think I'll just settle for writing serialized in the man page, and be 
done with it :-).

>>        futex operations on the same futex word.  Thus, the futex word is
>>        used to connect the synchronization in user space with the imple‐
>>        mentation of blocking by the kernel.  Analogously  to  an  atomic
>>        compare-and-exchange  operation  that  potentially changes shared
>>        memory, blocking via a futex is an atomic compare-and-block oper‐
>>        ation.
> 
> ...
> 
>>    Futex operations
>>        The futex_op argument consists of two parts: a command that spec‐
>>        ifies  the  operation to be performed, bit-wise ORed with zero or
>>        or more options that modify the behaviour of the operation.   The
>>        options that may be included in futex_op are as follows:
> 
> ...
> 
>>
>>        FUTEX_CLOCK_REALTIME (since Linux 2.6.28)
>>               This   option   bit   can   be   employed  only  with  the
>>               FUTEX_WAIT_BITSET and FUTEX_WAIT_REQUEUE_PI operations.
> 
> That caught me by surprise, but it's true. We reject FUTEX_WAIT |
> FUTEX_CLOCK_REALTIME, even though FUTEX_WAIT treated as FUTEX_WAIT_BITSET with
> val3=FUTEX_BITSET_MATCH_ANY.

You uncover all sorts of interesting stuff when you document APIs ;-).

> 
> Thomas, this looks like an oversight to me - do you recall if we intentionally
> disallow FUTEX_CLOCK_REALTIME with FUTEX_WAIT?
> 
>>               If this option is set, the kernel  treats  timeout  as  an
>>               absolute time based on CLOCK_REALTIME.
>>
>>               If  this  option  is not set, the kernel treats timeout as
>>               relative time, measured against the CLOCK_MONOTONIC clock.
> 
> ...
> 
>>    Priority-inheritance futexes
> 
> ...
> 
>>        *  If  the lock is owned and there are threads contending for the
>>           lock, then the FUTEX_WAITERS bit shall be  set  in  the  futex
>>           word's value; in other words, this value is:
>>
>>               FUTEX_WAITERS | TID
>>
>>
>>           (Note that is invalid for a PI futex word to have no owner and
> 
>                       ^ it
> 
>>           FUTEX_WAITERS set.)
> ...
> 
>>        FUTEX_TRYLOCK_PI (since Linux 2.6.18)
>>               This operation tries to acquire the futex at uaddr.  It is
>>               invoked when a user-space atomic acquire did  not  succeed
>>               because the futex word was not 0.
>>
>>
>> FIXME(Next sentence) The wording "The trylock in kernel" below 
>> needs clarification. Suggestions?
>>
>>               The trylock in kernel might succeed because the futex word
> 
> The lock acquisition might succeed in the kernel because the futex word

Already did some rewording here which I think makes things better.

>>               contains     stale     state     (FUTEX_WAITERS     and/or
>>               FUTEX_OWNER_DIED).   This can happen when the owner of the
>>               futex died.  User space cannot handle this condition in  a
>>               race-free  manner,  but  the  kernel  can  fix this up and
>>               acquire the futex.
>>
>>               The uaddr2, val, timeout, and val3 arguments are ignored.
> 
> ...
> 
>>    EXAMPLE
>>
>> FIXME I think it would be helpful here to say a few more words about
>>       the difference(s) between FUTEX_LOCK_PI and FUTEX_TRYLOCK_PI.
>>       Can someone propose something?
> 
> Hrm. It seems pretty straightforward to me. I guess I'm too close to it. What
> about it seems unclear and needs clarification?

On reflection, I agree that the difference is perhaps well-enough explained.

Thanks for the comments, Darren.

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

  parent reply	other threads:[~2015-12-16 15:54 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-15 13:43 futex(3) man page, final draft for pre-release review Michael Kerrisk (man-pages)
2015-12-15 13:43 ` Michael Kerrisk (man-pages)
     [not found] ` <56701916.4090203-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-12-15 15:34   ` Torvald Riegel
2015-12-15 15:34     ` Torvald Riegel
     [not found]     ` <1450193693.27311.115.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2015-12-15 16:02       ` Michael Kerrisk (man-pages)
2015-12-15 16:02         ` Michael Kerrisk (man-pages)
2015-12-15 21:18   ` Darren Hart
2015-12-15 21:18     ` Darren Hart
     [not found]     ` <20151215211816.GR11972-Z5kFBHtJu+EzCVHREhWfF0EOCMrvLtNR@public.gmane.org>
2015-12-16 15:54       ` Michael Kerrisk (man-pages) [this message]
2015-12-16 15:54         ` Michael Kerrisk (man-pages)
     [not found]         ` <5671891E.404-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-12-18 11:11           ` Torvald Riegel
2015-12-18 11:11             ` Torvald Riegel
     [not found]             ` <1450437061.26597.45.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2015-12-18 15:34               ` Jonathan Wakely
2015-12-18 15:34                 ` Jonathan Wakely
2015-12-19  6:54             ` Michael Kerrisk (man-pages)
2015-12-19  6:54               ` Michael Kerrisk (man-pages)
2015-12-18 11:21       ` Torvald Riegel
2015-12-18 11:21         ` Torvald Riegel
     [not found]         ` <1450437714.26597.53.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2015-12-19  6:56           ` Michael Kerrisk (man-pages)
2015-12-19  6:56             ` Michael Kerrisk (man-pages)
2015-12-15 22:41   ` Davidlohr Bueso
2015-12-15 22:41     ` Davidlohr Bueso
     [not found]     ` <20151215224119.GA28877-95RKjC4jbl+7r5TWoziOLQ@public.gmane.org>
2015-12-16 15:40       ` Michael Kerrisk (man-pages)
2015-12-16 15:40         ` Michael Kerrisk (man-pages)
2015-12-18 12:26       ` Torvald Riegel
2015-12-18 12:26         ` Torvald Riegel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5671891E.404@gmail.com \
    --to=mtk.manpages-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=anton-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org \
    --cc=arnd-r2nGTMty4D4@public.gmane.org \
    --cc=bert.hubert-dxZxOz86jR8sYtaaK7K+xw@public.gmane.org \
    --cc=bgallmeister-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=carlos-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=dalias-8zAoT0mYgF4@public.gmane.org \
    --cc=dave-h16yJtLeMjHk1uMJSBkQmQ@public.gmane.org \
    --cc=dvhart-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
    --cc=edumazet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=jakub-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=jan.kiszka-kv7WeFo6aLtBDgjK7y7TUQ@public.gmane.org \
    --cc=jwakely-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=libc-alpha-9JcytcrH/bA+uJoB2kUjGw@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-man-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org \
    --cc=mingo-X9Un+BFzKDI@public.gmane.org \
    --cc=roland-/Z5OmTQCD9xF6kxbq+BtvQ@public.gmane.org \
    --cc=rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org \
    --cc=rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org \
    --cc=tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org \
    --cc=triegel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=wagi-kQCPcA+X3s7YtjvyW6yDsg@public.gmane.org \
    --cc=xypron.glpk-Mmb7MZpHnFY@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.