linux-man.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: Darren Hart <dvhart-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>,
	Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
Cc: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
	Carlos O'Donell <carlos-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Ingo Molnar <mingo-X9Un+BFzKDI@public.gmane.org>,
	Jakub Jelinek <jakub-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	"linux-man-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-man-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	lkml <linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Davidlohr Bueso <davidlohr.bueso-VXdhtT5mjnY@public.gmane.org>,
	Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org>,
	Steven Rostedt <rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org>,
	Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
	Linux API <linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: futex(2) man page update help request
Date: Thu, 15 Jan 2015 16:12:20 +0100	[thread overview]
Message-ID: <54B7D8D4.2070203@gmail.com> (raw)
In-Reply-To: <CF9A731D.913E6%dvhart-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>

Hello Darren,

I give you the same apology as to Thomas for the 
long-delayed response to your mail.

And I repeat my note to Thomas:
In the next day or two, I hope to send out the new version
of the futex(2) page for review. The new draft is a bit
bigger (okay -- 4 x bigger) than the current page. And there 
are a quite number of FIXMEs that I've placed in the page 
for various points--some minor, but a few major--that need
to be checked or fixed. Would you have some time to review
that page? 

In the meantime, I have a couple of questions, which, if 
you could answer them, I would work some changes into the 
page before sending.

1. In various places, distinction is made between non-PI 
   futexs and PI futexes. But what determines that distinction?
   From the kernel's perspective, hat make a futex one type
   or another? I presume it is to do with the types of blocking
   waiters on the futex, but it would be good to have a formal
   definition.

2. Can you say something about the pairing requirements of
   FUTEX_WAIT_REQUEUE_PI and FUTEX_CMP_REQUEUE_PI. 
   What is the requirement and why do we need it?

Most of the rest of this mail is just a checklist noting
what I did with your comments. No response is needed 
in most cases, but there is one that I have marked with
"???". If you could reply to that. I'd be grateful.

On 05/15/2014 10:35 PM, Darren Hart wrote:
> On 5/15/14, 7:14, "Thomas Gleixner" <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org> wrote:
> 
> Wow Thomas, I planned to do exactly this and you beat me to it. Again.
> Thanks for getting this started.
> 
> Michael, I imagine you want something more condensed, and I'll add to what
> tglx posted (inline below) to try and get you that, but if you have
> questions and need to fill in the gap, the paper I presented at RTLWS11 in
> '09 covers this particularly nasty OPCODE in detail:
> 
> http://lwn.net/images/conf/rtlws11/papers/proc/p10.pdf
> 
> I believe Michael is looking for some higher level documentation, like how
> to use these and what they are intended for. 

Yes, that would be good.

> Probably something more like
> Ulrich's Futexes are Tricky paper - but let's start with getting the op
> codes, arguments, and return codes fleshed out.

Okay.

> For all the PI opcodes, we should probably mention something about the
> futex value scheme (TID), whereas the other opcodes do not require any
> specific value scheme.
> 
> No Owner:	0
> Owner:		TID
> Waiters:	TID | FUTEX_WAITERS
> 
> This is the relevant section from the referenced paper:
> 				
> The PI futex operations diverge from the oth-
> ers in that they impose a policy describing how
> the futex value is to be used. If the lock is un-
> owned, the futex value shall be 0. If owned, it
> shall be the thread id (tid) of the owning thread.
> If there are threads contending for the lock, then
> the FUTEX_WAITERS flag is set. With this policy in
> place, userspace can atomically acquire an unowned
> lock or release an uncontended lock using an atomic
> instruction and their own tid. A non-zero futex
> value will force waiters into the kernel to lock. The
> FUTEX_WAITERS flag forces the owner into the kernel
> to unlock. If the callers are forced into the kernel,
> they then deal directly with an underlying rt_mutex
> which implements the priority inheritance semantics.
> After the rt_mutex is acquired, the futex value is up-
> dated accordingly, before the calling thread returns
> to userspace.
>
> It is important to note that the kernel will update the futex value prior
> to returning to userspace. Unlike other futex op codes,
> FUTEX_CMP_REUQUE_PI (and FUTEX_WAIT_REQUEUE_PI, FUTEX_LOCK_PI are designed
> for the implementation of very specific IPC mechanisms).

??? Great text. May I presume that I can take this text 
and freely adapt it for the man page? (Actually, this is a 
request for forgiveness, rather than permission :-).)

>> FUTEX_CMP_REQUEUE_PI
>>
>> 	PI aware variant of FUTEX_CMP_REQUEUE. Inner futex at uaddr is
>> 	a non PI futex. Outer futex to which is requeued is a PI futex
>> 	at uaddr2.
> 
> Inner/outer terminology applies specifically to the glibc pthread
> condition variable and mutex use case, but is overly specific for the man
> page. Consider:
> 
> PI aware variant for FUTEX_CMP_REQUEUE. Requeue tasks blocked on uaddr via
> FUTEX_WAIT_REQUEUE_PI from a non-PI source futex (uaddr) to a PI target
> futex (uaddr2).

Thanks for that text. It is easier to grasp.

>>
>> 	The waiters on uaddr must wait in FUTEX_WAIT_REQUEUE_PI.
>>
>> 	The argument val is contains the number of waiters on uaddr
>> 	which are immediately woken up. Must be 1 for this opcode.
> 
> Because the point is to avoid the thundering herd in the first place, and
> other nasty little races and faulting corner cases...

I added the piece about "thundering herd".

>> 	The timeout argument is abused to transport the number of
>> 	waiters which are requeued on to the futex at uaddr2. The
>> 	pointer is typecasted to u32.
> 
> 
>           val3 contains the expected value of uaddr (same as
> FUTEX_CMP_REQUEUE)

Yes. (The text now says that 'val3' has the same purpose as 
for FUTEX_CMP_REQUEUE.)

>> Darren, can you fill in the missing details?
> 
> Yup...
> 
>>
>> 	[EFAULT] Kernel was unable to access the futex value at uaddr
>> 		 or uaddr2
>>
>> 	[ENOMEM] Kernel could not allocate state
>>
>> 	[EINVAL] The supplied uaddr/uaddr2 arguments do not point to a
>> 		 valid object, i.e. pointer is not 4 byte aligned
>>
>> 	[EINVAL] uaddr equal uaddr2. Requeue to same futex.
>>
>> 	[EINVAL] The kernel detected inconsistent state between the
>> 		 user space state at uaddr and the kernel state,
>> 		 i.e. it detected a waiter which waits in
>> 		 FUTEX_LOCK_PI on uaddr
> 
>                    instead of FUTEX_WAIT_REQUEUE_PI.

Thanks. I added that detail.

>> 	[EINVAL] The kernel detected inconsistent state between the
>> 		 user space state at uaddr and the kernel state,
>> 		 i.e. it detected a waiter which waits in
>> 		 FUTEX_WAIT[_BITSET] on uaddr
>>
>> 	[EINVAL] The kernel detected inconsistent state between the
>> 		 user space state at uaddr2 and the kernel state,
>> 		 i.e. it detected a waiter which waits in
>> 		 FUTEX_WAIT on uaddr2.
> 
>           [EINVAL] The kernel detected the FUTEX_CMP_REQUEUE_PI call is
>                    attempting to requeue a task to a futex other than that
>                    specified by the matching FUTEX_WAIT_REQUEUE_PI call for
>                    that task.

Thanks. Added.

> A number of these EINVALs can probably be combined into "Kernel detected
> bad state" as far as the C library is concerned, but we can consolidate
> later. But basically, EINVAL is returned if the non-pi to pi or op pairing
> semantics are violated.

I think the page probably needs some text to cover that point. I'll add
a FIXME for review.

>>  	[EINVAL] The supplied bitset is zero.
> 
> Bitset doesn't apply to FUTEX_CMP_REQUEUE_PI.

Thanks.

>           [EINVAL] nr_wake != 1

Thanks, I'd already spotted this, but it's good to have confirmation.

> EAGAIN == EWOULDBLOCK. We use each in the kernel, but will just refer to
> them here as EAGAIN.

Yes. And I've followed that convention now in the man page.

>> 	[EAGAIN] uaddr1 readout is not equal the compare value in
>> 		 argument val3
>>
>> 	[EAGAIN] The futex owner TID of uaddr2 is about to exit, but
>> 		 has not yet handled the internal state cleanup. Try
>> 		 again.
>>
>> 	[EPERM]  Caller is not allowed to attach the waiter to the
>> 		 futex at uaddr2 Can be a legitimate issue or a hint
>> 		 for state corruption in user space
>>
>> 	[ESRCH]	 The TID in the user space value at uaddr2 does not exist
> 
> Hrm, I'm missing ESRCH and EPERM in my state diagrams.... put yes, we can
> get ESRCH when looking up PI state, and we can return that from
> futex_requeue.... That needs some time to review...
> 
> I'm not seeing the EPERM path, where is that coming from?

Any further insight on the above?

>> 	[EDEADLOCK] The requeuing of a waiter to the kernel representation
>> 		    of the PI futex at uaddr2 detected a deadlock scenario.
>>
>>        [ENOSYS] Not implemented on all architectures and not supported
>> 		 on some CPU variants (runtime detection)
> 
> Return value >= 0 is successful, indicating the number of of tasks
> requeued or woken (3 requeued and 1 woken would return 4).

Yes. Already noted.

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2015-01-15 15:12 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-14 10:35 futex(2) man page update help request Michael Kerrisk (man-pages)
     [not found] ` <537346E5.4050407-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-05-14 16:18   ` Darren Hart
2014-05-14 19:03     ` Michael Kerrisk (man-pages)
     [not found]       ` <CAKgNAkh+PWzT2SByaLk_OtiAXeZSkWoMgu+ivDOt1dTWVtaatQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-05-14 19:59         ` Darren Hart
2014-05-14 20:23         ` Carlos O'Donell
     [not found]           ` <5373D0CA.2050204-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-05-14 20:44             ` Andy Lutomirski
2014-05-14 23:34             ` Thomas Gleixner
     [not found]               ` <alpine.DEB.2.02.1405150121230.6261-3cz04HxQygjZikZi3RtOZ1XZhhPuCNm+@public.gmane.org>
2014-05-15  3:12                 ` Carlos O'Donell
     [not found]                   ` <537430B5.2060001-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-05-15  4:49                     ` Michael Kerrisk (man-pages)
2014-05-15  4:53                 ` Michael Kerrisk (man-pages)
     [not found]                   ` <CAKgNAkjQ5Dd_U9OojXdgeforpRevvPHNFAw99kBFPCwHgf7Ggg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-05-15 14:14                     ` Thomas Gleixner
     [not found]                       ` <alpine.DEB.2.02.1405151144390.6261-3cz04HxQygjZikZi3RtOZ1XZhhPuCNm+@public.gmane.org>
2014-05-15 20:19                         ` Michael Kerrisk (man-pages)
     [not found]                           ` <53752157.9070803-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-08-04 14:46                             ` Carlos O'Donell
2014-05-15 20:35                         ` Darren Hart
     [not found]                           ` <CF9A731D.913E6%dvhart-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2015-01-15 15:12                             ` Michael Kerrisk (man-pages) [this message]
     [not found]                               ` <54B7D8D4.2070203-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-01-17  1:33                                 ` Darren Hart
2015-01-17  9:16                                   ` Michael Kerrisk (man-pages)
     [not found]                                     ` <54BA2872.5040003-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-01-17 19:26                                       ` Darren Hart
     [not found]                                         ` <D0DFF430.B7F94%dvhart-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2015-01-18 10:18                                           ` Michael Kerrisk (man-pages)
2015-01-15 15:10                         ` Michael Kerrisk (man-pages)
     [not found]                           ` <54B7D87C.3090901-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-01-15 22:23                             ` Thomas Gleixner
2015-01-16 15:17                               ` Michael Kerrisk (man-pages)
     [not found]                                 ` <54B92B71.2090509-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-01-16 15:20                                   ` Thomas Gleixner
2015-01-16 20:54                                     ` Michael Kerrisk (man-pages)
     [not found]                                       ` <54B97A72.2050205-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-01-17  0:46                                         ` Darren Hart
     [not found]                                           ` <D0DEECA2.B7EAD%dvhart-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2015-01-19 10:45                                             ` Thomas Gleixner
2015-01-19 14:07                                               ` Michael Kerrisk (man-pages)
2015-01-23 18:19                                             ` Torvald Riegel
2015-01-24 10:05                                               ` Thomas Gleixner
2015-01-24 12:58                                                 ` Torvald Riegel
     [not found]                                                   ` <1422104287.29655.13.camel-I2ZjUw8blINjztcc/or7kQ@public.gmane.org>
2015-01-24 16:25                                                     ` Thomas Gleixner
2015-01-17  0:56                                       ` Davidlohr Bueso
     [not found]                                         ` <1421456216.27134.2.camel-h16yJtLeMjHk1uMJSBkQmQ@public.gmane.org>
2015-01-17  1:11                                           ` Darren Hart
2015-01-23 18:29                             ` Torvald Riegel
     [not found]                               ` <1422037788.29655.0.camel-I2ZjUw8blINjztcc/or7kQ@public.gmane.org>
2015-01-24 11:35                                 ` Thomas Gleixner
2015-01-24 13:12                                   ` Torvald Riegel
     [not found]                                     ` <1422105142.29655.16.camel-I2ZjUw8blINjztcc/or7kQ@public.gmane.org>
2015-01-27  7:48                                       ` Michael Kerrisk (man-pages)
2015-02-05 19:57                                   ` Darren Hart
2014-05-15  8:13             ` Peter Zijlstra
     [not found]               ` <20140515081357.GC11096-ndre7Fmf5hadTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
2014-05-15 15:43                 ` Darren Hart
2014-05-15  8:14             ` Peter Zijlstra
     [not found]               ` <20140515081444.GD11096-ndre7Fmf5hadTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
2014-05-15 13:18                 ` Carlos O'Donell
     [not found]                   ` <5374BE9E.8080408-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-05-15 13:22                     ` Peter Zijlstra
     [not found]                       ` <20140515132235.GM30445-ndre7Fmf5hadTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
2014-05-15 13:49                         ` Michael Kerrisk (man-pages)
     [not found]                           ` <CAKgNAkjrS8WWMoQzsiOkMVn1_Bf06uFCL6ECU7z=mv0fszg+gQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-05-15 13:55                             ` Peter Zijlstra
2014-05-15 14:39                             ` Carlos O'Donell
2014-05-15 15:11                               ` Peter Zijlstra
2014-05-14 20:56         ` Davidlohr Bueso
2014-05-14 21:03           ` Darren Hart
     [not found]             ` <CF99279F.907FC%dvhart-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2014-05-14 22:21               ` Paul E. McKenney
     [not found]           ` <1400100977.3865.30.camel-5JQ4ckphU/8SZAcGdq5asR6epYMZPwEe5NbjCUgZEJk@public.gmane.org>
2014-05-15  0:28             ` H. Peter Anvin
2014-05-15  0:35               ` Andy Lutomirski
     [not found]                 ` <CALCETrXzMiS9DwvmZn++wg0x6v-ZR0YP9fAdco4PRST=nTY4nQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-05-15  0:41                   ` H. Peter Anvin
     [not found]               ` <53740A30.20807-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>
2014-05-15 19:10                 ` Carlos O'Donell
     [not found]     ` <CF98E3EF.90564%dvhart-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2014-05-14 21:05       ` Davidlohr Bueso
2014-05-15 15:15         ` Joseph S. Myers
2014-05-15  0:18       ` H. Peter Anvin
     [not found]         ` <537407ED.8050606-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>
2014-05-15  5:21           ` Darren Hart
     [not found]             ` <CF999CD6.90A93%dvhart-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2014-05-15  8:23               ` Peter Zijlstra
2014-05-15 13:46               ` Michael Kerrisk (man-pages)
     [not found]                 ` <5374C54B.7040408-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-05-15 14:59                   ` H. Peter Anvin
2014-05-15 15:42                 ` chrubis
     [not found]                   ` <20140515154246.GC6926-HSzIOc4LzcM@public.gmane.org>
2014-05-15 15:52                     ` H. Peter Anvin
     [not found]                       ` <5374E2A4.2070408-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>
2014-05-15 16:01                         ` chrubis-AlSwsSmVLrQ
     [not found]                           ` <20140515160152.GA7529-HSzIOc4LzcM@public.gmane.org>
2014-05-15 16:07                             ` H. Peter Anvin
     [not found]                               ` <5374E653.2080309-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>
2014-05-15 16:17                                 ` chrubis-AlSwsSmVLrQ
2014-05-15 16:56                                   ` H. Peter Anvin
     [not found]                                     ` <5374F1B3.9080006-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>
2014-05-15 17:06                                       ` chrubis-AlSwsSmVLrQ
2014-05-15 15:47                 ` Darren Hart
2014-05-15 15:35           ` chrubis-AlSwsSmVLrQ
2014-05-15 15:28       ` chrubis-AlSwsSmVLrQ
     [not found]         ` <20140515152834.GA6926-HSzIOc4LzcM@public.gmane.org>
2014-05-15 15:40           ` Steven Rostedt
2014-05-15 16:14         ` Darren Hart
     [not found]           ` <CF9A3521.90ECC%dvhart-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2014-05-15 16:30             ` chrubis-AlSwsSmVLrQ
     [not found]               ` <20140515163004.GB7959-HSzIOc4LzcM@public.gmane.org>
2014-05-15 18:17                 ` Darren Hart
     [not found]                   ` <CF9A3A75.90F40%dvhart-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2014-05-15 19:05                     ` chrubis-AlSwsSmVLrQ
     [not found]                       ` <20140515190529.GA8887-HSzIOc4LzcM@public.gmane.org>
2014-05-15 19:38                         ` Darren Hart
     [not found]                           ` <CF9A658E.91322%dvhart-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2014-08-11 10:19                             ` chrubis-AlSwsSmVLrQ
2014-11-26 13:41                             ` Cyril Hrubis
2015-02-16 13:14                           ` Cyril Hrubis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54B7D8D4.2070203@gmail.com \
    --to=mtk.manpages-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=arnd-r2nGTMty4D4@public.gmane.org \
    --cc=carlos-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=davidlohr.bueso-VXdhtT5mjnY@public.gmane.org \
    --cc=dvhart-VuQAYsv1563Yd54FQh9/CA@public.gmane.org \
    --cc=jakub-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-man-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=mingo-X9Un+BFzKDI@public.gmane.org \
    --cc=peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
    --cc=rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org \
    --cc=tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).