public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Darren Hart <dvhltc@us.ibm.com>
To: Michal Hocko <mhocko@suse.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	LKML <linux-kernel@vger.kernel.org>,
	Nick Piggin <npiggin@suse.de>,
	Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: futex: race in lock and unlock&exit for robust futex with PI?
Date: Fri, 25 Jun 2010 10:53:56 -0700	[thread overview]
Message-ID: <4C24ED34.9040808@us.ibm.com> (raw)
In-Reply-To: <20100625082711.GA32765@tiehlicka.suse.cz>

On 06/25/2010 01:27 AM, Michal Hocko wrote:
> On Thu 24-06-10 19:42:50, Darren Hart wrote:
>> On 06/23/2010 02:13 AM, Michal Hocko wrote:
>>> Hi,
>
> Hi,
>
>>
>> Hi Michal,
>>
>> Thanks for reporting the issue and providing a testcase.
>>
>>>
>>> attached you can find a simple test case which fails quite easily on the
>>> following glibc assert:
>>> "SharedMutexTest: pthread_mutex_lock.c:289: __pthread_mutex_lock:
>>>    Assertion `(-(e)) != 3 || !robust' failed." "
>>
>> I've run runSimple.sh in a tight loop for a couple hours (about 2k
>> iterations so far) and haven't seen anything other than "Here we go"
>> printed to the console.
>
> Maybe a higher load on CPUs would help (busy loop on other CPUs).

Must have been a build issue. I can reproduce _something_ now. Within 10 
iterations of runSimple.sh the test hangs. ps shows all the simple 
processes sitting in pause.

(gdb) bt
#0  0x0000003c0060e030 in __pause_nocancel () from /lib64/libpthread.so.0
#1  0x0000003c006085fc in __pthread_mutex_lock_full ()
    from /lib64/libpthread.so.0
#2  0x0000000000400cd6 in main (argc=1, argv=0x7fffc016e508) at simple.c:101

There is only one call to pause* in pthread_mutex_lock.c: (line ~316):

	/* ESRCH can happen only for non-robust PI mutexes where
	   the owner of the lock died.  */
	assert (INTERNAL_SYSCALL_ERRNO (e, __err) != ESRCH || !robust);

	/* Delay the thread indefinitely.  */
	while (1)
	  pause_not_cancel ();

Right now I'm thinking that NDEBUG is set in my build for whatever 
reason, but I think I'm seeing the same issue you are. I'll review the 
futex code and prepare a trace patch and see if I can reproduce with that.

Note: confirmed, the glibc rpm has -DNDEBUG=1

--
Darren

>
>>
>> I had to add -D_GNU_SOURCE to get it to build on my system (RHEL5.2
>> + 2.6.34). Perhaps this is just a difference in the toolchain.
>
> I assume that you got PTHREAD_PRIO_INHERIT undeclared error, don't you?
> I have hacked around that by #define __USE_UNIX98 which worked on Debian
> and OpenSuse. But you are right _GNU_SOURCE is definitely better
> solution.
>
>>
>>> AFAIU, this assertion says that futex syscall cannot fail with ESRCH
>>> for robust futex because it should either succeed or fail with
>>> EOWNERDEAD.
>>
>> I'll have to think on that and review the libc source. We do need to
>> confirm that the assert is even doing the right thing.
>
> Sure. I have looked through the glibc lock implementation and it makes
> quite a good sense to me. A robust lock should never return with ESRCH.
>
>>
>>>
>>> We have seen this problem on SLES11 and SLES11SP1 but I was able to
>>> reproduce it with the 2.6.34 kernel as well.
>>
>> What kind of system are you seeing this on? I've been running on a
>> 4-way x86_64 blade.
>
> * Debian (squeeze/sid) with
> - Intel(R) Core(TM)2 CPU T5600 @ 1.83GHz
> - kernel: vanilla 2.6.34
> - glibc: 2.11.1-3
> - i386
>
> * OpenSuse 11.2 with
> - Intel(R) Core(TM)2 Duo CPU E4500 @ 2.20GHz
> - kernel: distribution 2.6.31.12-0.2-desktop
> - glibc: 2.10.1-10.5.1
> - i386
>
> * SLES11SP1
> - Dual-Core AMD Opteron(tm) Processor 1218
> - kernel: distribution 2.6.32.12-0.3-default
> - glibc: 2.11.1-0.17.4
> - x86_64
>
> Each box shows a different number of asserts during 10 iterations.
>
>>
>>> The test case is quite easy.
>>>
>>> Executed with a parameter it creates a test file and initializes shared,
>>> robust pthread mutex (optionaly compile time configured with priority
>>> inheritance) backed by the mmapped test file. Without a parameter it
>>> mmaps the file and just locks, unlocks mutex and checks for EOWNERDEAD
>>> (this should never happen during the test as the process never dies with
>>> the lock held) in the loop.
>>
>> Have you found the PI parameter to be required for reproducing the
>> error? From the comments below I'm assuming so... just want to be
>> sure.
>
> Yes. If you comment out USE_PI variable in the script the problem is not
> shown at all.
>
>>
>>>
>>> If I run this application for multiple users in parallel I can see the
>>> above assertion. However, if priority inheritance is turned off then
>>> there is no problem. I am not able to reproduce also if the test case is
>>> run under a single user.
>>>
>>> I am using the attached runSimple.sh script to run the test case like
>>> this:
>>>
>>> rm test.file simple
>>> for i in `seq 10`
>>> do
>>> 	sh runSimple.sh
>>> done
>>>
>>> To disable IP just comment out USE_PI variable in the script.
>>> You need to change USER1 and USER2 variables to match you system. You
>>> will need to run the script as root if you do not set any special
>>> setting to run su on behalf of those users.
>>>
>>> I have tried to look at futex_{un}lock_pi but it is really hard to
>>> understand.
>>
>> *grin* tell me about it...
>>
>> See Documentation/pi-futex.txt if you haven't already.
>
> Will do.
>
>>
>>> I assume that lookup_pi_state is the one which sets ESRCH
>>> after it is not able to find the pid of the current owner.
>>>
>>> This would suggest that we are racing with the unlock of the current
>>> lock holder but I don't see how is this possible as both lock and unlock
>>> paths hold fshared lock for all operations over the lock value. I have
>>> noticed that the lock path drops fshared if the current holder is dying
>>> but then it retries the whole process again.
>>>
>>> Any advice would be highly appreciated.
>>
>> If I can reproduce this I should be able to get some trace points in
>> there to get a better idea of the execution path leading up to the
>> problem.
>
> Please make sure that you run the test case with two different users. I
> couldn't reproduce the issue with a single user.
>
> If you have some ideas about patches which I could try then just pass it
> to me.
>
>>
>> This would be a great time to have those futex fault injection patches...
>>
>>
>> --
>> Darren Hart
>> IBM Linux Technology Center
>> Real-Time Linux Team
>
> Thanks for looking into it.


-- 
Darren Hart
IBM Linux Technology Center
Real-Time Linux Team

  reply	other threads:[~2010-06-25 17:54 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-23  9:13 futex: race in lock and unlock&exit for robust futex with PI? Michal Hocko
2010-06-25  2:42 ` Darren Hart
2010-06-25  8:27   ` Michal Hocko
2010-06-25 17:53     ` Darren Hart [this message]
2010-06-25 23:35       ` Darren Hart
2010-06-28 14:42         ` Michal Hocko
2010-06-28 14:56           ` Darren Hart
2010-06-28 15:32           ` Michal Hocko
2010-06-28 15:40             ` Michal Hocko
2010-06-28 15:58             ` Michal Hocko
2010-06-28 16:39               ` Michal Hocko
2010-06-28 16:45                 ` Peter Zijlstra
2010-06-28 16:56                   ` Michal Hocko
2010-06-28 16:49                 ` Peter Zijlstra
2010-06-29  8:42                   ` [PATCH] futex: futex_find_get_task make credentials check conditional Michal Hocko
2010-06-29 14:56                     ` Darren Hart
2010-06-29 15:24                       ` Michal Hocko
2010-06-29 16:41                     ` Linus Torvalds
2010-06-29 16:58                       ` Darren Hart
2010-06-29 18:03                         ` Thomas Gleixner
2010-06-30  7:01                       ` Michal Hocko
2010-06-30  9:55                         ` [PATCH] futex: futex_find_get_task remove credentails check Michal Hocko
2010-06-30 16:43                           ` Darren Hart
2010-07-08  9:28                             ` Michal Hocko
2010-07-08  9:32                               ` Ingo Molnar
2010-07-08  9:39                                 ` Michal Hocko
2010-07-08  9:43                                   ` Peter Zijlstra
2010-07-08  9:50                                     ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C24ED34.9040808@us.ibm.com \
    --to=dvhltc@us.ibm.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=kuznet@ms2.inr.ac.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@suse.cz \
    --cc=npiggin@suse.de \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox