public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* futex: race in lock and unlock&exit for robust futex with PI?
@ 2010-06-23  9:13 Michal Hocko
  2010-06-25  2:42 ` Darren Hart
  0 siblings, 1 reply; 31+ messages in thread
From: Michal Hocko @ 2010-06-23  9:13 UTC (permalink / raw)
  To: Thomas Gleixner, Peter Zijlstra, Darren Hart
  Cc: LKML, Nick Piggin, Alexey Kuznetsov, Linus Torvalds

[-- Attachment #1: Type: text/plain, Size: 2241 bytes --]

Hi,

attached you can find a simple test case which fails quite easily on the
following glibc assert:
"SharedMutexTest: pthread_mutex_lock.c:289: __pthread_mutex_lock:
  Assertion `(-(e)) != 3 || !robust' failed." "

AFAIU, this assertion says that futex syscall cannot fail with ESRCH 
for robust futex because it should either succeed or fail with
EOWNERDEAD.

We have seen this problem on SLES11 and SLES11SP1 but I was able to
reproduce it with the 2.6.34 kernel as well.

The test case is quite easy. 

Executed with a parameter it creates a test file and initializes shared,
robust pthread mutex (optionaly compile time configured with priority
inheritance) backed by the mmapped test file. Without a parameter it
mmaps the file and just locks, unlocks mutex and checks for EOWNERDEAD
(this should never happen during the test as the process never dies with
the lock held) in the loop.

If I run this application for multiple users in parallel I can see the
above assertion. However, if priority inheritance is turned off then
there is no problem. I am not able to reproduce also if the test case is
run under a single user.

I am using the attached runSimple.sh script to run the test case like
this:

rm test.file simple
for i in `seq 10` 
do 
	sh runSimple.sh
done

To disable IP just comment out USE_PI variable in the script.
You need to change USER1 and USER2 variables to match you system. You
will need to run the script as root if you do not set any special
setting to run su on behalf of those users.

I have tried to look at futex_{un}lock_pi but it is really hard to
understand. I assume that lookup_pi_state is the one which sets ESRCH
after it is not able to find the pid of the current owner. 

This would suggest that we are racing with the unlock of the current
lock holder but I don't see how is this possible as both lock and unlock
paths hold fshared lock for all operations over the lock value. I have
noticed that the lock path drops fshared if the current holder is dying
but then it retries the whole process again.

Any advice would be highly appreciated.

Let me know if you need any further information

Thanks
-- 
Michal Hocko
L3 team 
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

[-- Attachment #2: runSimple.sh --]
[-- Type: application/x-sh, Size: 996 bytes --]

[-- Attachment #3: simple.c --]
[-- Type: text/x-csrc, Size: 2656 bytes --]

#include <unistd.h>
#include <errno.h>
#include <stdio.h>
#include <sys/file.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
#define __USE_UNIX98
#include <pthread.h>

#define  TEST_FILE "test.file"

int init_mutex(pthread_mutex_t *mutex) {

	pthread_mutexattr_t mattr;
	if (pthread_mutexattr_init(&mattr)) {
		perror("pthread_mutexattr_init: ");
		exit(1);
	}
	if (pthread_mutexattr_setpshared(&mattr, PTHREAD_PROCESS_SHARED)) {
		perror("pthread_mutexattr_setpshared: ");
		exit(1);
	}
#ifdef USE_PI
	if (pthread_mutexattr_setprotocol(&mattr, PTHREAD_PRIO_INHERIT)) {
		perror("pthread_mutexattr_setprotocol PI: ");
		exit(1);
	}
#endif
	if (pthread_mutexattr_setrobust_np(&mattr, PTHREAD_MUTEX_ROBUST_NP)) {
		perror("pthread_mutexattr_setrobust_np: ");
		exit(1);
	}

	memset(mutex, 0, sizeof(pthread_mutex_t));
	if (pthread_mutex_init(mutex, &mattr)) {
		perror("mutex_init: ");
		exit(1);
	}
}

int init_test_file(const char *fname) {
	int fd = open(fname, O_RDWR|O_CREAT, S_IREAD|S_IWRITE);
	if (fd == -1) {
		perror("file open:");
		exit(1);
	}
	if (ftruncate(fd, 4096)) {
		perror("truncate: ");
		exit(1);
	}
}

pthread_mutex_t *get_mutex_from_file(const char *fname) {
	int fd = open(fname, O_RDWR, S_IREAD|S_IWRITE);
	if (fd == -1) {
		perror("file open: ");
		exit(1);
	}

	void * addr = mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
	if (addr == MAP_FAILED) {
		perror("mmap failed: ");
		exit(1);
	}
	/* prefault the shared page */
	asm volatile ("" : : "r" (*((unsigned char *)addr)));
	return (pthread_mutex_t *)addr;
}

void check_locked_mutex(pthread_mutex_t *mutex) {
	if (!pthread_mutex_trylock(mutex)) {
		fprintf(stderr, "mutex is not held\n");
		exit(1);
	}
}

void sleep_up_to_sec(int sec) {
/*
	srandom(time(NULL));
	usleep((random()%sec) * 1000000);
*/
}

int main(int argc, char **argv) {
	if (argc > 1) {
		/* First run is just an initialization */
		init_test_file(TEST_FILE);
		pthread_mutex_t * mutex = get_mutex_from_file(TEST_FILE);
		init_mutex(mutex);
		exit(0);
	}

	pthread_mutex_t * mutex = get_mutex_from_file(TEST_FILE);
	int i;
	sleep_up_to_sec(5);
	for (i = 0; i < 1000; ++i)
	{
		int state = pthread_mutex_lock(mutex);
		if (state == EOWNERDEAD)
		{
			// We always perform check for dead process
			// Therefore may safely mark mutex as recovered
			printf("ownerdead\n");
			pthread_mutex_consistent_np(mutex);
		}else if (state) {
			perror("pthread_mutex_lock");
			exit(1);
		}

		check_locked_mutex(mutex);
		sleep_up_to_sec(10);
		if (pthread_mutex_unlock(mutex)) {
			perror("pthread_mutex_unlock");
			exit(1);
		}
	}

	exit(0);
}

^ permalink raw reply	[flat|nested] 31+ messages in thread
* [PATCH] futex: futex_find_get_task remove credentails check
@ 2010-07-08 12:51 Michal Hocko
  2010-07-08 13:22 ` Ingo Molnar
  0 siblings, 1 reply; 31+ messages in thread
From: Michal Hocko @ 2010-07-08 12:51 UTC (permalink / raw)
  To: stable; +Cc: Thomas Gleixner, Ingo Molnar, Darren Hart, LKML

Hi stable team,
could you consider including the following patch (Linus tree commit:
7a0ea09ad5352efce8fe79ed853150449903b9f5).

The original discussion which led to this commit can be found at
http://lkml.org/lkml/2010/6/23/52.

In short:
The original pi locking implementation (since it got to the kernel)
contains a credential check (in futex_find_get_task) if we want to
create a PI state for already held lock. This test fails if the lock
owner has a different (e)uid than the process for which we want to
create the state.
The lock operation then fails with ESRCH which is the error code 
which is returned if the process holding a lock doesn't exist. 
The userspace (glibc) doesn't expect this behavior for shared robust PI
futexes and fails with an assert or hang the task in the end-less loop.

The test case is attached in the referenced thread.

The credential test, which is removed by this patch, doesn't look
correct and it limits the functionality without any good reason. There
are no security consequences as well because the only thing that should
matter in the shared futex-es is accessibility to the shared memory.

The patch applies as is on top of Vanilla 2.6.32, but let me know if you
want to base it on top of the any of the stable trees.


---
>From 7a0ea09ad5352efce8fe79ed853150449903b9f5 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.cz>
Date: Wed, 30 Jun 2010 09:51:19 +0200
Subject: [PATCH] futex: futex_find_get_task remove credentails check

futex_find_get_task is currently used (through lookup_pi_state) from two
contexts, futex_requeue and futex_lock_pi_atomic.  None of the paths
looks it needs the credentials check, though.  Different (e)uids
shouldn't matter at all because the only thing that is important for
shared futex is the accessibility of the shared memory.

The credentail check results in glibc assert failure or process hang (if
glibc is compiled without assert support) for shared robust pthread
mutex with priority inheritance if a process tries to lock already held
lock owned by a process with a different euid:

pthread_mutex_lock.c:312: __pthread_mutex_lock_full: Assertion `(-(e)) != 3 || !robust' failed.

The problem is that futex_lock_pi_atomic which is called when we try to
lock already held lock checks the current holder (tid is stored in the
futex value) to get the PI state.  It uses lookup_pi_state which in turn
gets task struct from futex_find_get_task.  ESRCH is returned either
when the task is not found or if credentials check fails.

futex_lock_pi_atomic simply returns if it gets ESRCH.  glibc code,
however, doesn't expect that robust lock returns with ESRCH because it
should get either success or owner died.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Darren Hart <dvhltc@us.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 kernel/futex.c |   17 ++++-------------
 1 files changed, 4 insertions(+), 13 deletions(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index e7a35f1..6a3a5fa 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -429,20 +429,11 @@ static void free_pi_state(struct futex_pi_state *pi_state)
 static struct task_struct * futex_find_get_task(pid_t pid)
 {
 	struct task_struct *p;
-	const struct cred *cred = current_cred(), *pcred;
 
 	rcu_read_lock();
 	p = find_task_by_vpid(pid);
-	if (!p) {
-		p = ERR_PTR(-ESRCH);
-	} else {
-		pcred = __task_cred(p);
-		if (cred->euid != pcred->euid &&
-		    cred->euid != pcred->uid)
-			p = ERR_PTR(-ESRCH);
-		else
-			get_task_struct(p);
-	}
+	if (p)
+		get_task_struct(p);
 
 	rcu_read_unlock();
 
@@ -564,8 +555,8 @@ lookup_pi_state(u32 uval, struct futex_hash_bucket *hb,
 	if (!pid)
 		return -ESRCH;
 	p = futex_find_get_task(pid);
-	if (IS_ERR(p))
-		return PTR_ERR(p);
+	if (!p)
+		return -ESRCH;
 
 	/*
 	 * We need to look at the task state flags to figure out,
-- 
1.7.1

-- 
Michal Hocko
L3 team 
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

^ permalink raw reply related	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2010-07-12 10:23 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-23  9:13 futex: race in lock and unlock&exit for robust futex with PI? Michal Hocko
2010-06-25  2:42 ` Darren Hart
2010-06-25  8:27   ` Michal Hocko
2010-06-25 17:53     ` Darren Hart
2010-06-25 23:35       ` Darren Hart
2010-06-28 14:42         ` Michal Hocko
2010-06-28 14:56           ` Darren Hart
2010-06-28 15:32           ` Michal Hocko
2010-06-28 15:40             ` Michal Hocko
2010-06-28 15:58             ` Michal Hocko
2010-06-28 16:39               ` Michal Hocko
2010-06-28 16:45                 ` Peter Zijlstra
2010-06-28 16:56                   ` Michal Hocko
2010-06-28 16:49                 ` Peter Zijlstra
2010-06-29  8:42                   ` [PATCH] futex: futex_find_get_task make credentials check conditional Michal Hocko
2010-06-29 14:56                     ` Darren Hart
2010-06-29 15:24                       ` Michal Hocko
2010-06-29 16:41                     ` Linus Torvalds
2010-06-29 16:58                       ` Darren Hart
2010-06-29 18:03                         ` Thomas Gleixner
2010-06-30  7:01                       ` Michal Hocko
2010-06-30  9:55                         ` [PATCH] futex: futex_find_get_task remove credentails check Michal Hocko
2010-06-30 16:43                           ` Darren Hart
2010-07-08  9:28                             ` Michal Hocko
2010-07-08  9:32                               ` Ingo Molnar
2010-07-08  9:39                                 ` Michal Hocko
2010-07-08  9:43                                   ` Peter Zijlstra
2010-07-08  9:50                                     ` Michal Hocko
  -- strict thread matches above, loose matches on Subject: below --
2010-07-08 12:51 Michal Hocko
2010-07-08 13:22 ` Ingo Molnar
2010-07-12 10:20   ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox