[PATCH] mutex: do not unnecessarily deal with waiters

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] mutex: do not unnecessarily deal with waiters
@ 2013-05-23 23:59 Davidlohr Bueso
  2013-05-31  1:12 ` Davidlohr Bueso
  2013-06-27  9:00 ` Ingo Molnar
  0 siblings, 2 replies; 12+ messages in thread
From: Davidlohr Bueso @ 2013-05-23 23:59 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Rik van Riel, LKML, Davidlohr Bueso

From: Davidlohr Bueso <davidlohr.bueso@hp.com>

Upon entering the slowpath, we immediately attempt to acquire the lock
by checking if it is already unlocked. If we are lucky enough that this
is the case, then we don't need to deal with any waiter related logic.

Furthermore any checks for an empty wait_list are unnecessary as we
already know that count is non-negative and hence no one is waiting for
the lock.

Move the count check and xchg calls to be done before any waiters are
setup - including waiter debugging. Upon failure to acquire the lock,
the xchg sets the counter to 0, instead of -1 as it was originally.
This can be done here since we set it back to -1 right at the beginning
of the loop so other waiters are woken up when the lock is released.

When tested on a 8-socket (80 core) system against a vanilla 3.10-rc1
kernel, this patch provides some small performance benefits (+2-6%).
While it could be considered in the noise level, the average percentages
were stable across multiple runs and no performance regressions were seen.
Two big winners, for small amounts of users (10-100), were the short and
compute workloads had a +19.36% and +%15.76% in jobs per minute.
  
Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
---
 kernel/mutex.c | 26 ++++++++++++++------------
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/kernel/mutex.c b/kernel/mutex.c
index ad53a66..a8cd741 100644
--- a/kernel/mutex.c
+++ b/kernel/mutex.c
@@ -306,7 +306,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
 		owner = ACCESS_ONCE(lock->owner);
 		if (owner && !mutex_spin_on_owner(lock, owner)) {
 			mspin_unlock(MLOCK(lock), &node);
-			break;
+			goto slowpath;
 		}
 
 		if ((atomic_read(&lock->count) == 1) &&
@@ -314,8 +314,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
 			lock_acquired(&lock->dep_map, ip);
 			mutex_set_owner(lock);
 			mspin_unlock(MLOCK(lock), &node);
-			preempt_enable();
-			return 0;
+			goto done;
 		}
 		mspin_unlock(MLOCK(lock), &node);
 
@@ -326,7 +325,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
 		 * the owner complete.
 		 */
 		if (!owner && (need_resched() || rt_task(task)))
-			break;
+			goto slowpath;
 
 		/*
 		 * The cpu_relax() call is a compiler barrier which forces
@@ -340,6 +339,14 @@ slowpath:
 #endif
 	spin_lock_mutex(&lock->wait_lock, flags);
 
+	/* once more, can we acquire the lock? */
+	if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, 0) == 1)) {
+		lock_acquired(&lock->dep_map, ip);
+		mutex_set_owner(lock);
+		spin_unlock_mutex(&lock->wait_lock, flags);
+		goto done;
+	}
+
 	debug_mutex_lock_common(lock, &waiter);
 	debug_mutex_add_waiter(lock, &waiter, task_thread_info(task));
 
@@ -347,9 +354,6 @@ slowpath:
 	list_add_tail(&waiter.list, &lock->wait_list);
 	waiter.task = task;
 
-	if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, -1) == 1))
-		goto done;
-
 	lock_contended(&lock->dep_map, ip);
 
 	for (;;) {
@@ -363,7 +367,7 @@ slowpath:
 		 * other waiters:
 		 */
 		if (MUTEX_SHOW_NO_WAITER(lock) &&
-		   (atomic_xchg(&lock->count, -1) == 1))
+		    (atomic_xchg(&lock->count, -1) == 1))
 			break;
 
 		/*
@@ -388,9 +392,8 @@ slowpath:
 		spin_lock_mutex(&lock->wait_lock, flags);
 	}
 
-done:
+	/* got the lock - cleanup and rejoice! */
 	lock_acquired(&lock->dep_map, ip);
-	/* got the lock - rejoice! */
 	mutex_remove_waiter(lock, &waiter, current_thread_info());
 	mutex_set_owner(lock);
 
@@ -399,10 +402,9 @@ done:
 		atomic_set(&lock->count, 0);
 
 	spin_unlock_mutex(&lock->wait_lock, flags);
-
 	debug_mutex_free_waiter(&waiter);
+done:
 	preempt_enable();
-
 	return 0;
 }
 
-- 
1.7.11.7





^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] mutex: do not unnecessarily deal with waiters
  2013-05-23 23:59 [PATCH] mutex: do not unnecessarily deal with waiters Davidlohr Bueso
@ 2013-05-31  1:12 ` Davidlohr Bueso
  2013-06-26 17:49   ` Davidlohr Bueso
  2013-06-27  9:00 ` Ingo Molnar
  1 sibling, 1 reply; 12+ messages in thread
From: Davidlohr Bueso @ 2013-05-31  1:12 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Rik van Riel, LKML

ping?

On Thu, 2013-05-23 at 16:59 -0700, Davidlohr Bueso wrote:
> From: Davidlohr Bueso <davidlohr.bueso@hp.com>
> 
> Upon entering the slowpath, we immediately attempt to acquire the lock
> by checking if it is already unlocked. If we are lucky enough that this
> is the case, then we don't need to deal with any waiter related logic.
> 
> Furthermore any checks for an empty wait_list are unnecessary as we
> already know that count is non-negative and hence no one is waiting for
> the lock.
> 
> Move the count check and xchg calls to be done before any waiters are
> setup - including waiter debugging. Upon failure to acquire the lock,
> the xchg sets the counter to 0, instead of -1 as it was originally.
> This can be done here since we set it back to -1 right at the beginning
> of the loop so other waiters are woken up when the lock is released.
> 
> When tested on a 8-socket (80 core) system against a vanilla 3.10-rc1
> kernel, this patch provides some small performance benefits (+2-6%).
> While it could be considered in the noise level, the average percentages
> were stable across multiple runs and no performance regressions were seen.
> Two big winners, for small amounts of users (10-100), were the short and
> compute workloads had a +19.36% and +%15.76% in jobs per minute.
>   
> Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
> ---
>  kernel/mutex.c | 26 ++++++++++++++------------
>  1 file changed, 14 insertions(+), 12 deletions(-)
> 
> diff --git a/kernel/mutex.c b/kernel/mutex.c
> index ad53a66..a8cd741 100644
> --- a/kernel/mutex.c
> +++ b/kernel/mutex.c
> @@ -306,7 +306,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
>  		owner = ACCESS_ONCE(lock->owner);
>  		if (owner && !mutex_spin_on_owner(lock, owner)) {
>  			mspin_unlock(MLOCK(lock), &node);
> -			break;
> +			goto slowpath;
>  		}
>  
>  		if ((atomic_read(&lock->count) == 1) &&
> @@ -314,8 +314,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
>  			lock_acquired(&lock->dep_map, ip);
>  			mutex_set_owner(lock);
>  			mspin_unlock(MLOCK(lock), &node);
> -			preempt_enable();
> -			return 0;
> +			goto done;
>  		}
>  		mspin_unlock(MLOCK(lock), &node);
>  
> @@ -326,7 +325,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
>  		 * the owner complete.
>  		 */
>  		if (!owner && (need_resched() || rt_task(task)))
> -			break;
> +			goto slowpath;
>  
>  		/*
>  		 * The cpu_relax() call is a compiler barrier which forces
> @@ -340,6 +339,14 @@ slowpath:
>  #endif
>  	spin_lock_mutex(&lock->wait_lock, flags);
>  
> +	/* once more, can we acquire the lock? */
> +	if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, 0) == 1)) {
> +		lock_acquired(&lock->dep_map, ip);
> +		mutex_set_owner(lock);
> +		spin_unlock_mutex(&lock->wait_lock, flags);
> +		goto done;
> +	}
> +
>  	debug_mutex_lock_common(lock, &waiter);
>  	debug_mutex_add_waiter(lock, &waiter, task_thread_info(task));
>  
> @@ -347,9 +354,6 @@ slowpath:
>  	list_add_tail(&waiter.list, &lock->wait_list);
>  	waiter.task = task;
>  
> -	if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, -1) == 1))
> -		goto done;
> -
>  	lock_contended(&lock->dep_map, ip);
>  
>  	for (;;) {
> @@ -363,7 +367,7 @@ slowpath:
>  		 * other waiters:
>  		 */
>  		if (MUTEX_SHOW_NO_WAITER(lock) &&
> -		   (atomic_xchg(&lock->count, -1) == 1))
> +		    (atomic_xchg(&lock->count, -1) == 1))
>  			break;
>  
>  		/*
> @@ -388,9 +392,8 @@ slowpath:
>  		spin_lock_mutex(&lock->wait_lock, flags);
>  	}
>  
> -done:
> +	/* got the lock - cleanup and rejoice! */
>  	lock_acquired(&lock->dep_map, ip);
> -	/* got the lock - rejoice! */
>  	mutex_remove_waiter(lock, &waiter, current_thread_info());
>  	mutex_set_owner(lock);
>  
> @@ -399,10 +402,9 @@ done:
>  		atomic_set(&lock->count, 0);
>  
>  	spin_unlock_mutex(&lock->wait_lock, flags);
> -
>  	debug_mutex_free_waiter(&waiter);
> +done:
>  	preempt_enable();
> -
>  	return 0;
>  }
>  



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mutex: do not unnecessarily deal with waiters
  2013-05-31  1:12 ` Davidlohr Bueso
@ 2013-06-26 17:49   ` Davidlohr Bueso
  0 siblings, 0 replies; 12+ messages in thread
From: Davidlohr Bueso @ 2013-06-26 17:49 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Rik van Riel, LKML, Peter Zijlstra

ping, Ingo?

On Thu, 2013-05-30 at 18:12 -0700, Davidlohr Bueso wrote:
> ping?
> 
> On Thu, 2013-05-23 at 16:59 -0700, Davidlohr Bueso wrote:
> > From: Davidlohr Bueso <davidlohr.bueso@hp.com>
> > 
> > Upon entering the slowpath, we immediately attempt to acquire the lock
> > by checking if it is already unlocked. If we are lucky enough that this
> > is the case, then we don't need to deal with any waiter related logic.
> > 
> > Furthermore any checks for an empty wait_list are unnecessary as we
> > already know that count is non-negative and hence no one is waiting for
> > the lock.
> > 
> > Move the count check and xchg calls to be done before any waiters are
> > setup - including waiter debugging. Upon failure to acquire the lock,
> > the xchg sets the counter to 0, instead of -1 as it was originally.
> > This can be done here since we set it back to -1 right at the beginning
> > of the loop so other waiters are woken up when the lock is released.
> > 
> > When tested on a 8-socket (80 core) system against a vanilla 3.10-rc1
> > kernel, this patch provides some small performance benefits (+2-6%).
> > While it could be considered in the noise level, the average percentages
> > were stable across multiple runs and no performance regressions were seen.
> > Two big winners, for small amounts of users (10-100), were the short and
> > compute workloads had a +19.36% and +%15.76% in jobs per minute.
> >   
> > Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
> > ---
> >  kernel/mutex.c | 26 ++++++++++++++------------
> >  1 file changed, 14 insertions(+), 12 deletions(-)
> > 
> > diff --git a/kernel/mutex.c b/kernel/mutex.c
> > index ad53a66..a8cd741 100644
> > --- a/kernel/mutex.c
> > +++ b/kernel/mutex.c
> > @@ -306,7 +306,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
> >  		owner = ACCESS_ONCE(lock->owner);
> >  		if (owner && !mutex_spin_on_owner(lock, owner)) {
> >  			mspin_unlock(MLOCK(lock), &node);
> > -			break;
> > +			goto slowpath;
> >  		}
> >  
> >  		if ((atomic_read(&lock->count) == 1) &&
> > @@ -314,8 +314,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
> >  			lock_acquired(&lock->dep_map, ip);
> >  			mutex_set_owner(lock);
> >  			mspin_unlock(MLOCK(lock), &node);
> > -			preempt_enable();
> > -			return 0;
> > +			goto done;
> >  		}
> >  		mspin_unlock(MLOCK(lock), &node);
> >  
> > @@ -326,7 +325,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
> >  		 * the owner complete.
> >  		 */
> >  		if (!owner && (need_resched() || rt_task(task)))
> > -			break;
> > +			goto slowpath;
> >  
> >  		/*
> >  		 * The cpu_relax() call is a compiler barrier which forces
> > @@ -340,6 +339,14 @@ slowpath:
> >  #endif
> >  	spin_lock_mutex(&lock->wait_lock, flags);
> >  
> > +	/* once more, can we acquire the lock? */
> > +	if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, 0) == 1)) {
> > +		lock_acquired(&lock->dep_map, ip);
> > +		mutex_set_owner(lock);
> > +		spin_unlock_mutex(&lock->wait_lock, flags);
> > +		goto done;
> > +	}
> > +
> >  	debug_mutex_lock_common(lock, &waiter);
> >  	debug_mutex_add_waiter(lock, &waiter, task_thread_info(task));
> >  
> > @@ -347,9 +354,6 @@ slowpath:
> >  	list_add_tail(&waiter.list, &lock->wait_list);
> >  	waiter.task = task;
> >  
> > -	if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, -1) == 1))
> > -		goto done;
> > -
> >  	lock_contended(&lock->dep_map, ip);
> >  
> >  	for (;;) {
> > @@ -363,7 +367,7 @@ slowpath:
> >  		 * other waiters:
> >  		 */
> >  		if (MUTEX_SHOW_NO_WAITER(lock) &&
> > -		   (atomic_xchg(&lock->count, -1) == 1))
> > +		    (atomic_xchg(&lock->count, -1) == 1))
> >  			break;
> >  
> >  		/*
> > @@ -388,9 +392,8 @@ slowpath:
> >  		spin_lock_mutex(&lock->wait_lock, flags);
> >  	}
> >  
> > -done:
> > +	/* got the lock - cleanup and rejoice! */
> >  	lock_acquired(&lock->dep_map, ip);
> > -	/* got the lock - rejoice! */
> >  	mutex_remove_waiter(lock, &waiter, current_thread_info());
> >  	mutex_set_owner(lock);
> >  
> > @@ -399,10 +402,9 @@ done:
> >  		atomic_set(&lock->count, 0);
> >  
> >  	spin_unlock_mutex(&lock->wait_lock, flags);
> > -
> >  	debug_mutex_free_waiter(&waiter);
> > +done:
> >  	preempt_enable();
> > -
> >  	return 0;
> >  }
> >  
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mutex: do not unnecessarily deal with waiters
  2013-05-23 23:59 [PATCH] mutex: do not unnecessarily deal with waiters Davidlohr Bueso
  2013-05-31  1:12 ` Davidlohr Bueso
@ 2013-06-27  9:00 ` Ingo Molnar
  2013-06-28  1:32   ` Davidlohr Bueso
  1 sibling, 1 reply; 12+ messages in thread
From: Ingo Molnar @ 2013-06-27  9:00 UTC (permalink / raw)
  To: Davidlohr Bueso, Maarten Lankhorst
  Cc: Rik van Riel, LKML, Peter Zijlstra, Thomas Gleixner


* Davidlohr Bueso <davidlohr.bueso@hp.com> wrote:

> From: Davidlohr Bueso <davidlohr.bueso@hp.com>
> 
> Upon entering the slowpath, we immediately attempt to acquire the lock
> by checking if it is already unlocked. If we are lucky enough that this
> is the case, then we don't need to deal with any waiter related logic.
> 
> Furthermore any checks for an empty wait_list are unnecessary as we
> already know that count is non-negative and hence no one is waiting for
> the lock.
> 
> Move the count check and xchg calls to be done before any waiters are
> setup - including waiter debugging. Upon failure to acquire the lock,
> the xchg sets the counter to 0, instead of -1 as it was originally.
> This can be done here since we set it back to -1 right at the beginning
> of the loop so other waiters are woken up when the lock is released.
> 
> When tested on a 8-socket (80 core) system against a vanilla 3.10-rc1
> kernel, this patch provides some small performance benefits (+2-6%).
> While it could be considered in the noise level, the average percentages
> were stable across multiple runs and no performance regressions were seen.
> Two big winners, for small amounts of users (10-100), were the short and
> compute workloads had a +19.36% and +%15.76% in jobs per minute.
>   
> Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
> ---
>  kernel/mutex.c | 26 ++++++++++++++------------
>  1 file changed, 14 insertions(+), 12 deletions(-)
> 
> diff --git a/kernel/mutex.c b/kernel/mutex.c
> index ad53a66..a8cd741 100644
> --- a/kernel/mutex.c
> +++ b/kernel/mutex.c
> @@ -306,7 +306,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
>  		owner = ACCESS_ONCE(lock->owner);
>  		if (owner && !mutex_spin_on_owner(lock, owner)) {
>  			mspin_unlock(MLOCK(lock), &node);
> -			break;
> +			goto slowpath;
>  		}
>  
>  		if ((atomic_read(&lock->count) == 1) &&
> @@ -314,8 +314,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
>  			lock_acquired(&lock->dep_map, ip);
>  			mutex_set_owner(lock);
>  			mspin_unlock(MLOCK(lock), &node);
> -			preempt_enable();
> -			return 0;
> +			goto done;
>  		}
>  		mspin_unlock(MLOCK(lock), &node);
>  
> @@ -326,7 +325,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
>  		 * the owner complete.
>  		 */
>  		if (!owner && (need_resched() || rt_task(task)))
> -			break;
> +			goto slowpath;
>  
>  		/*
>  		 * The cpu_relax() call is a compiler barrier which forces
> @@ -340,6 +339,14 @@ slowpath:
>  #endif
>  	spin_lock_mutex(&lock->wait_lock, flags);
>  
> +	/* once more, can we acquire the lock? */
> +	if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, 0) == 1)) {
> +		lock_acquired(&lock->dep_map, ip);
> +		mutex_set_owner(lock);
> +		spin_unlock_mutex(&lock->wait_lock, flags);
> +		goto done;
> +	}
> +
>  	debug_mutex_lock_common(lock, &waiter);
>  	debug_mutex_add_waiter(lock, &waiter, task_thread_info(task));
>  
> @@ -347,9 +354,6 @@ slowpath:
>  	list_add_tail(&waiter.list, &lock->wait_list);
>  	waiter.task = task;
>  
> -	if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, -1) == 1))
> -		goto done;
> -
>  	lock_contended(&lock->dep_map, ip);
>  
>  	for (;;) {
> @@ -363,7 +367,7 @@ slowpath:
>  		 * other waiters:
>  		 */
>  		if (MUTEX_SHOW_NO_WAITER(lock) &&
> -		   (atomic_xchg(&lock->count, -1) == 1))
> +		    (atomic_xchg(&lock->count, -1) == 1))
>  			break;
>  
>  		/*
> @@ -388,9 +392,8 @@ slowpath:
>  		spin_lock_mutex(&lock->wait_lock, flags);
>  	}
>  
> -done:
> +	/* got the lock - cleanup and rejoice! */
>  	lock_acquired(&lock->dep_map, ip);
> -	/* got the lock - rejoice! */
>  	mutex_remove_waiter(lock, &waiter, current_thread_info());
>  	mutex_set_owner(lock);
>  
> @@ -399,10 +402,9 @@ done:
>  		atomic_set(&lock->count, 0);
>  
>  	spin_unlock_mutex(&lock->wait_lock, flags);
> -
>  	debug_mutex_free_waiter(&waiter);
> +done:
>  	preempt_enable();
> -
>  	return 0;
>  }

So I tried this out yesterday, but it interacted with the Wait/Wound 
patches in tip:core/mutexes.

Maarten Lankhorst pointed out that if this patch is applied on top of the 
WW patches as-is, then we get this semantic merge conflict:

> > @@ -340,6 +339,14 @@ slowpath:
> >  #endif
> >     spin_lock_mutex(&lock->wait_lock, flags);
> >  
> > +   /* once more, can we acquire the lock? */
> > +   if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, 0) == 1)) {
> > +           lock_acquired(&lock->dep_map, ip);
> > +           mutex_set_owner(lock);
> > +           spin_unlock_mutex(&lock->wait_lock, flags);
> > +           goto done;
> > +   }
> >
> ^^^^^^^^^^^^^^
>
> This part skips the whole if (!__builtin_constant_p(ww_ctx == NULL)) { 
> section with the wait_lock held.

Mind resolving that and merging this patch on top of the latest tip:master 
tree? Please also keep Maarten Cc:-ed.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mutex: do not unnecessarily deal with waiters
  2013-06-27  9:00 ` Ingo Molnar
@ 2013-06-28  1:32   ` Davidlohr Bueso
  2013-06-28  5:53     ` Maarten Lankhorst
  0 siblings, 1 reply; 12+ messages in thread
From: Davidlohr Bueso @ 2013-06-28  1:32 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Maarten Lankhorst, Rik van Riel, LKML, Peter Zijlstra,
	Thomas Gleixner

On Thu, 2013-06-27 at 11:00 +0200, Ingo Molnar wrote:
[...]
> 
> So I tried this out yesterday, but it interacted with the Wait/Wound 
> patches in tip:core/mutexes.
> 
> Maarten Lankhorst pointed out that if this patch is applied on top of the 
> WW patches as-is, then we get this semantic merge conflict:
> 
> > > @@ -340,6 +339,14 @@ slowpath:
> > >  #endif
> > >     spin_lock_mutex(&lock->wait_lock, flags);
> > >  
> > > +   /* once more, can we acquire the lock? */
> > > +   if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, 0) == 1)) {
> > > +           lock_acquired(&lock->dep_map, ip);
> > > +           mutex_set_owner(lock);
> > > +           spin_unlock_mutex(&lock->wait_lock, flags);
> > > +           goto done;
> > > +   }
> > >
> > ^^^^^^^^^^^^^^
> >
> > This part skips the whole if (!__builtin_constant_p(ww_ctx == NULL)) { 
> > section with the wait_lock held.

I see what you mean, I hadn't really looked at the W/W patches. BTW
those __builtin_constant_p() calls are pretty ugly/annoying to read,
plus why the negation of the NULL check? Couldn't we just do something
like:

#define is_ww_ctx(x) (__builtin_constant_p(x))
...
if (is_ww_ctxt(ww_ctx)) { ... }


Anyway, so going back to the actual patch, we need a few cleanups in
__mutex_lock_common() before we can rebase this patch - otherwise we're
going to end up duplicating a lot of code (and the function is already
big enough):

How about a new ww_mutex_set_context_slowpath() function that does the
w/w lock acquiring and wakes up any sleeping processes. We'd use this
function whenever we acquire the lock in the slowpath (with the
->wait_lock taken):

static __always_inline void
ww_mutex_set_context_slowpath(struct mutex *lock,
			      struct ww_acquire_ctx *ww_ctx, bool debug)
{
	if (!__builtin_constant_p(ww_ctx == NULL)) {
		struct mutex_waiter *cur;
		struct ww_mutex *ww = container_of(lock, struct ww_mutex, base);

		/*
		 * This branch gets optimized out for the common case,
		 * and is only important for ww_mutex_lock.
		 */
		ww_mutex_lock_acquired(ww, ww_ctx);
		ww->ctx = ww_ctx;

		/*
		 * Give any possible sleeping processes the chance to wake up,
		 * so they can recheck if they have to back off.
		 */
		list_for_each_entry(cur, &lock->wait_list, list) {
			if (debug)
				debug_mutex_wake_waiter(lock, cur);
			wake_up_process(cur->task);
		}
	}
}

In ww_mutex_set_context_fastpath() I'm a little confused with the
debug_mutex_wake_waiter() calls since we don't deal with debug in the
fast path (->wait_lock isn't held). So are these calls
correct/necessary?

For ww_mutex_set_context_slowpath(), the 'debug' parameter would be
necessary since with this patch we avoid doing the debug_mutex on a
quick attempt to grab the lock, otherwise we do the slowpath debug,
waiters, etc. For instance:

...
slowpath:
#endif
	spin_lock_mutex(&lock->wait_lock, flags);
	/* once more, can we acquire the lock? */
	if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, 0) == 1)) {
        	lock_acquired(&lock->dep_map, ip);
	        mutex_set_owner(lock);
		ww_mutex_set_context_fastpath(lock, ww_ctx, false);
        	spin_unlock_mutex(&lock->wait_lock, flags);
	        goto done;
  	}
...

lock_acquired(&lock->dep_map, ip);
/* got the lock - rejoice! */
mutex_remove_waiter(lock, &waiter, current_thread_info());
mutex_set_owner(lock);
ww_mutex_set_context_slowpath(lock, ww_ctx, true);
...


Thanks,
Davidlohr


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mutex: do not unnecessarily deal with waiters
  2013-06-28  1:32   ` Davidlohr Bueso
@ 2013-06-28  5:53     ` Maarten Lankhorst
  2013-06-28 19:29       ` Davidlohr Bueso
  2013-06-28 20:13       ` [PATCH v2] " Davidlohr Bueso
  0 siblings, 2 replies; 12+ messages in thread
From: Maarten Lankhorst @ 2013-06-28  5:53 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Ingo Molnar, Rik van Riel, LKML, Peter Zijlstra, Thomas Gleixner

Op 28-06-13 03:32, Davidlohr Bueso schreef:
> On Thu, 2013-06-27 at 11:00 +0200, Ingo Molnar wrote:
> [...]
>> So I tried this out yesterday, but it interacted with the Wait/Wound 
>> patches in tip:core/mutexes.
>>
>> Maarten Lankhorst pointed out that if this patch is applied on top of the 
>> WW patches as-is, then we get this semantic merge conflict:
>>
>>>> @@ -340,6 +339,14 @@ slowpath:
>>>>  #endif
>>>>     spin_lock_mutex(&lock->wait_lock, flags);
>>>>  
>>>> +   /* once more, can we acquire the lock? */
>>>> +   if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, 0) == 1)) {
>>>> +           lock_acquired(&lock->dep_map, ip);
>>>> +           mutex_set_owner(lock);
>>>> +           spin_unlock_mutex(&lock->wait_lock, flags);
>>>> +           goto done;
>>>> +   }
>>>>
>>> ^^^^^^^^^^^^^^
>>>
>>> This part skips the whole if (!__builtin_constant_p(ww_ctx == NULL)) { 
>>> section with the wait_lock held.
> I see what you mean, I hadn't really looked at the W/W patches. BTW
> those __builtin_constant_p() calls are pretty ugly/annoying to read,
> plus why the negation of the NULL check? Couldn't we just do something
> like:
It's to kill overhead.. ww_ctx == NULL is a constant only when the function is called with null as explicit parameter.

So !__builtin_constant_p(ww_ctx == NULL) means that the function was called with a variable ww_ctx.
> #define is_ww_ctx(x) (__builtin_constant_p(x))
> ...
> if (is_ww_ctxt(ww_ctx)) { ... }
>
>
> Anyway, so going back to the actual patch, we need a few cleanups in
> __mutex_lock_common() before we can rebase this patch - otherwise we're
> going to end up duplicating a lot of code (and the function is already
> big enough):
>
> How about a new ww_mutex_set_context_slowpath() function that does the
> w/w lock acquiring and wakes up any sleeping processes. We'd use this
> function whenever we acquire the lock in the slowpath (with the
> ->wait_lock taken):
>
> static __always_inline void
> ww_mutex_set_context_slowpath(struct mutex *lock,
> 			      struct ww_acquire_ctx *ww_ctx, bool debug)
> {
> 	if (!__builtin_constant_p(ww_ctx == NULL)) {
> 		struct mutex_waiter *cur;
> 		struct ww_mutex *ww = container_of(lock, struct ww_mutex, base);
>
> 		/*
> 		 * This branch gets optimized out for the common case,
> 		 * and is only important for ww_mutex_lock.
> 		 */
> 		ww_mutex_lock_acquired(ww, ww_ctx);
> 		ww->ctx = ww_ctx;
>
> 		/*
> 		 * Give any possible sleeping processes the chance to wake up,
> 		 * so they can recheck if they have to back off.
> 		 */
> 		list_for_each_entry(cur, &lock->wait_list, list) {
> 			if (debug)
> 				debug_mutex_wake_waiter(lock, cur);
> 			wake_up_process(cur->task);
> 		}
> 	}
> }
>
> In ww_mutex_set_context_fastpath() I'm a little confused with the
> debug_mutex_wake_waiter() calls since we don't deal with debug in the
> fast path (->wait_lock isn't held). So are these calls
> correct/necessary?
Well spotted, but in that case the !debug case mutex_wake_waiter gets optimized out anyway,
so please don't add a conditional like that.
> For ww_mutex_set_context_slowpath(), the 'debug' parameter would be
> necessary since with this patch we avoid doing the debug_mutex on a
> quick attempt to grab the lock, otherwise we do the slowpath debug,
> waiters, etc. For instance:
>
> ...
> slowpath:
> #endif
> 	spin_lock_mutex(&lock->wait_lock, flags);
> 	/* once more, can we acquire the lock? */
> 	if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, 0) == 1)) {
>         	lock_acquired(&lock->dep_map, ip);
> 	        mutex_set_owner(lock);
> 		ww_mutex_set_context_fastpath(lock, ww_ctx, false);
>         	spin_unlock_mutex(&lock->wait_lock, flags);
> 	        goto done;
>   	}
> ...
>
> lock_acquired(&lock->dep_map, ip);
> /* got the lock - rejoice! */
> mutex_remove_waiter(lock, &waiter, current_thread_info());
> mutex_set_owner(lock);
> ww_mutex_set_context_slowpath(lock, ww_ctx, true);
> ...

I used the power of goto's in my own fixed up version below, and reshuffled some calls a bit.

Maybe you could verify if it's correct, and if it is use it as base?
8<---------
diff --git a/kernel/mutex.c b/kernel/mutex.c
index e581ada..f93be1d 100644
--- a/kernel/mutex.c
+++ b/kernel/mutex.c
@@ -486,8 +486,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
 
 			mutex_set_owner(lock);
 			mspin_unlock(MLOCK(lock), &node);
-			preempt_enable();
-			return 0;
+			goto done;
 		}
 		mspin_unlock(MLOCK(lock), &node);
 
@@ -512,6 +511,10 @@ slowpath:
 #endif
 	spin_lock_mutex(&lock->wait_lock, flags);
 
+	/* once more, can we acquire the lock? */
+	if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, 0) == 1))
+		goto skip_wait;
+
 	debug_mutex_lock_common(lock, &waiter);
 	debug_mutex_add_waiter(lock, &waiter, task_thread_info(task));
 
@@ -519,9 +522,6 @@ slowpath:
 	list_add_tail(&waiter.list, &lock->wait_list);
 	waiter.task = task;
 
-	if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, -1) == 1))
-		goto done;
-
 	lock_contended(&lock->dep_map, ip);
 
 	for (;;) {
@@ -535,7 +535,7 @@ slowpath:
 		 * other waiters:
 		 */
 		if (MUTEX_SHOW_NO_WAITER(lock) &&
-		   (atomic_xchg(&lock->count, -1) == 1))
+		    (atomic_xchg(&lock->count, -1) == 1))
 			break;
 
 		/*
@@ -560,11 +560,15 @@ slowpath:
 		schedule_preempt_disabled();
 		spin_lock_mutex(&lock->wait_lock, flags);
 	}
+	mutex_remove_waiter(lock, &waiter, current_thread_info());
+	/* set it to 0 if there are no waiters left: */
+	if (likely(list_empty(&lock->wait_list)))
+		atomic_set(&lock->count, 0);
+	debug_mutex_free_waiter(&waiter);
 
-done:
+skip_wait:
+	/* got the lock - cleanup and rejoice! */
 	lock_acquired(&lock->dep_map, ip);
-	/* got the lock - rejoice! */
-	mutex_remove_waiter(lock, &waiter, current_thread_info());
 	mutex_set_owner(lock);
 
 	if (!__builtin_constant_p(ww_ctx == NULL)) {
@@ -591,15 +595,9 @@ done:
 		}
 	}
 
-	/* set it to 0 if there are no waiters left: */
-	if (likely(list_empty(&lock->wait_list)))
-		atomic_set(&lock->count, 0);
-
 	spin_unlock_mutex(&lock->wait_lock, flags);
-
-	debug_mutex_free_waiter(&waiter);
+done:
 	preempt_enable();
-
 	return 0;
 
 err:


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] mutex: do not unnecessarily deal with waiters
  2013-06-28  5:53     ` Maarten Lankhorst
@ 2013-06-28 19:29       ` Davidlohr Bueso
  2013-06-28 20:13       ` [PATCH v2] " Davidlohr Bueso
  1 sibling, 0 replies; 12+ messages in thread
From: Davidlohr Bueso @ 2013-06-28 19:29 UTC (permalink / raw)
  To: Maarten Lankhorst
  Cc: Ingo Molnar, Rik van Riel, LKML, Peter Zijlstra, Thomas Gleixner

On Fri, 2013-06-28 at 07:53 +0200, Maarten Lankhorst wrote:
> Op 28-06-13 03:32, Davidlohr Bueso schreef:
> > On Thu, 2013-06-27 at 11:00 +0200, Ingo Molnar wrote:
> > [...]
> >> So I tried this out yesterday, but it interacted with the Wait/Wound 
> >> patches in tip:core/mutexes.
> >>
> >> Maarten Lankhorst pointed out that if this patch is applied on top of the 
> >> WW patches as-is, then we get this semantic merge conflict:
> >>
> >>>> @@ -340,6 +339,14 @@ slowpath:
> >>>>  #endif
> >>>>     spin_lock_mutex(&lock->wait_lock, flags);
> >>>>  
> >>>> +   /* once more, can we acquire the lock? */
> >>>> +   if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, 0) == 1)) {
> >>>> +           lock_acquired(&lock->dep_map, ip);
> >>>> +           mutex_set_owner(lock);
> >>>> +           spin_unlock_mutex(&lock->wait_lock, flags);
> >>>> +           goto done;
> >>>> +   }
> >>>>
> >>> ^^^^^^^^^^^^^^
> >>>
> >>> This part skips the whole if (!__builtin_constant_p(ww_ctx == NULL)) { 
> >>> section with the wait_lock held.
> > I see what you mean, I hadn't really looked at the W/W patches. BTW
> > those __builtin_constant_p() calls are pretty ugly/annoying to read,
> > plus why the negation of the NULL check? Couldn't we just do something
> > like:
> It's to kill overhead.. ww_ctx == NULL is a constant only when the function is called with null as explicit parameter.
> 
> So !__builtin_constant_p(ww_ctx == NULL) means that the function was called with a variable ww_ctx.
> > #define is_ww_ctx(x) (__builtin_constant_p(x))
> > ...
> > if (is_ww_ctxt(ww_ctx)) { ... }
> >
> >
> > Anyway, so going back to the actual patch, we need a few cleanups in
> > __mutex_lock_common() before we can rebase this patch - otherwise we're
> > going to end up duplicating a lot of code (and the function is already
> > big enough):
> >
> > How about a new ww_mutex_set_context_slowpath() function that does the
> > w/w lock acquiring and wakes up any sleeping processes. We'd use this
> > function whenever we acquire the lock in the slowpath (with the
> > ->wait_lock taken):
> >
> > static __always_inline void
> > ww_mutex_set_context_slowpath(struct mutex *lock,
> > 			      struct ww_acquire_ctx *ww_ctx, bool debug)
> > {
> > 	if (!__builtin_constant_p(ww_ctx == NULL)) {
> > 		struct mutex_waiter *cur;
> > 		struct ww_mutex *ww = container_of(lock, struct ww_mutex, base);
> >
> > 		/*
> > 		 * This branch gets optimized out for the common case,
> > 		 * and is only important for ww_mutex_lock.
> > 		 */
> > 		ww_mutex_lock_acquired(ww, ww_ctx);
> > 		ww->ctx = ww_ctx;
> >
> > 		/*
> > 		 * Give any possible sleeping processes the chance to wake up,
> > 		 * so they can recheck if they have to back off.
> > 		 */
> > 		list_for_each_entry(cur, &lock->wait_list, list) {
> > 			if (debug)
> > 				debug_mutex_wake_waiter(lock, cur);
> > 			wake_up_process(cur->task);
> > 		}
> > 	}
> > }
> >
> > In ww_mutex_set_context_fastpath() I'm a little confused with the
> > debug_mutex_wake_waiter() calls since we don't deal with debug in the
> > fast path (->wait_lock isn't held). So are these calls
> > correct/necessary?
> Well spotted, but in that case the !debug case mutex_wake_waiter gets optimized out anyway,
> so please don't add a conditional like that.
> > For ww_mutex_set_context_slowpath(), the 'debug' parameter would be
> > necessary since with this patch we avoid doing the debug_mutex on a
> > quick attempt to grab the lock, otherwise we do the slowpath debug,
> > waiters, etc. For instance:
> >
> > ...
> > slowpath:
> > #endif
> > 	spin_lock_mutex(&lock->wait_lock, flags);
> > 	/* once more, can we acquire the lock? */
> > 	if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, 0) == 1)) {
> >         	lock_acquired(&lock->dep_map, ip);
> > 	        mutex_set_owner(lock);
> > 		ww_mutex_set_context_fastpath(lock, ww_ctx, false);
> >         	spin_unlock_mutex(&lock->wait_lock, flags);
> > 	        goto done;
> >   	}
> > ...
> >
> > lock_acquired(&lock->dep_map, ip);
> > /* got the lock - rejoice! */
> > mutex_remove_waiter(lock, &waiter, current_thread_info());
> > mutex_set_owner(lock);
> > ww_mutex_set_context_slowpath(lock, ww_ctx, true);
> > ...
> 
> I used the power of goto's in my own fixed up version below, and reshuffled some calls a bit.

Ok, so I was over complicating things to workaround the debug bits. With
that sorted out then your changes below look correct. I'll send out a
formal v2.

Thanks,
Davidlohr

> 
> Maybe you could verify if it's correct, and if it is use it as base?
> 8<---------
> diff --git a/kernel/mutex.c b/kernel/mutex.c
> index e581ada..f93be1d 100644
> --- a/kernel/mutex.c
> +++ b/kernel/mutex.c
> @@ -486,8 +486,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
>  
>  			mutex_set_owner(lock);
>  			mspin_unlock(MLOCK(lock), &node);
> -			preempt_enable();
> -			return 0;
> +			goto done;
>  		}
>  		mspin_unlock(MLOCK(lock), &node);
>  
> @@ -512,6 +511,10 @@ slowpath:
>  #endif
>  	spin_lock_mutex(&lock->wait_lock, flags);
>  
> +	/* once more, can we acquire the lock? */
> +	if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, 0) == 1))
> +		goto skip_wait;
> +
>  	debug_mutex_lock_common(lock, &waiter);
>  	debug_mutex_add_waiter(lock, &waiter, task_thread_info(task));
>  
> @@ -519,9 +522,6 @@ slowpath:
>  	list_add_tail(&waiter.list, &lock->wait_list);
>  	waiter.task = task;
>  
> -	if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, -1) == 1))
> -		goto done;
> -
>  	lock_contended(&lock->dep_map, ip);
>  
>  	for (;;) {
> @@ -535,7 +535,7 @@ slowpath:
>  		 * other waiters:
>  		 */
>  		if (MUTEX_SHOW_NO_WAITER(lock) &&
> -		   (atomic_xchg(&lock->count, -1) == 1))
> +		    (atomic_xchg(&lock->count, -1) == 1))
>  			break;
>  
>  		/*
> @@ -560,11 +560,15 @@ slowpath:
>  		schedule_preempt_disabled();
>  		spin_lock_mutex(&lock->wait_lock, flags);
>  	}
> +	mutex_remove_waiter(lock, &waiter, current_thread_info());
> +	/* set it to 0 if there are no waiters left: */
> +	if (likely(list_empty(&lock->wait_list)))
> +		atomic_set(&lock->count, 0);
> +	debug_mutex_free_waiter(&waiter);
>  
> -done:
> +skip_wait:
> +	/* got the lock - cleanup and rejoice! */
>  	lock_acquired(&lock->dep_map, ip);
> -	/* got the lock - rejoice! */
> -	mutex_remove_waiter(lock, &waiter, current_thread_info());
>  	mutex_set_owner(lock);
>  
>  	if (!__builtin_constant_p(ww_ctx == NULL)) {
> @@ -591,15 +595,9 @@ done:
>  		}
>  	}
>  
> -	/* set it to 0 if there are no waiters left: */
> -	if (likely(list_empty(&lock->wait_list)))
> -		atomic_set(&lock->count, 0);
> -
>  	spin_unlock_mutex(&lock->wait_lock, flags);
> -
> -	debug_mutex_free_waiter(&waiter);
> +done:
>  	preempt_enable();
> -
>  	return 0;
>  
>  err:
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2] mutex: do not unnecessarily deal with waiters
  2013-06-28  5:53     ` Maarten Lankhorst
  2013-06-28 19:29       ` Davidlohr Bueso
@ 2013-06-28 20:13       ` Davidlohr Bueso
  2013-06-28 20:53         ` Rik van Riel
                           ` (3 more replies)
  1 sibling, 4 replies; 12+ messages in thread
From: Davidlohr Bueso @ 2013-06-28 20:13 UTC (permalink / raw)
  To: Maarten Lankhorst
  Cc: Ingo Molnar, Rik van Riel, LKML, Peter Zijlstra, Thomas Gleixner

From: Davidlohr Bueso <davidlohr.bueso@hp.com>

Upon entering the slowpath, we immediately attempt to acquire the lock
by checking if it is already unlocked. If we are lucky enough that this
is the case, then we don't need to deal with any waiter related logic.

Furthermore any checks for an empty wait_list are unnecessary as we
already know that count is non-negative and hence no one is waiting for
the lock.

Move the count check and xchg calls to be done before any waiters are
setup - including waiter debugging. Upon failure to acquire the lock,
the xchg sets the counter to 0, instead of -1 as it was originally.
This can be done here since we set it back to -1 right at the beginning
of the loop so other waiters are woken up when the lock is released.

When tested on a 8-socket (80 core) system against a vanilla 3.10-rc1
kernel, this patch provides some small performance benefits (+2-6%).
While it could be considered in the noise level, the average percentages
were stable across multiple runs and no performance regressions were seen.
Two big winners, for small amounts of users (10-100), were the short and
compute workloads had a +19.36% and +%15.76% in jobs per minute.

Also change some break statements to 'goto slowpath', which IMO makes a
little more intuitive to read.

Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
---
v1->v2: Rebase on -tip, dealing with the new W/W mutexes.

 kernel/mutex.c | 41 ++++++++++++++++++-----------------------
 1 file changed, 18 insertions(+), 23 deletions(-)

diff --git a/kernel/mutex.c b/kernel/mutex.c
index e581ada..61cce1f 100644
--- a/kernel/mutex.c
+++ b/kernel/mutex.c
@@ -460,7 +460,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
 			 * performed the optimistic spinning cannot be done.
 			 */
 			if (ACCESS_ONCE(ww->ctx))
-				break;
+				goto slowpath;
 		}
 
 		/*
@@ -471,7 +471,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
 		owner = ACCESS_ONCE(lock->owner);
 		if (owner && !mutex_spin_on_owner(lock, owner)) {
 			mspin_unlock(MLOCK(lock), &node);
-			break;
+			goto slowpath;
 		}
 
 		if ((atomic_read(&lock->count) == 1) &&
@@ -486,8 +486,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
 
 			mutex_set_owner(lock);
 			mspin_unlock(MLOCK(lock), &node);
-			preempt_enable();
-			return 0;
+			goto done;
 		}
 		mspin_unlock(MLOCK(lock), &node);
 
@@ -498,7 +497,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
 		 * the owner complete.
 		 */
 		if (!owner && (need_resched() || rt_task(task)))
-			break;
+			goto slowpath;
 
 		/*
 		 * The cpu_relax() call is a compiler barrier which forces
@@ -512,6 +511,10 @@ slowpath:
 #endif
 	spin_lock_mutex(&lock->wait_lock, flags);
 
+	/* once more, can we acquire the lock? */
+	if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, 0) == 1))
+		goto skip_wait;
+
 	debug_mutex_lock_common(lock, &waiter);
 	debug_mutex_add_waiter(lock, &waiter, task_thread_info(task));
 
@@ -519,9 +522,6 @@ slowpath:
 	list_add_tail(&waiter.list, &lock->wait_list);
 	waiter.task = task;
 
-	if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, -1) == 1))
-		goto done;
-
 	lock_contended(&lock->dep_map, ip);
 
 	for (;;) {
@@ -535,7 +535,7 @@ slowpath:
 		 * other waiters:
 		 */
 		if (MUTEX_SHOW_NO_WAITER(lock) &&
-		   (atomic_xchg(&lock->count, -1) == 1))
+		    (atomic_xchg(&lock->count, -1) == 1))
 			break;
 
 		/*
@@ -560,24 +560,25 @@ slowpath:
 		schedule_preempt_disabled();
 		spin_lock_mutex(&lock->wait_lock, flags);
 	}
+	mutex_remove_waiter(lock, &waiter, current_thread_info());
+	/* set it to 0 if there are no waiters left: */
+	if (likely(list_empty(&lock->wait_list)))
+		atomic_set(&lock->count, 0);
+	debug_mutex_free_waiter(&waiter);
 
-done:
+skip_wait:
+	/* got the lock - cleanup and rejoice! */
 	lock_acquired(&lock->dep_map, ip);
-	/* got the lock - rejoice! */
-	mutex_remove_waiter(lock, &waiter, current_thread_info());
 	mutex_set_owner(lock);
 
 	if (!__builtin_constant_p(ww_ctx == NULL)) {
-		struct ww_mutex *ww = container_of(lock,
-						      struct ww_mutex,
-						      base);
+		struct ww_mutex *ww = container_of(lock, struct ww_mutex, base);
 		struct mutex_waiter *cur;
 
 		/*
 		 * This branch gets optimized out for the common case,
 		 * and is only important for ww_mutex_lock.
 		 */
-
 		ww_mutex_lock_acquired(ww, ww_ctx);
 		ww->ctx = ww_ctx;
 
@@ -591,15 +592,9 @@ done:
 		}
 	}
 
-	/* set it to 0 if there are no waiters left: */
-	if (likely(list_empty(&lock->wait_list)))
-		atomic_set(&lock->count, 0);
-
 	spin_unlock_mutex(&lock->wait_lock, flags);
-
-	debug_mutex_free_waiter(&waiter);
+done:
 	preempt_enable();
-
 	return 0;
 
 err:
-- 
1.7.11.7




^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] mutex: do not unnecessarily deal with waiters
  2013-06-28 20:13       ` [PATCH v2] " Davidlohr Bueso
@ 2013-06-28 20:53         ` Rik van Riel
  2013-06-29  7:17         ` Maarten Lankhorst
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 12+ messages in thread
From: Rik van Riel @ 2013-06-28 20:53 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Maarten Lankhorst, Ingo Molnar, LKML, Peter Zijlstra,
	Thomas Gleixner

On 06/28/2013 04:13 PM, Davidlohr Bueso wrote:
> From: Davidlohr Bueso <davidlohr.bueso@hp.com>
>
> Upon entering the slowpath, we immediately attempt to acquire the lock
> by checking if it is already unlocked. If we are lucky enough that this
> is the case, then we don't need to deal with any waiter related logic.
>
> Furthermore any checks for an empty wait_list are unnecessary as we
> already know that count is non-negative and hence no one is waiting for
> the lock.
>
> Move the count check and xchg calls to be done before any waiters are
> setup - including waiter debugging. Upon failure to acquire the lock,
> the xchg sets the counter to 0, instead of -1 as it was originally.
> This can be done here since we set it back to -1 right at the beginning
> of the loop so other waiters are woken up when the lock is released.
>
> When tested on a 8-socket (80 core) system against a vanilla 3.10-rc1
> kernel, this patch provides some small performance benefits (+2-6%).
> While it could be considered in the noise level, the average percentages
> were stable across multiple runs and no performance regressions were seen.
> Two big winners, for small amounts of users (10-100), were the short and
> compute workloads had a +19.36% and +%15.76% in jobs per minute.
>
> Also change some break statements to 'goto slowpath', which IMO makes a
> little more intuitive to read.
>
> Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>

Acked-by: Rik van Riel <riel@redhat.com>


-- 
All rights reversed

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] mutex: do not unnecessarily deal with waiters
  2013-06-28 20:13       ` [PATCH v2] " Davidlohr Bueso
  2013-06-28 20:53         ` Rik van Riel
@ 2013-06-29  7:17         ` Maarten Lankhorst
  2013-07-19 17:57         ` Davidlohr Bueso
  2013-07-24  3:55         ` [tip:core/locking] mutex: Do " tip-bot for Davidlohr Bueso
  3 siblings, 0 replies; 12+ messages in thread
From: Maarten Lankhorst @ 2013-06-29  7:17 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Ingo Molnar, Rik van Riel, LKML, Peter Zijlstra, Thomas Gleixner

Op 28-06-13 22:13, Davidlohr Bueso schreef:
> From: Davidlohr Bueso <davidlohr.bueso@hp.com>
>
> Upon entering the slowpath, we immediately attempt to acquire the lock
> by checking if it is already unlocked. If we are lucky enough that this
> is the case, then we don't need to deal with any waiter related logic.
>
> Furthermore any checks for an empty wait_list are unnecessary as we
> already know that count is non-negative and hence no one is waiting for
> the lock.
>
> Move the count check and xchg calls to be done before any waiters are
> setup - including waiter debugging. Upon failure to acquire the lock,
> the xchg sets the counter to 0, instead of -1 as it was originally.
> This can be done here since we set it back to -1 right at the beginning
> of the loop so other waiters are woken up when the lock is released.
>
> When tested on a 8-socket (80 core) system against a vanilla 3.10-rc1
> kernel, this patch provides some small performance benefits (+2-6%).
> While it could be considered in the noise level, the average percentages
> were stable across multiple runs and no performance regressions were seen.
> Two big winners, for small amounts of users (10-100), were the short and
> compute workloads had a +19.36% and +%15.76% in jobs per minute.
>
> Also change some break statements to 'goto slowpath', which IMO makes a
> little more intuitive to read.
Nice turquoise bikeshed you built there. ;-)

Acked-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
> ---
> v1->v2: Rebase on -tip, dealing with the new W/W mutexes.
>
>  kernel/mutex.c | 41 ++++++++++++++++++-----------------------
>  1 file changed, 18 insertions(+), 23 deletions(-)
>
> diff --git a/kernel/mutex.c b/kernel/mutex.c
> index e581ada..61cce1f 100644
> --- a/kernel/mutex.c
> +++ b/kernel/mutex.c
> @@ -460,7 +460,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
>  			 * performed the optimistic spinning cannot be done.
>  			 */
>  			if (ACCESS_ONCE(ww->ctx))
> -				break;
> +				goto slowpath;
>  		}
>  
>  		/*
> @@ -471,7 +471,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
>  		owner = ACCESS_ONCE(lock->owner);
>  		if (owner && !mutex_spin_on_owner(lock, owner)) {
>  			mspin_unlock(MLOCK(lock), &node);
> -			break;
> +			goto slowpath;
>  		}
>  
>  		if ((atomic_read(&lock->count) == 1) &&
> @@ -486,8 +486,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
>  
>  			mutex_set_owner(lock);
>  			mspin_unlock(MLOCK(lock), &node);
> -			preempt_enable();
> -			return 0;
> +			goto done;
>  		}
>  		mspin_unlock(MLOCK(lock), &node);
>  
> @@ -498,7 +497,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
>  		 * the owner complete.
>  		 */
>  		if (!owner && (need_resched() || rt_task(task)))
> -			break;
> +			goto slowpath;
>  
>  		/*
>  		 * The cpu_relax() call is a compiler barrier which forces
> @@ -512,6 +511,10 @@ slowpath:
>  #endif
>  	spin_lock_mutex(&lock->wait_lock, flags);
>  
> +	/* once more, can we acquire the lock? */
> +	if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, 0) == 1))
> +		goto skip_wait;
> +
>  	debug_mutex_lock_common(lock, &waiter);
>  	debug_mutex_add_waiter(lock, &waiter, task_thread_info(task));
>  
> @@ -519,9 +522,6 @@ slowpath:
>  	list_add_tail(&waiter.list, &lock->wait_list);
>  	waiter.task = task;
>  
> -	if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, -1) == 1))
> -		goto done;
> -
>  	lock_contended(&lock->dep_map, ip);
>  
>  	for (;;) {
> @@ -535,7 +535,7 @@ slowpath:
>  		 * other waiters:
>  		 */
>  		if (MUTEX_SHOW_NO_WAITER(lock) &&
> -		   (atomic_xchg(&lock->count, -1) == 1))
> +		    (atomic_xchg(&lock->count, -1) == 1))
>  			break;
>  
>  		/*
> @@ -560,24 +560,25 @@ slowpath:
>  		schedule_preempt_disabled();
>  		spin_lock_mutex(&lock->wait_lock, flags);
>  	}
> +	mutex_remove_waiter(lock, &waiter, current_thread_info());
> +	/* set it to 0 if there are no waiters left: */
> +	if (likely(list_empty(&lock->wait_list)))
> +		atomic_set(&lock->count, 0);
> +	debug_mutex_free_waiter(&waiter);
>  
> -done:
> +skip_wait:
> +	/* got the lock - cleanup and rejoice! */
>  	lock_acquired(&lock->dep_map, ip);
> -	/* got the lock - rejoice! */
> -	mutex_remove_waiter(lock, &waiter, current_thread_info());
>  	mutex_set_owner(lock);
>  
>  	if (!__builtin_constant_p(ww_ctx == NULL)) {
> -		struct ww_mutex *ww = container_of(lock,
> -						      struct ww_mutex,
> -						      base);
> +		struct ww_mutex *ww = container_of(lock, struct ww_mutex, base);
>  		struct mutex_waiter *cur;
>  
>  		/*
>  		 * This branch gets optimized out for the common case,
>  		 * and is only important for ww_mutex_lock.
>  		 */
> -
>  		ww_mutex_lock_acquired(ww, ww_ctx);
>  		ww->ctx = ww_ctx;
>  
> @@ -591,15 +592,9 @@ done:
>  		}
>  	}
>  
> -	/* set it to 0 if there are no waiters left: */
> -	if (likely(list_empty(&lock->wait_list)))
> -		atomic_set(&lock->count, 0);
> -
>  	spin_unlock_mutex(&lock->wait_lock, flags);
> -
> -	debug_mutex_free_waiter(&waiter);
> +done:
>  	preempt_enable();
> -
>  	return 0;
>  
>  err:


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] mutex: do not unnecessarily deal with waiters
  2013-06-28 20:13       ` [PATCH v2] " Davidlohr Bueso
  2013-06-28 20:53         ` Rik van Riel
  2013-06-29  7:17         ` Maarten Lankhorst
@ 2013-07-19 17:57         ` Davidlohr Bueso
  2013-07-24  3:55         ` [tip:core/locking] mutex: Do " tip-bot for Davidlohr Bueso
  3 siblings, 0 replies; 12+ messages in thread
From: Davidlohr Bueso @ 2013-07-19 17:57 UTC (permalink / raw)
  To: Maarten Lankhorst
  Cc: Ingo Molnar, Rik van Riel, LKML, Peter Zijlstra, Thomas Gleixner

Ingo, any chance of picking this up? Thanks!

On Fri, 2013-06-28 at 13:13 -0700, Davidlohr Bueso wrote:
> From: Davidlohr Bueso <davidlohr.bueso@hp.com>
> 
> Upon entering the slowpath, we immediately attempt to acquire the lock
> by checking if it is already unlocked. If we are lucky enough that this
> is the case, then we don't need to deal with any waiter related logic.
> 
> Furthermore any checks for an empty wait_list are unnecessary as we
> already know that count is non-negative and hence no one is waiting for
> the lock.
> 
> Move the count check and xchg calls to be done before any waiters are
> setup - including waiter debugging. Upon failure to acquire the lock,
> the xchg sets the counter to 0, instead of -1 as it was originally.
> This can be done here since we set it back to -1 right at the beginning
> of the loop so other waiters are woken up when the lock is released.
> 
> When tested on a 8-socket (80 core) system against a vanilla 3.10-rc1
> kernel, this patch provides some small performance benefits (+2-6%).
> While it could be considered in the noise level, the average percentages
> were stable across multiple runs and no performance regressions were seen.
> Two big winners, for small amounts of users (10-100), were the short and
> compute workloads had a +19.36% and +%15.76% in jobs per minute.
> 
> Also change some break statements to 'goto slowpath', which IMO makes a
> little more intuitive to read.
> 
> Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
> ---
> v1->v2: Rebase on -tip, dealing with the new W/W mutexes.
> 
>  kernel/mutex.c | 41 ++++++++++++++++++-----------------------
>  1 file changed, 18 insertions(+), 23 deletions(-)
> 
> diff --git a/kernel/mutex.c b/kernel/mutex.c
> index e581ada..61cce1f 100644
> --- a/kernel/mutex.c
> +++ b/kernel/mutex.c
> @@ -460,7 +460,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
>  			 * performed the optimistic spinning cannot be done.
>  			 */
>  			if (ACCESS_ONCE(ww->ctx))
> -				break;
> +				goto slowpath;
>  		}
>  
>  		/*
> @@ -471,7 +471,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
>  		owner = ACCESS_ONCE(lock->owner);
>  		if (owner && !mutex_spin_on_owner(lock, owner)) {
>  			mspin_unlock(MLOCK(lock), &node);
> -			break;
> +			goto slowpath;
>  		}
>  
>  		if ((atomic_read(&lock->count) == 1) &&
> @@ -486,8 +486,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
>  
>  			mutex_set_owner(lock);
>  			mspin_unlock(MLOCK(lock), &node);
> -			preempt_enable();
> -			return 0;
> +			goto done;
>  		}
>  		mspin_unlock(MLOCK(lock), &node);
>  
> @@ -498,7 +497,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
>  		 * the owner complete.
>  		 */
>  		if (!owner && (need_resched() || rt_task(task)))
> -			break;
> +			goto slowpath;
>  
>  		/*
>  		 * The cpu_relax() call is a compiler barrier which forces
> @@ -512,6 +511,10 @@ slowpath:
>  #endif
>  	spin_lock_mutex(&lock->wait_lock, flags);
>  
> +	/* once more, can we acquire the lock? */
> +	if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, 0) == 1))
> +		goto skip_wait;
> +
>  	debug_mutex_lock_common(lock, &waiter);
>  	debug_mutex_add_waiter(lock, &waiter, task_thread_info(task));
>  
> @@ -519,9 +522,6 @@ slowpath:
>  	list_add_tail(&waiter.list, &lock->wait_list);
>  	waiter.task = task;
>  
> -	if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, -1) == 1))
> -		goto done;
> -
>  	lock_contended(&lock->dep_map, ip);
>  
>  	for (;;) {
> @@ -535,7 +535,7 @@ slowpath:
>  		 * other waiters:
>  		 */
>  		if (MUTEX_SHOW_NO_WAITER(lock) &&
> -		   (atomic_xchg(&lock->count, -1) == 1))
> +		    (atomic_xchg(&lock->count, -1) == 1))
>  			break;
>  
>  		/*
> @@ -560,24 +560,25 @@ slowpath:
>  		schedule_preempt_disabled();
>  		spin_lock_mutex(&lock->wait_lock, flags);
>  	}
> +	mutex_remove_waiter(lock, &waiter, current_thread_info());
> +	/* set it to 0 if there are no waiters left: */
> +	if (likely(list_empty(&lock->wait_list)))
> +		atomic_set(&lock->count, 0);
> +	debug_mutex_free_waiter(&waiter);
>  
> -done:
> +skip_wait:
> +	/* got the lock - cleanup and rejoice! */
>  	lock_acquired(&lock->dep_map, ip);
> -	/* got the lock - rejoice! */
> -	mutex_remove_waiter(lock, &waiter, current_thread_info());
>  	mutex_set_owner(lock);
>  
>  	if (!__builtin_constant_p(ww_ctx == NULL)) {
> -		struct ww_mutex *ww = container_of(lock,
> -						      struct ww_mutex,
> -						      base);
> +		struct ww_mutex *ww = container_of(lock, struct ww_mutex, base);
>  		struct mutex_waiter *cur;
>  
>  		/*
>  		 * This branch gets optimized out for the common case,
>  		 * and is only important for ww_mutex_lock.
>  		 */
> -
>  		ww_mutex_lock_acquired(ww, ww_ctx);
>  		ww->ctx = ww_ctx;
>  
> @@ -591,15 +592,9 @@ done:
>  		}
>  	}
>  
> -	/* set it to 0 if there are no waiters left: */
> -	if (likely(list_empty(&lock->wait_list)))
> -		atomic_set(&lock->count, 0);
> -
>  	spin_unlock_mutex(&lock->wait_lock, flags);
> -
> -	debug_mutex_free_waiter(&waiter);
> +done:
>  	preempt_enable();
> -
>  	return 0;
>  
>  err:



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [tip:core/locking] mutex: Do not unnecessarily deal with waiters
  2013-06-28 20:13       ` [PATCH v2] " Davidlohr Bueso
                           ` (2 preceding siblings ...)
  2013-07-19 17:57         ` Davidlohr Bueso
@ 2013-07-24  3:55         ` tip-bot for Davidlohr Bueso
  3 siblings, 0 replies; 12+ messages in thread
From: tip-bot for Davidlohr Bueso @ 2013-07-24  3:55 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, davidlohr.bueso, a.p.zijlstra, riel,
	tglx, maarten.lankhorst

Commit-ID:  ec83f425dbca47e19c6737e8e7db0d0924a5de1b
Gitweb:     http://git.kernel.org/tip/ec83f425dbca47e19c6737e8e7db0d0924a5de1b
Author:     Davidlohr Bueso <davidlohr.bueso@hp.com>
AuthorDate: Fri, 28 Jun 2013 13:13:18 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 23 Jul 2013 11:48:37 +0200

mutex: Do not unnecessarily deal with waiters

Upon entering the slowpath, we immediately attempt to acquire
the lock by checking if it is already unlocked. If we are lucky
enough that this is the case, then we don't need to deal with
any waiter related logic.

Furthermore any checks for an empty wait_list are unnecessary as
we already know that count is non-negative and hence no one is
waiting for the lock.

Move the count check and xchg calls to be done before any
waiters are setup - including waiter debugging. Upon failure to
acquire the lock, the xchg sets the counter to 0, instead of -1
as it was originally. This can be done here since we set it back
to -1 right at the beginning of the loop so other waiters are
woken up when the lock is released.

When tested on a 8-socket (80 core) system against a vanilla
3.10-rc1 kernel, this patch provides some small performance
benefits (+2-6%). While it could be considered in the noise
level, the average percentages were stable across multiple runs
and no performance regressions were seen. Two big winners, for
small amounts of users (10-100), were the short and compute
workloads had a +19.36% and +%15.76% in jobs per minute.

Also change some break statements to 'goto slowpath', which IMO
makes a little more intuitive to read.

Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1372450398.2106.1.camel@buesod1.americas.hpqcorp.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/mutex.c | 41 ++++++++++++++++++-----------------------
 1 file changed, 18 insertions(+), 23 deletions(-)

diff --git a/kernel/mutex.c b/kernel/mutex.c
index 7ff48c5..386ad5d 100644
--- a/kernel/mutex.c
+++ b/kernel/mutex.c
@@ -463,7 +463,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
 			 * performed the optimistic spinning cannot be done.
 			 */
 			if (ACCESS_ONCE(ww->ctx))
-				break;
+				goto slowpath;
 		}
 
 		/*
@@ -474,7 +474,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
 		owner = ACCESS_ONCE(lock->owner);
 		if (owner && !mutex_spin_on_owner(lock, owner)) {
 			mspin_unlock(MLOCK(lock), &node);
-			break;
+			goto slowpath;
 		}
 
 		if ((atomic_read(&lock->count) == 1) &&
@@ -489,8 +489,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
 
 			mutex_set_owner(lock);
 			mspin_unlock(MLOCK(lock), &node);
-			preempt_enable();
-			return 0;
+			goto done;
 		}
 		mspin_unlock(MLOCK(lock), &node);
 
@@ -501,7 +500,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
 		 * the owner complete.
 		 */
 		if (!owner && (need_resched() || rt_task(task)))
-			break;
+			goto slowpath;
 
 		/*
 		 * The cpu_relax() call is a compiler barrier which forces
@@ -515,6 +514,10 @@ slowpath:
 #endif
 	spin_lock_mutex(&lock->wait_lock, flags);
 
+	/* once more, can we acquire the lock? */
+	if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, 0) == 1))
+		goto skip_wait;
+
 	debug_mutex_lock_common(lock, &waiter);
 	debug_mutex_add_waiter(lock, &waiter, task_thread_info(task));
 
@@ -522,9 +525,6 @@ slowpath:
 	list_add_tail(&waiter.list, &lock->wait_list);
 	waiter.task = task;
 
-	if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, -1) == 1))
-		goto done;
-
 	lock_contended(&lock->dep_map, ip);
 
 	for (;;) {
@@ -538,7 +538,7 @@ slowpath:
 		 * other waiters:
 		 */
 		if (MUTEX_SHOW_NO_WAITER(lock) &&
-		   (atomic_xchg(&lock->count, -1) == 1))
+		    (atomic_xchg(&lock->count, -1) == 1))
 			break;
 
 		/*
@@ -563,24 +563,25 @@ slowpath:
 		schedule_preempt_disabled();
 		spin_lock_mutex(&lock->wait_lock, flags);
 	}
+	mutex_remove_waiter(lock, &waiter, current_thread_info());
+	/* set it to 0 if there are no waiters left: */
+	if (likely(list_empty(&lock->wait_list)))
+		atomic_set(&lock->count, 0);
+	debug_mutex_free_waiter(&waiter);
 
-done:
+skip_wait:
+	/* got the lock - cleanup and rejoice! */
 	lock_acquired(&lock->dep_map, ip);
-	/* got the lock - rejoice! */
-	mutex_remove_waiter(lock, &waiter, current_thread_info());
 	mutex_set_owner(lock);
 
 	if (!__builtin_constant_p(ww_ctx == NULL)) {
-		struct ww_mutex *ww = container_of(lock,
-						      struct ww_mutex,
-						      base);
+		struct ww_mutex *ww = container_of(lock, struct ww_mutex, base);
 		struct mutex_waiter *cur;
 
 		/*
 		 * This branch gets optimized out for the common case,
 		 * and is only important for ww_mutex_lock.
 		 */
-
 		ww_mutex_lock_acquired(ww, ww_ctx);
 		ww->ctx = ww_ctx;
 
@@ -594,15 +595,9 @@ done:
 		}
 	}
 
-	/* set it to 0 if there are no waiters left: */
-	if (likely(list_empty(&lock->wait_list)))
-		atomic_set(&lock->count, 0);
-
 	spin_unlock_mutex(&lock->wait_lock, flags);
-
-	debug_mutex_free_waiter(&waiter);
+done:
 	preempt_enable();
-
 	return 0;
 
 err:

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2013-07-24  3:56 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-23 23:59 [PATCH] mutex: do not unnecessarily deal with waiters Davidlohr Bueso
2013-05-31  1:12 ` Davidlohr Bueso
2013-06-26 17:49   ` Davidlohr Bueso
2013-06-27  9:00 ` Ingo Molnar
2013-06-28  1:32   ` Davidlohr Bueso
2013-06-28  5:53     ` Maarten Lankhorst
2013-06-28 19:29       ` Davidlohr Bueso
2013-06-28 20:13       ` [PATCH v2] " Davidlohr Bueso
2013-06-28 20:53         ` Rik van Riel
2013-06-29  7:17         ` Maarten Lankhorst
2013-07-19 17:57         ` Davidlohr Bueso
2013-07-24  3:55         ` [tip:core/locking] mutex: Do " tip-bot for Davidlohr Bueso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox