public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrea Parri <andrea.parri@amarulasolutions.com>
To: Waiman Long <longman@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	linux-kernel@vger.kernel.org, Dave Chinner <david@fromorbit.com>,
	Eric Sandeen <sandeen@redhat.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Subject: Re: [PATCH] locking/rwsem: Synchronize task state & waiter->task of readers
Date: Mon, 23 Apr 2018 22:55:14 +0200	[thread overview]
Message-ID: <20180423205514.GA5876@andrea> (raw)
In-Reply-To: <6c112ecb-d662-b1fc-152a-32060ec46dae@redhat.com>

Hi Waiman,

On Mon, Apr 23, 2018 at 12:46:12PM -0400, Waiman Long wrote:
> On 04/10/2018 01:22 PM, Waiman Long wrote:
> > It was observed occasionally in PowerPC systems that there was reader
> > who had not been woken up but that its waiter->task had been cleared.

Can you provide more details about these observations?  (links to LKML
posts, traces, applications used/micro-benchmarks, ...)


> >
> > One probable cause of this missed wakeup may be the fact that the
> > waiter->task and the task state have not been properly synchronized as
> > the lock release-acquire pair of different locks in the wakeup code path
> > does not provide a full memory barrier guarantee.

I guess that by the "pair of different locks" you mean (sem->wait_lock,
p->pi_lock), right?  BTW, __rwsem_down_write_failed_common() is calling
wake_up_q() _before_ releasing the wait_lock: did you intend to exclude
this callsite? (why?)


> So smp_store_mb()
> > is now used to set waiter->task to NULL to provide a proper memory
> > barrier for synchronization.

Mmh; the patch is not introducing an smp_store_mb()... My guess is that
you are thinking at the sequence:

	smp_store_release(&waiter->task, NULL);
	[...]
	smp_mb(); /* added with your patch */

or what am I missing?


> >
> > Signed-off-by: Waiman Long <longman@redhat.com>
> > ---
> >  kernel/locking/rwsem-xadd.c | 17 +++++++++++++++++
> >  1 file changed, 17 insertions(+)
> >
> > diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
> > index e795908..b3c588c 100644
> > --- a/kernel/locking/rwsem-xadd.c
> > +++ b/kernel/locking/rwsem-xadd.c
> > @@ -209,6 +209,23 @@ static void __rwsem_mark_wake(struct rw_semaphore *sem,
> >  		smp_store_release(&waiter->task, NULL);
> >  	}
> >  
> > +	/*
> > +	 * To avoid missed wakeup of reader, we need to make sure
> > +	 * that task state and waiter->task are properly synchronized.
> > +	 *
> > +	 *     wakeup		      sleep
> > +	 *     ------		      -----
> > +	 * __rwsem_mark_wake:	rwsem_down_read_failed*:
> > +	 *   [S] waiter->task	  [S] set_current_state(state)
> > +	 *	 MB		      MB
> > +	 * try_to_wake_up:
> > +	 *   [L] state		  [L] waiter->task
> > +	 *
> > +	 * For the wakeup path, the original lock release-acquire pair
> > +	 * does not provide enough guarantee of proper synchronization.
> > +	 */
> > +	smp_mb();
> > +
> >  	adjustment = woken * RWSEM_ACTIVE_READ_BIAS - adjustment;
> >  	if (list_empty(&sem->wait_list)) {
> >  		/* hit end of list above */
> 
> Ping!
> 
> Any thought on this patch?
> 
> I am wondering if there is a cheaper way to apply the memory barrier
> just on architectures that need it.

try_to_wake_up() does:

	raw_spin_lock_irqsave(&p->pi_lock, flags);
	smp_mb__after_spinlock();
	if (!(p->state & state))

My understanding is that this smp_mb__after_spinlock() provides us with
the guarantee you described above.  The smp_mb__after_spinlock() should
represent a 'cheaper way' to provide such a guarantee.

If this understanding is correct, the remaining question would be about
whether you want to rely on (and document) the smp_mb__after_spinlock()
in the callsite in question (the comment in wake_up_q()

   /*
    * wake_up_process() implies a wmb() to pair with the queueing
    * in wake_q_add() so as not to miss wakeups.
    */

does not appear to be suffient...).

  Andrea


> 
> Cheers,
> Longman
> 

  reply	other threads:[~2018-04-23 20:55 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-10 17:22 [PATCH] locking/rwsem: Synchronize task state & waiter->task of readers Waiman Long
2018-04-18  6:20 ` Benjamin Herrenschmidt
2018-04-23 16:46 ` Waiman Long
2018-04-23 20:55   ` Andrea Parri [this message]
2018-04-23 21:30     ` Waiman Long
2018-04-24  9:15     ` Peter Zijlstra
2018-04-24 14:49       ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180423205514.GA5876@andrea \
    --to=andrea.parri@amarulasolutions.com \
    --cc=david@fromorbit.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mingo@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=sandeen@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox