All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Parri <andrea.parri@amarulasolutions.com>
To: Waiman Long <longman@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	linux-kernel@vger.kernel.org, Dave Chinner <david@fromorbit.com>,
	Eric Sandeen <sandeen@redhat.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Subject: Re: [PATCH] locking/rwsem: Synchronize task state & waiter->task of readers
Date: Mon, 23 Apr 2018 22:55:14 +0200	[thread overview]
Message-ID: <20180423205514.GA5876@andrea> (raw)
In-Reply-To: <6c112ecb-d662-b1fc-152a-32060ec46dae@redhat.com>

Hi Waiman,

On Mon, Apr 23, 2018 at 12:46:12PM -0400, Waiman Long wrote:
> On 04/10/2018 01:22 PM, Waiman Long wrote:
> > It was observed occasionally in PowerPC systems that there was reader
> > who had not been woken up but that its waiter->task had been cleared.

Can you provide more details about these observations?  (links to LKML
posts, traces, applications used/micro-benchmarks, ...)


> >
> > One probable cause of this missed wakeup may be the fact that the
> > waiter->task and the task state have not been properly synchronized as
> > the lock release-acquire pair of different locks in the wakeup code path
> > does not provide a full memory barrier guarantee.

I guess that by the "pair of different locks" you mean (sem->wait_lock,
p->pi_lock), right?  BTW, __rwsem_down_write_failed_common() is calling
wake_up_q() _before_ releasing the wait_lock: did you intend to exclude
this callsite? (why?)


> So smp_store_mb()
> > is now used to set waiter->task to NULL to provide a proper memory
> > barrier for synchronization.

Mmh; the patch is not introducing an smp_store_mb()... My guess is that
you are thinking at the sequence:

	smp_store_release(&waiter->task, NULL);
	[...]
	smp_mb(); /* added with your patch */

or what am I missing?


> >
> > Signed-off-by: Waiman Long <longman@redhat.com>
> > ---
> >  kernel/locking/rwsem-xadd.c | 17 +++++++++++++++++
> >  1 file changed, 17 insertions(+)
> >
> > diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
> > index e795908..b3c588c 100644
> > --- a/kernel/locking/rwsem-xadd.c
> > +++ b/kernel/locking/rwsem-xadd.c
> > @@ -209,6 +209,23 @@ static void __rwsem_mark_wake(struct rw_semaphore *sem,
> >  		smp_store_release(&waiter->task, NULL);
> >  	}
> >  
> > +	/*
> > +	 * To avoid missed wakeup of reader, we need to make sure
> > +	 * that task state and waiter->task are properly synchronized.
> > +	 *
> > +	 *     wakeup		      sleep
> > +	 *     ------		      -----
> > +	 * __rwsem_mark_wake:	rwsem_down_read_failed*:
> > +	 *   [S] waiter->task	  [S] set_current_state(state)
> > +	 *	 MB		      MB
> > +	 * try_to_wake_up:
> > +	 *   [L] state		  [L] waiter->task
> > +	 *
> > +	 * For the wakeup path, the original lock release-acquire pair
> > +	 * does not provide enough guarantee of proper synchronization.
> > +	 */
> > +	smp_mb();
> > +
> >  	adjustment = woken * RWSEM_ACTIVE_READ_BIAS - adjustment;
> >  	if (list_empty(&sem->wait_list)) {
> >  		/* hit end of list above */
> 
> Ping!
> 
> Any thought on this patch?
> 
> I am wondering if there is a cheaper way to apply the memory barrier
> just on architectures that need it.

try_to_wake_up() does:

	raw_spin_lock_irqsave(&p->pi_lock, flags);
	smp_mb__after_spinlock();
	if (!(p->state & state))

My understanding is that this smp_mb__after_spinlock() provides us with
the guarantee you described above.  The smp_mb__after_spinlock() should
represent a 'cheaper way' to provide such a guarantee.

If this understanding is correct, the remaining question would be about
whether you want to rely on (and document) the smp_mb__after_spinlock()
in the callsite in question (the comment in wake_up_q()

   /*
    * wake_up_process() implies a wmb() to pair with the queueing
    * in wake_q_add() so as not to miss wakeups.
    */

does not appear to be suffient...).

  Andrea


> 
> Cheers,
> Longman
> 

  reply	other threads:[~2018-04-23 20:55 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-10 17:22 [PATCH] locking/rwsem: Synchronize task state & waiter->task of readers Waiman Long
2018-04-18  6:20 ` Benjamin Herrenschmidt
2018-04-23 16:46 ` Waiman Long
2018-04-23 20:55   ` Andrea Parri [this message]
2018-04-23 21:30     ` Waiman Long
2018-04-24  9:15     ` Peter Zijlstra
2018-04-24 14:49       ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180423205514.GA5876@andrea \
    --to=andrea.parri@amarulasolutions.com \
    --cc=david@fromorbit.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mingo@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=sandeen@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.