All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Waiman Long <Waiman.Long@hpe.com>
Cc: Ingo Molnar <mingo@redhat.com>,
	linux-kernel@vger.kernel.org,
	Scott J Norton <scott.norton@hpe.com>,
	Douglas Hatch <doug.hatch@hpe.com>,
	Will Deacon <will.deacon@arm.com>
Subject: Re: [PATCH] locking/qrwlock: Allow multiple spinning readers
Date: Tue, 29 Mar 2016 22:20:50 +0200	[thread overview]
Message-ID: <20160329202050.GN3408@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <1458444079-59601-1-git-send-email-Waiman.Long@hpe.com>

On Sat, Mar 19, 2016 at 11:21:19PM -0400, Waiman Long wrote:
> In qrwlock, the reader that is spining on the lock will need to notify
> the next reader in the queue when the lock is free. That introduces a
> reader-to-reader latency that is not present in the original rwlock.

How did you find this 'problem'?

> That is the price for reducing lock cacheline contention. It also
> reduces the performance benefit of qrwlock on reader heavy workloads.
> 
> However, if we allow a limited number of readers to spin on the
> lock simultaneously, we can eliminates some of the reader-to-reader
> latencies at the expense of a bit more cacheline contention and
> probably more power consumption.

So the embedded people might not like that much.

> This patch changes the reader slowpath to allow multiple readers to
> spin on the lock. The maximum number of concurrent readers allowed
> is currently set to 4 to limit the amount of additional cacheline
> contention while improving reader performance on most workloads. If
> a writer comes to the queue head, however, it will stop additional
> readers from coming out.
> 
> Using a multi-threaded locking microbenchmark on a 4-socket 40-core
> Haswell-EX system, the locking throughput of 4.5-rc6 kernel with or
> without the patch were as follows:

Do you have an actual real world benchmark where this makes a
difference?

>  /**
>   * queued_read_lock_slowpath - acquire read lock of a queue rwlock
>   * @lock: Pointer to queue rwlock structure
>   * @cnts: Current qrwlock lock value
>   */
>  void queued_read_lock_slowpath(struct qrwlock *lock, u32 cnts)
>  {
> +	bool locked = true;
> +
>  	/*
>  	 * Readers come here when they cannot get the lock without waiting
>  	 */
> @@ -78,7 +71,10 @@ void queued_read_lock_slowpath(struct qrwlock *lock, u32 cnts)
>  		 * semantics) until the lock is available without waiting in
>  		 * the queue.
>  		 */
> +		while ((cnts & _QW_WMASK) == _QW_LOCKED) {
> +			cpu_relax_lowlatency();
> +			cnts = atomic_read_acquire(&lock->cnts);
> +		}
>  		return;
>  	}
>  	atomic_sub(_QR_BIAS, &lock->cnts);
> @@ -92,14 +88,31 @@ void queued_read_lock_slowpath(struct qrwlock *lock, u32 cnts)
>  	 * The ACQUIRE semantics of the following spinning code ensure
>  	 * that accesses can't leak upwards out of our subsequent critical
>  	 * section in the case that the lock is currently held for write.
> +	 *
> +	 * The reader increments the reader count & wait until the writer
> +	 * releases the lock.
>  	 */
>  	cnts = atomic_add_return_acquire(_QR_BIAS, &lock->cnts) - _QR_BIAS;
> +	while ((cnts & _QW_WMASK) == _QW_LOCKED) {
> +		if (locked && ((cnts >> _QR_SHIFT) < MAX_SPINNING_READERS)) {
> +			/*
> +			 * Unlock the wait queue so that more readers can
> +			 * come forward and waiting for the writer to exit
> +			 * as long as no more than MAX_SPINNING_READERS
> +			 * readers are present.
> +			 */
> +			arch_spin_unlock(&lock->wait_lock);
> +			locked = false;

Only 1 more can come forward with this logic. How can you ever get to 4?

Also, what says the next in queue is a reader?

> +		}
> +		cpu_relax_lowlatency();
> +		cnts = atomic_read_acquire(&lock->cnts);
> +	}
>  
>  	/*
>  	 * Signal the next one in queue to become queue head
>  	 */
> +	if (locked)
> +		arch_spin_unlock(&lock->wait_lock);
>  }
>  EXPORT_SYMBOL(queued_read_lock_slowpath);
>  
> -- 
> 1.7.1
> 

  parent reply	other threads:[~2016-03-29 20:20 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-20  3:21 [PATCH] locking/qrwlock: Allow multiple spinning readers Waiman Long
2016-03-20 10:43 ` Peter Zijlstra
2016-03-22  2:21   ` Waiman Long
2016-03-29 20:20 ` Peter Zijlstra [this message]
2016-03-31 22:12   ` Waiman Long
2016-04-01 10:29     ` Peter Zijlstra
2016-04-01 10:31     ` Peter Zijlstra
2016-04-01 10:41       ` Will Deacon
2016-04-01 10:54         ` Peter Zijlstra
2016-04-01 11:43           ` Peter Zijlstra
2016-04-01 16:47             ` Will Deacon
2016-04-01 19:53               ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160329202050.GN3408@twins.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=Waiman.Long@hpe.com \
    --cc=doug.hatch@hpe.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=scott.norton@hpe.com \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.