All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Arjan van de Ven <arjan@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>, Andrew Morton <akpm@osdl.org>,
	Dave Olson <olson@unixfolk.com>,
	ccb@acm.org, linux-kernel@vger.kernel.org,
	Peter Chubb <peter@chubb.wattle.id.au>
Subject: Re: [patch] increase spinlock-debug looping timeouts (write_lock and NMI)
Date: Tue, 20 Jun 2006 23:36:49 +1000	[thread overview]
Message-ID: <4497F9F1.8060708@yahoo.com.au> (raw)
In-Reply-To: <1150808692.2891.194.camel@laptopd505.fenrus.org>

Arjan van de Ven wrote:
>>Correct me if I'm wrong, but... a read-lock requires at most a single
>>cacheline transfer per lock acq and a single per release, no matter the
>>concurrency on the lock (so long as it is read only).
>>
>>A spinlock is going to take more. If the hardware perfectly round-robins
>>the cacheline, it will take lockers+1 transfers per lock+unlock.
> 
> 
> This is a bit too simplistic view; shared cachelines are cheap, it's
> getting the cacheline exclusive (or transitioning to/from exclusive)
> that is the expensive part...

Taking the lock is going to transiation the cacheline to exclusive. If
the next locker tries to take the lock, they transfer the cacheline and
exclusive access and fail. If they have already tried to take the lock
earlier, they might only request a readonly state, but it still requires
a cacheline transfer (which is the expensive part).

The only way it is simplistic is that hardware will be unfair and give
the same, or "close" requesters priority for some time, so the cacheline
stays close.

At some point, when it gets transferred away, there is no guarantee that
the spinlock will be unlocked. Quite likely the opposite, if there is
large contention for it and/or its cacheline.

> 
> (note that our spinlocks are fixed nowadays to only do the slowpath side
> of things for read, eg allow shared cachelines there)

To put it another way, when 1 CPU takes or releases the lock, the cachelines
of 11 others are invalidated. In a perfect round-robin, if 12 queue up at the
same time, 1 will go through and 11 will fail (= 12 cacheline transfers). So
in this situation, the reader lock has a factor of 12 better acquisition
throughput.

Now the situation is simplistic (all queueing at the same time, perfectly
fair hardware), but the cacheline transfer costs are accurate *for this
situation*.

So I think rwlocks do have a fundamental advantage over spinlocks (aside
from the multiple concurrent readers advantage, although the two properties
are obviously fundamentally related). It is yet to be shown whether that is
actually the cause of Peter's performance improvement, but that is my
guess.

-- 
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 

  parent reply	other threads:[~2006-06-20 13:49 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <fa.VT2rwoX1M/2O/aO5crhlRDNx4YA@ifi.uio.no>
     [not found] ` <fa.Zp589GPrIISmAAheRowfRgZ1jgs@ifi.uio.no>
2006-06-20  5:35   ` [patch] increase spinlock-debug looping timeouts (write_lock and NMI) Dave Olson
2006-06-20  6:39     ` Andrew Morton
2006-06-20  6:53       ` Dave Jones
2006-06-20  7:37       ` Nick Piggin
2006-06-20  8:03         ` Andrew Morton
2006-06-20  8:33         ` Ingo Molnar
2006-06-20  9:37           ` Nick Piggin
2006-06-20  9:51             ` Ingo Molnar
2006-06-20 10:59               ` Nick Piggin
2006-06-20 13:04                 ` Arjan van de Ven
2006-06-20 13:28                   ` update pci device id cckuo
2006-06-20 14:06                     ` Arjan van de Ven
2006-06-20 13:36                   ` Nick Piggin [this message]
2006-06-20 14:53                     ` [patch] increase spinlock-debug looping timeouts (write_lock and NMI) Arjan van de Ven
2006-06-20 15:16                       ` Nick Piggin
2006-06-20 16:27                         ` Nick Piggin
2006-06-20  8:43         ` Arjan van de Ven
2006-06-20 16:11       ` Dave Olson
2006-06-20 21:10         ` Andrew Morton
2006-06-22  5:45 Dave Olson
2006-06-22  5:57 ` Andrew Morton
2006-06-23  7:57 ` Ingo Molnar
  -- strict thread matches above, loose matches on Subject: below --
2006-06-23 16:27 Dave Olson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4497F9F1.8060708@yahoo.com.au \
    --to=nickpiggin@yahoo.com.au \
    --cc=akpm@osdl.org \
    --cc=arjan@infradead.org \
    --cc=ccb@acm.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=olson@unixfolk.com \
    --cc=peter@chubb.wattle.id.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.