Re: [PATCH linux-2.6 01/04] brsem: implement big reader semaphore

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Tejun Heo <htejun@gmail.com>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: zwane@linuxpower.ca, viro@zeniv.linux.org.uk,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH linux-2.6 01/04] brsem: implement big reader semaphore
Date: Sun, 25 Sep 2005 17:53:56 +0900	[thread overview]
Message-ID: <433665A4.6010400@gmail.com> (raw)
In-Reply-To: <43365F82.1040801@yahoo.com.au>


  Hello, Nick.

Nick Piggin wrote:
> Tejun Heo wrote:
> 
>>
>>  As I've said in the other reply, I'll first rip off three stage init 
>> thing for cpucontrol.  That added pretty much complexity to it.  And 
>> with the weird naming scheme, please tell me how to fix it.  I have no 
>> problem renaming things.
>>
> 
> OK, my criticism of your naming was not constructive so I apologise
> for that. I willll help you with some of those minor issues if we
> establish that your overall design is a a goer.
> 

  Thanks. :-)

> 
>>> What would be wrong with an array of NR_CPUS rwsems? The only
>>> tiny trick you would have to do AFAIKS is have up_read remember
>>> what rwsem down_read took, but that could be returned from
>>> down_read as a token.
>>
>>
>>
>>  Using array of rwsems means that every reader-side ops performs 
>> (unnecessary) real down and up operations on the semaphore which 
>> involve atomic memory op and probably memory barrier.  These are 
>> pretty expensive operations.
>>
>>  What brsem tries to do is implementing rwsem semantics while 
>> performing only normal (as opposed to atomic/barrier) intstructions 
>> during reader-side operations.  That's why all the call_on_all_cpus 
>> stuff is needed while write-locking.  Do you think avoiding 
>> atomic/barrier stuff would be an overkill?
>>
> 
> Yes I think so. I think the main problem on modern CPUs is
> not atomic operations as such, but cacheline bouncing.
> 
> Without any numbers, I'd guess that your down_read is more
> expensive than mine simply due to touching more cachelines
> and having more branches.

  Other than local_bh_disable/enable(), all brsem read ops do are

  1. accessing sem->idx
  2. calculate per-cpu rcnt address from sem->idx
  3. do one branch on the value of per-cpu rcnt
  4. inc/dec per-cpu rcnt

  So, it does access one more cachline and, yeap, there is one branch 
for bh enabling and several more inside local_bh_enable.  I'll try to 
get some benchmark numbers for comparison.

  I'm thinking about adding down_read(&xxx->s_umount) to write(2) and 
compare normal rwsem and brsem performance by repeitively writing short 
data into a file on a UP machine.  Do you have better ideas?

> 
> The other thing is simply that you really want your
> synchronization primitives to be as simple and verifiable
> as possible. For example rwsems even recently have had subtle
> memory ordering and other implemntation corner cases, and
> they're much less complex than this brsem.
> 
> Nick
> 

  Thanks.

-- 
tejun

next prev parent reply	other threads:[~2005-09-25  8:54 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-09-25  6:43 [PATCH linux-2.6 00/04] brsem: [RFC] big reader semaphore Tejun Heo
2005-09-25  6:43 ` [PATCH linux-2.6 01/04] brsem: implement " Tejun Heo
2005-09-25  7:19   ` Nick Piggin
2005-09-25  8:03     ` Nick Piggin
2005-09-25  8:11     ` Tejun Heo
2005-09-25  8:27       ` Nick Piggin
2005-09-25  8:53         ` Tejun Heo [this message]
2005-09-25  9:24           ` Nick Piggin
2005-09-25 10:05             ` Tejun Heo
2005-09-25 11:22               ` Nick Piggin
2005-09-25  6:43 ` [PATCH linux-2.6 02/04] brsem: convert super_block->s_umount to brsem Tejun Heo
2005-09-25  6:43 ` [PATCH linux-2.6 03/04] brsem: fix ro-remount <-> open race condition Tejun Heo
2005-09-25  6:43 ` [PATCH linux-2.6 04/04] brsem: convert cpucontrol to brsem Tejun Heo
2005-09-25  7:39   ` Nick Piggin
2005-09-25  8:03     ` Tejun Heo
2005-09-25 23:46     ` Nathan Lynch
2005-09-26  1:11       ` Nick Piggin
2005-09-26  4:05         ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=433665A4.6010400@gmail.com \
    --to=htejun@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nickpiggin@yahoo.com.au \
    --cc=viro@zeniv.linux.org.uk \
    --cc=zwane@linuxpower.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.