From: Andrea Arcangeli <andrea@suse.de>
To: "D . W . Howells" <dhowells@astarte.free-online.co.uk>
Cc: torvalds@transmeta.com, linux-kernel@vger.kernel.org,
dhowells@redhat.com, davem@redhat.com
Subject: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]
Date: Tue, 24 Apr 2001 06:56:33 +0200 [thread overview]
Message-ID: <20010424065633.A16845@athlon.random> (raw)
In-Reply-To: <01042321353400.00801@orion.ddi.co.uk> <20010423233435.N19938@athlon.random>
In-Reply-To: <20010423233435.N19938@athlon.random>; from andrea@suse.de on Mon, Apr 23, 2001 at 11:34:35PM +0200
[-- Attachment #1: Type: text/plain, Size: 6546 bytes --]
On Mon, Apr 23, 2001 at 11:34:35PM +0200, Andrea Arcangeli wrote:
> On Mon, Apr 23, 2001 at 09:35:34PM +0100, D . W . Howells wrote:
> > This patch (made against linux-2.4.4-pre6) makes a number of changes to the
> > rwsem implementation:
> >
> > (1) Everything in try #2
> >
> > plus
> >
> > (2) Changes proposed by Linus for the generic semaphore code.
> >
> > (3) Ideas from Andrea and how he implemented his semaphores.
>
> I benchmarked try3 on top of pre6 and I get this:
>
> ----------------------------------------------
> RWSEM_GENERIC_SPINLOCK y in rwsem-2.4.4-pre6 + your latest #try3
>
> rw
>
> reads taken: 5842496
> writes taken: 3016649
> reads taken: 5823381
> writes taken: 3006773
>
> r1
>
> reads taken: 13309316
> reads taken: 13311722
>
> r2
>
> reads taken: 5010534
> reads taken: 5023185
>
> ro
>
> reads taken: 3850228
> reads taken: 3845954
>
> w1
>
> writes taken: 13012701
> writes taken: 13021716
>
> wo
>
> writes taken: 1825789
> writes taken: 1802560
>
> ----------------------------------------------
> RWSEM_XCHGADD y in rwsem-2.4.4-pre6 + your latest #try3
>
> rw
>
> reads taken: 5789542
> writes taken: 2989478
> reads taken: 5801777
> writes taken: 2995669
>
> r1
>
> reads taken: 16922653
> reads taken: 16946132
>
> r2
>
> reads taken: 5650211
> reads taken: 5647272
>
> ro
>
> reads taken: 4956250
> reads taken: 4959828
>
> w1
>
> writes taken: 15431139
> writes taken: 15439790
>
> wo
>
> writes taken: 813756
> writes taken: 816005
>
> graph updated attached. so in short my fast path is still quite faster (r1/w1),
> slow path is comparable now (I still win in all tests but wo which is probably
> the less interesting one in real life [write contention]). I still have room to
> improve the wo test [write contention] by spending more icache btw but it
> probably doesn't worth.
Ok I finished now my asm optimized rwsemaphores and I improved a little my
spinlock based one but without touching the icache usage.
Here it is the code against pre6 vanilla:
ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/patches/v2.4/2.4.4pre6/rwsem-8
here the same but against David's try#2:
ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/patches/v2.4/2.4.4pre6/rwsem-8-against-dh-try2
and here again the same but against David's latest try#3:
ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/patches/v2.4/2.4.4pre6/rwsem-8-against-dh-try3
The main advantage of my rewrite are:
- my x86 version is visibly faster than yours in the write fast path and it
also saves icache because it's smaller (read fast path is reproducibly a bit faster too)
- my spinlock version is visibly faster in both read and write fast path and still
has smaller icache footprint (of course my version is completly out of line too
and I'm comparing apples to apples)
- at least to me the slow path of my code seems to be much simpler
than yours (and I can actually understand it ;), for example I don't feel
any need of any atomic exchange anywhere, I admit now comparing
my code to yours I still don't know why you need it
- the common code automatically extends itself to support 2^32 concurrent sleepers
on 64bit archs
- there is no code duplication at all in supporting xchgadd common code logic
for other archs (and I prepared a skeleton to fill for the alpha)
Only disavantage is that sparc64 will have to fixup some bit again.
Here the numbers of the above patches on top of vanilla 2.4.4-pre6:
----------------------------------------------
RWSEM_XCHGADD y in rwsem-2.4.4-pre6 + my latest rwsem
rw
reads taken: 5736978
writes taken: 2962325
reads taken: 5799163
writes taken: 2994404
r1
reads taken: 17044842
reads taken: 17053405
r2
reads taken: 5603085
reads taken: 5601647
ro
reads taken: 4831655
reads taken: 4833518
w1
writes taken: 16064773
writes taken: 16037018
wo
writes taken: 860791
writes taken: 864103
----------------------------------------------
RWSEM_SPINLOCK y in rwsem-2.4.4-pre6 + my latest rwsem
rw
reads taken: 6061713
writes taken: 3129801
reads taken: 6099046
writes taken: 3148951
r1
reads taken: 14251500
reads taken: 14265389
r2
reads taken: 4972932
reads taken: 4936267
ro
reads taken: 4253814
reads taken: 4253432
w1
writes taken: 13652385
writes taken: 13632914
wo
writes taken: 1751857
writes taken: 1753608
I draw a graph with the above numbers compared to the previous quoted numbers I
generated a few hours ago with your latest try#3 (attached). (W1 and R1 are
respectively the benchmark of the write fast path and read fast path, only one
writer and only one reader and they're of course the most interesting ones)
So I'd suggest Linus to apply one of my above -8 patches for pre7. (I hope I
won't need any secondary try)
this is a diffstat of the rwsem-8 patch:
arch/alpha/config.in | 1
arch/alpha/kernel/alpha_ksyms.c | 4
arch/arm/config.in | 2
arch/cris/config.in | 1
arch/i386/config.in | 4
arch/ia64/config.in | 1
arch/m68k/config.in | 1
arch/mips/config.in | 1
arch/mips64/config.in | 1
arch/parisc/config.in | 1
arch/ppc/config.in | 1
arch/ppc/kernel/ppc_ksyms.c | 2
arch/s390/config.in | 1
arch/s390x/config.in | 1
arch/sh/config.in | 1
arch/sparc/config.in | 1
arch/sparc64/config.in | 4
include/asm-alpha/compiler.h | 9 -
include/asm-alpha/rwsem_xchgadd.h | 27 +++
include/asm-alpha/semaphore.h | 2
include/asm-i386/rwsem.h | 225 --------------------------------
include/asm-i386/rwsem_xchgadd.h | 93 +++++++++++++
include/asm-sparc64/rwsem.h | 35 ++---
include/linux/compiler.h | 13 +
include/linux/rwsem-spinlock.h | 57 --------
include/linux/rwsem.h | 106 ---------------
include/linux/rwsem_spinlock.h | 61 ++++++++
include/linux/rwsem_xchgadd.h | 96 +++++++++++++
include/linux/sched.h | 2
lib/Makefile | 6
lib/rwsem-spinlock.c | 245 -----------------------------------
lib/rwsem.c | 265 --------------------------------------
lib/rwsem_spinlock.c | 124 +++++++++++++++++
lib/rwsem_xchgadd.c | 88 ++++++++++++
35 files changed, 531 insertions(+), 951 deletions(-)
Andrea
[-- Attachment #2: rwsem-8.png --]
[-- Type: image/png, Size: 6697 bytes --]
next prev parent reply other threads:[~2001-04-24 4:57 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2001-04-23 20:35 [PATCH] rw_semaphores, optimisations try #3 D.W.Howells
2001-04-23 21:34 ` Andrea Arcangeli
2001-04-24 4:56 ` Andrea Arcangeli [this message]
2001-04-24 8:56 ` rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3] David Howells
2001-04-24 9:49 ` Andrea Arcangeli
2001-04-24 10:25 ` David Howells
2001-04-24 10:44 ` Andrea Arcangeli
2001-04-24 13:07 ` David Howells
2001-04-24 13:59 ` Andrea Arcangeli
2001-04-24 15:49 ` Linus Torvalds
2001-04-24 10:17 ` Andrea Arcangeli
2001-04-24 10:33 ` David Howells
2001-04-24 10:46 ` Andrea Arcangeli
2001-04-24 12:19 ` Andrea Arcangeli
2001-04-24 13:10 ` Andrea Arcangeli
2001-04-23 22:23 ` [PATCH] rw_semaphores, optimisations try #3 Linus Torvalds
2001-04-24 10:05 ` David Howells
2001-04-24 15:40 ` Linus Torvalds
2001-04-24 16:37 ` David Howells
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20010424065633.A16845@athlon.random \
--to=andrea@suse.de \
--cc=davem@redhat.com \
--cc=dhowells@astarte.free-online.co.uk \
--cc=dhowells@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@transmeta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox