From: Heinz Mauelshagen <heinzm@redhat.com>
To: Neil Brown <neilb@suse.de>
Cc: Christoph Hellwig <hch@infradead.org>,
device-mapper development <dm-devel@redhat.com>,
Dan Williams <dan.j.williams@intel.com>,
Ed Ciechanowski <ed.ciechanowski@intel.com>
Subject: Re: [PATCH 1/6] dm raid45 target: export region hash functions and add a needed one
Date: Wed, 08 Jul 2009 20:56:20 +0200 [thread overview]
Message-ID: <1247079380.9206.99.camel@o> (raw)
In-Reply-To: <19025.28110.236108.810671@notabene.brown>
On Mon, 2009-07-06 at 13:21 +1000, Neil Brown wrote:
> On Thursday July 2, heinzm@redhat.com wrote:
> >
> > Dan, Neil,
Hi,
back after > 4 days of Internet outage caused by lightning :-(
I'll respond to Neils comments here in order to have a comparable
microbenchmark based on his recommended change
(and one bug I fixed; see below).
> >
> > like mentioned before I left to LinuxTag last week, here comes an initial
> > take on dm-raid45 warm/cold CPU cache xor speed optimization metrics.
> >
> > This shall give us the base to decide to keep or drop the dm-raid45
> > internal xor optimization magic or move (part of) it into the crypto
> > subsystem.
>
> Thanks for doing this.
You're welcome.
> >
> >
> > Intel results with 128 iterations each:
> > ---------------------------------------
> >
> > 1 stripe : NB:10 111/80 HM:118 111/82
> > 2 stripes : NB:25 113/87 HM:103 112/91
> > 3 stripes : NB:24 115/93 HM:104 114/93
> > 4 stripes : NB:48 114/93 HM:80 114/93
> > 5 stripes : NB:38 113/94 HM:90 114/94
> > 6 stripes : NB:25 116/94 HM:103 114/94
> > 7 stripes : NB:25 115/95 HM:103 115/95
> > 8 stripes : NB:62 117/96 HM:66 116/95 <<<--- cold cache starts here
> > 9 stripes : NB:66 117/96 HM:62 116/95
> > 10 stripes: NB:73 117/96 HM:55 114/95
> > 11 stripes: NB:63 114/96 HM:65 112/95
> > 12 stripes: NB:51 111/96 HM:77 110/95
> > 13 stripes: NB:65 109/96 HM:63 112/95
>
> These results seem to suggest that the two different routines provide
> very similar results on this hardware, particularly when the cache is cold.
> The high degree of variability might be because you have dropped this:
>
> > - /* Wait for next tick. */
> > - for (j = jiffies; j == jiffies; )
> > - ;
> ??
> Without that, it could be running the test over anything from 4 to 5
> jiffies.
> I note that do_xor_speed in crypto/xor.c doesn't synchronise at the
> start either. I think that is a bug.
> The variability seem to generally be close to 20%, which is consistent
> with the difference between 4 and 5.
>
> Could you put that loop back in and re-test?
>
Reintroduced and rerun tests.
In addition to that I fixed a flaw, which lead to
dm-raid45.c:xor_optimize() running xor_speed() with chunks > raid
devices, which ain't make sense and lead to longer test runs and
erroneous chunk values (e.g. 7 when only 3 raid devices configured).
Hence we could end up with an algorithm claiming it was selected
for > raid devices.
Here's the new results:
Intel Core i7:
--------------
1 stripe : NB:54 114/94 HM:74 113/93
2 stripes : NB:57 116/94 HM:71 115/94
3 stripes : NB:64 115/94 HM:64 114/94
4 stripes : NB:51 112/94 HM:77 114/94
5 stripes : NB:77 115/94 HM:51 114/94
6 stripes : NB:25 111/89 HM:103 105/90
7 stripes : NB:13 105/91 HM:115 111/90
8 stripes : NB:27 108/92 HM:101 111/93
9 stripes : NB:29 113/92 HM:99 114/93
10 stripes: NB:41 110/92 HM:87 112/93
11 stripes: NB:34 105/92 HM:94 107/93
12 stripes: NB:51 114/93 HM:77 114/93
13 stripes: NB:54 115/94 HM:74 114/93
14 stripes: NB:64 115/94 HM:64 114/93
AMD Opteron:
--------
1 stripe : NB:0 25/17 HM:128 48/38
2 stripes : NB:0 24/18 HM:128 46/36
3 stripes : NB:0 25/18 HM:128 47/37
4 stripes : NB:0 27/19 HM:128 48/41
5 stripes : NB:0 30/18 HM:128 49/40
6 stripes : NB:0 27/19 HM:128 49/40
7 stripes : NB:0 29/18 HM:128 49/39
8 stripes : NB:0 26/19 HM:128 49/40
9 stripes : NB:0 28/19 HM:128 51/41
10 stripes: NB:0 28/18 HM:128 50/41
11 stripes: NB:0 31/19 HM:128 49/40
12 stripes: NB:0 28/19 HM:128 50/40
13 stripes: NB:0 26/19 HM:128 50/40
14 stripes: NB:0 27/20 HM:128 49/40
Still too much variability...
> >
> > Opteron results with 128 iterations each:
> > -----------------------------------------
> > 1 stripe : NB:0 30/20 HM:128 64/53
> > 2 stripes : NB:0 31/21 HM:128 68/55
> > 3 stripes : NB:0 31/22 HM:128 68/57
> > 4 stripes : NB:0 32/22 HM:128 70/61
> > 5 stripes : NB:0 32/22 HM:128 70/63
> > 6 stripes : NB:0 35/22 HM:128 70/64
> > 7 stripes : NB:0 32/23 HM:128 69/63
> > 8 stripes : NB:0 44/23 HM:128 76/65
> > 9 stripes : NB:0 43/23 HM:128 73/65
> > 10 stripes: NB:0 35/23 HM:128 72/64
> > 11 stripes: NB:0 35/24 HM:128 72/64
> > 12 stripes: NB:0 33/24 HM:128 72/65
> > 13 stripes: NB:0 33/23 HM:128 71/64
>
> Here your code seems to be 2-3 times faster!
> Can you check which function xor_block is using?
> If it is :
> xor: automatically using best checksumming function: ....
> then it might be worth disabling that test in calibrate_xor_blocks and
> see if it picks one that ends up being faster.
Picks the same sse one automatically/measured on both archs with
obvious variability:
[37414.875236] xor: automatically using best checksumming function:
generic_sse
[37414.893930] generic_sse: 12619.000 MB/sec
[37414.893932] xor: using function: generic_sse (12619.000 MB/sec)
[37445.679501] xor: measuring software checksum speed
[37445.696829] generic_sse: 15375.000 MB/sec
[37445.696830] xor: using function: generic_sse (15375.000 MB/sec)
Will get to Dough's recommendation to run loaded benchmarks tomorrow...
Heinz
>
> There is still the fact that by using the cache for data that will be
> accessed once, we are potentially slowing down the rest of the system.
> i.e. the reason to avoid the cache is not just because it won't
> benefit the xor much, but because it will hurt other users.
> I don't know how to measure that effect :-(
> But if avoiding the cache makes xor 1/3 the speed of using the cache
> even though it is cold, then it would be hard to justify not using the
> cache I think.
>
> >
> > Questions/Recommendations:
> > --------------------------
> > Review the code changes and the data analysis please.
>
> It seems to mostly make sense
> - the 'wait for next tick' should stay
> - it would be interesting to see what the final choice of 'chunks'
> was (i.e. how many to xor together at a time).
>
>
> Thanks!
>
> NeilBrown
next prev parent reply other threads:[~2009-07-08 18:56 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-06-15 17:21 [PATCH 1/6] dm raid45 target: export region hash functions and add a needed one heinzm
2009-06-16 14:09 ` Christoph Hellwig
2009-06-16 14:51 ` Heinz Mauelshagen
2009-06-16 17:55 ` Dan Williams
2009-06-16 19:11 ` Heinz Mauelshagen
2009-06-16 19:48 ` Dan Williams
2009-06-16 22:46 ` Neil Brown
2009-06-18 16:08 ` Jonathan Brassow
2009-06-19 1:43 ` Neil Brown
2009-06-19 10:33 ` Heinz Mauelshagen
2009-06-21 0:32 ` Dan Williams
2009-06-21 12:06 ` Neil Brown
2009-06-22 12:25 ` Neil Brown
2009-06-22 19:10 ` Heinz Mauelshagen
2009-07-02 12:52 ` Heinz Mauelshagen
2009-07-06 3:21 ` Neil Brown
2009-07-07 18:38 ` Doug Ledford
2009-07-10 15:23 ` Heinz Mauelshagen
2009-07-11 12:44 ` Doug Ledford
2009-07-12 2:56 ` Dan Williams
2009-07-08 18:56 ` Heinz Mauelshagen [this message]
2009-06-18 16:39 ` Jonathan Brassow
2009-06-18 20:01 ` Heinz Mauelshagen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1247079380.9206.99.camel@o \
--to=heinzm@redhat.com \
--cc=dan.j.williams@intel.com \
--cc=dm-devel@redhat.com \
--cc=ed.ciechanowski@intel.com \
--cc=hch@infradead.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.