From: Helge Deller <deller@gmx.de>
To: James Bottomley <James.Bottomley@HansenPartnership.com>,
John David Anglin <dave.anglin@bell.net>
Cc: linux-parisc@vger.kernel.org
Subject: Re: [PATCH] parisc: adjust L1_CACHE_BYTES to 128 bytes on PA8800 and PA8900 CPUs
Date: Tue, 22 Sep 2015 18:20:03 +0200 [thread overview]
Message-ID: <56017FB3.5050709@gmx.de> (raw)
In-Reply-To: <55EB5EFA.4040407@gmx.de>
On 05.09.2015 23:30, Helge Deller wrote:
> Hi James,
> ...
> I haven't done any performance measurements yet, but your patch looks
> absolutely correct.
> ...
Hello everyone,
I did some timing tests with the various patches for
a) atomic_hash patches:
https://patchwork.kernel.org/patch/7116811/
b) alignment of LWS locks:
https://patchwork.kernel.org/patch/7137931/
The testcase I used is basically the following:
- It starts 32 threads.
- We have 16 atomic ints organized in an array.
- The first thread increments NITERS times the first atomic int.
- The second thread decrements NITERS times the first atomic int.
- The third/fourth thread increments/decrements the second atomic int, and so on...
- So, we have 32 threads, of which 16 increments and 16 decrements 16 different atomic ints.
- All threads run in parallel on a 4-way SMP PA8700 rp5470 machine.
- I used the "time" command to measure the timings.
- I did not stopped other services on the machine, but ran each test a few times and the timing results did not show significant variation between each run.
- All timings were done on a vanilla kernel 4.2 with only the mentioned patch applied.
The code is a modified testcase from the libatomic-ops debian package:
AO_t counter_array[16] = { 0, };
#define NITERS 1000000
void * add1sub1_thr(void * id)
{
int me = (int)(AO_PTRDIFF_T)id;
AO_t *counter;
int i;
counter = &counter_array[me >> 1];
for (i = 0; i < NITERS; ++i)
if ((me & 1) != 0) {
(void)AO_fetch_and_sub1(counter);
} else {
(void)AO_fetch_and_add1(counter);
}
return 0;
...
run_parallel(32, add1sub1_thr)
...
The baseline for all results is the timing with a vanilla kernel 4.2:
real 0m13.596s
user 0m18.152s
sys 0m35.752s
The next results are with the atomic_hash (a) patch applied:
For ATOMIC_HASH_SIZE = 4.
real 0m21.892s
user 0m27.492s
sys 0m59.704s
For ATOMIC_HASH_SIZE = 64.
real 0m20.604s
user 0m24.832s
sys 0m56.552s
Next I applied the LWS locks patch (b):
XXXXXXXXXXXXXXXXXXXX LWS_LOCK_ALIGN_BITS = 4
real 0m13.660s
user 0m18.592s
sys 0m35.236s
XXXXXXXXXXXXXXXXXXXX LWS_LOCK_ALIGN_BITS = L1_CACHE_SHIFT
real 0m11.992s
user 0m19.064s
sys 0m28.476s
Then I applied both patches (a and b):
ATOMIC_HASH_SIZE = 64, LWS_LOCK_ALIGN_BITS = 4
ATOMIC_HASH_SIZE = 64, LWS_LOCK_ALIGN_BITS = 4
real 0m13.232s
user 0m17.704s
sys 0m33.884s
ATOMIC_HASH_SIZE = 64, LWS_LOCK_ALIGN_BITS = L1_CACHE_SHIFT
real 0m12.300s
user 0m20.268s
sys 0m28.424s
ATOMIC_HASH_SIZE = 4, LWS_LOCK_ALIGN_BITS = 4
real 0m13.181s
user 0m17.584s
sys 0m34.800s
ATOMIC_HASH_SIZE = 4, LWS_LOCK_ALIGN_BITS = L1_CACHE_SHIFT
real 0m11.692s
user 0m18.232s
sys 0m27.072s
In summary I'm astonished about those results.
Especially from patch (a) I would have expected (when applied stand-alone) the same or better performance, because it makes the spinlocks more fine-grained.
But a performance drop of 50% is strange.
Patch (b) stand-alone does significantly increases performance (~20%), and together with patch (a) it even adds a few more percent performance increase on top.
Given the numbers above I currently would suggest to apply both patches (with ATOMIC_HASH_SIZE = 4 and LWS_LOCK_ALIGN_BITS = L1_CACHE_SHIFT).
Thoughts?
Helge
next prev parent reply other threads:[~2015-09-22 16:20 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-02 16:20 [PATCH] parisc: adjust L1_CACHE_BYTES to 128 bytes on PA8800 and PA8900 CPUs Helge Deller
2015-09-03 13:30 ` James Bottomley
2015-09-03 13:57 ` James Bottomley
2015-09-05 21:30 ` Helge Deller
2015-09-22 16:20 ` Helge Deller [this message]
2015-09-23 0:12 ` John David Anglin
2015-09-23 19:30 ` Helge Deller
2015-09-23 21:00 ` John David Anglin
2015-09-24 14:20 ` James Bottomley
2015-09-24 16:39 ` John David Anglin
2015-09-24 16:57 ` James Bottomley
2015-09-25 12:20 ` John David Anglin
2015-09-25 15:56 ` John David Anglin
2015-09-27 16:27 ` [PATCH] parisc: " John David Anglin
2015-09-28 15:57 ` Helge Deller
2015-09-28 20:00 ` John David Anglin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56017FB3.5050709@gmx.de \
--to=deller@gmx.de \
--cc=James.Bottomley@HansenPartnership.com \
--cc=dave.anglin@bell.net \
--cc=linux-parisc@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.