Re: [PATCH] parisc: adjust L1_CACHE_BYTES to 128 bytes on PA8800 and PA8900 CPUs

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Helge Deller <deller@gmx.de>
To: John David Anglin <dave.anglin@bell.net>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>,
	linux-parisc@vger.kernel.org
Subject: Re: [PATCH] parisc: adjust L1_CACHE_BYTES to 128 bytes on PA8800 and PA8900 CPUs
Date: Wed, 23 Sep 2015 21:30:31 +0200	[thread overview]
Message-ID: <5602FDD7.6020501@gmx.de> (raw)
In-Reply-To: <17069A9B-BA68-4BDA-9342-83E33A22D547@bell.net>

On 23.09.2015 02:12, John David Anglin wrote:
> On 2015-09-22, at 12:20 PM, Helge Deller wrote:
> 
>> On 05.09.2015 23:30, Helge Deller wrote:
>>> Hi James,
>>> ...
>>> I haven't done any performance measurements yet, but your patch looks
>>> absolutely correct.
>>> ...
>>
>> Hello everyone,
>>
>> I did some timing tests with the various patches for
>> a) atomic_hash patches:
>>        https://patchwork.kernel.org/patch/7116811/
>> b) alignment of LWS locks:
>>        https://patchwork.kernel.org/patch/7137931/
>>
>> The testcase I used is basically the following:
>> - It starts 32 threads.
>> - We have 16 atomic ints organized in an array.
>> - The first thread increments NITERS times the first atomic int.
>> - The second thread decrements NITERS times the first atomic int.
>> - The third/fourth thread increments/decrements the second atomic int, and so on...
>> - So, we have 32 threads, of which 16 increments and 16 decrements 16 different atomic ints.
>> - All threads run in parallel on a 4-way SMP PA8700 rp5470 machine.
>> - I used the "time" command to measure the timings.
>> - I did not stopped other services on the machine, but ran each test a few times and the timing results did not show significant variation between each run.
>> - All timings were done on a vanilla kernel 4.2 with only the mentioned patch applied.
>>
>> The code is a modified testcase from the libatomic-ops debian package:
>>
>> AO_t counter_array[16] = { 0, };
>> #define NITERS 1000000
>>
>> void * add1sub1_thr(void * id)
>> {
>>  int me = (int)(AO_PTRDIFF_T)id;
>>  AO_t *counter;
>>  int i;
>>
>>  counter = &counter_array[me >> 1];
>>  for (i = 0; i < NITERS; ++i)
>>    if ((me & 1) != 0) {
>>      (void)AO_fetch_and_sub1(counter);
>>    } else {
>>      (void)AO_fetch_and_add1(counter);
>>    }
>>  return 0;
>> ...
>> run_parallel(32, add1sub1_thr)
>> ...
>>
>>
> 
> Does libatomic-ops now use the GCC sync builtins and the LWS calls?

Yes, if you apply the patch from https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=785654
on top of the libatomic-ops debian package.
That's what I tested.

>> The baseline for all results is the timing with a vanilla kernel 4.2:
>> real    0m13.596s
>> user    0m18.152s
>> sys     0m35.752s
>>
>>
>> The next results are with the atomic_hash (a) patch applied:
>> For ATOMIC_HASH_SIZE = 4. 
>> real    0m21.892s 
>> user    0m27.492s
>> sys     0m59.704s
>>
>> For ATOMIC_HASH_SIZE = 64.
>> real    0m20.604s
>> user    0m24.832s
>> sys     0m56.552s
>>
> 
> I'm not sure why the atomic_hash patch would directly affect performance of this test as I
> don't think the patch affects the LWS locks.

True, but even so more, the patch should not have slowed down the (unrelated) testcase.
 
> I question the the atomic hash changes as the original defines are taken directly from generic code.
> Optimally, we want one spinlock per cacheline.  Why do we care about the size of atomic_t?

Assume two unrelated code paths which are protected by two different spinlocks (which are of size atomic_t).
So, if the addresses of those spinlocks calculate to be (virtually) on the same cacheline they would block each other.
With James patch the possibility of blocking each other is theoretically lower (esp. if you increase the number of locks). 

> The above indicate the change is detrimental.

Yes, true for patch (a), but not (b).

Helge

next prev parent reply	other threads:[~2015-09-23 19:30 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-02 16:20 [PATCH] parisc: adjust L1_CACHE_BYTES to 128 bytes on PA8800 and PA8900 CPUs Helge Deller
2015-09-03 13:30 ` James Bottomley
2015-09-03 13:57   ` James Bottomley
2015-09-05 21:30     ` Helge Deller
2015-09-22 16:20       ` Helge Deller
2015-09-23  0:12         ` John David Anglin
2015-09-23 19:30           ` Helge Deller [this message]
2015-09-23 21:00             ` John David Anglin
2015-09-24 14:20           ` James Bottomley
2015-09-24 16:39             ` John David Anglin
2015-09-24 16:57               ` James Bottomley
2015-09-25 12:20                 ` John David Anglin
2015-09-25 15:56                   ` John David Anglin
2015-09-27 16:27         ` [PATCH] parisc: " John David Anglin
2015-09-28 15:57           ` Helge Deller
2015-09-28 20:00             ` John David Anglin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5602FDD7.6020501@gmx.de \
    --to=deller@gmx.de \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=dave.anglin@bell.net \
    --cc=linux-parisc@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.