Re: [PATCH] MIPS: Optimize spinlocks.

Linux MIPS Architecture development
 help / color / mirror / Atom feed

From: David Daney <ddaney@caviumnetworks.com>
To: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Subject: Re: [PATCH] MIPS: Optimize spinlocks.
Date: Wed, 24 Feb 2010 08:55:12 -0800	[thread overview]
Message-ID: <4B8559F0.6080908@caviumnetworks.com> (raw)
In-Reply-To: <20100224155336.GA5130@linux-mips.org>

On 02/24/2010 07:53 AM, Ralf Baechle wrote:
> On Thu, Feb 04, 2010 at 11:31:49AM -0800, David Daney wrote:
>
>> The current locking mechanism uses a ll/sc sequence to release a
>> spinlock.  This is slower than a wmb() followed by a store to unlock.
>>
>> The branching forward to .subsection 2 on sc failure slows down the
>> contended case.  So we get rid of that part too.
>>
>> Since we are now working on naturally aligned u16 values, we can get
>> rid of a masking operation as the LHU already does the right thing.
>> The ANDI are reversed for better scheduling on multi-issue CPUs
>>
>> On a 12 CPU 750MHz Octeon cn5750 this patch improves ipv4 UDP packet
>> forwarding rates from 3.58*10^6 PPS to 3.99*10^6 PPS, or about 11%.
>
> And in your benchmarking patch you wrote:
>
>> 		spin_single	spin_multi
>> base		  106885	247941
>> spinlock_patch  75194		219465
>
> I did some benchmarking on an IP27 (180MHz, 2 CPU, needs LL/SC workaround):
>
> 		spin_single	spin_multi
> base		229341		3505690
> spinlock_patch	177847		3615326
>
> So about 22% speedup for spin_single but 3% slowdown for spin_multi.
>

It is possible that by choosing a better nudge_writes() implementation 
for R10K, that the 3% degradation could be erased.  Perhaps:

#define nudge_writes() do { } while (0)

Basically you want something that is fast, but that also forces the 
write to be globally visible as soon as possible.  Some processors have 
a prefetch instruction that does this.  On other processors a NOP is 
optimal as they don't combine writes in the write back buffer.

There is a wbflush() function that could potentially be used, but its 
implementation is too heavy on Octeon.

> Disabling the R10k LL/SC workaround btw. gives another 23% speedup for
> spin_single and marginal 0.3% for spin_multi; the latter may well be
> statistical noise.
>
>    Ralf
>

next prev parent reply	other threads:[~2010-02-24 17:05 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-04 19:31 [PATCH] MIPS: Optimize spinlocks David Daney
2010-02-24 15:53 ` Ralf Baechle
2010-02-24 15:54   ` Ralf Baechle
2010-02-24 16:55   ` David Daney [this message]
2010-02-25 14:15     ` Ralf Baechle
2010-02-25 17:31       ` David Daney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B8559F0.6080908@caviumnetworks.com \
    --to=ddaney@caviumnetworks.com \
    --cc=linux-mips@linux-mips.org \
    --cc=ralf@linux-mips.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox