linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/3] x86,smp: make ticket spinlock proportional backoff w/ auto tuning
@ 2012-12-21 23:49 Rik van Riel
  2012-12-21 23:50 ` [RFC PATCH 1/3] x86,smp: move waiting on contended lock out of line Rik van Riel
                   ` (3 more replies)
  0 siblings, 4 replies; 53+ messages in thread
From: Rik van Riel @ 2012-12-21 23:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: aquini, walken, lwoodman, jeremy, Jan Beulich, Thomas Gleixner

[-- Attachment #1: Type: text/plain, Size: 2682 bytes --]

Many spinlocks are embedded in data structures; having many CPUs
pounce on the cache line the lock is in will slow down the lock
holder, and can cause system performance to fall off a cliff.

The paper "Non-scalable locks are dangerous" is a good reference:

	http://pdos.csail.mit.edu/papers/linux:lock.pdf

In the Linux kernel, spinlocks are optimized for the case of
there not being contention. After all, if there is contention,
the data structure can be improved to reduce or eliminate
lock contention.

Likewise, the spinlock API should remain simple, and the
common case of the lock not being contended should remain
as fast as ever.

However, since spinlock contention should be fairly uncommon,
we can add functionality into the spinlock slow path that keeps
system performance from falling off a cliff when there is lock
contention.

Proportional delay in ticket locks is delaying the time between
checking the ticket based on a delay factor, and the number of
CPUs ahead of us in the queue for this lock. Checking the lock
less often allows the lock holder to continue running, resulting
in better throughput and preventing performance from dropping
off a cliff.

The test case has a number of threads locking and unlocking a
semaphore. With just one thread, everything sits in the CPU
cache and throughput is around 2.6 million operations per
second, with a 5-10% variation.

Once a second thread gets involved, data structures bounce
from CPU to CPU, and performance deteriorates to about 1.25
million operations per second, with a 5-10% variation.

However, as more and more threads get added to the mix,
performance with the vanilla kernel continues to deteriorate.
Once I hit 24 threads, on a 24 CPU, 4 node test system,
performance is down to about 290k operations/second.

With a proportional backoff delay added to the spinlock
code, performance with 24 threads goes up to about 400k
operations/second with a 50x delay, and about 900k operations/second
with a 250x delay. However, with a 250x delay, performance with
2-5 threads is worse than with a 50x delay.

Making the code auto-tune the delay factor results in a system
that performs well with both light and heavy lock contention,
and should also protect against the (likely) case of the fixed
delay factor being wrong for other hardware.

The attached graph shows the performance of the multi threaded
semaphore lock/unlock test case, with 1-24 threads, on the
vanilla kernel, with 10x, 50x, and 250x proportional delay,
and with autotuning proportional delay tuning for either 2 or
2.7 spins before the lock is obtained.

Please let me know if you manage to break this code in any way,
so I can fix it...

[-- Attachment #2: spinlock-backoff.png --]
[-- Type: image/png, Size: 20083 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2013-01-03 18:23 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-12-21 23:49 [RFC PATCH 0/3] x86,smp: make ticket spinlock proportional backoff w/ auto tuning Rik van Riel
2012-12-21 23:50 ` [RFC PATCH 1/3] x86,smp: move waiting on contended lock out of line Rik van Riel
2012-12-22  3:05   ` Steven Rostedt
2012-12-22  4:40   ` Michel Lespinasse
2012-12-22  4:48     ` Rik van Riel
2012-12-23 22:52   ` Rafael Aquini
2012-12-21 23:51 ` [RFC PATCH 2/3] x86,smp: proportional backoff for ticket spinlocks Rik van Riel
2012-12-22  3:07   ` Steven Rostedt
2012-12-22  3:14     ` Steven Rostedt
2012-12-22  3:47       ` Rik van Riel
2012-12-22  4:44   ` Michel Lespinasse
2012-12-23 22:55   ` Rafael Aquini
2012-12-21 23:51 ` [RFC PATCH 3/3] x86,smp: auto tune spinlock backoff delay factor Rik van Riel
2012-12-21 23:56   ` [RFC PATCH 3/3 -v2] " Rik van Riel
2012-12-22  0:18     ` Eric Dumazet
2012-12-22  2:43       ` Rik van Riel
2012-12-22  0:48     ` Eric Dumazet
2012-12-22  2:57       ` Rik van Riel
2012-12-22  3:29     ` Eric Dumazet
2012-12-22  3:44       ` Rik van Riel
2012-12-22  3:33     ` Steven Rostedt
2012-12-22  3:50       ` Rik van Riel
2012-12-26 19:10         ` Eric Dumazet
2012-12-26 19:27           ` Eric Dumazet
2012-12-26 19:51           ` Rik van Riel
2012-12-27  6:07             ` Michel Lespinasse
2012-12-27 14:27               ` Eric Dumazet
2012-12-27 14:35                 ` Rik van Riel
2012-12-27 18:41                   ` Jan Beulich
2012-12-27 19:09                     ` Rik van Riel
2013-01-03  9:05                       ` Jan Beulich
2013-01-03 13:24                         ` Steven Rostedt
2013-01-03 13:35                           ` Eric Dumazet
2013-01-03 15:32                             ` Steven Rostedt
2013-01-03 16:10                               ` Eric Dumazet
2013-01-03 16:45                                 ` Steven Rostedt
2013-01-03 17:54                                   ` Eric Dumazet
2012-12-27 18:49                   ` Eric Dumazet
2012-12-27 19:31                     ` Rik van Riel
2012-12-29  0:42                       ` Eric Dumazet
2012-12-29 10:27           ` Michel Lespinasse
2013-01-03 18:17             ` Eric Dumazet
2012-12-22  0:47   ` [RFC PATCH 3/3] " David Daney
2012-12-22  2:51     ` Rik van Riel
2012-12-22  3:49       ` Steven Rostedt
2012-12-22  3:58         ` Rik van Riel
2012-12-23 23:08           ` Rafael Aquini
2012-12-22  5:42   ` Michel Lespinasse
2012-12-22 14:32     ` Rik van Riel
2013-01-02  0:06 ` ticket spinlock proportional backoff experiments Michel Lespinasse
2013-01-02  0:09   ` [PATCH 1/2] x86,smp: simplify __ticket_spin_lock Michel Lespinasse
2013-01-02 15:31     ` Rik van Riel
2013-01-02  0:10   ` [PATCH 2/2] x86,smp: proportional backoff for ticket spinlocks Michel Lespinasse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).