linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH [RT] 00/14] RFC - adaptive real-time locks
@ 2008-02-21 15:26 Gregory Haskins
  2008-02-21 15:26 ` [PATCH [RT] 01/14] spinlocks: fix preemption feature when PREEMPT_RT is enabled Gregory Haskins
                   ` (15 more replies)
  0 siblings, 16 replies; 59+ messages in thread
From: Gregory Haskins @ 2008-02-21 15:26 UTC (permalink / raw)
  To: mingo, a.p.zijlstra, tglx, rostedt, linux-rt-users
  Cc: linux-kernel, bill.huey, kevin, cminyard, dsingleton, dwalker,
	npiggin, dsaxena, ak, gregkh, sdietrich, pmorreale, mkohari,
	ghaskins

The Real Time patches to the Linux kernel converts the architecture
specific SMP-synchronization primitives commonly referred to as
"spinlocks" to an "RT mutex" implementation that support a priority
inheritance protocol, and priority-ordered wait queues.  The RT mutex
implementation allows tasks that would otherwise busy-wait for a
contended lock to be preempted by higher priority tasks without
compromising the integrity of critical sections protected by the lock.
The unintended side-effect is that the -rt kernel suffers from
significant degradation of IO throughput (disk and net) due to the
extra overhead associated with managing pi-lists and context switching.
This has been generally accepted as a price to pay for low-latency
preemption.

Our research indicates that it doesn't necessarily have to be this
way.  This patch set introduces an adaptive technology that retains both
the priority inheritance protocol as well as the preemptive nature of
spinlocks and mutexes and adds a 300+% throughput increase to the Linux
Real time kernel.  It applies to 2.6.24-rt1.  

These performance increases apply to disk IO as well as netperf UDP
benchmarks, without compromising RT preemption latency.  For more
complex applications, overall the I/O throughput seems to approach the
throughput on a PREEMPT_VOLUNTARY or PREEMPT_DESKTOP Kernel, as is
shipped by most distros.

Essentially, the RT Mutex has been modified to busy-wait under
contention for a limited (and configurable) time.  This works because
most locks are typically held for very short time spans.  Too often,
by the time a task goes to sleep on a mutex, the mutex is already being
released on another CPU.

The effect (on SMP) is that by polling a mutex for a limited time we
reduce context switch overhead by up to 90%, and therefore eliminate CPU
cycles as well as massive hot-spots in the scheduler / other bottlenecks
in the Kernel - even though we busy-wait (using CPU cycles) to poll the
lock.

We have put together some data from different types of benchmarks for
this patch series, which you can find here:

ftp://ftp.novell.com/dev/ghaskins/adaptive-locks.pdf

It compares a stock kernel.org 2.6.24 (PREEMPT_DESKTOP), a stock
2.6.24-rt1 (PREEMPT_RT), and a 2.6.24-rt1 + adaptive-lock
(2.6.24-rt1-al) (PREEMPT_RT) kernel.  The machine is a 4-way (dual-core,
dual-socket) 2Ghz 5130 Xeon (core2duo-woodcrest) Dell Precision 490. 

Some tests show a marked improvement (for instance, dbench and hackbench),
whereas some others (make -j 128) the results were not as profound but
they were still net-positive. In all cases we have also verified that
deterministic latency is not impacted by using cyclic-test. 

This patch series also includes some re-work on the raw_spinlock
infrastructure, including Nick Piggin's x86-ticket-locks.  We found that
the increased pressure on the lock->wait_locks could cause rare but
serious latency spikes that are fixed by a fifo raw_spinlock_t
implementation.  Nick was gracious enough to allow us to re-use his
work (which is already accepted in 2.6.25).  Note that we also have a
C version of his protocol available if other architectures need
fifo-lock support as well, which we will gladly post upon request.

Special thanks go to many people who were instrumental to this project,
including:
  *) the -rt team here at Novell for research, development, and testing.
  *) Nick Piggin for his invaluable consultation/feedback and use of his
     x86-ticket-locks.
  *) The reviewers/testers at Suse, Montavista, and Bill Huey for their
     time and feedback on the early versions of these patches.

As always, comments/feedback/bug-fixes are welcome.

Regards,
-Greg

^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2008-02-25 23:52 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-02-21 15:26 [PATCH [RT] 00/14] RFC - adaptive real-time locks Gregory Haskins
2008-02-21 15:26 ` [PATCH [RT] 01/14] spinlocks: fix preemption feature when PREEMPT_RT is enabled Gregory Haskins
2008-02-21 15:26 ` [PATCH [RT] 02/14] spinlock: make preemptible-waiter feature a specific config option Gregory Haskins
2008-02-22 19:09   ` Pavel Machek
2008-02-21 15:26 ` [PATCH [RT] 03/14] x86: FIFO ticket spinlocks Gregory Haskins
2008-02-21 15:26 ` [PATCH [RT] 04/14] disable PREEMPT_SPINLOCK_WAITERS when x86 ticket/fifo spins are in use Gregory Haskins
2008-02-21 15:26 ` [PATCH [RT] 05/14] rearrange rt_spin_lock sleep Gregory Haskins
2008-02-22 13:29   ` Gregory Haskins
2008-02-22 13:35     ` Steven Rostedt
2008-02-22 13:40       ` Peter Zijlstra
2008-02-22 13:35     ` Ingo Molnar
2008-02-22 13:43       ` Steven Rostedt
2008-02-22 13:46       ` Steven Rostedt
2008-02-21 15:26 ` [PATCH [RT] 06/14] optimize rt lock wakeup Gregory Haskins
2008-02-21 15:27 ` [PATCH [RT] 07/14] adaptive real-time lock support Gregory Haskins
2008-02-22 19:14   ` Pavel Machek
2008-02-21 15:27 ` [PATCH [RT] 08/14] add a loop counter based timeout mechanism Gregory Haskins
2008-02-21 16:41   ` Andi Kleen
2008-02-21 17:02     ` Gregory Haskins
2008-02-21 17:04     ` Peter W. Morreale
2008-02-21 17:06     ` Sven-Thorsten Dietrich
2008-02-22 19:08     ` Paul E. McKenney
2008-02-22 19:19       ` Bill Huey (hui)
2008-02-22 19:21         ` Bill Huey (hui)
2008-02-22 19:43           ` Paul E. McKenney
2008-02-22 19:55             ` Sven-Thorsten Dietrich
2008-02-22 20:23               ` Paul E. McKenney
2008-02-22 22:03                 ` Gregory Haskins
2008-02-23 12:31                   ` Andi Kleen
2008-02-23 16:32                     ` Paul E. McKenney
2008-02-25 23:52                   ` Sven-Thorsten Dietrich
2008-02-22 20:36               ` Peter W. Morreale
2008-02-23  7:36                 ` Sven-Thorsten Dietrich
2008-02-22 20:15             ` Peter W. Morreale
2008-02-21 15:27 ` [PATCH [RT] 09/14] adaptive mutexes Gregory Haskins
2008-02-21 15:27 ` [PATCH [RT] 10/14] adjust pi_lock usage in wakeup Gregory Haskins
2008-02-21 16:48   ` Steven Rostedt
2008-02-21 17:09     ` Peter W. Morreale
2008-02-21 15:27 ` [PATCH [RT] 11/14] optimize the !printk fastpath through the lock acquisition Gregory Haskins
2008-02-21 16:36   ` Andi Kleen
2008-02-21 16:47     ` Gregory Haskins
2008-02-22 19:18   ` Pavel Machek
2008-02-22 22:20     ` Gregory Haskins
2008-02-23  0:43       ` Bill Huey (hui)
2008-02-25  5:20         ` Gregory Haskins
2008-02-25  6:21           ` Bill Huey (hui)
2008-02-25  9:02             ` Bill Huey (hui)
2008-02-21 15:27 ` [PATCH [RT] 12/14] remove the extra call to try_to_take_lock Gregory Haskins
2008-02-21 15:27 ` [PATCH [RT] 13/14] allow rt-mutex lock-stealing to include lateral priority Gregory Haskins
2008-02-21 15:27 ` [PATCH [RT] 14/14] sysctl for runtime-control of lateral mutex stealing Gregory Haskins
2008-02-21 16:05 ` [PATCH [RT] 00/14] RFC - adaptive real-time locks Gregory Haskins
2008-02-21 21:24 ` Ingo Molnar
2008-02-21 21:33   ` Bill Huey (hui)
     [not found]     ` <20080221214219.GA27209@elte.hu>
2008-02-21 21:56       ` Gregory Haskins
2008-02-21 22:53       ` Bill Huey (hui)
2008-02-21 21:40   ` Gregory Haskins
2008-02-21 22:12   ` Peter W. Morreale
2008-02-21 22:42     ` Peter W. Morreale
2008-02-23  8:03   ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).