linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re-tune x86 uaccess code for PREEMPT_VOLUNTARY v2
@ 2013-08-14  0:07 Andi Kleen
  2013-08-14  0:07 ` [PATCH 1/8] x86: Add 1/2/4/8 byte optimization to 64bit __copy_{from,to}_user_inatomic Andi Kleen
                   ` (8 more replies)
  0 siblings, 9 replies; 11+ messages in thread
From: Andi Kleen @ 2013-08-14  0:07 UTC (permalink / raw)
  To: linux-kernel; +Cc: x86, torvalds

The x86 user access functions (*_user) were originally very well tuned,
with partial inline code and other optimizations.

Then over time various new checks -- particularly the sleep checks for
a voluntary preempt kernel -- destroyed a lot of the tunings

A typical user access operation is now doing multiple useless
function calls. Also the without force inline gcc's inlining
policy makes it even worse, with adding more unnecessary calls.

Here's a typical example from ftrace:

     10)               |    might_fault() {
     10)               |      _cond_resched() {
     10)               |        should_resched() {
     10)               |          need_resched() {
     10)   0.063 us    |            test_ti_thread_flag();
     10)   0.643 us    |          }
     10)   1.238 us    |        }
     10)   1.845 us    |      }
     10)   2.438 us    |    }

So we spent 2.5us doing nothing (ok it's a bit less without
ftrace, but still pretty bad)

Then in other cases we would have an out of line function,
but would actually do the might_sleep() checks in the inlined
caller. This doesn't make any sense at all.

There were also a few other problems, for example the x86-64 uaccess
code regularly falls back to string functions, even though a simple
mov would be enough. For example every futex access to the lock
variable would actually use string instructions, even though 
it's just 4 bytes.

This patch kit is an attempt to get us back to sane code, 
mostly by doing proper inlining and doing sleep checks in the right
place. Unfortunately I had to add one tree sweep to avoid an nasty
include loop.

v2: Now completely remove reschedule checks for uaccess functions.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-08-14  9:56 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-08-14  0:07 Re-tune x86 uaccess code for PREEMPT_VOLUNTARY v2 Andi Kleen
2013-08-14  0:07 ` [PATCH 1/8] x86: Add 1/2/4/8 byte optimization to 64bit __copy_{from,to}_user_inatomic Andi Kleen
2013-08-14  0:17   ` Linus Torvalds
2013-08-14  0:07 ` [PATCH 2/8] x86: Include linux/sched.h in asm/uaccess.h Andi Kleen
2013-08-14  0:07 ` [PATCH 3/8] tree-sweep: Include linux/sched.h for might_sleep users Andi Kleen
2013-08-14  0:07 ` [PATCH 4/8] Move might_sleep and friends from kernel.h to sched.h Andi Kleen
2013-08-14  0:07 ` [PATCH 5/8] sched: mark should_resched() __always_inline Andi Kleen
2013-08-14  0:07 ` [PATCH 6/8] Add might_fault_debug_only() Andi Kleen
2013-08-14  0:07 ` [PATCH 7/8] x86: Remove cond_resched() from uaccess code Andi Kleen
2013-08-14  0:07 ` [PATCH 8/8] sched: Inline the need_resched test into the caller for _cond_resched Andi Kleen
2013-08-14  9:56 ` Re-tune x86 uaccess code for PREEMPT_VOLUNTARY v2 Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).