* [PATCH 1/5] ptrace: remove incorrect unlikelys
2009-03-25 5:19 [PATCH 0/5] [PATCH][GIT PULL] remove unnecessary (un)likelys Steven Rostedt
@ 2009-03-25 5:19 ` Steven Rostedt
2009-03-25 7:21 ` Ingo Molnar
2009-03-25 9:28 ` Roland McGrath
2009-03-25 5:19 ` [PATCH 2/5] mm: remove unlikly NULL from kfree Steven Rostedt
` (6 subsequent siblings)
7 siblings, 2 replies; 37+ messages in thread
From: Steven Rostedt @ 2009-03-25 5:19 UTC (permalink / raw)
To: linux-kernel
Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
Roland McGrath, Nick Piggin, Steven Rostedt
[-- Attachment #1: 0001-ptrace-remove-incorrect-unlikelys.patch --]
[-- Type: text/plain, Size: 1409 bytes --]
From: Steven Rostedt <rostedt@goodmis.org>
Impact: clean up
Accounding to the annotated branch profiler, the unlikelys used by
ptrace is incorrect every time.
correct incorrect % Function File Line
------- --------- - -------- ---- ----
0 24176 100 syscall_trace_leave ptrace.c 1444
0 21478 100 syscall_trace_enter ptrace.c 1424
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
arch/x86/kernel/ptrace.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index 06ca07f..74a16db 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -1421,7 +1421,7 @@ asmregparm long syscall_trace_enter(struct pt_regs *regs)
tracehook_report_syscall_entry(regs))
ret = -1L;
- if (unlikely(current->audit_context)) {
+ if (current->audit_context) {
if (IS_IA32)
audit_syscall_entry(AUDIT_ARCH_I386,
regs->orig_ax,
@@ -1441,7 +1441,7 @@ asmregparm long syscall_trace_enter(struct pt_regs *regs)
asmregparm void syscall_trace_leave(struct pt_regs *regs)
{
- if (unlikely(current->audit_context))
+ if (current->audit_context)
audit_syscall_exit(AUDITSC_RESULT(regs->ax), regs->ax);
if (test_thread_flag(TIF_SYSCALL_TRACE))
--
1.6.2
--
^ permalink raw reply related [flat|nested] 37+ messages in thread* Re: [PATCH 1/5] ptrace: remove incorrect unlikelys
2009-03-25 5:19 ` [PATCH 1/5] ptrace: remove incorrect unlikelys Steven Rostedt
@ 2009-03-25 7:21 ` Ingo Molnar
2009-03-25 13:31 ` Christian Borntraeger
2009-03-25 13:42 ` Steven Rostedt
2009-03-25 9:28 ` Roland McGrath
1 sibling, 2 replies; 37+ messages in thread
From: Ingo Molnar @ 2009-03-25 7:21 UTC (permalink / raw)
To: Steven Rostedt
Cc: linux-kernel, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
Roland McGrath, Nick Piggin, Steven Rostedt
* Steven Rostedt <rostedt@goodmis.org> wrote:
> From: Steven Rostedt <rostedt@goodmis.org>
>
> Impact: clean up
>
> Accounding to the annotated branch profiler, the unlikelys used by
> ptrace is incorrect every time.
>
> correct incorrect % Function File Line
> ------- --------- - -------- ---- ----
>
> 0 24176 100 syscall_trace_leave ptrace.c 1444
> 0 21478 100 syscall_trace_enter ptrace.c 1424
>
> Signed-off-by: Steven Rostedt <srostedt@redhat.com>
> ---
> arch/x86/kernel/ptrace.c | 4 ++--
> 1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
> index 06ca07f..74a16db 100644
> --- a/arch/x86/kernel/ptrace.c
> +++ b/arch/x86/kernel/ptrace.c
> @@ -1421,7 +1421,7 @@ asmregparm long syscall_trace_enter(struct pt_regs *regs)
> tracehook_report_syscall_entry(regs))
> ret = -1L;
>
> - if (unlikely(current->audit_context)) {
> + if (current->audit_context) {
i suspect you got this result because you are running Fedora with
auditd enabled and running, right? Does SuSE and Ubuntu run with
auditing enabled as well? If yes then removing this annotation would
be right - otherwise the auditing-enabled case is considered the
less likely variant. (despite it being 100% wrong for your
particular configuration)
Ingo
^ permalink raw reply [flat|nested] 37+ messages in thread* Re: [PATCH 1/5] ptrace: remove incorrect unlikelys
2009-03-25 7:21 ` Ingo Molnar
@ 2009-03-25 13:31 ` Christian Borntraeger
2009-03-25 13:42 ` Steven Rostedt
1 sibling, 0 replies; 37+ messages in thread
From: Christian Borntraeger @ 2009-03-25 13:31 UTC (permalink / raw)
To: Ingo Molnar
Cc: Steven Rostedt, linux-kernel, Andrew Morton, Thomas Gleixner,
Peter Zijlstra, Roland McGrath, Nick Piggin, Steven Rostedt
Am Wednesday 25 March 2009 08:21:29 schrieb Ingo Molnar:
> > @@ -1421,7 +1421,7 @@ asmregparm long syscall_trace_enter(struct pt_regs *regs)
> > tracehook_report_syscall_entry(regs))
> > ret = -1L;
> >
> > - if (unlikely(current->audit_context)) {
> > + if (current->audit_context) {
>
> i suspect you got this result because you are running Fedora with
> auditd enabled and running, right? Does SuSE and Ubuntu run with
> auditing enabled as well? If yes then removing this annotation would
> be right - otherwise the auditing-enabled case is considered the
> less likely variant. (despite it being 100% wrong for your
> particular configuration)
Auditd can be enabled/disabled via kernel command line, no? In that case
there should be no unlikely and no likely. We should not optimize for a specific
value at compile time if the value can be changed at runtime.
This patch makes a lot of sense to me.
^ permalink raw reply [flat|nested] 37+ messages in thread* Re: [PATCH 1/5] ptrace: remove incorrect unlikelys
2009-03-25 7:21 ` Ingo Molnar
2009-03-25 13:31 ` Christian Borntraeger
@ 2009-03-25 13:42 ` Steven Rostedt
1 sibling, 0 replies; 37+ messages in thread
From: Steven Rostedt @ 2009-03-25 13:42 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
Roland McGrath, Nick Piggin, Steven Rostedt
On Wed, 25 Mar 2009, Ingo Molnar wrote:
> > diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
> > index 06ca07f..74a16db 100644
> > --- a/arch/x86/kernel/ptrace.c
> > +++ b/arch/x86/kernel/ptrace.c
> > @@ -1421,7 +1421,7 @@ asmregparm long syscall_trace_enter(struct pt_regs *regs)
> > tracehook_report_syscall_entry(regs))
> > ret = -1L;
> >
> > - if (unlikely(current->audit_context)) {
> > + if (current->audit_context) {
>
> i suspect you got this result because you are running Fedora with
> auditd enabled and running, right? Does SuSE and Ubuntu run with
> auditing enabled as well? If yes then removing this annotation would
> be right - otherwise the auditing-enabled case is considered the
> less likely variant. (despite it being 100% wrong for your
> particular configuration)
I'm running an old RHEL 5.1 box. I guess we can look at what the other
distros run.
-- Steve
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 1/5] ptrace: remove incorrect unlikelys
2009-03-25 5:19 ` [PATCH 1/5] ptrace: remove incorrect unlikelys Steven Rostedt
2009-03-25 7:21 ` Ingo Molnar
@ 2009-03-25 9:28 ` Roland McGrath
1 sibling, 0 replies; 37+ messages in thread
From: Roland McGrath @ 2009-03-25 9:28 UTC (permalink / raw)
To: Steven Rostedt
Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
Peter Zijlstra, Nick Piggin, Steven Rostedt
This is mislabeled, really. This is "x86: syscall-audit", not ptrace.
I think this one also warrants moving the if inside a generic (non-arch)
inline. The choice about unlikely() is not arch-specific, nor (AFIAK) is
the logic of the if.
Thanks,
Roland
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH 2/5] mm: remove unlikly NULL from kfree
2009-03-25 5:19 [PATCH 0/5] [PATCH][GIT PULL] remove unnecessary (un)likelys Steven Rostedt
2009-03-25 5:19 ` [PATCH 1/5] ptrace: remove incorrect unlikelys Steven Rostedt
@ 2009-03-25 5:19 ` Steven Rostedt
2009-03-25 7:34 ` Pekka Enberg
2009-03-25 5:19 ` [PATCH 3/5] sched: remove unlikely in pre_schedule_rt Steven Rostedt
` (5 subsequent siblings)
7 siblings, 1 reply; 37+ messages in thread
From: Steven Rostedt @ 2009-03-25 5:19 UTC (permalink / raw)
To: linux-kernel
Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
Roland McGrath, Nick Piggin, Steven Rostedt
[-- Attachment #1: 0002-mm-remove-unlikly-NULL-from-kfree.patch --]
[-- Type: text/plain, Size: 1614 bytes --]
From: Steven Rostedt <rostedt@goodmis.org>
Impact: clean up
A NULL pointer to kfree is no longer unlikely, as seen by the
annotated branch profiler:
correct incorrect % Function File Line
------- --------- - -------- ---- ----
728571 1315540 64 kfree slab.c 3719
This makes sense, since we now encourage developers to just call kfree
without checking for NULL.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
mm/slab.c | 2 +-
mm/slob.c | 2 +-
mm/slub.c | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/mm/slab.c b/mm/slab.c
index 4d00855..0386c33 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3716,7 +3716,7 @@ void kfree(const void *objp)
struct kmem_cache *c;
unsigned long flags;
- if (unlikely(ZERO_OR_NULL_PTR(objp)))
+ if (ZERO_OR_NULL_PTR(objp))
return;
local_irq_save(flags);
kfree_debugcheck(objp);
diff --git a/mm/slob.c b/mm/slob.c
index 52bc8a2..e077174 100644
--- a/mm/slob.c
+++ b/mm/slob.c
@@ -491,7 +491,7 @@ void kfree(const void *block)
{
struct slob_page *sp;
- if (unlikely(ZERO_OR_NULL_PTR(block)))
+ if (ZERO_OR_NULL_PTR(block))
return;
sp = (struct slob_page *)virt_to_page(block);
diff --git a/mm/slub.c b/mm/slub.c
index 0280eee..65dc436 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2743,7 +2743,7 @@ void kfree(const void *x)
struct page *page;
void *object = (void *)x;
- if (unlikely(ZERO_OR_NULL_PTR(x)))
+ if (ZERO_OR_NULL_PTR(x))
return;
page = virt_to_head_page(x);
--
1.6.2
--
^ permalink raw reply related [flat|nested] 37+ messages in thread* Re: [PATCH 2/5] mm: remove unlikly NULL from kfree
2009-03-25 5:19 ` [PATCH 2/5] mm: remove unlikly NULL from kfree Steven Rostedt
@ 2009-03-25 7:34 ` Pekka Enberg
2009-03-25 7:50 ` Thomas Gleixner
2009-03-25 8:02 ` Hua Zhong
0 siblings, 2 replies; 37+ messages in thread
From: Pekka Enberg @ 2009-03-25 7:34 UTC (permalink / raw)
To: Steven Rostedt
Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
Peter Zijlstra, Roland McGrath, Nick Piggin, Steven Rostedt,
Christoph Lameter, Matt Mackall
On Wed, Mar 25, 2009 at 7:19 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
> From: Steven Rostedt <rostedt@goodmis.org>
>
> Impact: clean up
>
> A NULL pointer to kfree is no longer unlikely, as seen by the
> annotated branch profiler:
>
> correct incorrect % Function File Line
> ------- --------- - -------- ---- ----
> 728571 1315540 64 kfree slab.c 3719
>
> This makes sense, since we now encourage developers to just call kfree
> without checking for NULL.
But those are _error handling paths_ (at least supposed to be). I
wonder which call-sites are responsible for this. Can frtrace help us
here?
> ---
> mm/slab.c | 2 +-
> mm/slob.c | 2 +-
> mm/slub.c | 2 +-
> 3 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/mm/slab.c b/mm/slab.c
> index 4d00855..0386c33 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -3716,7 +3716,7 @@ void kfree(const void *objp)
> struct kmem_cache *c;
> unsigned long flags;
>
> - if (unlikely(ZERO_OR_NULL_PTR(objp)))
> + if (ZERO_OR_NULL_PTR(objp))
> return;
> local_irq_save(flags);
> kfree_debugcheck(objp);
> diff --git a/mm/slob.c b/mm/slob.c
> index 52bc8a2..e077174 100644
> --- a/mm/slob.c
> +++ b/mm/slob.c
> @@ -491,7 +491,7 @@ void kfree(const void *block)
> {
> struct slob_page *sp;
>
> - if (unlikely(ZERO_OR_NULL_PTR(block)))
> + if (ZERO_OR_NULL_PTR(block))
> return;
>
> sp = (struct slob_page *)virt_to_page(block);
> diff --git a/mm/slub.c b/mm/slub.c
> index 0280eee..65dc436 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2743,7 +2743,7 @@ void kfree(const void *x)
> struct page *page;
> void *object = (void *)x;
>
> - if (unlikely(ZERO_OR_NULL_PTR(x)))
> + if (ZERO_OR_NULL_PTR(x))
> return;
>
> page = virt_to_head_page(x);
> --
> 1.6.2
>
> --
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
^ permalink raw reply [flat|nested] 37+ messages in thread* Re: [PATCH 2/5] mm: remove unlikly NULL from kfree
2009-03-25 7:34 ` Pekka Enberg
@ 2009-03-25 7:50 ` Thomas Gleixner
2009-03-25 8:01 ` Pekka Enberg
2009-03-25 21:20 ` Andrew Morton
2009-03-25 8:02 ` Hua Zhong
1 sibling, 2 replies; 37+ messages in thread
From: Thomas Gleixner @ 2009-03-25 7:50 UTC (permalink / raw)
To: Pekka Enberg
Cc: Steven Rostedt, linux-kernel, Ingo Molnar, Andrew Morton,
Peter Zijlstra, Roland McGrath, Nick Piggin, Steven Rostedt,
Christoph Lameter, Matt Mackall
On Wed, 25 Mar 2009, Pekka Enberg wrote:
> On Wed, Mar 25, 2009 at 7:19 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
> > This makes sense, since we now encourage developers to just call kfree
> > without checking for NULL.
>
> But those are _error handling paths_ (at least supposed to be). I
> wonder which call-sites are responsible for this. Can frtrace help us
> here?
Why is this an error handler. We replaced tons of
if (obj)
kfree(obj);
constructs all over the kernel with kfree(obj); and let kfree deal
with the NULL pointer.
Thanks,
tglx
^ permalink raw reply [flat|nested] 37+ messages in thread* Re: [PATCH 2/5] mm: remove unlikly NULL from kfree
2009-03-25 7:50 ` Thomas Gleixner
@ 2009-03-25 8:01 ` Pekka Enberg
2009-03-25 21:20 ` Andrew Morton
1 sibling, 0 replies; 37+ messages in thread
From: Pekka Enberg @ 2009-03-25 8:01 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Steven Rostedt, linux-kernel, Ingo Molnar, Andrew Morton,
Peter Zijlstra, Roland McGrath, Nick Piggin, Steven Rostedt,
Christoph Lameter, Matt Mackall
Hi Thomas,
On Wed, Mar 25, 2009 at 7:19 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
> > > This makes sense, since we now encourage developers to just call kfree
> > > without checking for NULL.
On Wed, 25 Mar 2009, Pekka Enberg wrote:
> > But those are _error handling paths_ (at least supposed to be). I
> > wonder which call-sites are responsible for this. Can frtrace help us
> > here?
On Wed, 2009-03-25 at 08:50 +0100, Thomas Gleixner wrote:
> Why is this an error handler. We replaced tons of
>
> if (obj)
> kfree(obj);
>
> constructs all over the kernel with kfree(obj); and let kfree deal
> with the NULL pointer.
We encourage developers not to check for kfree() in the common
out-of-memory error handling paths. But what Steven's results suggest is
that the common case is something like this:
void *p = NULL;
if (/* unlikely condition */)
p = kmalloc(...);
kfree(p);
which, quite frankly, doesn't make much sense to me. That's why I would
really want to know which call-sites are causing this before applying
the patch.
Pekka
^ permalink raw reply [flat|nested] 37+ messages in thread* Re: [PATCH 2/5] mm: remove unlikly NULL from kfree
2009-03-25 7:50 ` Thomas Gleixner
2009-03-25 8:01 ` Pekka Enberg
@ 2009-03-25 21:20 ` Andrew Morton
1 sibling, 0 replies; 37+ messages in thread
From: Andrew Morton @ 2009-03-25 21:20 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Pekka Enberg, Steven Rostedt, linux-kernel, Ingo Molnar,
Peter Zijlstra, Roland McGrath, Nick Piggin, Steven Rostedt,
Christoph Lameter, Matt Mackall
On Wed, 25 Mar 2009 08:50:44 +0100 (CET) Thomas Gleixner <tglx@linutronix.de> wrote:
> On Wed, 25 Mar 2009, Pekka Enberg wrote:
> > On Wed, Mar 25, 2009 at 7:19 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
> > > This makes sense, since we now encourage developers to just call kfree
> > > without checking for NULL.
> >
> > But those are _error handling paths_ (at least supposed to be). I
> > wonder which call-sites are responsible for this. Can frtrace help us
> > here?
>
-mm's profile-likely-unlikely-macros.patch gives the backtraces you're
looking for.
I agree, probably some particular callsite is doing something silly and
is skewing all the instrumentation.
We have at times identified callsites where kfree(0) is so likely that
"remove the NULL test" was an undesirable change.
Many many many kfree() callsites _know_ that the pointer isn't NULL.
Having that test in kfree() was always stupid. What we should do is to
have a kfree_it_isnt_null() for those callsites so they can omit the test
altogether.
^ permalink raw reply [flat|nested] 37+ messages in thread
* RE: [PATCH 2/5] mm: remove unlikly NULL from kfree
2009-03-25 7:34 ` Pekka Enberg
2009-03-25 7:50 ` Thomas Gleixner
@ 2009-03-25 8:02 ` Hua Zhong
2009-03-25 8:06 ` Pekka Enberg
1 sibling, 1 reply; 37+ messages in thread
From: Hua Zhong @ 2009-03-25 8:02 UTC (permalink / raw)
To: 'Pekka Enberg', 'Steven Rostedt'
Cc: linux-kernel, 'Ingo Molnar', 'Andrew Morton',
'Thomas Gleixner', 'Peter Zijlstra',
'Roland McGrath', 'Nick Piggin',
'Steven Rostedt', 'Christoph Lameter',
'Matt Mackall'
> But those are _error handling paths_ (at least supposed to be). I
> wonder which call-sites are responsible for this. Can frtrace help us
> here?
I am not sure why you call these error paths.
I submitted the same patch two years ago, and you are still holding the same
argument.
http://www.archivum.info/linux.kernel/2006-04/msg06042.html
Have you used likely profiler? These are real numbers. If you insist on
calling them error paths then error paths are obviously the norm.
Hua
^ permalink raw reply [flat|nested] 37+ messages in thread
* RE: [PATCH 2/5] mm: remove unlikly NULL from kfree
2009-03-25 8:02 ` Hua Zhong
@ 2009-03-25 8:06 ` Pekka Enberg
2009-03-25 13:51 ` Pekka Enberg
0 siblings, 1 reply; 37+ messages in thread
From: Pekka Enberg @ 2009-03-25 8:06 UTC (permalink / raw)
To: Hua Zhong
Cc: 'Steven Rostedt', linux-kernel, 'Ingo Molnar',
'Andrew Morton', 'Thomas Gleixner',
'Peter Zijlstra', 'Roland McGrath',
'Nick Piggin', 'Steven Rostedt',
'Christoph Lameter', 'Matt Mackall'
On Wed, 2009-03-25 at 01:02 -0700, Hua Zhong wrote:
> > But those are _error handling paths_ (at least supposed to be). I
> > wonder which call-sites are responsible for this. Can frtrace help us
> > here?
>
> I am not sure why you call these error paths.
>
> I submitted the same patch two years ago, and you are still holding the same
> argument.
>
> http://www.archivum.info/linux.kernel/2006-04/msg06042.html
>
> Have you used likely profiler? These are real numbers. If you insist on
> calling them error paths then error paths are obviously the norm.
I am not denying the results, I am just saying that they don't make much
sense to me. Like I said, I would love to see the actual call-sites to
prove my argument wrong.
Pekka
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 2/5] mm: remove unlikly NULL from kfree
2009-03-25 8:06 ` Pekka Enberg
@ 2009-03-25 13:51 ` Pekka Enberg
2009-03-25 14:47 ` Steven Rostedt
2009-03-26 16:10 ` Al Viro
0 siblings, 2 replies; 37+ messages in thread
From: Pekka Enberg @ 2009-03-25 13:51 UTC (permalink / raw)
To: Hua Zhong
Cc: Steven Rostedt, linux-kernel, Ingo Molnar, Andrew Morton,
Thomas Gleixner, Peter Zijlstra, Roland McGrath, Nick Piggin,
Steven Rostedt, Christoph Lameter, Matt Mackall, Al Viro
On Wed, 2009-03-25 at 01:02 -0700, Hua Zhong wrote:
>> > But those are _error handling paths_ (at least supposed to be). I
>> > wonder which call-sites are responsible for this. Can frtrace help us
>> > here?
>>
>> I am not sure why you call these error paths.
>>
>> I submitted the same patch two years ago, and you are still holding the same
>> argument.
>>
>> http://www.archivum.info/linux.kernel/2006-04/msg06042.html
>>
>> Have you used likely profiler? These are real numbers. If you insist on
>> calling them error paths then error paths are obviously the norm.
On Wed, Mar 25, 2009 at 10:06 AM, Pekka Enberg <penberg@cs.helsinki.fi> wrote:
> I am not denying the results, I am just saying that they don't make much
> sense to me. Like I said, I would love to see the actual call-sites to
> prove my argument wrong.
OK, so according to Steven, audit_syscall_exit() is one such call-site
that shows up in the traces. I don't really understand what's going on
there but if it is sane, maybe that warrants the removal of unlikely()
from kfree(). Hmm?
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 2/5] mm: remove unlikly NULL from kfree
2009-03-25 13:51 ` Pekka Enberg
@ 2009-03-25 14:47 ` Steven Rostedt
2009-03-25 14:59 ` Pekka Enberg
2009-03-25 15:27 ` Steven Rostedt
2009-03-26 16:10 ` Al Viro
1 sibling, 2 replies; 37+ messages in thread
From: Steven Rostedt @ 2009-03-25 14:47 UTC (permalink / raw)
To: Pekka Enberg
Cc: Hua Zhong, linux-kernel, Ingo Molnar, Andrew Morton,
Thomas Gleixner, Peter Zijlstra, Roland McGrath, Nick Piggin,
Steven Rostedt, Christoph Lameter, Matt Mackall, Al Viro
On Wed, 25 Mar 2009, Pekka Enberg wrote:
> On Wed, 2009-03-25 at 01:02 -0700, Hua Zhong wrote:
> >> > But those are _error handling paths_ (at least supposed to be). I
> >> > wonder which call-sites are responsible for this. Can frtrace help us
> >> > here?
> >>
> >> I am not sure why you call these error paths.
> >>
> >> I submitted the same patch two years ago, and you are still holding the same
> >> argument.
> >>
> >> http://www.archivum.info/linux.kernel/2006-04/msg06042.html
> >>
> >> Have you used likely profiler? These are real numbers. If you insist on
> >> calling them error paths then error paths are obviously the norm.
>
> On Wed, Mar 25, 2009 at 10:06 AM, Pekka Enberg <penberg@cs.helsinki.fi> wrote:
> > I am not denying the results, I am just saying that they don't make much
> > sense to me. Like I said, I would love to see the actual call-sites to
> > prove my argument wrong.
>
> OK, so according to Steven, audit_syscall_exit() is one such call-site
> that shows up in the traces. I don't really understand what's going on
> there but if it is sane, maybe that warrants the removal of unlikely()
> from kfree(). Hmm?
After disabling AUDIT_SYSCALLS I have this:
# cat /debug/tracing/trace | sort -u
record_nulls: ptr=(null) (ext3_get_acl+0x1e0/0x3f0 [ext3])
record_nulls: ptr=(null) (free_bitmap+0x29/0x70)
record_nulls: ptr=(null) (free_tty_struct+0x1d/0x40)
record_nulls: ptr=(null) (ftrace_graph_exit_task+0x1e/0x20)
record_nulls: ptr=(null) (inet_sock_destruct+0x1cb/0x2a0)
record_nulls: ptr=(null) (ip_cork_release+0x24/0x50)
record_nulls: ptr=(null) (keyctl_join_session_keyring+0x5a/0x70)
record_nulls: ptr=(null) (key_user_lookup+0x183/0x220)
record_nulls: ptr=(null) (kobject_set_name_vargs+0x43/0x50)
record_nulls: ptr=(null) (netlink_release+0x1a4/0x2f0)
record_nulls: ptr=(null) (release_sysfs_dirent+0x20/0xc0)
record_nulls: ptr=(null) (sysfs_open_file+0x1c8/0x3e0)
record_nulls: ptr=(null) (tty_write+0x16a/0x290)
I added a hook to only record when NULL is passed into kfree.
Also note, that after disabling AUDIT_SYSCALLS I now only have roughly 7%
NULL hit rate. Still, unlikely is probably not a benefit here.
correct incorrect % Function File Line
------- --------- - -------- ---- ----
108129 8214 7 kfree slub.c 2796
And yes, this is a different kernel than the patch. This is running on tip
and not 2.6.29, and also, you can see, uses slub.c not slab.c. The reason
I changed was because:
1) it has better tracing utilities
2) included your trace point patch ;-)
The header is still messed up, because I'm using the kmemtrace branch and
not the ftrace branch, but I had better go back and make sure they are
fixed.
-- Steve
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 2/5] mm: remove unlikly NULL from kfree
2009-03-25 14:47 ` Steven Rostedt
@ 2009-03-25 14:59 ` Pekka Enberg
2009-03-25 15:04 ` Steven Rostedt
2009-03-25 15:27 ` Steven Rostedt
1 sibling, 1 reply; 37+ messages in thread
From: Pekka Enberg @ 2009-03-25 14:59 UTC (permalink / raw)
To: Steven Rostedt
Cc: Hua Zhong, linux-kernel, Ingo Molnar, Andrew Morton,
Thomas Gleixner, Peter Zijlstra, Roland McGrath, Nick Piggin,
Steven Rostedt, Christoph Lameter, Matt Mackall, Al Viro
Hi Steven,
On Wed, 25 Mar 2009, Pekka Enberg wrote:
> > OK, so according to Steven, audit_syscall_exit() is one such call-site
> > that shows up in the traces. I don't really understand what's going on
> > there but if it is sane, maybe that warrants the removal of unlikely()
> > from kfree(). Hmm?
On Wed, 2009-03-25 at 10:47 -0400, Steven Rostedt wrote:
> After disabling AUDIT_SYSCALLS I have this:
>
> # cat /debug/tracing/trace | sort -u
>
> record_nulls: ptr=(null) (ext3_get_acl+0x1e0/0x3f0 [ext3])
> record_nulls: ptr=(null) (free_bitmap+0x29/0x70)
> record_nulls: ptr=(null) (free_tty_struct+0x1d/0x40)
> record_nulls: ptr=(null) (ftrace_graph_exit_task+0x1e/0x20)
> record_nulls: ptr=(null) (inet_sock_destruct+0x1cb/0x2a0)
> record_nulls: ptr=(null) (ip_cork_release+0x24/0x50)
> record_nulls: ptr=(null) (keyctl_join_session_keyring+0x5a/0x70)
> record_nulls: ptr=(null) (key_user_lookup+0x183/0x220)
> record_nulls: ptr=(null) (kobject_set_name_vargs+0x43/0x50)
> record_nulls: ptr=(null) (netlink_release+0x1a4/0x2f0)
> record_nulls: ptr=(null) (release_sysfs_dirent+0x20/0xc0)
> record_nulls: ptr=(null) (sysfs_open_file+0x1c8/0x3e0)
> record_nulls: ptr=(null) (tty_write+0x16a/0x290)
>
> I added a hook to only record when NULL is passed into kfree.
>
> Also note, that after disabling AUDIT_SYSCALLS I now only have roughly 7%
> NULL hit rate. Still, unlikely is probably not a benefit here.
Thanks for doing this. Do you mean that 93% hit ratio is not enough to
be a performance gain?
Pekka
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 2/5] mm: remove unlikly NULL from kfree
2009-03-25 14:59 ` Pekka Enberg
@ 2009-03-25 15:04 ` Steven Rostedt
2009-03-25 15:08 ` Steven Rostedt
0 siblings, 1 reply; 37+ messages in thread
From: Steven Rostedt @ 2009-03-25 15:04 UTC (permalink / raw)
To: Pekka Enberg
Cc: Hua Zhong, linux-kernel, Ingo Molnar, Andrew Morton,
Thomas Gleixner, Peter Zijlstra, Roland McGrath, Nick Piggin,
Steven Rostedt, Christoph Lameter, Matt Mackall, Al Viro
On Wed, 25 Mar 2009, Pekka Enberg wrote:
> Hi Steven,
>
> On Wed, 25 Mar 2009, Pekka Enberg wrote:
> > > OK, so according to Steven, audit_syscall_exit() is one such call-site
> > > that shows up in the traces. I don't really understand what's going on
> > > there but if it is sane, maybe that warrants the removal of unlikely()
> > > from kfree(). Hmm?
>
> On Wed, 2009-03-25 at 10:47 -0400, Steven Rostedt wrote:
> > After disabling AUDIT_SYSCALLS I have this:
> >
> > # cat /debug/tracing/trace | sort -u
> >
> > record_nulls: ptr=(null) (ext3_get_acl+0x1e0/0x3f0 [ext3])
> > record_nulls: ptr=(null) (free_bitmap+0x29/0x70)
> > record_nulls: ptr=(null) (free_tty_struct+0x1d/0x40)
> > record_nulls: ptr=(null) (ftrace_graph_exit_task+0x1e/0x20)
> > record_nulls: ptr=(null) (inet_sock_destruct+0x1cb/0x2a0)
> > record_nulls: ptr=(null) (ip_cork_release+0x24/0x50)
> > record_nulls: ptr=(null) (keyctl_join_session_keyring+0x5a/0x70)
> > record_nulls: ptr=(null) (key_user_lookup+0x183/0x220)
> > record_nulls: ptr=(null) (kobject_set_name_vargs+0x43/0x50)
> > record_nulls: ptr=(null) (netlink_release+0x1a4/0x2f0)
> > record_nulls: ptr=(null) (release_sysfs_dirent+0x20/0xc0)
> > record_nulls: ptr=(null) (sysfs_open_file+0x1c8/0x3e0)
> > record_nulls: ptr=(null) (tty_write+0x16a/0x290)
> >
> > I added a hook to only record when NULL is passed into kfree.
> >
> > Also note, that after disabling AUDIT_SYSCALLS I now only have roughly 7%
> > NULL hit rate. Still, unlikely is probably not a benefit here.
>
> Thanks for doing this. Do you mean that 93% hit ratio is not enough to
> be a performance gain?
I think it was Christoph Lameter (good you Cc'd him) told me that anything
less that 99% is not worthy of a (un)likely macro.
I honestly don't know.
-- Steve
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 2/5] mm: remove unlikly NULL from kfree
2009-03-25 15:04 ` Steven Rostedt
@ 2009-03-25 15:08 ` Steven Rostedt
2009-03-25 16:14 ` Matt Mackall
0 siblings, 1 reply; 37+ messages in thread
From: Steven Rostedt @ 2009-03-25 15:08 UTC (permalink / raw)
To: Pekka Enberg
Cc: Hua Zhong, linux-kernel, Ingo Molnar, Andrew Morton,
Thomas Gleixner, Peter Zijlstra, Roland McGrath, Nick Piggin,
Steven Rostedt, Christoph Lameter, Matt Mackall, Al Viro
On Wed, 25 Mar 2009, Steven Rostedt wrote:
>
> On Wed, 25 Mar 2009, Pekka Enberg wrote:
>
> > Hi Steven,
> >
> > On Wed, 25 Mar 2009, Pekka Enberg wrote:
> > > > OK, so according to Steven, audit_syscall_exit() is one such call-site
> > > > that shows up in the traces. I don't really understand what's going on
> > > > there but if it is sane, maybe that warrants the removal of unlikely()
> > > > from kfree(). Hmm?
> >
> > On Wed, 2009-03-25 at 10:47 -0400, Steven Rostedt wrote:
> > > After disabling AUDIT_SYSCALLS I have this:
> > >
> > > # cat /debug/tracing/trace | sort -u
> > >
> > > record_nulls: ptr=(null) (ext3_get_acl+0x1e0/0x3f0 [ext3])
> > > record_nulls: ptr=(null) (free_bitmap+0x29/0x70)
> > > record_nulls: ptr=(null) (free_tty_struct+0x1d/0x40)
> > > record_nulls: ptr=(null) (ftrace_graph_exit_task+0x1e/0x20)
> > > record_nulls: ptr=(null) (inet_sock_destruct+0x1cb/0x2a0)
> > > record_nulls: ptr=(null) (ip_cork_release+0x24/0x50)
> > > record_nulls: ptr=(null) (keyctl_join_session_keyring+0x5a/0x70)
> > > record_nulls: ptr=(null) (key_user_lookup+0x183/0x220)
> > > record_nulls: ptr=(null) (kobject_set_name_vargs+0x43/0x50)
> > > record_nulls: ptr=(null) (netlink_release+0x1a4/0x2f0)
> > > record_nulls: ptr=(null) (release_sysfs_dirent+0x20/0xc0)
> > > record_nulls: ptr=(null) (sysfs_open_file+0x1c8/0x3e0)
> > > record_nulls: ptr=(null) (tty_write+0x16a/0x290)
> > >
> > > I added a hook to only record when NULL is passed into kfree.
> > >
> > > Also note, that after disabling AUDIT_SYSCALLS I now only have roughly 7%
> > > NULL hit rate. Still, unlikely is probably not a benefit here.
> >
> > Thanks for doing this. Do you mean that 93% hit ratio is not enough to
> > be a performance gain?
>
> I think it was Christoph Lameter (good you Cc'd him) told me that anything
> less that 99% is not worthy of a (un)likely macro.
>
> I honestly don't know.
I think the theory is that gcc and the CPU can handle normal branch
predictions well. But if you use one of the prediction macros, gcc
(and some archs) behaves differently, such that taking the wrong branch
can cost more than the time saved with all the other correct hits.
Again, I'm not sure. I haven't done the bench marks. Perhaps someone else
is more apt at knowing the details here.
-- Steve
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 2/5] mm: remove unlikly NULL from kfree
2009-03-25 15:08 ` Steven Rostedt
@ 2009-03-25 16:14 ` Matt Mackall
2009-03-25 16:34 ` Steven Rostedt
0 siblings, 1 reply; 37+ messages in thread
From: Matt Mackall @ 2009-03-25 16:14 UTC (permalink / raw)
To: Steven Rostedt
Cc: Pekka Enberg, Hua Zhong, linux-kernel, Ingo Molnar, Andrew Morton,
Thomas Gleixner, Peter Zijlstra, Roland McGrath, Nick Piggin,
Steven Rostedt, Christoph Lameter, Al Viro
On Wed, 2009-03-25 at 11:08 -0400, Steven Rostedt wrote:
> On Wed, 25 Mar 2009, Steven Rostedt wrote:
>
> >
> > On Wed, 25 Mar 2009, Pekka Enberg wrote:
> >
> > > Hi Steven,
> > >
> > > On Wed, 25 Mar 2009, Pekka Enberg wrote:
> > > > > OK, so according to Steven, audit_syscall_exit() is one such call-site
> > > > > that shows up in the traces. I don't really understand what's going on
> > > > > there but if it is sane, maybe that warrants the removal of unlikely()
> > > > > from kfree(). Hmm?
> > >
> > > On Wed, 2009-03-25 at 10:47 -0400, Steven Rostedt wrote:
> > > > After disabling AUDIT_SYSCALLS I have this:
> > > >
> > > > # cat /debug/tracing/trace | sort -u
> > > >
> > > > record_nulls: ptr=(null) (ext3_get_acl+0x1e0/0x3f0 [ext3])
> > > > record_nulls: ptr=(null) (free_bitmap+0x29/0x70)
> > > > record_nulls: ptr=(null) (free_tty_struct+0x1d/0x40)
> > > > record_nulls: ptr=(null) (ftrace_graph_exit_task+0x1e/0x20)
> > > > record_nulls: ptr=(null) (inet_sock_destruct+0x1cb/0x2a0)
> > > > record_nulls: ptr=(null) (ip_cork_release+0x24/0x50)
> > > > record_nulls: ptr=(null) (keyctl_join_session_keyring+0x5a/0x70)
> > > > record_nulls: ptr=(null) (key_user_lookup+0x183/0x220)
> > > > record_nulls: ptr=(null) (kobject_set_name_vargs+0x43/0x50)
> > > > record_nulls: ptr=(null) (netlink_release+0x1a4/0x2f0)
> > > > record_nulls: ptr=(null) (release_sysfs_dirent+0x20/0xc0)
> > > > record_nulls: ptr=(null) (sysfs_open_file+0x1c8/0x3e0)
> > > > record_nulls: ptr=(null) (tty_write+0x16a/0x290)
> > > >
> > > > I added a hook to only record when NULL is passed into kfree.
> > > >
> > > > Also note, that after disabling AUDIT_SYSCALLS I now only have roughly 7%
> > > > NULL hit rate. Still, unlikely is probably not a benefit here.
> > >
> > > Thanks for doing this. Do you mean that 93% hit ratio is not enough to
> > > be a performance gain?
> >
> > I think it was Christoph Lameter (good you Cc'd him) told me that anything
> > less that 99% is not worthy of a (un)likely macro.
> >
> > I honestly don't know.
>
> I think the theory is that gcc and the CPU can handle normal branch
> predictions well. But if you use one of the prediction macros, gcc
> (and some archs) behaves differently, such that taking the wrong branch
> can cost more than the time saved with all the other correct hits.
>
> Again, I'm not sure. I haven't done the bench marks. Perhaps someone else
> is more apt at knowing the details here.
>From first principles, we can make a reasonable model of branch
prediction success with a branch cache:
hot cache cold cache cold cache cold cache
w|w/o hint good hint bad hint
p near 0 + + + -
p near .5 0 0 0 0
p near 1 + - + -
(this assumes the CPU is biased against branching in the cold cache
case)
Branch prediction miss has a penalty measured in some smallish number of
cycles. So the impact in cycles/sec[1] is (p(miss) * penalty) * (calls /
sec). Because the branch cache kicks in and hides our unlikely hint with
a hot cache, we can't get a high calls/sec, so to have much impact,
we've got to have a very high probably of a missed branch (p near 1)
_and_ cold cache.
So for CPUs with a branch cache, unlikely hints only make sense in
fairly extreme cases. And I think that includes most CPU families these
days as it's pretty much required to get much advantage from making the
CPU clock faster than the memory bus.
We'd have a lot of trouble benchmarking this meaningfully as hot caches
kill the effect. And it would of course depend directly on a given CPU's
branch cache size and branch miss penalty, numbers that vary from model
to model. So I think unless we can confidently state that a branch is
rarely taken, there's very little upside to adding unlikely.
On the other hand, there's also very little downside until our hint is
grossly inaccurate. So there's a huge hysteresis here: if p is < .99,
not much point adding unlikely. If p is > .1, not much point removing
it.
[1] Note that cycles/sec is dimensionless as cycles and seconds are both
measures of time. So impact here is in units of very small fractions of
a percent.
--
http://selenic.com : development and support for Mercurial and Linux
^ permalink raw reply [flat|nested] 37+ messages in thread* Re: [PATCH 2/5] mm: remove unlikly NULL from kfree
2009-03-25 16:14 ` Matt Mackall
@ 2009-03-25 16:34 ` Steven Rostedt
2009-03-25 20:26 ` Jeremy Fitzhardinge
2009-03-25 21:01 ` Matt Mackall
0 siblings, 2 replies; 37+ messages in thread
From: Steven Rostedt @ 2009-03-25 16:34 UTC (permalink / raw)
To: Matt Mackall
Cc: Pekka Enberg, Hua Zhong, linux-kernel, Ingo Molnar, Andrew Morton,
Thomas Gleixner, Peter Zijlstra, Roland McGrath, Nick Piggin,
Steven Rostedt, Christoph Lameter, Al Viro
On Wed, 25 Mar 2009, Matt Mackall wrote:
> >
> > I think the theory is that gcc and the CPU can handle normal branch
> > predictions well. But if you use one of the prediction macros, gcc
> > (and some archs) behaves differently, such that taking the wrong branch
> > can cost more than the time saved with all the other correct hits.
> >
> > Again, I'm not sure. I haven't done the bench marks. Perhaps someone else
> > is more apt at knowing the details here.
>
> >From first principles, we can make a reasonable model of branch
> prediction success with a branch cache:
>
> hot cache cold cache cold cache cold cache
> w|w/o hint good hint bad hint
> p near 0 + + + -
> p near .5 0 0 0 0
> p near 1 + - + -
>
> (this assumes the CPU is biased against branching in the cold cache
> case)
>
> Branch prediction miss has a penalty measured in some smallish number of
> cycles. So the impact in cycles/sec[1] is (p(miss) * penalty) * (calls /
> sec). Because the branch cache kicks in and hides our unlikely hint with
> a hot cache, we can't get a high calls/sec, so to have much impact,
> we've got to have a very high probably of a missed branch (p near 1)
> _and_ cold cache.
>
> So for CPUs with a branch cache, unlikely hints only make sense in
> fairly extreme cases. And I think that includes most CPU families these
> days as it's pretty much required to get much advantage from making the
> CPU clock faster than the memory bus.
>
> We'd have a lot of trouble benchmarking this meaningfully as hot caches
> kill the effect. And it would of course depend directly on a given CPU's
> branch cache size and branch miss penalty, numbers that vary from model
> to model. So I think unless we can confidently state that a branch is
> rarely taken, there's very little upside to adding unlikely.
>
> On the other hand, there's also very little downside until our hint is
> grossly inaccurate. So there's a huge hysteresis here: if p is < .99,
> not much point adding unlikely. If p is > .1, not much point removing
> it.
>
> [1] Note that cycles/sec is dimensionless as cycles and seconds are both
> measures of time. So impact here is in units of very small fractions of
> a percent.
Hi Matt,
Thanks for this info!
Although gcc plays a role too. That is, if we have
if (x)
do something small;
do something large;
this can be broken into:
cmp x
beq 1f
do something small
1:
do something large
Which plays nice with the cache. But, by adding a unlikely(x), gcc will
probably choose to do:
cmp x
bne 2f
1:
do something large
ret;
2:
do something small
b 1b
which hurts in a number of ways.
-- Steve
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 2/5] mm: remove unlikly NULL from kfree
2009-03-25 16:34 ` Steven Rostedt
@ 2009-03-25 20:26 ` Jeremy Fitzhardinge
2009-03-25 21:09 ` Matt Mackall
2009-03-25 21:01 ` Matt Mackall
1 sibling, 1 reply; 37+ messages in thread
From: Jeremy Fitzhardinge @ 2009-03-25 20:26 UTC (permalink / raw)
To: Steven Rostedt
Cc: Matt Mackall, Pekka Enberg, Hua Zhong, linux-kernel, Ingo Molnar,
Andrew Morton, Thomas Gleixner, Peter Zijlstra, Roland McGrath,
Nick Piggin, Steven Rostedt, Christoph Lameter, Al Viro
Steven Rostedt wrote:
> Thanks for this info!
>
> Although gcc plays a role too. That is, if we have
>
> if (x)
> do something small;
>
> do something large;
>
>
> this can be broken into:
>
> cmp x
> beq 1f
> do something small
> 1:
> do something large
>
> Which plays nice with the cache. But, by adding a unlikely(x), gcc will
> probably choose to do:
>
> cmp x
> bne 2f
>
> 1:
> do something large
>
> ret;
>
> 2:
> do something small
> b 1b
>
> which hurts in a number of ways.
>
I think that's probably the dominant effect on x86 systems, because
Intel doesn't recommend using the branch hint prefixes as far as I can
tell (their consumption of icache space outweighs any benefit of priming
the predictor).
J
^ permalink raw reply [flat|nested] 37+ messages in thread* Re: [PATCH 2/5] mm: remove unlikly NULL from kfree
2009-03-25 20:26 ` Jeremy Fitzhardinge
@ 2009-03-25 21:09 ` Matt Mackall
0 siblings, 0 replies; 37+ messages in thread
From: Matt Mackall @ 2009-03-25 21:09 UTC (permalink / raw)
To: Jeremy Fitzhardinge
Cc: Steven Rostedt, Pekka Enberg, Hua Zhong, linux-kernel,
Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
Roland McGrath, Nick Piggin, Steven Rostedt, Christoph Lameter,
Al Viro
On Wed, 2009-03-25 at 13:26 -0700, Jeremy Fitzhardinge wrote:
> Steven Rostedt wrote:
> > Thanks for this info!
> >
> > Although gcc plays a role too. That is, if we have
> >
> > if (x)
> > do something small;
> >
> > do something large;
> >
> >
> > this can be broken into:
> >
> > cmp x
> > beq 1f
> > do something small
> > 1:
> > do something large
> >
> > Which plays nice with the cache. But, by adding a unlikely(x), gcc will
> > probably choose to do:
> >
> > cmp x
> > bne 2f
> >
> > 1:
> > do something large
> >
> > ret;
> >
> > 2:
> > do something small
> > b 1b
> >
> > which hurts in a number of ways.
> >
>
> I think that's probably the dominant effect on x86 systems, because
> Intel doesn't recommend using the branch hint prefixes as far as I can
> tell (their consumption of icache space outweighs any benefit of priming
> the predictor).
Yeah, I was actually thinking 'hint' in the gcc sense of adding
unlikely() or not, which results in, say, choosing one code layout vs
the other based on the CPU's cold-cache bias.
--
http://selenic.com : development and support for Mercurial and Linux
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 2/5] mm: remove unlikly NULL from kfree
2009-03-25 16:34 ` Steven Rostedt
2009-03-25 20:26 ` Jeremy Fitzhardinge
@ 2009-03-25 21:01 ` Matt Mackall
2009-03-25 21:24 ` Steven Rostedt
1 sibling, 1 reply; 37+ messages in thread
From: Matt Mackall @ 2009-03-25 21:01 UTC (permalink / raw)
To: Steven Rostedt
Cc: Pekka Enberg, Hua Zhong, linux-kernel, Ingo Molnar, Andrew Morton,
Thomas Gleixner, Peter Zijlstra, Roland McGrath, Nick Piggin,
Steven Rostedt, Christoph Lameter, Al Viro
On Wed, 2009-03-25 at 12:34 -0400, Steven Rostedt wrote:
> On Wed, 25 Mar 2009, Matt Mackall wrote:
> > >
> > > I think the theory is that gcc and the CPU can handle normal branch
> > > predictions well. But if you use one of the prediction macros, gcc
> > > (and some archs) behaves differently, such that taking the wrong branch
> > > can cost more than the time saved with all the other correct hits.
> > >
> > > Again, I'm not sure. I haven't done the bench marks. Perhaps someone else
> > > is more apt at knowing the details here.
> >
> > >From first principles, we can make a reasonable model of branch
> > prediction success with a branch cache:
> >
> > hot cache cold cache cold cache cold cache
> > w|w/o hint good hint bad hint
> > p near 0 + + + -
> > p near .5 0 0 0 0
> > p near 1 + - + -
> >
> > (this assumes the CPU is biased against branching in the cold cache
> > case)
> >
> > Branch prediction miss has a penalty measured in some smallish number of
> > cycles. So the impact in cycles/sec[1] is (p(miss) * penalty) * (calls /
> > sec). Because the branch cache kicks in and hides our unlikely hint with
> > a hot cache, we can't get a high calls/sec, so to have much impact,
> > we've got to have a very high probably of a missed branch (p near 1)
> > _and_ cold cache.
> >
> > So for CPUs with a branch cache, unlikely hints only make sense in
> > fairly extreme cases. And I think that includes most CPU families these
> > days as it's pretty much required to get much advantage from making the
> > CPU clock faster than the memory bus.
> >
> > We'd have a lot of trouble benchmarking this meaningfully as hot caches
> > kill the effect. And it would of course depend directly on a given CPU's
> > branch cache size and branch miss penalty, numbers that vary from model
> > to model. So I think unless we can confidently state that a branch is
> > rarely taken, there's very little upside to adding unlikely.
> >
> > On the other hand, there's also very little downside until our hint is
> > grossly inaccurate. So there's a huge hysteresis here: if p is < .99,
> > not much point adding unlikely. If p is > .1, not much point removing
> > it.
> >
> > [1] Note that cycles/sec is dimensionless as cycles and seconds are both
> > measures of time. So impact here is in units of very small fractions of
> > a percent.
>
> Hi Matt,
>
> Thanks for this info!
>
> Although gcc plays a role too. That is, if we have
>
> if (x)
> do something small;
>
> do something large;
>
>
> this can be broken into:
>
> cmp x
> beq 1f
> do something small
> 1:
> do something large
>
> Which plays nice with the cache. But, by adding a unlikely(x), gcc will
> probably choose to do:
>
> cmp x
> bne 2f
>
> 1:
> do something large
>
> ret;
>
> 2:
> do something small
> b 1b
>
> which hurts in a number of ways.
The cost is an unconditional branch; the ret already exists. There's a
slightly larger cache footprint for the small branch and a slightly
smaller footprint for the large branch. If p is close to .5 and
calls/sec is high, the cache footprint is the sum of the footprint of
both branches. But if calls/sec is close to low, cache footprint is also
low.
So, yeah, I think this is a good additional argument to err on the side
of not adding these things at all. And I certainly wasn't intending to
defend the ones in kfree.
But I'm also skeptical of whether it's worth spending much time actively
routing out the moderately incorrect instances. It's going to be nearly
immune to performance benchmarking. We should instead just actively
discourage using unlikely in new code.
--
http://selenic.com : development and support for Mercurial and Linux
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 2/5] mm: remove unlikly NULL from kfree
2009-03-25 21:01 ` Matt Mackall
@ 2009-03-25 21:24 ` Steven Rostedt
0 siblings, 0 replies; 37+ messages in thread
From: Steven Rostedt @ 2009-03-25 21:24 UTC (permalink / raw)
To: Matt Mackall
Cc: Pekka Enberg, Hua Zhong, linux-kernel, Ingo Molnar, Andrew Morton,
Thomas Gleixner, Peter Zijlstra, Roland McGrath, Nick Piggin,
Steven Rostedt, Christoph Lameter, Al Viro
On Wed, 25 Mar 2009, Matt Mackall wrote:
>
> The cost is an unconditional branch; the ret already exists. There's a
> slightly larger cache footprint for the small branch and a slightly
> smaller footprint for the large branch. If p is close to .5 and
> calls/sec is high, the cache footprint is the sum of the footprint of
> both branches. But if calls/sec is close to low, cache footprint is also
> low.
>
> So, yeah, I think this is a good additional argument to err on the side
> of not adding these things at all. And I certainly wasn't intending to
> defend the ones in kfree.
>
> But I'm also skeptical of whether it's worth spending much time actively
> routing out the moderately incorrect instances. It's going to be nearly
> immune to performance benchmarking. We should instead just actively
> discourage using unlikely in new code.
>
Sure. OK, actually I'd say they are valid for 100% hits. These are for
error conditions and trace point like data. Where, we want the least
amount of overhead when the condition is false.
But I can see, we've been brought up (well some of us) that branches are
horrible for pipelines, and any help in deciding the choice is always
tempting.
-- Steve
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 2/5] mm: remove unlikly NULL from kfree
2009-03-25 14:47 ` Steven Rostedt
2009-03-25 14:59 ` Pekka Enberg
@ 2009-03-25 15:27 ` Steven Rostedt
1 sibling, 0 replies; 37+ messages in thread
From: Steven Rostedt @ 2009-03-25 15:27 UTC (permalink / raw)
To: Pekka Enberg
Cc: Hua Zhong, linux-kernel, Ingo Molnar, Andrew Morton,
Thomas Gleixner, Peter Zijlstra, Roland McGrath, Nick Piggin,
Steven Rostedt, Christoph Lameter, Matt Mackall, Al Viro
On Wed, 25 Mar 2009, Steven Rostedt wrote:
>
> # cat /debug/tracing/trace | sort -u
>
> record_nulls: ptr=(null) (ext3_get_acl+0x1e0/0x3f0 [ext3])
> record_nulls: ptr=(null) (free_bitmap+0x29/0x70)
> record_nulls: ptr=(null) (free_tty_struct+0x1d/0x40)
> record_nulls: ptr=(null) (ftrace_graph_exit_task+0x1e/0x20)
> record_nulls: ptr=(null) (inet_sock_destruct+0x1cb/0x2a0)
> record_nulls: ptr=(null) (ip_cork_release+0x24/0x50)
> record_nulls: ptr=(null) (keyctl_join_session_keyring+0x5a/0x70)
> record_nulls: ptr=(null) (key_user_lookup+0x183/0x220)
> record_nulls: ptr=(null) (kobject_set_name_vargs+0x43/0x50)
> record_nulls: ptr=(null) (netlink_release+0x1a4/0x2f0)
> record_nulls: ptr=(null) (release_sysfs_dirent+0x20/0xc0)
> record_nulls: ptr=(null) (sysfs_open_file+0x1c8/0x3e0)
> record_nulls: ptr=(null) (tty_write+0x16a/0x290)
>
Hmm, it is looking like the patter of calling kfree(NULL) is due to this:
struct some_struct *some_init_function(int some_option)
{
some_pointer = kzalloc(sizeof(struct some_struct));
if (some_option)
some_pointer->some_item = kmalloc();
return some_pointer;
}
some_destructor(struct some_struct *some_pointer)
{
kfree(some_pointer->some_item);
kfree(some_pointer);
}
That is, a structure may or may not allocate memory for various items in
the structure. They may stay as NULL. On the destructor side, instead of
testing if the items are NULL, we simply call kfree on them.
This explains why the object is not unlikely to be NULL in kfree.
Note, likely is not the same as majority of the time. If I were to say, 7%
of the next 100 days will snow. Can you say it is unlikely that it will
snow? Perhaps on a certain day it may be unlikely, but it is very likely
it will eventually snow. The same should go with the macros. If you say,
unlikely, it really should mean, I do not expect this to ever happen.
Not, the majority of the times it will not happen.
-- Steve
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 2/5] mm: remove unlikly NULL from kfree
2009-03-25 13:51 ` Pekka Enberg
2009-03-25 14:47 ` Steven Rostedt
@ 2009-03-26 16:10 ` Al Viro
2009-03-26 16:15 ` Andrew Morton
1 sibling, 1 reply; 37+ messages in thread
From: Al Viro @ 2009-03-26 16:10 UTC (permalink / raw)
To: Pekka Enberg
Cc: Hua Zhong, Steven Rostedt, linux-kernel, Ingo Molnar,
Andrew Morton, Thomas Gleixner, Peter Zijlstra, Roland McGrath,
Nick Piggin, Steven Rostedt, Christoph Lameter, Matt Mackall
On Wed, Mar 25, 2009 at 03:51:49PM +0200, Pekka Enberg wrote:
> OK, so according to Steven, audit_syscall_exit() is one such call-site
> that shows up in the traces. I don't really understand what's going on
> there but if it is sane, maybe that warrants the removal of unlikely()
> from kfree(). Hmm?
*Shrug*
We certainly can add explicit check, but that'll keep asking for
patches "removing useless check". The same applies to other places,
really.
And making kfree(NULL) break will keep reintroducing bugs, since people
will expect the behaviour of free(3)...
I don't have any serious objections to adding a check there; indeed,
the normal case there (
if (context->state != AUDIT_RECORD_CONTEXT) {
kfree(context->filterkey);
context->filterkey = NULL;
}
) is context->state != AUDIT_RECORD_CONTEXT, context->filterkey == NULL,
so it might be worth turning into
if (unlikely(context->filterkey)) {
if (context->state != AUDIT_RECORD_CONTEXT) {
kfree(context->filterkey);
context->filterkey = NULL;
}
}
anyway. Just dubious about the goal in general...
^ permalink raw reply [flat|nested] 37+ messages in thread* Re: [PATCH 2/5] mm: remove unlikly NULL from kfree
2009-03-26 16:10 ` Al Viro
@ 2009-03-26 16:15 ` Andrew Morton
0 siblings, 0 replies; 37+ messages in thread
From: Andrew Morton @ 2009-03-26 16:15 UTC (permalink / raw)
To: Al Viro
Cc: Pekka Enberg, Hua Zhong, Steven Rostedt, linux-kernel,
Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Roland McGrath,
Nick Piggin, Steven Rostedt, Christoph Lameter, Matt Mackall
On Thu, 26 Mar 2009 16:10:19 +0000 Al Viro <viro@ZenIV.linux.org.uk> wrote:
> We certainly can add explicit check, but that'll keep asking for
> patches "removing useless check". The same applies to other places,
> really.
Such patches are easily prevented by commenting the exceptional cases.
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH 3/5] sched: remove unlikely in pre_schedule_rt
2009-03-25 5:19 [PATCH 0/5] [PATCH][GIT PULL] remove unnecessary (un)likelys Steven Rostedt
2009-03-25 5:19 ` [PATCH 1/5] ptrace: remove incorrect unlikelys Steven Rostedt
2009-03-25 5:19 ` [PATCH 2/5] mm: remove unlikly NULL from kfree Steven Rostedt
@ 2009-03-25 5:19 ` Steven Rostedt
2009-03-25 5:24 ` Steven Rostedt
2009-03-25 5:19 ` [PATCH 4/5] sched: remove unlikelys from sched_move_task Steven Rostedt
` (4 subsequent siblings)
7 siblings, 1 reply; 37+ messages in thread
From: Steven Rostedt @ 2009-03-25 5:19 UTC (permalink / raw)
To: linux-kernel
Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
Roland McGrath, Nick Piggin, Steven Rostedt
[-- Attachment #1: 0003-sched-remove-unlikely-in-pre_schedule_rt.patch --]
[-- Type: text/plain, Size: 1027 bytes --]
From: Steven Rostedt <rostedt@goodmis.org>
Impact: clean up
The annotated branch profiler shows that the unlikely used by
pre_schedule_rt is always incorrect. This makes sense because in
sched.c we have:
if (prev->sched_class->pre_schedule)
prev->sched_class->pre_schedule(rq, prev);
And we are saying that prev is unlikely to be an rt task. This looks
more like a likely candidate to me.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
kernel/sched_rt.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c
index bac1061..537af77 100644
--- a/kernel/sched_rt.c
+++ b/kernel/sched_rt.c
@@ -1260,7 +1260,7 @@ static int pull_rt_task(struct rq *this_rq)
static void pre_schedule_rt(struct rq *rq, struct task_struct *prev)
{
/* Try to pull RT tasks here if we lower this rq's prio */
- if (unlikely(rt_task(prev)) && rq->rt.highest_prio > prev->prio)
+ if (rt_task(prev) && rq->rt.highest_prio > prev->prio)
pull_rt_task(rq);
}
--
1.6.2
--
^ permalink raw reply related [flat|nested] 37+ messages in thread* Re: [PATCH 3/5] sched: remove unlikely in pre_schedule_rt
2009-03-25 5:19 ` [PATCH 3/5] sched: remove unlikely in pre_schedule_rt Steven Rostedt
@ 2009-03-25 5:24 ` Steven Rostedt
0 siblings, 0 replies; 37+ messages in thread
From: Steven Rostedt @ 2009-03-25 5:24 UTC (permalink / raw)
To: linux-kernel
Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
Roland McGrath, Nick Piggin, Steven Rostedt
On Wed, 25 Mar 2009, Steven Rostedt wrote:
> From: Steven Rostedt <rostedt@goodmis.org>
>
> Impact: clean up
>
> The annotated branch profiler shows that the unlikely used by
> pre_schedule_rt is always incorrect. This makes sense because in
> sched.c we have:
>
> if (prev->sched_class->pre_schedule)
> prev->sched_class->pre_schedule(rq, prev);
>
> And we are saying that prev is unlikely to be an rt task. This looks
> more like a likely candidate to me.
I forgot to add the profiler output:
correct incorrect % Function File Line
------- --------- - -------- ---- ----
0 2248 100 pre_schedule_rt sched_rt.c 1263
-- Steve
>
> Signed-off-by: Steven Rostedt <srostedt@redhat.com>
> ---
> kernel/sched_rt.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c
> index bac1061..537af77 100644
> --- a/kernel/sched_rt.c
> +++ b/kernel/sched_rt.c
> @@ -1260,7 +1260,7 @@ static int pull_rt_task(struct rq *this_rq)
> static void pre_schedule_rt(struct rq *rq, struct task_struct *prev)
> {
> /* Try to pull RT tasks here if we lower this rq's prio */
> - if (unlikely(rt_task(prev)) && rq->rt.highest_prio > prev->prio)
> + if (rt_task(prev) && rq->rt.highest_prio > prev->prio)
> pull_rt_task(rq);
> }
>
> --
> 1.6.2
>
> --
>
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH 4/5] sched: remove unlikelys from sched_move_task
2009-03-25 5:19 [PATCH 0/5] [PATCH][GIT PULL] remove unnecessary (un)likelys Steven Rostedt
` (2 preceding siblings ...)
2009-03-25 5:19 ` [PATCH 3/5] sched: remove unlikely in pre_schedule_rt Steven Rostedt
@ 2009-03-25 5:19 ` Steven Rostedt
2009-03-25 5:19 ` [PATCH 5/5] mm: remove unlikelys for unlock in rmap.c Steven Rostedt
` (3 subsequent siblings)
7 siblings, 0 replies; 37+ messages in thread
From: Steven Rostedt @ 2009-03-25 5:19 UTC (permalink / raw)
To: linux-kernel
Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
Roland McGrath, Nick Piggin, Steven Rostedt
[-- Attachment #1: 0004-sched-remove-unlikelys-from-sched_move_task.patch --]
[-- Type: text/plain, Size: 1260 bytes --]
From: Steven Rostedt <rostedt@goodmis.org>
Impact: clean up
We seem to be moving running tasks more than non running tasks.
I guess tasks like to move themselves.
These are also candidates for likely:
correct incorrect % Function File Line
------- --------- - -------- ---- ----
0 10027 100 sched_move_task sched.c 8918
0 10027 100 sched_move_task sched.c 8908
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
kernel/sched.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/sched.c b/kernel/sched.c
index 8e2558c..5a60745 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -8905,7 +8905,7 @@ void sched_move_task(struct task_struct *tsk)
if (on_rq)
dequeue_task(rq, tsk, 0);
- if (unlikely(running))
+ if (running)
tsk->sched_class->put_prev_task(rq, tsk);
set_task_rq(tsk, task_cpu(tsk));
@@ -8915,7 +8915,7 @@ void sched_move_task(struct task_struct *tsk)
tsk->sched_class->moved_group(tsk);
#endif
- if (unlikely(running))
+ if (running)
tsk->sched_class->set_curr_task(rq);
if (on_rq)
enqueue_task(rq, tsk, 0);
--
1.6.2
--
^ permalink raw reply related [flat|nested] 37+ messages in thread* [PATCH 5/5] mm: remove unlikelys for unlock in rmap.c
2009-03-25 5:19 [PATCH 0/5] [PATCH][GIT PULL] remove unnecessary (un)likelys Steven Rostedt
` (3 preceding siblings ...)
2009-03-25 5:19 ` [PATCH 4/5] sched: remove unlikelys from sched_move_task Steven Rostedt
@ 2009-03-25 5:19 ` Steven Rostedt
2009-04-24 11:12 ` KOSAKI Motohiro
2009-03-25 7:19 ` [PATCH 0/5] [PATCH][GIT PULL] remove unnecessary (un)likelys Ingo Molnar
` (2 subsequent siblings)
7 siblings, 1 reply; 37+ messages in thread
From: Steven Rostedt @ 2009-03-25 5:19 UTC (permalink / raw)
To: linux-kernel
Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
Roland McGrath, Nick Piggin, Steven Rostedt
[-- Attachment #1: 0005-mm-remove-unlikelys-for-unlock-in-rmap.c.patch --]
[-- Type: text/plain, Size: 2157 bytes --]
From: Steven Rostedt <rostedt@goodmis.org>
Impact: clean up
The annotated branch profiler shows that the rmap calls are likely
called with unlock set.
correct incorrect % Function File Line
------- --------- - -------- ---- ----
0 46100 100 try_to_unmap_anon rmap.c 1013
0 46100 100 try_to_unmap_anon rmap.c 1005
0 5763 100 try_to_unmap_file rmap.c 1074
0 5763 100 try_to_unmap_file rmap.c 1069
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
mm/rmap.c | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/mm/rmap.c b/mm/rmap.c
index 1652166..ad62fe0 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1002,7 +1002,7 @@ static int try_to_unmap_anon(struct page *page, int unlock, int migration)
unsigned int mlocked = 0;
int ret = SWAP_AGAIN;
- if (MLOCK_PAGES && unlikely(unlock))
+ if (MLOCK_PAGES && unlock)
ret = SWAP_SUCCESS; /* default for try_to_munlock() */
anon_vma = page_lock_anon_vma(page);
@@ -1010,7 +1010,7 @@ static int try_to_unmap_anon(struct page *page, int unlock, int migration)
return ret;
list_for_each_entry(vma, &anon_vma->head, anon_vma_node) {
- if (MLOCK_PAGES && unlikely(unlock)) {
+ if (MLOCK_PAGES && unlock) {
if (!((vma->vm_flags & VM_LOCKED) &&
page_mapped_in_vma(page, vma)))
continue; /* must visit all unlocked vmas */
@@ -1066,12 +1066,12 @@ static int try_to_unmap_file(struct page *page, int unlock, int migration)
unsigned int mapcount;
unsigned int mlocked = 0;
- if (MLOCK_PAGES && unlikely(unlock))
+ if (MLOCK_PAGES && unlock)
ret = SWAP_SUCCESS; /* default for try_to_munlock() */
spin_lock(&mapping->i_mmap_lock);
vma_prio_tree_foreach(vma, &iter, &mapping->i_mmap, pgoff, pgoff) {
- if (MLOCK_PAGES && unlikely(unlock)) {
+ if (MLOCK_PAGES && unlock) {
if (!((vma->vm_flags & VM_LOCKED) &&
page_mapped_in_vma(page, vma)))
continue; /* must visit all vmas */
--
1.6.2
--
^ permalink raw reply related [flat|nested] 37+ messages in thread* Re: [PATCH 5/5] mm: remove unlikelys for unlock in rmap.c
2009-03-25 5:19 ` [PATCH 5/5] mm: remove unlikelys for unlock in rmap.c Steven Rostedt
@ 2009-04-24 11:12 ` KOSAKI Motohiro
2009-04-24 12:15 ` Lee Schermerhorn
0 siblings, 1 reply; 37+ messages in thread
From: KOSAKI Motohiro @ 2009-04-24 11:12 UTC (permalink / raw)
To: Steven Rostedt
Cc: kosaki.motohiro, linux-kernel, Ingo Molnar, Andrew Morton,
Thomas Gleixner, Peter Zijlstra, Roland McGrath, Nick Piggin,
Steven Rostedt, Lee Schermerhorn
(cc to lee)
> From: Steven Rostedt <rostedt@goodmis.org>
>
> Impact: clean up
>
> The annotated branch profiler shows that the rmap calls are likely
> called with unlock set.
>
> correct incorrect % Function File Line
> ------- --------- - -------- ---- ----
> 0 46100 100 try_to_unmap_anon rmap.c 1013
> 0 46100 100 try_to_unmap_anon rmap.c 1005
> 0 5763 100 try_to_unmap_file rmap.c 1074
> 0 5763 100 try_to_unmap_file rmap.c 1069
>
> Signed-off-by: Steven Rostedt <srostedt@redhat.com>
unlock==1 mean munlock() is called.
unlock==0 mean memory shortage and reclaim happend.
So, we did guess end-user don't use munlock() so frequently.
but reclaim is frequently happend.
Oh well, but you have rich machine. hmm...
ok, I can agree user can use munlock() frequently.
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> ---
> mm/rmap.c | 8 ++++----
> 1 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 1652166..ad62fe0 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1002,7 +1002,7 @@ static int try_to_unmap_anon(struct page *page, int unlock, int migration)
> unsigned int mlocked = 0;
> int ret = SWAP_AGAIN;
>
> - if (MLOCK_PAGES && unlikely(unlock))
> + if (MLOCK_PAGES && unlock)
> ret = SWAP_SUCCESS; /* default for try_to_munlock() */
>
> anon_vma = page_lock_anon_vma(page);
> @@ -1010,7 +1010,7 @@ static int try_to_unmap_anon(struct page *page, int unlock, int migration)
> return ret;
>
> list_for_each_entry(vma, &anon_vma->head, anon_vma_node) {
> - if (MLOCK_PAGES && unlikely(unlock)) {
> + if (MLOCK_PAGES && unlock) {
> if (!((vma->vm_flags & VM_LOCKED) &&
> page_mapped_in_vma(page, vma)))
> continue; /* must visit all unlocked vmas */
> @@ -1066,12 +1066,12 @@ static int try_to_unmap_file(struct page *page, int unlock, int migration)
> unsigned int mapcount;
> unsigned int mlocked = 0;
>
> - if (MLOCK_PAGES && unlikely(unlock))
> + if (MLOCK_PAGES && unlock)
> ret = SWAP_SUCCESS; /* default for try_to_munlock() */
>
> spin_lock(&mapping->i_mmap_lock);
> vma_prio_tree_foreach(vma, &iter, &mapping->i_mmap, pgoff, pgoff) {
> - if (MLOCK_PAGES && unlikely(unlock)) {
> + if (MLOCK_PAGES && unlock) {
> if (!((vma->vm_flags & VM_LOCKED) &&
> page_mapped_in_vma(page, vma)))
> continue; /* must visit all vmas */
> --
> 1.6.2
>
> --
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 37+ messages in thread* Re: [PATCH 5/5] mm: remove unlikelys for unlock in rmap.c
2009-04-24 11:12 ` KOSAKI Motohiro
@ 2009-04-24 12:15 ` Lee Schermerhorn
0 siblings, 0 replies; 37+ messages in thread
From: Lee Schermerhorn @ 2009-04-24 12:15 UTC (permalink / raw)
To: KOSAKI Motohiro
Cc: Steven Rostedt, linux-kernel, Ingo Molnar, Andrew Morton,
Thomas Gleixner, Peter Zijlstra, Roland McGrath, Nick Piggin,
Steven Rostedt, Hugh Dickins
On Fri, 2009-04-24 at 20:12 +0900, KOSAKI Motohiro wrote:
> (cc to lee)
>
> > From: Steven Rostedt <rostedt@goodmis.org>
> >
> > Impact: clean up
> >
> > The annotated branch profiler shows that the rmap calls are likely
> > called with unlock set.
> >
> > correct incorrect % Function File Line
> > ------- --------- - -------- ---- ----
> > 0 46100 100 try_to_unmap_anon rmap.c 1013
> > 0 46100 100 try_to_unmap_anon rmap.c 1005
> > 0 5763 100 try_to_unmap_file rmap.c 1074
> > 0 5763 100 try_to_unmap_file rmap.c 1069
> >
> > Signed-off-by: Steven Rostedt <srostedt@redhat.com>
>
> unlock==1 mean munlock() is called.
> unlock==0 mean memory shortage and reclaim happend.
>
> So, we did guess end-user don't use munlock() so frequently.
> but reclaim is frequently happend.
>
> Oh well, but you have rich machine. hmm...
> ok, I can agree user can use munlock() frequently.
>
>
> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
>
In current mainline, try_to_munlock() is called only from
munlock_vma_page() and then only when page is actually mlocked. So, you
should only see the profile above during a test that is munlock()ing a
lot of pages. However, after try_to_munlock() was introduced, we'd
added a call to it in vmscan.c to test for mlocked anon pages to avoid
allocating swap to mlocked pages. At the time, we couldn't reliably
free swap there. Hugh fixed this and removed the call from vmscan. If
Steve was testing a kernel before the call was removed, he would have
seen the calls when reclaiming any anon pages.
Steve: what test were you using and on what version?
Regards,
Lee
>
>
>
> > ---
> > mm/rmap.c | 8 ++++----
> > 1 files changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/mm/rmap.c b/mm/rmap.c
> > index 1652166..ad62fe0 100644
> > --- a/mm/rmap.c
> > +++ b/mm/rmap.c
> > @@ -1002,7 +1002,7 @@ static int try_to_unmap_anon(struct page *page, int unlock, int migration)
> > unsigned int mlocked = 0;
> > int ret = SWAP_AGAIN;
> >
> > - if (MLOCK_PAGES && unlikely(unlock))
> > + if (MLOCK_PAGES && unlock)
> > ret = SWAP_SUCCESS; /* default for try_to_munlock() */
> >
> > anon_vma = page_lock_anon_vma(page);
> > @@ -1010,7 +1010,7 @@ static int try_to_unmap_anon(struct page *page, int unlock, int migration)
> > return ret;
> >
> > list_for_each_entry(vma, &anon_vma->head, anon_vma_node) {
> > - if (MLOCK_PAGES && unlikely(unlock)) {
> > + if (MLOCK_PAGES && unlock) {
> > if (!((vma->vm_flags & VM_LOCKED) &&
> > page_mapped_in_vma(page, vma)))
> > continue; /* must visit all unlocked vmas */
> > @@ -1066,12 +1066,12 @@ static int try_to_unmap_file(struct page *page, int unlock, int migration)
> > unsigned int mapcount;
> > unsigned int mlocked = 0;
> >
> > - if (MLOCK_PAGES && unlikely(unlock))
> > + if (MLOCK_PAGES && unlock)
> > ret = SWAP_SUCCESS; /* default for try_to_munlock() */
> >
> > spin_lock(&mapping->i_mmap_lock);
> > vma_prio_tree_foreach(vma, &iter, &mapping->i_mmap, pgoff, pgoff) {
> > - if (MLOCK_PAGES && unlikely(unlock)) {
> > + if (MLOCK_PAGES && unlock) {
> > if (!((vma->vm_flags & VM_LOCKED) &&
> > page_mapped_in_vma(page, vma)))
> > continue; /* must visit all vmas */
> > --
> > 1.6.2
> >
> > --
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
>
>
>
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 0/5] [PATCH][GIT PULL] remove unnecessary (un)likelys
2009-03-25 5:19 [PATCH 0/5] [PATCH][GIT PULL] remove unnecessary (un)likelys Steven Rostedt
` (4 preceding siblings ...)
2009-03-25 5:19 ` [PATCH 5/5] mm: remove unlikelys for unlock in rmap.c Steven Rostedt
@ 2009-03-25 7:19 ` Ingo Molnar
2009-03-25 7:25 ` Ingo Molnar
2009-04-27 16:30 ` Daniel Walker
7 siblings, 0 replies; 37+ messages in thread
From: Ingo Molnar @ 2009-03-25 7:19 UTC (permalink / raw)
To: Steven Rostedt
Cc: linux-kernel, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
Roland McGrath, Nick Piggin
* Steven Rostedt <rostedt@goodmis.org> wrote:
> Ingo,
>
> I guess this can go through you. This is a start of clean ups to
> get rid of (un)likelys that are at least 50% incorrect. This
> series has some that are 100% incorrect.
Nice!
I'll wait for the acks from affected (and Cc:-ed) maintainers though
- and they might decide to carry the patches in their trees straight
away (or disagree with the change) so a rebase is in the cards as
well.
Ingo
^ permalink raw reply [flat|nested] 37+ messages in thread* Re: [PATCH 0/5] [PATCH][GIT PULL] remove unnecessary (un)likelys
2009-03-25 5:19 [PATCH 0/5] [PATCH][GIT PULL] remove unnecessary (un)likelys Steven Rostedt
` (5 preceding siblings ...)
2009-03-25 7:19 ` [PATCH 0/5] [PATCH][GIT PULL] remove unnecessary (un)likelys Ingo Molnar
@ 2009-03-25 7:25 ` Ingo Molnar
2009-03-25 13:43 ` Steven Rostedt
2009-04-27 16:30 ` Daniel Walker
7 siblings, 1 reply; 37+ messages in thread
From: Ingo Molnar @ 2009-03-25 7:25 UTC (permalink / raw)
To: Steven Rostedt, Peter Zijlstra, Pekka Enberg
Cc: linux-kernel, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
Roland McGrath, Nick Piggin
* Steven Rostedt <rostedt@goodmis.org> wrote:
> ----
> arch/x86/kernel/ptrace.c | 4 ++--
> kernel/sched.c | 4 ++--
> kernel/sched_rt.c | 2 +-
> mm/rmap.c | 8 ++++----
> mm/slab.c | 2 +-
> mm/slob.c | 2 +-
> mm/slub.c | 2 +-
> 7 files changed, 12 insertions(+), 12 deletions(-)
I commented on the ptrace one - i think we should take that one out
or at least wait for more info.
The MM ones look correct. Maybe your kfree observation will be
debated - but i think it's probably right that we should not special
annotate kfree(NULL) anymore in any direction. But it's ultimately
up to the MM folks. (I've Cc:-ed Pekka too)
The rmap and the scheduler ones definitely look correct. When you
rebase, please include the profiler output to 3/5 too.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 37+ messages in thread* Re: [PATCH 0/5] [PATCH][GIT PULL] remove unnecessary (un)likelys
2009-03-25 7:25 ` Ingo Molnar
@ 2009-03-25 13:43 ` Steven Rostedt
0 siblings, 0 replies; 37+ messages in thread
From: Steven Rostedt @ 2009-03-25 13:43 UTC (permalink / raw)
To: Ingo Molnar
Cc: Peter Zijlstra, Pekka Enberg, linux-kernel, Andrew Morton,
Thomas Gleixner, Peter Zijlstra, Roland McGrath, Nick Piggin
On Wed, 25 Mar 2009, Ingo Molnar wrote:
>
> * Steven Rostedt <rostedt@goodmis.org> wrote:
>
> > ----
> > arch/x86/kernel/ptrace.c | 4 ++--
> > kernel/sched.c | 4 ++--
> > kernel/sched_rt.c | 2 +-
> > mm/rmap.c | 8 ++++----
> > mm/slab.c | 2 +-
> > mm/slob.c | 2 +-
> > mm/slub.c | 2 +-
> > 7 files changed, 12 insertions(+), 12 deletions(-)
>
> I commented on the ptrace one - i think we should take that one out
> or at least wait for more info.
>
> The MM ones look correct. Maybe your kfree observation will be
> debated - but i think it's probably right that we should not special
> annotate kfree(NULL) anymore in any direction. But it's ultimately
> up to the MM folks. (I've Cc:-ed Pekka too)
>
> The rmap and the scheduler ones definitely look correct. When you
> rebase, please include the profiler output to 3/5 too.
I'll rebase it to only include the non controversial changes.
Thanks,
-- Steve
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 0/5] [PATCH][GIT PULL] remove unnecessary (un)likelys
2009-03-25 5:19 [PATCH 0/5] [PATCH][GIT PULL] remove unnecessary (un)likelys Steven Rostedt
` (6 preceding siblings ...)
2009-03-25 7:25 ` Ingo Molnar
@ 2009-04-27 16:30 ` Daniel Walker
7 siblings, 0 replies; 37+ messages in thread
From: Daniel Walker @ 2009-04-27 16:30 UTC (permalink / raw)
To: Steven Rostedt
Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
Peter Zijlstra, Roland McGrath, Nick Piggin
On Wed, 2009-03-25 at 01:19 -0400, Steven Rostedt wrote:
> Ingo,
>
> I guess this can go through you. This is a start of clean ups to
> get rid of (un)likelys that are at least 50% incorrect. This series
> has some that are 100% incorrect.
>
> The branch profiler used is from 2.6.29 which does not have the
> fixed header that is in tip. This branch is also based off of
> 2.6.29 and not tip.
>
> Perhaps, since some of these cases are 100% wrong they can simply
> be reversed. I'm choosing to just remove the annotation, and then
> later I'll be using the full branch profiler to look for candidates
> for adding (un)likelys. That way each added annotation can be
> scrutinized individually.
What kind of methodology are you using to determine which to remove? It
looks like some you review the code, and other you just remove based on
it being %50 wrong or more.. I found workloads are especially important.
Where finding one that is %50 or %100 wrong does necessarily mean
anything..
Daniel
^ permalink raw reply [flat|nested] 37+ messages in thread