[BUG] perf and kmemcheck : fatal combination

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [BUG] perf and kmemcheck : fatal combination
@ 2011-04-25 16:08 Eric Dumazet
  2011-04-26  7:38 ` Peter Zijlstra
  0 siblings, 1 reply; 14+ messages in thread
From: Eric Dumazet @ 2011-04-25 16:08 UTC (permalink / raw)
  To: Ingo Molnar, Arnaldo Carvalho de Melo, Peter Zijlstra,
	Paul Mackerras, Pekka Enberg, Vegard Nossum
  Cc: linux-kernel

Hi guys

Just got a panic on a kmemcheck kernel, latest linux-2.6 tree.

I forgot I had kmemcheck enabled, and started "perf top" just because my
machine was damn slow... Oh well...

Crash in do_nmi -> nmi_enter() -> BUG_ON(in_nmi());

Thanks




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] perf and kmemcheck : fatal combination
  2011-04-25 16:08 [BUG] perf and kmemcheck : fatal combination Eric Dumazet
@ 2011-04-26  7:38 ` Peter Zijlstra
  2011-04-26  7:43   ` Pekka Enberg
  2011-04-26  8:04   ` Ingo Molnar
  0 siblings, 2 replies; 14+ messages in thread
From: Peter Zijlstra @ 2011-04-26  7:38 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Paul Mackerras,
	Pekka Enberg, Vegard Nossum, linux-kernel, Mathieu Desnoyers

On Mon, 2011-04-25 at 18:08 +0200, Eric Dumazet wrote:
> Hi guys
> 
> Just got a panic on a kmemcheck kernel, latest linux-2.6 tree.
> 
> I forgot I had kmemcheck enabled, and started "perf top" just because my
> machine was damn slow... Oh well...
> 
> Crash in do_nmi -> nmi_enter() -> BUG_ON(in_nmi());

Hmm,. I bet because kmemcheck triggers faults from nmi context because
it messes about with the page protection bits a lot to track things.

Can't really think of anything except not making perf available on
kmemcheck kernels.

---
 init/Kconfig |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index 32745bf..94735b4 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1125,6 +1125,7 @@ config PERF_EVENTS
 	bool "Kernel performance events and counters"
 	default y if (PROFILING || PERF_COUNTERS)
 	depends on HAVE_PERF_EVENTS
+	depends on !KMEMCHECK
 	select ANON_INODES
 	select IRQ_WORK
 	help


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [BUG] perf and kmemcheck : fatal combination
  2011-04-26  7:38 ` Peter Zijlstra
@ 2011-04-26  7:43   ` Pekka Enberg
  2011-04-26  8:04   ` Ingo Molnar
  1 sibling, 0 replies; 14+ messages in thread
From: Pekka Enberg @ 2011-04-26  7:43 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Eric Dumazet, Ingo Molnar, Arnaldo Carvalho de Melo,
	Paul Mackerras, Vegard Nossum, linux-kernel, Mathieu Desnoyers

On Tue, Apr 26, 2011 at 10:38 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> On Mon, 2011-04-25 at 18:08 +0200, Eric Dumazet wrote:
>> Hi guys
>>
>> Just got a panic on a kmemcheck kernel, latest linux-2.6 tree.
>>
>> I forgot I had kmemcheck enabled, and started "perf top" just because my
>> machine was damn slow... Oh well...
>>
>> Crash in do_nmi -> nmi_enter() -> BUG_ON(in_nmi());
>
> Hmm,. I bet because kmemcheck triggers faults from nmi context because
> it messes about with the page protection bits a lot to track things.
>
> Can't really think of anything except not making perf available on
> kmemcheck kernels.
>
> ---
>  init/Kconfig |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/init/Kconfig b/init/Kconfig
> index 32745bf..94735b4 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1125,6 +1125,7 @@ config PERF_EVENTS
>        bool "Kernel performance events and counters"
>        default y if (PROFILING || PERF_COUNTERS)
>        depends on HAVE_PERF_EVENTS
> +       depends on !KMEMCHECK
>        select ANON_INODES
>        select IRQ_WORK
>        help

Acked-by: Pekka Enberg <penberg@kernel.org>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] perf and kmemcheck : fatal combination
  2011-04-26  7:38 ` Peter Zijlstra
  2011-04-26  7:43   ` Pekka Enberg
@ 2011-04-26  8:04   ` Ingo Molnar
  2011-04-26  8:57     ` Eric Dumazet
  1 sibling, 1 reply; 14+ messages in thread
From: Ingo Molnar @ 2011-04-26  8:04 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Eric Dumazet, Arnaldo Carvalho de Melo, Paul Mackerras,
	Pekka Enberg, Vegard Nossum, linux-kernel, Mathieu Desnoyers


* Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:

> On Mon, 2011-04-25 at 18:08 +0200, Eric Dumazet wrote:
> > Hi guys
> > 
> > Just got a panic on a kmemcheck kernel, latest linux-2.6 tree.
> > 
> > I forgot I had kmemcheck enabled, and started "perf top" just because my
> > machine was damn slow... Oh well...
> > 
> > Crash in do_nmi -> nmi_enter() -> BUG_ON(in_nmi());
> 
> Hmm,. I bet because kmemcheck triggers faults from nmi context because
> it messes about with the page protection bits a lot to track things.
> 
> Can't really think of anything except not making perf available on
> kmemcheck kernels.
> 
> ---
>  init/Kconfig |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
> 
> diff --git a/init/Kconfig b/init/Kconfig
> index 32745bf..94735b4 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1125,6 +1125,7 @@ config PERF_EVENTS
>  	bool "Kernel performance events and counters"
>  	default y if (PROFILING || PERF_COUNTERS)
>  	depends on HAVE_PERF_EVENTS
> +	depends on !KMEMCHECK
>  	select ANON_INODES
>  	select IRQ_WORK
>  	help

Eric, does it manage to limp along if you remove the BUG_ON()?

That risks NMI recursion but maybe it allows you to see why things are slow, 
before it crashes ;-)

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] perf and kmemcheck : fatal combination
  2011-04-26  8:04   ` Ingo Molnar
@ 2011-04-26  8:57     ` Eric Dumazet
  2011-04-26  9:53       ` Eric Dumazet
  0 siblings, 1 reply; 14+ messages in thread
From: Eric Dumazet @ 2011-04-26  8:57 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Arnaldo Carvalho de Melo, Paul Mackerras,
	Pekka Enberg, Vegard Nossum, linux-kernel, Mathieu Desnoyers

[-- Attachment #1: Type: text/plain, Size: 309 bytes --]

Le mardi 26 avril 2011 à 10:04 +0200, Ingo Molnar a écrit :

> Eric, does it manage to limp along if you remove the BUG_ON()?
> 
> That risks NMI recursion but maybe it allows you to see why things are slow, 
> before it crashes ;-)
> 

If I remove the BUG_ON from nmi_enter, it seems to crash very fast 



[-- Attachment #2: capture2.png --]
[-- Type: image/png, Size: 132197 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] perf and kmemcheck : fatal combination
  2011-04-26  8:57     ` Eric Dumazet
@ 2011-04-26  9:53       ` Eric Dumazet
  2011-04-26 10:08         ` Pekka Enberg
  2011-04-26 13:53         ` Mathieu Desnoyers
  0 siblings, 2 replies; 14+ messages in thread
From: Eric Dumazet @ 2011-04-26  9:53 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Arnaldo Carvalho de Melo, Paul Mackerras,
	Pekka Enberg, Vegard Nossum, linux-kernel, Mathieu Desnoyers

Le mardi 26 avril 2011 à 10:57 +0200, Eric Dumazet a écrit :
> Le mardi 26 avril 2011 à 10:04 +0200, Ingo Molnar a écrit :
> 
> > Eric, does it manage to limp along if you remove the BUG_ON()?
> > 
> > That risks NMI recursion but maybe it allows you to see why things are slow, 
> > before it crashes ;-)
> > 
> 
> If I remove the BUG_ON from nmi_enter, it seems to crash very fast 
> 
> 

Before you ask, some more complete netconsole traces :



[  306.657192] ------------[ cut here ]------------
[  306.657195] ------------[ cut here ]------------
[  306.657202] WARNING: at arch/x86/mm/kmemcheck/kmemcheck.c:634 kmemcheck_fault+0xa9/0xc0()
[  306.657204] Hardware name: ProLiant BL460c G6
[  306.657205] Modules linked in: nfsd lockd auth_rpcgss sunrpc tg3 libphy sg [last unloaded: x_tables]
[  306.657211] Pid: 3955, comm: perf Not tainted 2.6.39-rc4-00369-g23cf772-dirty #559
[  306.657212] Call Trace:
[  306.657214]  <NMI>  [<ffffffff8102ac39>] ? kmemcheck_fault+0xa9/0xc0
[  306.657221]  [<ffffffff810427db>] warn_slowpath_common+0x8b/0xc0
[  306.657223]  [<ffffffff81042825>] warn_slowpath_null+0x15/0x20
[  306.657226]  [<ffffffff8102ac39>] kmemcheck_fault+0xa9/0xc0
[  306.657229]  [<ffffffff8147ca4b>] do_page_fault+0x1fb/0x560
[  306.657234]  [<ffffffff811d0289>] ? put_dec+0x59/0x60
[  306.657237]  [<ffffffff811d0591>] ? number+0x301/0x330
[  306.657239]  [<ffffffff8147a48f>] page_fault+0x1f/0x30
[  306.657245]  [<ffffffff8124dce5>] ? vt_console_print+0x85/0x360
[  306.657247]  [<ffffffff8124dcda>] ? vt_console_print+0x7a/0x360
[  306.657250]  [<ffffffff81043159>] __call_console_drivers+0x89/0xa0
[  306.657252]  [<ffffffff810431bb>] _call_console_drivers+0x4b/0x80
[  306.657254]  [<ffffffff810432d7>] console_unlock+0xe7/0x1e0
[  306.657257]  [<ffffffff8104388e>] vprintk+0x1ee/0x4a0
[  306.657260]  [<ffffffff8102ac39>] ? kmemcheck_fault+0xa9/0xc0
[  306.657262]  [<ffffffff81043ba7>] printk+0x67/0x70
[  306.657264]  [<ffffffff8102ac39>] ? kmemcheck_fault+0xa9/0xc0
[  306.657267]  [<ffffffff81042789>] warn_slowpath_common+0x39/0xc0
[  306.657269]  [<ffffffff81042825>] warn_slowpath_null+0x15/0x20
[  306.657271]  [<ffffffff8102ac39>] kmemcheck_fault+0xa9/0xc0
[  306.657273]  [<ffffffff8147ca4b>] do_page_fault+0x1fb/0x560
[  306.657276]  [<ffffffff8101167b>] ? intel_pmu_drain_bts_buffer+0x2b/0x170
[  306.657279]  [<ffffffff8147a48f>] page_fault+0x1f/0x30
[  306.657282]  [<ffffffff8100ef42>] ? x86_perf_event_update+0x12/0x70
[  306.657284]  [<ffffffff810104b1>] ? intel_pmu_save_and_restart+0x11/0x20
[  306.657287]  [<ffffffff81012e84>] intel_pmu_handle_irq+0x1d4/0x420
[  306.657290]  [<ffffffff8147b570>] perf_event_nmi_handler+0x50/0xc0
[  306.657292]  [<ffffffff8147cfa3>] notifier_call_chain+0x53/0x80
[  306.657294]  [<ffffffff8147d018>] __atomic_notifier_call_chain+0x48/0x70
[  306.657296]  [<ffffffff8147d051>] atomic_notifier_call_chain+0x11/0x20
[  306.657298]  [<ffffffff8147d08e>] notify_die+0x2e/0x30
[  306.657300]  [<ffffffff8147a8af>] do_nmi+0x4f/0x200
[  306.657302]  [<ffffffff8147a6ea>] nmi+0x1a/0x20
[  306.657304]  [<ffffffff8100fd4d>] ? intel_pmu_enable_all+0x9d/0x110
[  306.657305]  <<EOE>>  [<ffffffff810104da>] intel_pmu_nhm_enable_all+0x1a/0x120
[  306.657309]  [<ffffffff810131d4>] x86_pmu_enable+0x104/0x260
[  306.657313]  [<ffffffff810a84e9>] perf_pmu_enable+0x39/0x50
[  306.657314]  [<ffffffff8101236c>] x86_pmu_add+0xac/0x120
[  306.657317]  [<ffffffff810aae68>] ? perf_install_in_context+0x18/0xa0
[  306.657319]  [<ffffffff8102b001>] ? kmemcheck_pte_lookup+0x11/0x40
[  306.657322]  [<ffffffff8147a48f>] ? page_fault+0x1f/0x30
[  306.657325]  [<ffffffff810acf15>] event_sched_in+0x65/0x110
[  306.657327]  [<ffffffff810afb95>] __perf_install_in_context+0x125/0x140
[  306.657330]  [<ffffffff810ab100>] ? perf_remove_from_context+0xa0/0xa0
[  306.657332]  [<ffffffff810ab159>] remote_function+0x59/0x70
[  306.657335]  [<ffffffff81075d6e>] smp_call_function_single+0x8e/0x170
[  306.657338]  [<ffffffff810a86a4>] cpu_function_call+0x34/0x40
[  306.657340]  [<ffffffff810afa70>] ? perf_tp_event+0xf0/0xf0
[  306.657342]  [<ffffffff810aaedf>] perf_install_in_context+0x8f/0xa0
[  306.657345]  [<ffffffff810b0792>] sys_perf_event_open+0x592/0x7a0
[  306.657348]  [<ffffffff814819a9>] sysenter_dispatch+0x7/0x27
[  306.657350] ---[ end trace 7333dc2d81c31e96 ]---
[  306.699715] WARNING: at arch/x86/mm/kmemcheck/kmemcheck.c:634 kmemcheck_fault+0xa9/0xc0()
[  306.700659] Hardware name: ProLiant BL460c G6
[  306.701487] Modules linked in: nfsd lockd auth_rpcgss sunrpc tg3 libphy sg [last unloaded: x_tables]
[  306.704964] Pid: 3955, comm: perf Tainted: G        W   2.6.39-rc4-00369-g23cf772-dirty #559
[  306.705922] Call Trace:
[  306.706405]  <NMI>  [<ffffffff8102ac39>] ? kmemcheck_fault+0xa9/0xc0
[  306.707439]  [<ffffffff810427db>] warn_slowpath_common+0x8b/0xc0
[  306.708173]  [<ffffffff81042825>] warn_slowpath_null+0x15/0x20
[  306.708893]  [<ffffffff8102ac39>] kmemcheck_fault+0xa9/0xc0
[  306.709597]  [<ffffffff8147ca4b>] do_page_fault+0x1fb/0x560
[  306.710301]  [<ffffffff8101167b>] ? intel_pmu_drain_bts_buffer+0x2b/0x170
[  306.711091]  [<ffffffff8147a48f>] page_fault+0x1f/0x30
[  306.711764]  [<ffffffff8100ef42>] ? x86_perf_event_update+0x12/0x70
[  306.712727]  [<ffffffff810104b1>] ? intel_pmu_save_and_restart+0x11/0x20
[  306.713509]  [<ffffffff81012e84>] intel_pmu_handle_irq+0x1d4/0x420
[  306.714254]  [<ffffffff8147b570>] perf_event_nmi_handler+0x50/0xc0
[  306.714999]  [<ffffffff8147cfa3>] notifier_call_chain+0x53/0x80
[  306.715728]  [<ffffffff8147d018>] __atomic_notifier_call_chain+0x48/0x70
[  306.716510]  [<ffffffff8147d051>] atomic_notifier_call_chain+0x11/0x20
[  306.717279]  [<ffffffff8147d08e>] notify_die+0x2e/0x30
[  306.717951]  [<ffffffff8147a8af>] do_nmi+0x4f/0x200
[  306.718605]  [<ffffffff8147a6ea>] nmi+0x1a/0x20
[  306.719237]  [<ffffffff8100fd4d>] ? intel_pmu_enable_all+0x9d/0x110
[  306.719988]  <<EOE>>  [<ffffffff810104da>] intel_pmu_nhm_enable_all+0x1a/0x120
[  306.721347]  [<ffffffff810131d4>] x86_pmu_enable+0x104/0x260
[  306.722056]  [<ffffffff810a84e9>] perf_pmu_enable+0x39/0x50
[  306.722760]  [<ffffffff8101236c>] x86_pmu_add+0xac/0x120
[  306.723445]  [<ffffffff810aae68>] ? perf_install_in_context+0x18/0xa0
[  306.724210]  [<ffffffff8102b001>] ? kmemcheck_pte_lookup+0x11/0x40
[  306.724955]  [<ffffffff8147a48f>] ? page_fault+0x1f/0x30
[  306.725640]  [<ffffffff810acf15>] event_sched_in+0x65/0x110
[  306.726345]  [<ffffffff810afb95>] __perf_install_in_context+0x125/0x140
[  306.727124]  [<ffffffff810ab100>] ? perf_remove_from_context+0xa0/0xa0
[  306.727893]  [<ffffffff810ab159>] remote_function+0x59/0x70
[  306.728597]  [<ffffffff81075d6e>] smp_call_function_single+0x8e/0x170
[  306.729363]  [<ffffffff810a86a4>] cpu_function_call+0x34/0x40
[  306.730079]  [<ffffffff810afa70>] ? perf_tp_event+0xf0/0xf0
[  306.730783]  [<ffffffff810aaedf>] perf_install_in_context+0x8f/0xa0
[  306.731535]  [<ffffffff810b0792>] sys_perf_event_open+0x592/0x7a0
[  306.732277]  [<ffffffff814819a9>] sysenter_dispatch+0x7/0x27
[  306.735272] ---[ end trace 7333dc2d81c31e97 ]---
[  306.736401] BUG: unable to handle kernel 



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] perf and kmemcheck : fatal combination
  2011-04-26  9:53       ` Eric Dumazet
@ 2011-04-26 10:08         ` Pekka Enberg
  2011-04-26 10:27           ` Eric Dumazet
  2011-04-26 13:53         ` Mathieu Desnoyers
  1 sibling, 1 reply; 14+ messages in thread
From: Pekka Enberg @ 2011-04-26 10:08 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Ingo Molnar, Peter Zijlstra, Arnaldo Carvalho de Melo,
	Paul Mackerras, Vegard Nossum, linux-kernel, Mathieu Desnoyers

On Tue, Apr 26, 2011 at 12:53 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le mardi 26 avril 2011 à 10:57 +0200, Eric Dumazet a écrit :
>> Le mardi 26 avril 2011 à 10:04 +0200, Ingo Molnar a écrit :
>>
>> > Eric, does it manage to limp along if you remove the BUG_ON()?
>> >
>> > That risks NMI recursion but maybe it allows you to see why things are slow,
>> > before it crashes ;-)
>> >
>>
>> If I remove the BUG_ON from nmi_enter, it seems to crash very fast
>
> Before you ask, some more complete netconsole traces :
>
> [  306.657192] ------------[ cut here ]------------
> [  306.657195] ------------[ cut here ]------------
> [  306.657202] WARNING: at arch/x86/mm/kmemcheck/kmemcheck.c:634 kmemcheck_fault+0xa9/0xc0()
> [  306.657204] Hardware name: ProLiant BL460c G6
> [  306.657205] Modules linked in: nfsd lockd auth_rpcgss sunrpc tg3 libphy sg [last unloaded: x_tables]
> [  306.657211] Pid: 3955, comm: perf Not tainted 2.6.39-rc4-00369-g23cf772-dirty #559
> [  306.657212] Call Trace:
> [  306.657214]  <NMI>  [<ffffffff8102ac39>] ? kmemcheck_fault+0xa9/0xc0
> [  306.657221]  [<ffffffff810427db>] warn_slowpath_common+0x8b/0xc0
> [  306.657223]  [<ffffffff81042825>] warn_slowpath_null+0x15/0x20
> [  306.657226]  [<ffffffff8102ac39>] kmemcheck_fault+0xa9/0xc0
> [  306.657229]  [<ffffffff8147ca4b>] do_page_fault+0x1fb/0x560
> [  306.657234]  [<ffffffff811d0289>] ? put_dec+0x59/0x60
> [  306.657237]  [<ffffffff811d0591>] ? number+0x301/0x330
> [  306.657239]  [<ffffffff8147a48f>] page_fault+0x1f/0x30
> [  306.657245]  [<ffffffff8124dce5>] ? vt_console_print+0x85/0x360
> [  306.657247]  [<ffffffff8124dcda>] ? vt_console_print+0x7a/0x360
> [  306.657250]  [<ffffffff81043159>] __call_console_drivers+0x89/0xa0
> [  306.657252]  [<ffffffff810431bb>] _call_console_drivers+0x4b/0x80
> [  306.657254]  [<ffffffff810432d7>] console_unlock+0xe7/0x1e0
> [  306.657257]  [<ffffffff8104388e>] vprintk+0x1ee/0x4a0
> [  306.657260]  [<ffffffff8102ac39>] ? kmemcheck_fault+0xa9/0xc0
> [  306.657262]  [<ffffffff81043ba7>] printk+0x67/0x70
> [  306.657264]  [<ffffffff8102ac39>] ? kmemcheck_fault+0xa9/0xc0
> [  306.657267]  [<ffffffff81042789>] warn_slowpath_common+0x39/0xc0
> [  306.657269]  [<ffffffff81042825>] warn_slowpath_null+0x15/0x20
> [  306.657271]  [<ffffffff8102ac39>] kmemcheck_fault+0xa9/0xc0
> [  306.657273]  [<ffffffff8147ca4b>] do_page_fault+0x1fb/0x560
> [  306.657276]  [<ffffffff8101167b>] ? intel_pmu_drain_bts_buffer+0x2b/0x170
> [  306.657279]  [<ffffffff8147a48f>] page_fault+0x1f/0x30
> [  306.657282]  [<ffffffff8100ef42>] ? x86_perf_event_update+0x12/0x70
> [  306.657284]  [<ffffffff810104b1>] ? intel_pmu_save_and_restart+0x11/0x20
> [  306.657287]  [<ffffffff81012e84>] intel_pmu_handle_irq+0x1d4/0x420
> [  306.657290]  [<ffffffff8147b570>] perf_event_nmi_handler+0x50/0xc0
> [  306.657292]  [<ffffffff8147cfa3>] notifier_call_chain+0x53/0x80
> [  306.657294]  [<ffffffff8147d018>] __atomic_notifier_call_chain+0x48/0x70
> [  306.657296]  [<ffffffff8147d051>] atomic_notifier_call_chain+0x11/0x20
> [  306.657298]  [<ffffffff8147d08e>] notify_die+0x2e/0x30
> [  306.657300]  [<ffffffff8147a8af>] do_nmi+0x4f/0x200
> [  306.657302]  [<ffffffff8147a6ea>] nmi+0x1a/0x20
> [  306.657304]  [<ffffffff8100fd4d>] ? intel_pmu_enable_all+0x9d/0x110
> [  306.657305]  <<EOE>>  [<ffffffff810104da>] intel_pmu_nhm_enable_all+0x1a/0x120
> [  306.657309]  [<ffffffff810131d4>] x86_pmu_enable+0x104/0x260
> [  306.657313]  [<ffffffff810a84e9>] perf_pmu_enable+0x39/0x50
> [  306.657314]  [<ffffffff8101236c>] x86_pmu_add+0xac/0x120
> [  306.657317]  [<ffffffff810aae68>] ? perf_install_in_context+0x18/0xa0
> [  306.657319]  [<ffffffff8102b001>] ? kmemcheck_pte_lookup+0x11/0x40
> [  306.657322]  [<ffffffff8147a48f>] ? page_fault+0x1f/0x30
> [  306.657325]  [<ffffffff810acf15>] event_sched_in+0x65/0x110
> [  306.657327]  [<ffffffff810afb95>] __perf_install_in_context+0x125/0x140
> [  306.657330]  [<ffffffff810ab100>] ? perf_remove_from_context+0xa0/0xa0
> [  306.657332]  [<ffffffff810ab159>] remote_function+0x59/0x70
> [  306.657335]  [<ffffffff81075d6e>] smp_call_function_single+0x8e/0x170
> [  306.657338]  [<ffffffff810a86a4>] cpu_function_call+0x34/0x40
> [  306.657340]  [<ffffffff810afa70>] ? perf_tp_event+0xf0/0xf0
> [  306.657342]  [<ffffffff810aaedf>] perf_install_in_context+0x8f/0xa0
> [  306.657345]  [<ffffffff810b0792>] sys_perf_event_open+0x592/0x7a0
> [  306.657348]  [<ffffffff814819a9>] sysenter_dispatch+0x7/0x27
> [  306.657350] ---[ end trace 7333dc2d81c31e96 ]---

That's just kmemcheck fault handler warning about in_nmi(). You could
try to make the relevant perf allocations use __GFP_NOTRACK and/or
SLAB_NOTRACK to avoid page faulting in the perf nmi handler.

                        Pekka

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] perf and kmemcheck : fatal combination
  2011-04-26 10:08         ` Pekka Enberg
@ 2011-04-26 10:27           ` Eric Dumazet
  2011-04-26 12:27             ` Eric Dumazet
  0 siblings, 1 reply; 14+ messages in thread
From: Eric Dumazet @ 2011-04-26 10:27 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Ingo Molnar, Peter Zijlstra, Arnaldo Carvalho de Melo,
	Paul Mackerras, Vegard Nossum, linux-kernel, Mathieu Desnoyers

Le mardi 26 avril 2011 à 13:08 +0300, Pekka Enberg a écrit :

> That's just kmemcheck fault handler warning about in_nmi(). You could
> try to make the relevant perf allocations use __GFP_NOTRACK and/or
> SLAB_NOTRACK to avoid page faulting in the perf nmi handler.

Yes, I am going to try that, thanks



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] perf and kmemcheck : fatal combination
  2011-04-26 10:27           ` Eric Dumazet
@ 2011-04-26 12:27             ` Eric Dumazet
  2011-04-26 12:33               ` Peter Zijlstra
  0 siblings, 1 reply; 14+ messages in thread
From: Eric Dumazet @ 2011-04-26 12:27 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Ingo Molnar, Peter Zijlstra, Arnaldo Carvalho de Melo,
	Paul Mackerras, Vegard Nossum, linux-kernel, Mathieu Desnoyers

Le mardi 26 avril 2011 à 12:27 +0200, Eric Dumazet a écrit :
> Le mardi 26 avril 2011 à 13:08 +0300, Pekka Enberg a écrit :
> 
> > That's just kmemcheck fault handler warning about in_nmi(). You could
> > try to make the relevant perf allocations use __GFP_NOTRACK and/or
> > SLAB_NOTRACK to avoid page faulting in the perf nmi handler.
> 
> Yes, I am going to try that, thanks
> 

Thats far from trivial, maybe because we dont have NOTRACK api for
percpu allocations ?

I tried without success following patch

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 632e5dc..bea4949 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1632,7 +1632,7 @@ static int validate_event(struct perf_event *event)
 	struct event_constraint *c;
 	int ret = 0;
 
-	fake_cpuc = kmalloc(sizeof(*fake_cpuc), GFP_KERNEL | __GFP_ZERO);
+	fake_cpuc = kmalloc(sizeof(*fake_cpuc), GFP_KERNEL | __GFP_ZERO | ___GFP_NOTRACK);
 	if (!fake_cpuc)
 		return -ENOMEM;
 
@@ -1667,7 +1667,7 @@ static int validate_group(struct perf_event *event)
 	int ret, n;
 
 	ret = -ENOMEM;
-	fake_cpuc = kmalloc(sizeof(*fake_cpuc), GFP_KERNEL | __GFP_ZERO);
+	fake_cpuc = kmalloc(sizeof(*fake_cpuc), GFP_KERNEL | __GFP_ZERO | ___GFP_NOTRACK);
 	if (!fake_cpuc)
 		goto out;
 
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index 43fa20b..a659b61 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1209,7 +1209,7 @@ static int intel_pmu_cpu_prepare(int cpu)
 		return NOTIFY_OK;
 
 	cpuc->per_core = kzalloc_node(sizeof(struct intel_percore),
-				      GFP_KERNEL, cpu_to_node(cpu));
+				      GFP_KERNEL | ___GFP_NOTRACK, cpu_to_node(cpu));
 	if (!cpuc->per_core)
 		return NOTIFY_BAD;
 
diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index bab491b..e921a2f 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -84,7 +84,7 @@ static int alloc_pebs_buffer(int cpu)
 	if (!x86_pmu.pebs)
 		return 0;
 
-	buffer = kmalloc_node(PEBS_BUFFER_SIZE, GFP_KERNEL | __GFP_ZERO, node);
+	buffer = kmalloc_node(PEBS_BUFFER_SIZE, GFP_KERNEL | __GFP_ZERO | ___GFP_NOTRACK, node);
 	if (unlikely(!buffer))
 		return -ENOMEM;
 
@@ -122,7 +122,7 @@ static int alloc_bts_buffer(int cpu)
 	if (!x86_pmu.bts)
 		return 0;
 
-	buffer = kmalloc_node(BTS_BUFFER_SIZE, GFP_KERNEL | __GFP_ZERO, node);
+	buffer = kmalloc_node(BTS_BUFFER_SIZE, GFP_KERNEL | __GFP_ZERO | ___GFP_NOTRACK, node);
 	if (unlikely(!buffer))
 		return -ENOMEM;
 
@@ -155,7 +155,7 @@ static int alloc_ds_buffer(int cpu)
 	int node = cpu_to_node(cpu);
 	struct debug_store *ds;
 
-	ds = kmalloc_node(sizeof(*ds), GFP_KERNEL | __GFP_ZERO, node);
+	ds = kmalloc_node(sizeof(*ds), GFP_KERNEL | __GFP_ZERO | ___GFP_NOTRACK, node);
 	if (unlikely(!ds))
 		return -ENOMEM;
 
diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
index ba36217..8c2e3e6 100644
--- a/include/linux/hardirq.h
+++ b/include/linux/hardirq.h
@@ -211,7 +211,6 @@ extern void irq_exit(void);
 #define nmi_enter()						\
 	do {							\
 		ftrace_nmi_enter();				\
-		BUG_ON(in_nmi());				\
 		add_preempt_count(NMI_OFFSET + HARDIRQ_OFFSET);	\
 		lockdep_off();					\
 		rcu_nmi_enter();				\
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 8e81a98..b09ba81 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -2589,14 +2589,14 @@ static int alloc_callchain_buffers(void)
 	 */
 	size = offsetof(struct callchain_cpus_entries, cpu_entries[nr_cpu_ids]);
 
-	entries = kzalloc(size, GFP_KERNEL);
+	entries = kzalloc(size, GFP_KERNEL | ___GFP_NOTRACK);
 	if (!entries)
 		return -ENOMEM;
 
 	size = sizeof(struct perf_callchain_entry) * PERF_NR_CONTEXTS;
 
 	for_each_possible_cpu(cpu) {
-		entries->cpu_entries[cpu] = kmalloc_node(size, GFP_KERNEL,
+		entries->cpu_entries[cpu] = kmalloc_node(size, GFP_KERNEL | ___GFP_NOTRACK,
 							 cpu_to_node(cpu));
 		if (!entries->cpu_entries[cpu])
 			goto fail;
@@ -2756,7 +2756,8 @@ alloc_perf_context(struct pmu *pmu, struct task_struct *task)
 {
 	struct perf_event_context *ctx;
 
-	ctx = kzalloc(sizeof(struct perf_event_context), GFP_KERNEL);
+	ctx = kzalloc(sizeof(struct perf_event_context),
+		      GFP_KERNEL | ___GFP_NOTRACK);
 	if (!ctx)
 		return NULL;
 
@@ -3451,7 +3452,7 @@ static void *perf_mmap_alloc_page(int cpu)
 	int node;
 
 	node = (cpu == -1) ? cpu : cpu_to_node(cpu);
-	page = alloc_pages_node(node, GFP_KERNEL | __GFP_ZERO, 0);
+	page = alloc_pages_node(node, GFP_KERNEL | __GFP_ZERO | ___GFP_NOTRACK, 0);
 	if (!page)
 		return NULL;
 
@@ -3468,7 +3469,7 @@ perf_buffer_alloc(int nr_pages, long watermark, int cpu, int flags)
 	size = sizeof(struct perf_buffer);
 	size += nr_pages * sizeof(void *);
 
-	buffer = kzalloc(size, GFP_KERNEL);
+	buffer = kzalloc(size, GFP_KERNEL | ___GFP_NOTRACK);
 	if (!buffer)
 		goto fail;
 
@@ -3585,7 +3586,7 @@ perf_buffer_alloc(int nr_pages, long watermark, int cpu, int flags)
 	size = sizeof(struct perf_buffer);
 	size += sizeof(void *);
 
-	buffer = kzalloc(size, GFP_KERNEL);
+	buffer = kzalloc(size, GFP_KERNEL | ___GFP_NOTRACK);
 	if (!buffer)
 		goto fail;
 
@@ -4841,7 +4842,7 @@ static void perf_event_mmap_event(struct perf_mmap_event *mmap_event)
 		 * need to add enough zero bytes after the string to handle
 		 * the 64bit alignment we do later.
 		 */
-		buf = kzalloc(PATH_MAX + sizeof(u64), GFP_KERNEL);
+		buf = kzalloc(PATH_MAX + sizeof(u64), GFP_KERNEL | ___GFP_NOTRACK);
 		if (!buf) {
 			name = strncpy(tmp, "//enomem", sizeof(tmp));
 			goto got_name;
@@ -5385,7 +5386,7 @@ static int swevent_hlist_get_cpu(struct perf_event *event, int cpu)
 	if (!swevent_hlist_deref(swhash) && cpu_online(cpu)) {
 		struct swevent_hlist *hlist;
 
-		hlist = kzalloc(sizeof(*hlist), GFP_KERNEL);
+		hlist = kzalloc(sizeof(*hlist), GFP_KERNEL | ___GFP_NOTRACK);
 		if (!hlist) {
 			err = -ENOMEM;
 			goto exit;
@@ -5969,7 +5970,7 @@ static int pmu_dev_alloc(struct pmu *pmu)
 {
 	int ret = -ENOMEM;
 
-	pmu->dev = kzalloc(sizeof(struct device), GFP_KERNEL);
+	pmu->dev = kzalloc(sizeof(struct device), GFP_KERNEL | ___GFP_NOTRACK);
 	if (!pmu->dev)
 		goto out;
 
@@ -6170,7 +6171,7 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
 			return ERR_PTR(-EINVAL);
 	}
 
-	event = kzalloc(sizeof(*event), GFP_KERNEL);
+	event = kzalloc(sizeof(*event), GFP_KERNEL | ___GFP_NOTRACK);
 	if (!event)
 		return ERR_PTR(-ENOMEM);
 
@@ -7222,7 +7223,8 @@ static void __cpuinit perf_event_init_cpu(int cpu)
 	if (swhash->hlist_refcount > 0) {
 		struct swevent_hlist *hlist;
 
-		hlist = kzalloc_node(sizeof(*hlist), GFP_KERNEL, cpu_to_node(cpu));
+		hlist = kzalloc_node(sizeof(*hlist), GFP_KERNEL | ___GFP_NOTRACK,
+				     cpu_to_node(cpu));
 		WARN_ON(!hlist);
 		rcu_assign_pointer(swhash->swevent_hlist, hlist);
 	}



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [BUG] perf and kmemcheck : fatal combination
  2011-04-26 12:27             ` Eric Dumazet
@ 2011-04-26 12:33               ` Peter Zijlstra
  2011-04-26 12:56                 ` Eric Dumazet
  2011-04-26 19:13                 ` Pekka Enberg
  0 siblings, 2 replies; 14+ messages in thread
From: Peter Zijlstra @ 2011-04-26 12:33 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Pekka Enberg, Ingo Molnar, Arnaldo Carvalho de Melo,
	Paul Mackerras, Vegard Nossum, linux-kernel, Mathieu Desnoyers

On Tue, 2011-04-26 at 14:27 +0200, Eric Dumazet wrote:
> Thats far from trivial, maybe because we dont have NOTRACK api for
> percpu allocations ? 

We can't use per-cpu allocations from NMI context because of the same
problem, per-cpu uses vmalloc and vmalloc needs faults. Hence that
shouldn't be a problem.

It looks like you covered most of it though, the buffer and the
callchain stuff, aside from that it should only use some static data.

Pekka, what does kmemcheck do for .data and .bss things?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] perf and kmemcheck : fatal combination
  2011-04-26 12:33               ` Peter Zijlstra
@ 2011-04-26 12:56                 ` Eric Dumazet
  2011-04-26 13:09                   ` Eric Dumazet
  2011-04-26 19:13                 ` Pekka Enberg
  1 sibling, 1 reply; 14+ messages in thread
From: Eric Dumazet @ 2011-04-26 12:56 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Pekka Enberg, Ingo Molnar, Arnaldo Carvalho de Melo,
	Paul Mackerras, Vegard Nossum, linux-kernel, Mathieu Desnoyers

Le mardi 26 avril 2011 à 14:33 +0200, Peter Zijlstra a écrit :
> On Tue, 2011-04-26 at 14:27 +0200, Eric Dumazet wrote:
> > Thats far from trivial, maybe because we dont have NOTRACK api for
> > percpu allocations ? 
> 
> We can't use per-cpu allocations from NMI context because of the same
> problem, per-cpu uses vmalloc and vmalloc needs faults. Hence that
> shouldn't be a problem.
> 
> It looks like you covered most of it though, the buffer and the
> callchain stuff, aside from that it should only use some static data.
> 
> Pekka, what does kmemcheck do for .data and .bss things?

Hmm, maybe I have a problem because of the WARN_ON in kmemcheck and my
boot had "log_buf_len=32M", so kmemcheck was called again


I am now trying to remve the line 634 from
arch/x86/mm/kmemcheck/kmemcheck.c

[ 4564.554310] WARNING: at arch/x86/mm/kmemcheck/kmemcheck.c:634 kmemcheck_fault+0xa9/0xc0()
[ 4564.554312] Hardware name: ProLiant BL460c G6
[ 4564.554313] Modules linked in: nfsd lockd auth_rpcgss sunrpc tg3 libphy sg [last unloaded: x_tables]
[ 4564.554319] Pid: 4276, comm: perf Not tainted 2.6.39-rc4-00369-g23cf772-dirty #561
[ 4564.554320] Call Trace:
[ 4564.554322]  <NMI>  [<ffffffff8102ac39>] ? kmemcheck_fault+0xa9/0xc0
[ 4564.554329]  [<ffffffff810427db>] warn_slowpath_common+0x8b/0xc0
[ 4564.554331]  [<ffffffff81042825>] warn_slowpath_null+0x15/0x20
[ 4564.554333]  [<ffffffff8102ac39>] kmemcheck_fault+0xa9/0xc0
[ 4564.554337]  [<ffffffff8147ca4b>] do_page_fault+0x1fb/0x560
[ 4564.554342]  [<ffffffff811d0289>] ? put_dec+0x59/0x60
[ 4564.554344]  [<ffffffff811d0591>] ? number+0x301/0x330
[ 4564.554347]  [<ffffffff8147a48f>] page_fault+0x1f/0x30
[ 4564.554352]  [<ffffffff8124dce5>] ? vt_console_print+0x85/0x360
[ 4564.554355]  [<ffffffff8124dcda>] ? vt_console_print+0x7a/0x360
[ 4564.554358]  [<ffffffff81043159>] __call_console_drivers+0x89/0xa0
[ 4564.554360]  [<ffffffff810431bb>] _call_console_drivers+0x4b/0x80
[ 4564.554362]  [<ffffffff810432d7>] console_unlock+0xe7/0x1e0
[ 4564.554365]  [<ffffffff8104388e>] vprintk+0x1ee/0x4a0
[ 4564.554368]  [<ffffffff8102ac39>] ? kmemcheck_fault+0xa9/0xc0
[ 4564.554370]  [<ffffffff81043ba7>] printk+0x67/0x70
[ 4564.554372]  [<ffffffff8102ac39>] ? kmemcheck_fault+0xa9/0xc0
[ 4564.554375]  [<ffffffff81042789>] warn_slowpath_common+0x39/0xc0
[ 4564.554377]  [<ffffffff81042825>] warn_slowpath_null+0x15/0x20
[ 4564.554379]  [<ffffffff8102ac39>] kmemcheck_fault+0xa9/0xc0
[ 4564.554381]  [<ffffffff8147ca4b>] do_page_fault+0x1fb/0x560
[ 4564.554384]  [<ffffffff8101167b>] ? intel_pmu_drain_bts_buffer+0x2b/0x170
[ 4564.554387]  [<ffffffff8147a48f>] page_fault+0x1f/0x30
[ 4564.554390]  [<ffffffff8100ef42>] ? x86_perf_event_update+0x12/0x70
[ 4564.554392]  [<ffffffff810104b1>] ? intel_pmu_save_and_restart+0x11/0x20
[ 4564.554394]  [<ffffffff81012e84>] intel_pmu_handle_irq+0x1d4/0x420
[ 4564.554397]  [<ffffffff8147b570>] perf_event_nmi_handler+0x50/0xc0
[ 4564.554399]  [<ffffffff8147cfa3>] notifier_call_chain+0x53/0x80
[ 4564.554401]  [<ffffffff8147d018>] __atomic_notifier_call_chain+0x48/0x70
[ 4564.554404]  [<ffffffff8147d051>] atomic_notifier_call_chain+0x11/0x20
[ 4564.554406]  [<ffffffff8147d08e>] notify_die+0x2e/0x30
[ 4564.554407]  [<ffffffff8147a8af>] do_nmi+0x4f/0x200
[ 4564.554409]  [<ffffffff8147a6ea>] nmi+0x1a/0x20
[ 4564.554411]  [<ffffffff8100fd4d>] ? intel_pmu_enable_all+0x9d/0x110




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] perf and kmemcheck : fatal combination
  2011-04-26 12:56                 ` Eric Dumazet
@ 2011-04-26 13:09                   ` Eric Dumazet
  0 siblings, 0 replies; 14+ messages in thread
From: Eric Dumazet @ 2011-04-26 13:09 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Pekka Enberg, Ingo Molnar, Arnaldo Carvalho de Melo,
	Paul Mackerras, Vegard Nossum, linux-kernel, Mathieu Desnoyers

Le mardi 26 avril 2011 à 14:56 +0200, Eric Dumazet a écrit :
> Le mardi 26 avril 2011 à 14:33 +0200, Peter Zijlstra a écrit :
> > On Tue, 2011-04-26 at 14:27 +0200, Eric Dumazet wrote:
> > > Thats far from trivial, maybe because we dont have NOTRACK api for
> > > percpu allocations ? 
> > 
> > We can't use per-cpu allocations from NMI context because of the same
> > problem, per-cpu uses vmalloc and vmalloc needs faults. Hence that
> > shouldn't be a problem.
> > 
> > It looks like you covered most of it though, the buffer and the
> > callchain stuff, aside from that it should only use some static data.
> > 
> > Pekka, what does kmemcheck do for .data and .bss things?
> 
> Hmm, maybe I have a problem because of the WARN_ON in kmemcheck and my
> boot had "log_buf_len=32M", so kmemcheck was called again
> 
> 
> I am now trying to remve the line 634 from
> arch/x86/mm/kmemcheck/kmemcheck.c

Yes, its making some progress

[  328.696312] BUG: unable to handle kernel 
[  328.697078] kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
[  328.698284] BUG: unable to handle kernel paging request at ffff88011fc07b58
[  328.699569] IP: [<ffff88011fc07968>] 0xffff88011fc07967
[  328.700488] PGD 1a94063 PUD 11f6f9067 PMD 11f7f8067 PTE 800000011fc07163
[  328.702327] Oops: 0011 [#1] PREEMPT SMP 
[  328.703693] last sysfs file: /sys/devices/system/cpu/online
[  328.704361] CPU 0 
[  328.704777] Modules linked in: nfsd lockd auth_rpcgss sunrpc tg3 libphy sg [last unloaded: x_tables]
[  328.708693] 
[  328.709075] Pid: 4019, comm: perf Not tainted 2.6.39-rc4-00369-g23cf772-dirty #562 HP ProLiant BL460c G6
[  328.710581] RIP: 3c4c:[<ffff88011fc07968>]  [<ffff88011fc07968>] 0xffff88011fc07967
[  328.711692] RSP: 0000:ffff88011b3cbfd8  EFLAGS: ffff88011fc0793c
[  328.712388] RAX: ffff88011b3ca000 RBX: ffff88011fc07ff8 RCX: ffff88011fc07158
[  328.713180] RDX: ffff88011fc03fc0 RSI: ffff88011fc00000 RDI: ffff88011fc07158
[  328.713959] RBP: ffff88011fc07888 R08: ffffffff81601680 R09: ffff880114278400
[  328.714738] R10: ffffffff81004ed3 R11: ffff88011fc078d8 R12: ffffffff8105eb99
[  328.715516] R13: ffff88011fc07888 R14: 0000000000000018 R15: ffff880112638238
[  328.716294] FS:  0000000000000000(0000) GS:ffff88011fc00000(0063) knlGS:00000000f77176c0



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] perf and kmemcheck : fatal combination
  2011-04-26  9:53       ` Eric Dumazet
  2011-04-26 10:08         ` Pekka Enberg
@ 2011-04-26 13:53         ` Mathieu Desnoyers
  1 sibling, 0 replies; 14+ messages in thread
From: Mathieu Desnoyers @ 2011-04-26 13:53 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Ingo Molnar, Peter Zijlstra, Arnaldo Carvalho de Melo,
	Paul Mackerras, Pekka Enberg, Vegard Nossum, linux-kernel

* Eric Dumazet (eric.dumazet@gmail.com) wrote:
> Le mardi 26 avril 2011 à 10:57 +0200, Eric Dumazet a écrit :
> > Le mardi 26 avril 2011 à 10:04 +0200, Ingo Molnar a écrit :
> > 
> > > Eric, does it manage to limp along if you remove the BUG_ON()?
> > > 
> > > That risks NMI recursion but maybe it allows you to see why things are slow, 
> > > before it crashes ;-)
> > > 
> > 
> > If I remove the BUG_ON from nmi_enter, it seems to crash very fast 
> > 
> > 
> 
> Before you ask, some more complete netconsole traces :
[...]
> [  306.657279]  [<ffffffff8147a48f>] page_fault+0x1f/0x30
> [  306.657282]  [<ffffffff8100ef42>] ? x86_perf_event_update+0x12/0x70
> [  306.657284]  [<ffffffff810104b1>] ? intel_pmu_save_and_restart+0x11/0x20
> [  306.657287]  [<ffffffff81012e84>] intel_pmu_handle_irq+0x1d4/0x420
> [  306.657290]  [<ffffffff8147b570>] perf_event_nmi_handler+0x50/0xc0
> [  306.657292]  [<ffffffff8147cfa3>] notifier_call_chain+0x53/0x80
> [  306.657294]  [<ffffffff8147d018>] __atomic_notifier_call_chain+0x48/0x70
> [  306.657296]  [<ffffffff8147d051>] atomic_notifier_call_chain+0x11/0x20
> [  306.657298]  [<ffffffff8147d08e>] notify_die+0x2e/0x30
> [  306.657300]  [<ffffffff8147a8af>] do_nmi+0x4f/0x200
> [  306.657302]  [<ffffffff8147a6ea>] nmi+0x1a/0x20
> [  306.657304]  [<ffffffff8100fd4d>] ? intel_pmu_enable_all+0x9d/0x110

just a thought: I've seen this kind of issue with LTTng before, and my
approach is to ensure this does not happen by issuing a
vmalloc_sync_all() call between all vmalloc/vmap calls and accesses to
those memory regions from the tracer code. So it boild down to :

1 - perform all memory allocation at trace session creation (from thread
    context). I do the page table in software (and allocate my buffer
    pages with alloc_pages()), so not page fault is generated by those
    accesses. However, I use kmalloc() to allocate my own
    software-page-table, which uses vmalloc if the allocation is larger
    than a certain threshold. Therefore, I need to issue
    vmalloc_sync_all() before NMI starts using the buffers.

2 - issue vmalloc_sync_all() from the tracer code, after buffer
    allocation, but before the trace session is added to the RCU list of
    active traces.

3 - issue vmalloc_sync_all() when each LTTng module is loaded, before
    they are registered to LTTng, so the memory used to keep their
    code and data is faulted in.

Until we find time and resources to finally implement the virtualized
NMI handling (which handles pages faults within NMIs) as discussed with
Linus last summer, I am staying with this work-around. It might be good
enough for perf too.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] perf and kmemcheck : fatal combination
  2011-04-26 12:33               ` Peter Zijlstra
  2011-04-26 12:56                 ` Eric Dumazet
@ 2011-04-26 19:13                 ` Pekka Enberg
  1 sibling, 0 replies; 14+ messages in thread
From: Pekka Enberg @ 2011-04-26 19:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Eric Dumazet, Ingo Molnar, Arnaldo Carvalho de Melo,
	Paul Mackerras, Vegard Nossum, linux-kernel, Mathieu Desnoyers

On Tue, 2011-04-26 at 14:33 +0200, Peter Zijlstra wrote:
> On Tue, 2011-04-26 at 14:27 +0200, Eric Dumazet wrote:
> > Thats far from trivial, maybe because we dont have NOTRACK api for
> > percpu allocations ? 
> 
> We can't use per-cpu allocations from NMI context because of the same
> problem, per-cpu uses vmalloc and vmalloc needs faults. Hence that
> shouldn't be a problem.
> 
> It looks like you covered most of it though, the buffer and the
> callchain stuff, aside from that it should only use some static data.
> 
> Pekka, what does kmemcheck do for .data and .bss things?

No, kmemcheck is only active for memory allocated with slab and the page
allocator .


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2011-04-26 19:13 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-04-25 16:08 [BUG] perf and kmemcheck : fatal combination Eric Dumazet
2011-04-26  7:38 ` Peter Zijlstra
2011-04-26  7:43   ` Pekka Enberg
2011-04-26  8:04   ` Ingo Molnar
2011-04-26  8:57     ` Eric Dumazet
2011-04-26  9:53       ` Eric Dumazet
2011-04-26 10:08         ` Pekka Enberg
2011-04-26 10:27           ` Eric Dumazet
2011-04-26 12:27             ` Eric Dumazet
2011-04-26 12:33               ` Peter Zijlstra
2011-04-26 12:56                 ` Eric Dumazet
2011-04-26 13:09                   ` Eric Dumazet
2011-04-26 19:13                 ` Pekka Enberg
2011-04-26 13:53         ` Mathieu Desnoyers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).