* [PATCH] [0/2] Fix panic regression in 2.6.27
@ 2008-09-02 13:49 Andi Kleen
2008-09-02 13:49 ` [PATCH] [1/2] Add a SYSTEM_PANIC state Andi Kleen
2008-09-02 13:49 ` [PATCH] [2/2] Don't complain about disabled irqs when the system has paniced Andi Kleen
0 siblings, 2 replies; 18+ messages in thread
From: Andi Kleen @ 2008-09-02 13:49 UTC (permalink / raw)
To: torvalds, linux-kernel
panic() from an interrupt right now spews lots of ugly WARN_ON
warnings because panic->smp_send_stop calls the new smp
call helpers which complain on irqs_disabled().
I fixed it in two parts: add a new SYSTEM_PANIC state to check
cleanly for panic and then disable the warnings in this case.
I believe this should be fixed for 2.6.27.
-Andi
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH] [1/2] Add a SYSTEM_PANIC state
2008-09-02 13:49 [PATCH] [0/2] Fix panic regression in 2.6.27 Andi Kleen
@ 2008-09-02 13:49 ` Andi Kleen
2008-09-03 19:04 ` Andrew Morton
2008-09-02 13:49 ` [PATCH] [2/2] Don't complain about disabled irqs when the system has paniced Andi Kleen
1 sibling, 1 reply; 18+ messages in thread
From: Andi Kleen @ 2008-09-02 13:49 UTC (permalink / raw)
To: torvalds, linux-kernel
Natural extension of the other system states.
This is useful to do some special case code on system panic.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
include/linux/kernel.h | 1 +
kernel/panic.c | 2 ++
2 files changed, 3 insertions(+)
Index: linux/include/linux/kernel.h
===================================================================
--- linux.orig/include/linux/kernel.h
+++ linux/include/linux/kernel.h
@@ -248,6 +248,7 @@ extern enum system_states {
SYSTEM_POWER_OFF,
SYSTEM_RESTART,
SYSTEM_SUSPEND_DISK,
+ SYSTEM_PANIC,
} system_state;
#define TAINT_PROPRIETARY_MODULE (1<<0)
Index: linux/kernel/panic.c
===================================================================
--- linux.orig/kernel/panic.c
+++ linux/kernel/panic.c
@@ -68,6 +68,8 @@ NORET_TYPE void panic(const char * fmt,
unsigned long caller = (unsigned long) __builtin_return_address(0);
#endif
+ system_state = SYSTEM_PANIC;
+
/*
* It's possible to come here directly from a panic-assertion and not
* have preempt disabled. Some functions called from here want
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH] [2/2] Don't complain about disabled irqs when the system has paniced
2008-09-02 13:49 [PATCH] [0/2] Fix panic regression in 2.6.27 Andi Kleen
2008-09-02 13:49 ` [PATCH] [1/2] Add a SYSTEM_PANIC state Andi Kleen
@ 2008-09-02 13:49 ` Andi Kleen
2008-09-02 14:28 ` Peter Zijlstra
2008-09-12 19:50 ` Pavel Machek
1 sibling, 2 replies; 18+ messages in thread
From: Andi Kleen @ 2008-09-02 13:49 UTC (permalink / raw)
To: torvalds, linux-kernel
panic calls smp_send_stop which eventually calls smp_call_function_*.
smp_call_function warns about disabled interrupts. But it's legal
to call panic in this case. When this happens panic() prints
several ugly backtraces. So don't check for disabled interrupts
in panic state.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Index: linux/kernel/smp.c
===================================================================
--- linux.orig/kernel/smp.c
+++ linux/kernel/smp.c
@@ -216,7 +216,7 @@ int smp_call_function_single(int cpu, vo
int err = 0;
/* Can deadlock when called with interrupts disabled */
- WARN_ON(irqs_disabled());
+ WARN_ON(system_state < SYSTEM_PANIC && irqs_disabled());
if (cpu == me) {
local_irq_save(flags);
@@ -260,7 +260,8 @@ EXPORT_SYMBOL(smp_call_function_single);
void __smp_call_function_single(int cpu, struct call_single_data *data)
{
/* Can deadlock when called with interrupts disabled */
- WARN_ON((data->flags & CSD_FLAG_WAIT) && irqs_disabled());
+ WARN_ON(system_state < SYSTEM_PANIC &&
+ (data->flags & CSD_FLAG_WAIT) && irqs_disabled());
generic_exec_single(cpu, data);
}
@@ -329,7 +330,7 @@ int smp_call_function_mask(cpumask_t mas
int slowpath = 0;
/* Can deadlock when called with interrupts disabled */
- WARN_ON(irqs_disabled());
+ WARN_ON(system_state < SYSTEM_PANIC && irqs_disabled());
cpu = smp_processor_id();
allbutself = cpu_online_map;
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] [2/2] Don't complain about disabled irqs when the system has paniced
2008-09-02 13:49 ` [PATCH] [2/2] Don't complain about disabled irqs when the system has paniced Andi Kleen
@ 2008-09-02 14:28 ` Peter Zijlstra
2008-09-02 14:40 ` Andi Kleen
2008-09-12 19:50 ` Pavel Machek
1 sibling, 1 reply; 18+ messages in thread
From: Peter Zijlstra @ 2008-09-02 14:28 UTC (permalink / raw)
To: Andi Kleen; +Cc: torvalds, linux-kernel
On Tue, 2008-09-02 at 15:49 +0200, Andi Kleen wrote:
> panic calls smp_send_stop which eventually calls smp_call_function_*.
> smp_call_function warns about disabled interrupts. But it's legal
> to call panic in this case. When this happens panic() prints
> several ugly backtraces. So don't check for disabled interrupts
> in panic state.
While it might be legal for panic to be called from such contexts, I
understand those warnings are there to warn of deadlocks.
So with the below patch you allow panic to deadlock if I understand
things correctly.
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
>
> Index: linux/kernel/smp.c
> ===================================================================
> --- linux.orig/kernel/smp.c
> +++ linux/kernel/smp.c
> @@ -216,7 +216,7 @@ int smp_call_function_single(int cpu, vo
> int err = 0;
>
> /* Can deadlock when called with interrupts disabled */
> - WARN_ON(irqs_disabled());
> + WARN_ON(system_state < SYSTEM_PANIC && irqs_disabled());
>
> if (cpu == me) {
> local_irq_save(flags);
> @@ -260,7 +260,8 @@ EXPORT_SYMBOL(smp_call_function_single);
> void __smp_call_function_single(int cpu, struct call_single_data *data)
> {
> /* Can deadlock when called with interrupts disabled */
> - WARN_ON((data->flags & CSD_FLAG_WAIT) && irqs_disabled());
> + WARN_ON(system_state < SYSTEM_PANIC &&
> + (data->flags & CSD_FLAG_WAIT) && irqs_disabled());
>
> generic_exec_single(cpu, data);
> }
> @@ -329,7 +330,7 @@ int smp_call_function_mask(cpumask_t mas
> int slowpath = 0;
>
> /* Can deadlock when called with interrupts disabled */
> - WARN_ON(irqs_disabled());
> + WARN_ON(system_state < SYSTEM_PANIC && irqs_disabled());
>
> cpu = smp_processor_id();
> allbutself = cpu_online_map;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] [2/2] Don't complain about disabled irqs when the system has paniced
2008-09-02 14:28 ` Peter Zijlstra
@ 2008-09-02 14:40 ` Andi Kleen
2008-09-02 14:45 ` Peter Zijlstra
0 siblings, 1 reply; 18+ messages in thread
From: Andi Kleen @ 2008-09-02 14:40 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Andi Kleen, torvalds, linux-kernel
On Tue, Sep 02, 2008 at 04:28:03PM +0200, Peter Zijlstra wrote:
> On Tue, 2008-09-02 at 15:49 +0200, Andi Kleen wrote:
> > panic calls smp_send_stop which eventually calls smp_call_function_*.
> > smp_call_function warns about disabled interrupts. But it's legal
> > to call panic in this case. When this happens panic() prints
> > several ugly backtraces. So don't check for disabled interrupts
> > in panic state.
>
> While it might be legal for panic to be called from such contexts, I
> understand those warnings are there to warn of deadlocks.
>
> So with the below patch you allow panic to deadlock if I understand
> things correctly.
Please describe the deadlock exactly. I don't think it can deadlock
in this case.
Besides do you prefer to not allow panic from interrupts/machine
checks etc. anymore?
-Andi
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] [2/2] Don't complain about disabled irqs when the system has paniced
2008-09-02 14:40 ` Andi Kleen
@ 2008-09-02 14:45 ` Peter Zijlstra
2008-09-02 15:00 ` Andi Kleen
0 siblings, 1 reply; 18+ messages in thread
From: Peter Zijlstra @ 2008-09-02 14:45 UTC (permalink / raw)
To: Andi Kleen; +Cc: torvalds, linux-kernel, Jens Axboe, Nick Piggin
On Tue, 2008-09-02 at 16:40 +0200, Andi Kleen wrote:
> On Tue, Sep 02, 2008 at 04:28:03PM +0200, Peter Zijlstra wrote:
> > On Tue, 2008-09-02 at 15:49 +0200, Andi Kleen wrote:
> > > panic calls smp_send_stop which eventually calls smp_call_function_*.
> > > smp_call_function warns about disabled interrupts. But it's legal
> > > to call panic in this case. When this happens panic() prints
> > > several ugly backtraces. So don't check for disabled interrupts
> > > in panic state.
> >
> > While it might be legal for panic to be called from such contexts, I
> > understand those warnings are there to warn of deadlocks.
> >
> > So with the below patch you allow panic to deadlock if I understand
> > things correctly.
>
> Please describe the deadlock exactly. I don't think it can deadlock
> in this case.
Then why are those warnings there? The deadlock is for the CSD_FLAG_WAIT
case, which can always happen due to the static csd data fallback.
The deadlock scenario is long the lines of two such smp_call_function*()
both under irq disabled calling each other with CSD_FLAG_WAIT set.
Neither remote cpu will handle the IPI due to irqs being disabled, so
we'll wait at-infinitum for completion.
> Besides do you prefer to not allow panic from interrupts/machine
> checks etc. anymore?
However did I imply that, I just said your fix looked iffy.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] [2/2] Don't complain about disabled irqs when the system has paniced
2008-09-02 14:45 ` Peter Zijlstra
@ 2008-09-02 15:00 ` Andi Kleen
2008-09-03 6:02 ` Nick Piggin
0 siblings, 1 reply; 18+ messages in thread
From: Andi Kleen @ 2008-09-02 15:00 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Andi Kleen, torvalds, linux-kernel, Jens Axboe, Nick Piggin
On Tue, Sep 02, 2008 at 04:45:17PM +0200, Peter Zijlstra wrote:
> On Tue, 2008-09-02 at 16:40 +0200, Andi Kleen wrote:
> > On Tue, Sep 02, 2008 at 04:28:03PM +0200, Peter Zijlstra wrote:
> > > On Tue, 2008-09-02 at 15:49 +0200, Andi Kleen wrote:
> > > > panic calls smp_send_stop which eventually calls smp_call_function_*.
> > > > smp_call_function warns about disabled interrupts. But it's legal
> > > > to call panic in this case. When this happens panic() prints
> > > > several ugly backtraces. So don't check for disabled interrupts
> > > > in panic state.
> > >
> > > While it might be legal for panic to be called from such contexts, I
> > > understand those warnings are there to warn of deadlocks.
> > >
> > > So with the below patch you allow panic to deadlock if I understand
> > > things correctly.
> >
> > Please describe the deadlock exactly. I don't think it can deadlock
> > in this case.
>
> Then why are those warnings there? The deadlock is for the CSD_FLAG_WAIT
> case, which can always happen due to the static csd data fallback.
>
> The deadlock scenario is long the lines of two such smp_call_function*()
> both under irq disabled calling each other with CSD_FLAG_WAIT set.
> Neither remote cpu will handle the IPI due to irqs being disabled, so
> we'll wait at-infinitum for completion.
First smp_send_stop does not wait (or at least not pass the
wait flag, it will still wait for the first ack like everyone else)
I don't claim to understand the new kernel/smp.c code (it seems to me quite
overdesigned and complicated and I admit I got lost in it somewhere),
but I think your scenario would rely on a global lock and presumably there
is none in the new code?
>
> > Besides do you prefer to not allow panic from interrupts/machine
> > checks etc. anymore?
>
> However did I imply that, I just said your fix looked iffy.
Well it would be the only alternative. Or have a timeout (I had
such a hack a long time ago) but that has also other issues.
In fact for smp_send_stop() it would be far better to just use an NMI,
but we unfortunately have a few BIOS that do not support NMI properly.
I think for 2.6.27 at least this is the best fix. At least keeping
panic that broken is no option I would say.
-Andi
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] [2/2] Don't complain about disabled irqs when the system has paniced
2008-09-02 15:00 ` Andi Kleen
@ 2008-09-03 6:02 ` Nick Piggin
2008-09-03 6:59 ` Andi Kleen
0 siblings, 1 reply; 18+ messages in thread
From: Nick Piggin @ 2008-09-03 6:02 UTC (permalink / raw)
To: Andi Kleen; +Cc: Peter Zijlstra, torvalds, linux-kernel, Jens Axboe
On Wednesday 03 September 2008 01:00, Andi Kleen wrote:
> On Tue, Sep 02, 2008 at 04:45:17PM +0200, Peter Zijlstra wrote:
> > The deadlock scenario is long the lines of two such smp_call_function*()
> > both under irq disabled calling each other with CSD_FLAG_WAIT set.
> > Neither remote cpu will handle the IPI due to irqs being disabled, so
> > we'll wait at-infinitum for completion.
>
> First smp_send_stop does not wait (or at least not pass the
> wait flag, it will still wait for the first ack like everyone else)
It won't wait, but it may have to wait for an ack as you say (but now
this is a very rare case when kmalloc fails rather than always having
to wait for so long).
So yes, it does wait in some cases. If interrupts are disabled then it
will stop processing IPIs which are sent to it from another CPU, which
might be also calling smp_send_stop and which itself is not processing
IPIs from the CPU. This is the deadlock.
We could serialise smp_send_stop (using a simple xchg, in case we panic
in lockdep), and then it is possible to get away without deadlocking.
> I don't claim to understand the new kernel/smp.c code (it seems to me quite
> overdesigned and complicated and I admit I got lost in it somewhere),
> but I think your scenario would rely on a global lock and presumably there
> is none in the new code?
Overdesigned? Well the old code was horribly slow and synchronized, and
made it useless for doing anything except things like smp_send_stop so
I would say it was under designed ;)
> > > Besides do you prefer to not allow panic from interrupts/machine
> > > checks etc. anymore?
> >
> > However did I imply that, I just said your fix looked iffy.
>
> Well it would be the only alternative. Or have a timeout (I had
> such a hack a long time ago) but that has also other issues.
>
> In fact for smp_send_stop() it would be far better to just use an NMI,
> but we unfortunately have a few BIOS that do not support NMI properly.
>
> I think for 2.6.27 at least this is the best fix. At least keeping
> panic that broken is no option I would say.
It is reasonable I think, but I don't like testing symbolic constants
with inequalities like in your patch 2/2. Can you just make it
system_state != SYSTEM_PANIC ?
Also... is it really a regression? The old code definitely had deadlock
warnings too, but maybe they were made more complete in the new code?
Thanks,
Nick
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] [2/2] Don't complain about disabled irqs when the system has paniced
2008-09-03 6:02 ` Nick Piggin
@ 2008-09-03 6:59 ` Andi Kleen
2008-09-03 9:37 ` Nick Piggin
0 siblings, 1 reply; 18+ messages in thread
From: Andi Kleen @ 2008-09-03 6:59 UTC (permalink / raw)
To: Nick Piggin
Cc: Andi Kleen, Peter Zijlstra, torvalds, linux-kernel, Jens Axboe
On Wed, Sep 03, 2008 at 04:02:37PM +1000, Nick Piggin wrote:
> > > Neither remote cpu will handle the IPI due to irqs being disabled, so
> > > we'll wait at-infinitum for completion.
> >
> > First smp_send_stop does not wait (or at least not pass the
> > wait flag, it will still wait for the first ack like everyone else)
>
> It won't wait, but it may have to wait for an ack as you say (but now
> this is a very rare case when kmalloc fails rather than always having
> to wait for so long).
kmalloc is actually not safe in panic. panic could happen inside
kmalloc(). Hmm I missed it does that unconditionally.
If yes the code is more broken than I thought.
> So yes, it does wait in some cases. If interrupts are disabled then it
> will stop processing IPIs which are sent to it from another CPU, which
> might be also calling smp_send_stop and which itself is not processing
> IPIs from the CPU. This is the deadlock.
>
> We could serialise smp_send_stop (using a simple xchg, in case we panic
> in lockdep), and then it is possible to get away without deadlocking.
Yes we need to get rid of the kmalloc too. Either use a static data
structure or a special path :/
> > I don't claim to understand the new kernel/smp.c code (it seems to me quite
> > overdesigned and complicated and I admit I got lost in it somewhere),
> > but I think your scenario would rely on a global lock and presumably there
> > is none in the new code?
>
> Overdesigned? Well the old code was horribly slow and synchronized, and
> made it useless for doing anything except things like smp_send_stop so
> I would say it was under designed ;)
It just seems quite complicated. Do you really need four layers
of function calls just to ask the low level code to trigger
an IPI? And are there really benchmarks that show the
queueing does actually improve something significantly? Ok I can see the point
of having distributed locks (did that on my own for the x86-64
TLB flusher), but that really doesn't need that much code !
And the queueing has bad side effects like breaking panic
and adding hard to test fallback paths.
Perhaps I'm old fashioned, but somehow my feeling is noone tries
to keep code simple anymore :/
> > > > Besides do you prefer to not allow panic from interrupts/machine
> > > > checks etc. anymore?
> > >
> > > However did I imply that, I just said your fix looked iffy.
> >
> > Well it would be the only alternative. Or have a timeout (I had
> > such a hack a long time ago) but that has also other issues.
> >
> > In fact for smp_send_stop() it would be far better to just use an NMI,
> > but we unfortunately have a few BIOS that do not support NMI properly.
> >
> > I think for 2.6.27 at least this is the best fix. At least keeping
> > panic that broken is no option I would say.
>
> It is reasonable I think, but I don't like testing symbolic constants
> with inequalities like in your patch 2/2. Can you just make it
> system_state != SYSTEM_PANIC ?
Well I like it.
>
> Also... is it really a regression? The old code definitely had deadlock
> warnings too, but maybe they were made more complete in the new code?
Yes 2.6.26 did panic without these endless warnings.
-Andi
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] [2/2] Don't complain about disabled irqs when the system has paniced
2008-09-03 6:59 ` Andi Kleen
@ 2008-09-03 9:37 ` Nick Piggin
2008-09-03 9:48 ` Andi Kleen
0 siblings, 1 reply; 18+ messages in thread
From: Nick Piggin @ 2008-09-03 9:37 UTC (permalink / raw)
To: Andi Kleen; +Cc: Peter Zijlstra, torvalds, linux-kernel, Jens Axboe
On Wednesday 03 September 2008 16:59, Andi Kleen wrote:
> On Wed, Sep 03, 2008 at 04:02:37PM +1000, Nick Piggin wrote:
> > > > Neither remote cpu will handle the IPI due to irqs being disabled, so
> > > > we'll wait at-infinitum for completion.
> > >
> > > First smp_send_stop does not wait (or at least not pass the
> > > wait flag, it will still wait for the first ack like everyone else)
> >
> > It won't wait, but it may have to wait for an ack as you say (but now
> > this is a very rare case when kmalloc fails rather than always having
> > to wait for so long).
>
> kmalloc is actually not safe in panic. panic could happen inside
> kmalloc(). Hmm I missed it does that unconditionally.
> If yes the code is more broken than I thought.
Hmm... that does pull in significantly more code yes...
It should be fixed. I guess a quickfix is to add a new call in
kernel/smp.c for use by panic code.
smp_send_stop is not a very precise heuristic when used at panic time,
though, so I prefer not to slow down anything just for that case.
> > > I don't claim to understand the new kernel/smp.c code (it seems to me
> > > quite overdesigned and complicated and I admit I got lost in it
> > > somewhere), but I think your scenario would rely on a global lock and
> > > presumably there is none in the new code?
> >
> > Overdesigned? Well the old code was horribly slow and synchronized, and
> > made it useless for doing anything except things like smp_send_stop so
> > I would say it was under designed ;)
>
> It just seems quite complicated. Do you really need four layers
> of function calls just to ask the low level code to trigger
> an IPI?
Oh. I thought the actual function call structure itself should be the
nice cleanup part that everyone would like...
- call_function->call_function_mask pre existing code
New:
- arch IPI function allows most logic to be implemented generically
- new API that allows deadlock-free operation with irqs disabled in
some cases.
Queueing for smp_call_function_mask I understand if it is less liked ;)
> And are there really benchmarks that show the
> queueing does actually improve something significantly?
For the call_function_mask case, I did see a benefit of queueing when
testing vmap scalability on a bigish system, yes. I had since done
vunmap batching in the vmap layer so the IPI costs don't really show
up at all there anymore. But the code was already implemented, and I
still don't want a smp_call_function from CPU0 to CPUs 1 and 2 to hold
up an smp_call_function from CPU4 to CPUs 5 and 6 for example.
For call_function_single, queueing could really help with high interrupt
loads like the block completion migration that Jens has been looking at.
In that case, queueing will act like interrupt mitigation when the target
CPU becomes saturated so it will help keep performance from degrading,
and it also allows multiple source CPUs to target a single destination
without losing too much scalability.
Also there was some straight-line benefit of queueing due to the source
CPU not having to wait for an ack from the destination before continuing.
Basically that would previously put a hard upper limit of remote function
call to the interrupt latency.
Benchmarks for "real" things unfortunately not easy to come by yet,
simply because it was so crap before that it was unusable for anything
remotely performance oriented. After the rewrite, I hope to see some
more interesting uses popping up slowly.
> Ok I can see the
> point of having distributed locks (did that on my own for the x86-64
> TLB flusher), but that really doesn't need that much code !
> And the queueing has bad side effects like breaking panic
> and adding hard to test fallback paths.
>
> Perhaps I'm old fashioned, but somehow my feeling is noone tries
> to keep code simple anymore :/
It was certianly tried. It's not a simple problem to do this with any
reasonable scalability, however.
> > > I think for 2.6.27 at least this is the best fix. At least keeping
> > > panic that broken is no option I would say.
> >
> > It is reasonable I think, but I don't like testing symbolic constants
> > with inequalities like in your patch 2/2. Can you just make it
> > system_state != SYSTEM_PANIC ?
>
> Well I like it.
How is it better than system_state != SYSTEM_PANIC? It breaks if
SYSTEM_PANIC gets something in front of it, and even when it is
correct it makes you double check the definition of SYSTEM_PANIC
to see what it does (is there any state above SYSTEM_PANIC?)
> > Also... is it really a regression? The old code definitely had deadlock
> > warnings too, but maybe they were made more complete in the new code?
>
> Yes 2.6.26 did panic without these endless warnings.
OK that should be fixed I agree.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] [2/2] Don't complain about disabled irqs when the system has paniced
2008-09-03 9:37 ` Nick Piggin
@ 2008-09-03 9:48 ` Andi Kleen
2008-09-03 10:17 ` Nick Piggin
0 siblings, 1 reply; 18+ messages in thread
From: Andi Kleen @ 2008-09-03 9:48 UTC (permalink / raw)
To: Nick Piggin
Cc: Andi Kleen, Peter Zijlstra, torvalds, linux-kernel, Jens Axboe
> Hmm... that does pull in significantly more code yes...
>
> It should be fixed. I guess a quickfix is to add a new call in
> kernel/smp.c for use by panic code.
Or just a new vector. I'll take a look later.
>
> > an IPI?
>
> Oh. I thought the actual function call structure itself should be the
> nice cleanup part that everyone would like...
I fail to see how much more complex code is a cleanup to be honest.
> > And are there really benchmarks that show the
> > queueing does actually improve something significantly?
>
> For the call_function_mask case, I did see a benefit of queueing when
> testing vmap scalability on a bigish system, yes. I had since done
Does that improve anything real-world?
> For call_function_single, queueing could really help with high interrupt
> loads like the block completion migration that Jens has been looking at.
But does it or not?
> In that case, queueing will act like interrupt mitigation when the target
> CPU becomes saturated so it will help keep performance from degrading,
> and it also allows multiple source CPUs to target a single destination
> without losing too much scalability.
>
> Also there was some straight-line benefit of queueing due to the source
> CPU not having to wait for an ack from the destination before continuing.
> Basically that would previously put a hard upper limit of remote function
> call to the interrupt latency.
>
> Benchmarks for "real" things unfortunately not easy to come by yet,
> simply because it was so crap before that it was unusable for anything
> remotely performance oriented. After the rewrite, I hope to see some
At least the "industry standard benchmark" Intel spends a lot of time
one used IPIs and iirc didn't run into too big issues. It was far
less a problem than the endless sl[au]b regressions at least.
> It was certianly tried. It's not a simple problem to do this with any
> reasonable scalability, however.
Well just not having a big lock is a big step forward. But that
doesn't need that much code, it's ~10-20 lines of new code
as you can see in the scalable tlb flush.
The queueing seems to add all the complexity and so far I haven't
seen much evidence that the queueing actually helps enough
to pay for its costs in complexity and maintenance overhead
(and bugginess as the panic case and the earlier crash issue shows)
> > > It is reasonable I think, but I don't like testing symbolic constants
> > > with inequalities like in your patch 2/2. Can you just make it
> > > system_state != SYSTEM_PANIC ?
> >
> > Well I like it.
>
> How is it better than system_state != SYSTEM_PANIC? It breaks if
> SYSTEM_PANIC gets something in front of it, and even when it is
No it will not break in that case.
> correct it makes you double check the definition of SYSTEM_PANIC
> to see what it does (is there any state above SYSTEM_PANIC?)
The system states are already ordered and it would make
a lot of sense to keep it that way.
Anyways the code will go anyways I guess because we have
established that it's not enough to fix panic.
-Andi
--
ak@linux.intel.com
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] [2/2] Don't complain about disabled irqs when the system has paniced
2008-09-03 9:48 ` Andi Kleen
@ 2008-09-03 10:17 ` Nick Piggin
2008-09-03 10:26 ` Jens Axboe
0 siblings, 1 reply; 18+ messages in thread
From: Nick Piggin @ 2008-09-03 10:17 UTC (permalink / raw)
To: Andi Kleen; +Cc: Peter Zijlstra, torvalds, linux-kernel, Jens Axboe
On Wednesday 03 September 2008 19:48, Andi Kleen wrote:
> > Hmm... that does pull in significantly more code yes...
> >
> > It should be fixed. I guess a quickfix is to add a new call in
> > kernel/smp.c for use by panic code.
>
> Or just a new vector. I'll take a look later.
That would work.
> > > an IPI?
> >
> > Oh. I thought the actual function call structure itself should be the
> > nice cleanup part that everyone would like...
>
> I fail to see how much more complex code is a cleanup to be honest.
I was referring to the call structure. You complained about 4 levels of
functions, but that's just to allow most of the code to be moved into
architecture independent kernel which is a cleanup I think.
The rest of the changes obviously are more complex and are not cleanups.
> > > And are there really benchmarks that show the
> > > queueing does actually improve something significantly?
> >
> > For the call_function_mask case, I did see a benefit of queueing when
> > testing vmap scalability on a bigish system, yes. I had since done
>
> Does that improve anything real-world?
Yes. XFS directory workloads when it's using name block size > PAGE_SIZE.
> > For call_function_single, queueing could really help with high interrupt
> > loads like the block completion migration that Jens has been looking at.
>
> But does it or not?
Well yes. It was basically impossible to implement without this IIRC,
due to performance problems. And that code itself also showed some
improvements in cases. I think it may need more tuning and testing.
> > In that case, queueing will act like interrupt mitigation when the target
> > CPU becomes saturated so it will help keep performance from degrading,
> > and it also allows multiple source CPUs to target a single destination
> > without losing too much scalability.
> >
> > Also there was some straight-line benefit of queueing due to the source
> > CPU not having to wait for an ack from the destination before continuing.
> > Basically that would previously put a hard upper limit of remote function
> > call to the interrupt latency.
> >
> > Benchmarks for "real" things unfortunately not easy to come by yet,
> > simply because it was so crap before that it was unusable for anything
> > remotely performance oriented. After the rewrite, I hope to see some
>
> At least the "industry standard benchmark" Intel spends a lot of time
> one used IPIs and iirc didn't run into too big issues. It was far
> less a problem than the endless sl[au]b regressions at least.
IPIs for smp_call_function? Generated at any significant rate? I guarantee
not, otherwise there would have been problems :)
> > It was certianly tried. It's not a simple problem to do this with any
> > reasonable scalability, however.
>
> Well just not having a big lock is a big step forward. But that
> doesn't need that much code, it's ~10-20 lines of new code
> as you can see in the scalable tlb flush.
As I said, if you don't queue, you put a pretty hard, pretty low upper
limit on the rate at which you can queue calls on the target, and that
limit slows down as your system gets larger which is pretty nasty.
> The queueing seems to add all the complexity and so far I haven't
> seen much evidence that the queueing actually helps enough
> to pay for its costs in complexity and maintenance overhead
> (and bugginess as the panic case and the earlier crash issue shows)
If Jens is ever running benchmarks on the request migration completion
patches (I think it still uses call_function_single), then maybe he could
disable queueing and report what happens.
> > > > It is reasonable I think, but I don't like testing symbolic constants
> > > > with inequalities like in your patch 2/2. Can you just make it
> > > > system_state != SYSTEM_PANIC ?
> > >
> > > Well I like it.
> >
> > How is it better than system_state != SYSTEM_PANIC? It breaks if
> > SYSTEM_PANIC gets something in front of it, and even when it is
>
> No it will not break in that case.
It breaks because it stops warning for the states above panic,
doesn't it?
> > correct it makes you double check the definition of SYSTEM_PANIC
> > to see what it does (is there any state above SYSTEM_PANIC?)
>
> The system states are already ordered and it would make
> a lot of sense to keep it that way.
OK, out of curiosity, please tell me what is the downside of != ?
> Anyways the code will go anyways I guess because we have
> established that it's not enough to fix panic.
Sure.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] [2/2] Don't complain about disabled irqs when the system has paniced
2008-09-03 10:17 ` Nick Piggin
@ 2008-09-03 10:26 ` Jens Axboe
0 siblings, 0 replies; 18+ messages in thread
From: Jens Axboe @ 2008-09-03 10:26 UTC (permalink / raw)
To: Nick Piggin; +Cc: Andi Kleen, Peter Zijlstra, torvalds, linux-kernel
On Wed, Sep 03 2008, Nick Piggin wrote:
> > > For call_function_single, queueing could really help with high interrupt
> > > loads like the block completion migration that Jens has been looking at.
> >
> > But does it or not?
>
> Well yes. It was basically impossible to implement without this IIRC,
> due to performance problems. And that code itself also showed some
> improvements in cases. I think it may need more tuning and testing.
The old code was useless for this, far too slow. With the new code I've
been able to demonstrate VERY good benefits from migrating interrupts,
both in synthetic cache locality benchmarks but also from something as
general as the compilebench test app (where reductions in system time
are as huge as 20-40%!). I hope to be able to share these results soon.
I have a rough profile from last week on XFS.
http://brick.kernel.dk/lock_rq0
vs
http://brick.kernel.dk/lock_rq1
rq0 is normal interrupt handling, rq1 is with rq_affinity set to 1 for
that block device queue. With that set, request completions are migrated
to the CPU that submitted them.
That is the whole reason why I got into generalizing the IPI function
call handling.
--
Jens Axboe
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] [1/2] Add a SYSTEM_PANIC state
2008-09-02 13:49 ` [PATCH] [1/2] Add a SYSTEM_PANIC state Andi Kleen
@ 2008-09-03 19:04 ` Andrew Morton
2008-09-03 19:16 ` Andi Kleen
0 siblings, 1 reply; 18+ messages in thread
From: Andrew Morton @ 2008-09-03 19:04 UTC (permalink / raw)
To: Andi Kleen; +Cc: torvalds, linux-kernel
On Tue, 2 Sep 2008 15:49:22 +0200 (CEST)
Andi Kleen <andi@firstfloor.org> wrote:
> --- linux.orig/include/linux/kernel.h
> +++ linux/include/linux/kernel.h
> @@ -248,6 +248,7 @@ extern enum system_states {
> SYSTEM_POWER_OFF,
> SYSTEM_RESTART,
> SYSTEM_SUSPEND_DISK,
> + SYSTEM_PANIC,
> } system_state;
system_state is such a crock. I wonder what other random code all over
the place is looking at system_state and will get unexpectedly broken
by other "unrelated" changes such as this..
It's not a heck of a lot nicer, but we could do this:
--- a/kernel/panic.c~a
+++ a/kernel/panic.c
@@ -80,7 +80,6 @@ NORET_TYPE void panic(const char * fmt,
vsnprintf(buf, sizeof(buf), fmt, args);
va_end(args);
printk(KERN_EMERG "Kernel panic - not syncing: %s\n",buf);
- bust_spinlocks(0);
/*
* If we have crashed and we have a crash kernel loaded let it handle
@@ -97,6 +96,7 @@ NORET_TYPE void panic(const char * fmt,
*/
smp_send_stop();
#endif
+ bust_spinlocks(0);
atomic_notifier_call_chain(&panic_notifier_list, 0, buf);
_
then test oops_in_progress down in the IPI code. This has the
advantage of not introducing any additional global states and is a bit
more logical, I think. Need to check whether crash_kexec() would be
affected by the above change.
Bear in mind that the oops-handling code can call panic(), if
panic_on_oops==1. I can't think of any adverse or special consequences
of this, but it needs to be thought about.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] [1/2] Add a SYSTEM_PANIC state
2008-09-03 19:04 ` Andrew Morton
@ 2008-09-03 19:16 ` Andi Kleen
2008-09-03 19:32 ` Andrew Morton
0 siblings, 1 reply; 18+ messages in thread
From: Andi Kleen @ 2008-09-03 19:16 UTC (permalink / raw)
To: Andrew Morton; +Cc: Andi Kleen, torvalds, linux-kernel
On Wed, Sep 03, 2008 at 12:04:23PM -0700, Andrew Morton wrote:
> On Tue, 2 Sep 2008 15:49:22 +0200 (CEST)
> Andi Kleen <andi@firstfloor.org> wrote:
>
> > --- linux.orig/include/linux/kernel.h
> > +++ linux/include/linux/kernel.h
> > @@ -248,6 +248,7 @@ extern enum system_states {
> > SYSTEM_POWER_OFF,
> > SYSTEM_RESTART,
> > SYSTEM_SUSPEND_DISK,
> > + SYSTEM_PANIC,
> > } system_state;
>
> system_state is such a crock. I wonder what other random code all over
> the place is looking at system_state and will get unexpectedly broken
> by other "unrelated" changes such as this..
>From a quick grep none.
Also I think it's a natural extension.
>
>
> It's not a heck of a lot nicer, but we could do this:
Sorry but I think it's far worse. How do you think it's
better?
-Andi
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] [1/2] Add a SYSTEM_PANIC state
2008-09-03 19:16 ` Andi Kleen
@ 2008-09-03 19:32 ` Andrew Morton
2008-09-03 19:44 ` Andi Kleen
0 siblings, 1 reply; 18+ messages in thread
From: Andrew Morton @ 2008-09-03 19:32 UTC (permalink / raw)
To: Andi Kleen; +Cc: andi, torvalds, linux-kernel
On Wed, 3 Sep 2008 21:16:51 +0200
Andi Kleen <andi@firstfloor.org> wrote:
> On Wed, Sep 03, 2008 at 12:04:23PM -0700, Andrew Morton wrote:
> > On Tue, 2 Sep 2008 15:49:22 +0200 (CEST)
> > Andi Kleen <andi@firstfloor.org> wrote:
> >
> > > --- linux.orig/include/linux/kernel.h
> > > +++ linux/include/linux/kernel.h
> > > @@ -248,6 +248,7 @@ extern enum system_states {
> > > SYSTEM_POWER_OFF,
> > > SYSTEM_RESTART,
> > > SYSTEM_SUSPEND_DISK,
> > > + SYSTEM_PANIC,
> > > } system_state;
> >
> > system_state is such a crock. I wonder what other random code all over
> > the place is looking at system_state and will get unexpectedly broken
> > by other "unrelated" changes such as this..
>
> >From a quick grep none.
>
> Also I think it's a natural extension.
>
> >
> >
> > It's not a heck of a lot nicer, but we could do this:
>
> Sorry but I think it's far worse. How do you think it's
> better?
>
For the reason which I stated and which you carefully deleted prior to
asking my reason.
Sheesh.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] [1/2] Add a SYSTEM_PANIC state
2008-09-03 19:32 ` Andrew Morton
@ 2008-09-03 19:44 ` Andi Kleen
0 siblings, 0 replies; 18+ messages in thread
From: Andi Kleen @ 2008-09-03 19:44 UTC (permalink / raw)
To: Andrew Morton; +Cc: Andi Kleen, torvalds, linux-kernel
On Wed, Sep 03, 2008 at 12:32:02PM -0700, Andrew Morton wrote:
> On Wed, 3 Sep 2008 21:16:51 +0200
> Andi Kleen <andi@firstfloor.org> wrote:
>
> > On Wed, Sep 03, 2008 at 12:04:23PM -0700, Andrew Morton wrote:
> > > On Tue, 2 Sep 2008 15:49:22 +0200 (CEST)
> > > Andi Kleen <andi@firstfloor.org> wrote:
> > >
> > > > --- linux.orig/include/linux/kernel.h
> > > > +++ linux/include/linux/kernel.h
> > > > @@ -248,6 +248,7 @@ extern enum system_states {
> > > > SYSTEM_POWER_OFF,
> > > > SYSTEM_RESTART,
> > > > SYSTEM_SUSPEND_DISK,
> > > > + SYSTEM_PANIC,
> > > > } system_state;
> > >
> > > system_state is such a crock. I wonder what other random code all over
> > > the place is looking at system_state and will get unexpectedly broken
> > > by other "unrelated" changes such as this..
> >
> > >From a quick grep none.
> >
> > Also I think it's a natural extension.
> >
> > >
> > >
> > > It's not a heck of a lot nicer, but we could do this:
> >
> > Sorry but I think it's far worse. How do you think it's
> > better?
> >
>
> For the reason which I stated and which you carefully deleted prior to
> asking my reason.
You mean " This has the
advantage of not introducing any additional global states and is a bit
more logical, I think." ?
Seems dodgy to me but ok. Also I think personally SYSTEM_PANIC
is more logical than making oops_in_progress stand for panic too.
Anyways it's moot because it looks like smp_call_function is not
easily salavagable for panic anyways.
-Andi
--
ak@linux.intel.com
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] [2/2] Don't complain about disabled irqs when the system has paniced
2008-09-02 13:49 ` [PATCH] [2/2] Don't complain about disabled irqs when the system has paniced Andi Kleen
2008-09-02 14:28 ` Peter Zijlstra
@ 2008-09-12 19:50 ` Pavel Machek
1 sibling, 0 replies; 18+ messages in thread
From: Pavel Machek @ 2008-09-12 19:50 UTC (permalink / raw)
To: Andi Kleen; +Cc: torvalds, linux-kernel
Hi!
> panic calls smp_send_stop which eventually calls smp_call_function_*.
> smp_call_function warns about disabled interrupts. But it's legal
> to call panic in this case. When this happens panic() prints
> several ugly backtraces. So don't check for disabled interrupts
> in panic state.
>
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
...
> /* Can deadlock when called with interrupts disabled */
> - WARN_ON(irqs_disabled());
> + WARN_ON(system_state < SYSTEM_PANIC && irqs_disabled());
Nice trap if someone adds state after panic... replace < with !=?
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2008-09-12 19:52 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-02 13:49 [PATCH] [0/2] Fix panic regression in 2.6.27 Andi Kleen
2008-09-02 13:49 ` [PATCH] [1/2] Add a SYSTEM_PANIC state Andi Kleen
2008-09-03 19:04 ` Andrew Morton
2008-09-03 19:16 ` Andi Kleen
2008-09-03 19:32 ` Andrew Morton
2008-09-03 19:44 ` Andi Kleen
2008-09-02 13:49 ` [PATCH] [2/2] Don't complain about disabled irqs when the system has paniced Andi Kleen
2008-09-02 14:28 ` Peter Zijlstra
2008-09-02 14:40 ` Andi Kleen
2008-09-02 14:45 ` Peter Zijlstra
2008-09-02 15:00 ` Andi Kleen
2008-09-03 6:02 ` Nick Piggin
2008-09-03 6:59 ` Andi Kleen
2008-09-03 9:37 ` Nick Piggin
2008-09-03 9:48 ` Andi Kleen
2008-09-03 10:17 ` Nick Piggin
2008-09-03 10:26 ` Jens Axboe
2008-09-12 19:50 ` Pavel Machek
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).