[RFC][PATCH] pstore: Skip spinlock when just one cpu is online

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [RFC][PATCH] pstore: Skip spinlock when just one cpu is online
@ 2012-12-07 21:41 Seiji Aguchi
  2012-12-07 22:17 ` Luck, Tony
  2012-12-10 16:42 ` Don Zickus
  0 siblings, 2 replies; 9+ messages in thread
From: Seiji Aguchi @ 2012-12-07 21:41 UTC (permalink / raw)
  To: linux-kernel@vger.kernel.org, Luck, Tony (tony.luck@intel.com)
  Cc: cbouatmailru@gmail.com, ccross@android.com, keescook@chromium.org,
	dzickus@redhat.com, dle-develop@lists.sourceforge.net,
	Satoru Moriya

[Issue]

If one cpu ,which is taking a psinfo->buf_lock, 
receive NMI from a panicked cpu via smp_send_stop(),
the panicked cpu hangs up in pstore_dump() called by kmsg_dump(KMSG_DUMP_PANIC)
because the psinfo->buf_lock is taken again in it.

To avoid the deadlock, an easy solution is moving kmsg_dump above
smp_send_stop() in panic path.

But, it is not safe to kick pstore while multiple cpus are running in panic case,
because they may touch corrupted data/variables and unnecessary failures may happen.
In that case, we can't guarantee that a panicked cpu can log messages reliably
because it may have harmful effects due to the failures.

[Solution]

This patch skips taking a psinfo->buf_lock when just one cpu is online
because stopped cpus turn to offline via smp_send_stop()
in some architectures like x86, powerpc or arm64.

It may be a hack but solves my concern deadlocking in x86 architecture.

Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com>
---
 fs/pstore/platform.c |   14 +++++++++++---
 1 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index 947fbe0..ca4d2ab 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -107,7 +107,7 @@ static void pstore_dump(struct kmsg_dumper *dumper,
 	unsigned long	total = 0;
 	const char	*why;
 	u64		id;
-	unsigned int	part = 1;
+	unsigned int	part = 1, cpu_num = num_online_cpus();
 	unsigned long	flags = 0;
 	int		is_locked = 0;
 	int		ret;
@@ -118,8 +118,14 @@ static void pstore_dump(struct kmsg_dumper *dumper,
 		is_locked = spin_trylock(&psinfo->buf_lock);
 		if (!is_locked)
 			pr_err("pstore dump routine blocked in NMI, may corrupt error record\n");
-	} else
+	} else if (cpu_num > 1) {
+		/*
+		 * Take a spin lock only when multiple cpus are online.
+		 */
 		spin_lock_irqsave(&psinfo->buf_lock, flags);
+	} else
+		local_irq_save(flags);
+
 	oopscount++;
 	while (total < kmsg_bytes) {
 		char *dst;
@@ -146,8 +152,10 @@ static void pstore_dump(struct kmsg_dumper *dumper,
 	if (in_nmi()) {
 		if (is_locked)
 			spin_unlock(&psinfo->buf_lock);
-	} else
+	} else if (cpu_num > 1) {
 		spin_unlock_irqrestore(&psinfo->buf_lock, flags);
+	} else
+		local_irq_restore(flags);
 }

 static struct kmsg_dumper pstore_dumper = {
-- 1.7.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* RE: [RFC][PATCH] pstore: Skip spinlock when just one cpu is online
  2012-12-07 21:41 [RFC][PATCH] pstore: Skip spinlock when just one cpu is online Seiji Aguchi
@ 2012-12-07 22:17 ` Luck, Tony
  2012-12-07 23:43   ` Seiji Aguchi
  2012-12-10 16:42 ` Don Zickus
  1 sibling, 1 reply; 9+ messages in thread
From: Luck, Tony @ 2012-12-07 22:17 UTC (permalink / raw)
  To: Seiji Aguchi, linux-kernel@vger.kernel.org
  Cc: cbouatmailru@gmail.com, ccross@android.com, keescook@chromium.org,
	dzickus@redhat.com, dle-develop@lists.sourceforge.net,
	Satoru Moriya

> This patch skips taking a psinfo->buf_lock when just one cpu is online
> because stopped cpus turn to offline via smp_send_stop()
> in some architectures like x86, powerpc or arm64.

That seems an impressive list of preconditions.  So for this to
help we need to have taken all but one cpu offline, then be in
some code that is holding the pstore lock and get hit by an NMI
which causes us to recurse into the pstore code.

Can all these things really happen (did you run into this problem
on a real system?). Or is this just a theoretical problem.  Ugly (but
practical) hacks might be OK to solve real problems. But do we really
want them to fix problems that actually never happen?

-Tony

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [RFC][PATCH] pstore: Skip spinlock when just one cpu is online
  2012-12-07 22:17 ` Luck, Tony
@ 2012-12-07 23:43   ` Seiji Aguchi
  2012-12-10 16:48     ` Don Zickus
  0 siblings, 1 reply; 9+ messages in thread
From: Seiji Aguchi @ 2012-12-07 23:43 UTC (permalink / raw)
  To: Luck, Tony, linux-kernel@vger.kernel.org
  Cc: cbouatmailru@gmail.com, ccross@android.com, keescook@chromium.org,
	dzickus@redhat.com, dle-develop@lists.sourceforge.net,
	Satoru Moriya

> Can all these things really happen (did you run into this problem on a real system?). Or is this just a theoretical problem.  Ugly (but
> practical) hacks might be OK to solve real problems. 

It is a theoretical problem right now.
But it is a timing issue and there is a possibility to happen actually.

> But do we really want them to fix problems that actually never happen?

If we find a problem (even if it is theoretical), we can't say "It actually never happen.".

I have some reasons to submit this patch before reproducing actually.

1)
It is too late if we fix a problem after it actually happened in case where we apply Linux, including pstore, 
to mission critical systems, because the failure of those systems has a great impact on a whole society.
Customers in this area ask us to fix a problem as soon as possible.
On the other hand, this kind of timing issue is hard to reproduce.
So, our support service engineers often work all night to reproduce it.
It is a nightmare for us.

If we can fix it with a small patch in adance, it is really helpful for us.

2)
In the long term, I plan to add a kmsg_dump to a kexec path because kdump may fail in the real world.
In that case, we need another troubleshooting material like pstore to detect a root cause of failure.

Actually, someone blamed for a reliability of kdump in LinuxCON Europe.
http://events.linuxfoundation.org/images/stories/pdf/lceu2012_holzheu.pdf

To convince a kexec maintainer to add a kmsg_dump, I need to prove that there is no problem in pstore code
causing a failure of kdump.

Seiji

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC][PATCH] pstore: Skip spinlock when just one cpu is online
  2012-12-07 21:41 [RFC][PATCH] pstore: Skip spinlock when just one cpu is online Seiji Aguchi
  2012-12-07 22:17 ` Luck, Tony
@ 2012-12-10 16:42 ` Don Zickus
  2012-12-10 23:52   ` Seiji Aguchi
  1 sibling, 1 reply; 9+ messages in thread
From: Don Zickus @ 2012-12-10 16:42 UTC (permalink / raw)
  To: Seiji Aguchi
  Cc: linux-kernel@vger.kernel.org, Luck, Tony (tony.luck@intel.com),
	cbouatmailru@gmail.com, ccross@android.com, keescook@chromium.org,
	dle-develop@lists.sourceforge.net, Satoru Moriya

On Fri, Dec 07, 2012 at 09:41:13PM +0000, Seiji Aguchi wrote:
> [Issue]
> 
> If one cpu ,which is taking a psinfo->buf_lock, 
> receive NMI from a panicked cpu via smp_send_stop(),
> the panicked cpu hangs up in pstore_dump() called by kmsg_dump(KMSG_DUMP_PANIC)
> because the psinfo->buf_lock is taken again in it.

Hi Seiji,

I am trying to understand this scenario.  You said it is theoretical.
Looking through smp_send_stop, it only sends an NMI if the IRQ did not
work after a second.

So the scenario you are looking at is:

cpuA grabs psinfo->buf_lock
cpuB panics and calls smp_send_stop
smp_send_stop sends IRQ to cpuA
after 1 second, cpuB gives up on cpuA and sends an NMI instead
cpuA is now in an NMI handler while still holding buf_lock
cpuB is deadlocked

Now my first reaction would be, if that is the scenario, why couldn't cpuA
release the lock within one second.  Because if cpuA is stuck talking with
firmware, then your patch to force the unlock is probably going to trip
over the same problems.
(those problems include dealing with resetting a firmware state machine)

IOW, your patch is just one of many minefields you will have to cross in
order to be successful.  

Without any testing to show that you have cleared all those minefields, I
am leaning against your patch for now.  I would rather have a complete
patchset that fixes all the problems in this path.

> 
> To avoid the deadlock, an easy solution is moving kmsg_dump above
> smp_send_stop() in panic path.
> 
> But, it is not safe to kick pstore while multiple cpus are running in panic case,
> because they may touch corrupted data/variables and unnecessary failures may happen.
> In that case, we can't guarantee that a panicked cpu can log messages reliably
> because it may have harmful effects due to the failures.
> 
> [Solution]
> 
> This patch skips taking a psinfo->buf_lock when just one cpu is online
> because stopped cpus turn to offline via smp_send_stop()
> in some architectures like x86, powerpc or arm64.
> 
> It may be a hack but solves my concern deadlocking in x86 architecture.

If you did have to go this route, why not take the 'reason' variable and
pass it to some function 'in_panic_path(reason)' that tells you if you are
panicing or not.  Then you can change the 'if in_nmi()' to 'if in_nmi() ||
in_panic'.

The panic path already disables irqs for you, not sure you need to do it
again.  Unless you have some other path you are trying to protect in mind.

Cheers,
Don

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC][PATCH] pstore: Skip spinlock when just one cpu is online
  2012-12-07 23:43   ` Seiji Aguchi
@ 2012-12-10 16:48     ` Don Zickus
  2012-12-10 18:19       ` Luck, Tony
  2012-12-10 23:55       ` Seiji Aguchi
  0 siblings, 2 replies; 9+ messages in thread
From: Don Zickus @ 2012-12-10 16:48 UTC (permalink / raw)
  To: Seiji Aguchi
  Cc: Luck, Tony, linux-kernel@vger.kernel.org, cbouatmailru@gmail.com,
	ccross@android.com, keescook@chromium.org,
	dle-develop@lists.sourceforge.net, Satoru Moriya

On Fri, Dec 07, 2012 at 11:43:03PM +0000, Seiji Aguchi wrote:
> > Can all these things really happen (did you run into this problem on a real system?). Or is this just a theoretical problem.  Ugly (but
> > practical) hacks might be OK to solve real problems. 
> 
> It is a theoretical problem right now.
> But it is a timing issue and there is a possibility to happen actually.
> 
> > But do we really want them to fix problems that actually never happen?
> 
> If we find a problem (even if it is theoretical), we can't say "It actually never happen.".
> 
> I have some reasons to submit this patch before reproducing actually.
> 
> 1)
> It is too late if we fix a problem after it actually happened in case where we apply Linux, including pstore, 
> to mission critical systems, because the failure of those systems has a great impact on a whole society.
> Customers in this area ask us to fix a problem as soon as possible.
> On the other hand, this kind of timing issue is hard to reproduce.
> So, our support service engineers often work all night to reproduce it.
> It is a nightmare for us.
> 
> If we can fix it with a small patch in adance, it is really helpful for us.

As I said in my email I just sent, it may not help you without testing it.
As there are probably other problems in that un-tested theoretical
scenario.

> 
> 2)
> In the long term, I plan to add a kmsg_dump to a kexec path because kdump may fail in the real world.
> In that case, we need another troubleshooting material like pstore to detect a root cause of failure.

But you are assuming that kmsg_dump is perfect and it isn't, in which case
by putting kmsg_dump in the kdump path, you actually may be blocking kdump
from working.

That is the biggest hold up for those guys from including it I believe.

Cheers,
Don

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [RFC][PATCH] pstore: Skip spinlock when just one cpu is online
  2012-12-10 16:48     ` Don Zickus
@ 2012-12-10 18:19       ` Luck, Tony
  2012-12-11  0:06         ` Seiji Aguchi
  2012-12-10 23:55       ` Seiji Aguchi
  1 sibling, 1 reply; 9+ messages in thread
From: Luck, Tony @ 2012-12-10 18:19 UTC (permalink / raw)
  To: Don Zickus, Seiji Aguchi
  Cc: linux-kernel@vger.kernel.org, cbouatmailru@gmail.com,
	ccross@android.com, keescook@chromium.org,
	dle-develop@lists.sourceforge.net, Satoru Moriya

> But you are assuming that kmsg_dump is perfect and it isn't, in which case
> by putting kmsg_dump in the kdump path, you actually may be blocking kdump
> from working.

I think the concern is that kdump isn't perfect, so sometimes we don't get a good dump
from it. In those cases it would have been nice to have a pstore log of the original
problem.

But ... I don't see an answer to this problem.  Adding more code just increases the
number of possible places we can fail (especially as we are executing in a state
where we know that things are all messed up ... the first kernel panic'd because
something bad happened  that we didn't know how to fix).

A boot argument might help - so we can force use of pstore in cases where
kdump is failing (or prevent use of pstore in cases where it seem to be preventing
us getting to kdump ... I don't have a preference).  BUT this would only be useful
if we had a repeatable problem so that we could switch to the other mode ... and
it seems likely that the kinds of problems that cause pstore or kdump to fail would
be weird cases that are not very repeatable :-(

-Tony

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [RFC][PATCH] pstore: Skip spinlock when just one cpu is online
  2012-12-10 16:42 ` Don Zickus
@ 2012-12-10 23:52   ` Seiji Aguchi
  0 siblings, 0 replies; 9+ messages in thread
From: Seiji Aguchi @ 2012-12-10 23:52 UTC (permalink / raw)
  To: Don Zickus
  Cc: linux-kernel@vger.kernel.org, Luck, Tony (tony.luck@intel.com),
	cbouatmailru@gmail.com, ccross@android.com, keescook@chromium.org,
	dle-develop@lists.sourceforge.net, Satoru Moriya

> Now my first reaction would be, if that is the scenario, why couldn't cpuA release the lock within one second.  Because if cpuA is stuck
> talking with firmware, then your patch to force the unlock is probably going to trip over the same problems.
> (those problems include dealing with resetting a firmware state machine)
> 
> IOW, your patch is just one of many minefields you will have to cross in order to be successful.

I agree that I need to fix the platform driver's deadlocking as well to make an overall system workable. 
(I just wanted to fix the deadlocking problems step by step.)


> 
> Without any testing to show that you have cleared all those minefields, I am leaning against your patch for now.  I would rather have a
> complete patchset that fixes all the problems in this path.

Thanks. I will try to make the complete patchset.

> 
> If you did have to go this route, why not take the 'reason' variable and pass it to some function 'in_panic_path(reason)' that tells you if
> you are panicing or not.  Then you can change the 'if in_nmi()' to 'if in_nmi() || in_panic'.
> 
> The panic path already disables irqs for you, not sure you need to do it again.  Unless you have some other path you are trying to
> protect in mind.

Thank you for the suggestion.
I will update my patch in accordance with your comment if anyone else disagree that.

> 
> Cheers,
> Don

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [RFC][PATCH] pstore: Skip spinlock when just one cpu is online
  2012-12-10 16:48     ` Don Zickus
  2012-12-10 18:19       ` Luck, Tony
@ 2012-12-10 23:55       ` Seiji Aguchi
  1 sibling, 0 replies; 9+ messages in thread
From: Seiji Aguchi @ 2012-12-10 23:55 UTC (permalink / raw)
  To: Don Zickus
  Cc: Luck, Tony, linux-kernel@vger.kernel.org, cbouatmailru@gmail.com,
	ccross@android.com, keescook@chromium.org,
	dle-develop@lists.sourceforge.net, Satoru Moriya

> > If we can fix it with a small patch in adance, it is really helpful for us.
> 
> As I said in my email I just sent, it may not help you without testing it.
> As there are probably other problems in that un-tested theoretical scenario.

OK. I understood.

 
> >
> > 2)
> > In the long term, I plan to add a kmsg_dump to a kexec path because kdump may fail in the real world.
> > In that case, we need another troubleshooting material like pstore to detect a root cause of failure.
> 
> But you are assuming that kmsg_dump is perfect and it isn't, in which case by putting kmsg_dump in the kdump path, you actually may
> be blocking kdump from working.

That is why I'm trying to fix problems of pstore in panic path right now.
If kmsg_dump/pstore/platform drivers work well in panic path, they work in kdump path as well.

Seiji

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [RFC][PATCH] pstore: Skip spinlock when just one cpu is online
  2012-12-10 18:19       ` Luck, Tony
@ 2012-12-11  0:06         ` Seiji Aguchi
  0 siblings, 0 replies; 9+ messages in thread
From: Seiji Aguchi @ 2012-12-11  0:06 UTC (permalink / raw)
  To: Luck, Tony, Don Zickus
  Cc: linux-kernel@vger.kernel.org, cbouatmailru@gmail.com,
	ccross@android.com, keescook@chromium.org,
	dle-develop@lists.sourceforge.net, Satoru Moriya

> A boot argument might help - so we can force use of pstore in cases where kdump is failing (or prevent use of pstore in cases where it
> seem to be preventing us getting to kdump ... I don't have a preference).  BUT this would only be useful if we had a repeatable
> problem so that we could switch to the other mode ... and it seems likely that the kinds of problems that cause pstore or kdump to fail
> would be weird cases that are not very repeatable :-(

I think it is helpful for reproducible problems because reasons of kdump failure vary widely.
For example, users may forget to erase vmcores from a dump disk and kdump fails due to lack of space.
Also, FC cable or other hardware may be broken.

But, The person I have to convince is not you but kexec engineers...

> 
> -Tony

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-12-11  0:06 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-12-07 21:41 [RFC][PATCH] pstore: Skip spinlock when just one cpu is online Seiji Aguchi
2012-12-07 22:17 ` Luck, Tony
2012-12-07 23:43   ` Seiji Aguchi
2012-12-10 16:48     ` Don Zickus
2012-12-10 18:19       ` Luck, Tony
2012-12-11  0:06         ` Seiji Aguchi
2012-12-10 23:55       ` Seiji Aguchi
2012-12-10 16:42 ` Don Zickus
2012-12-10 23:52   ` Seiji Aguchi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox