[RFC][PATCH v2 -next 2/2] Adding lock operations to kmsg_dump()/pstore

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [RFC][PATCH v2 -next 2/2] Adding lock operations to kmsg_dump()/pstore_dump()
@ 2011-10-21 21:21 Seiji Aguchi
  2011-10-28 18:33 ` Don Zickus
  2011-11-10 12:50 ` Peter Zijlstra
  0 siblings, 2 replies; 6+ messages in thread
From: Seiji Aguchi @ 2011-10-21 21:21 UTC (permalink / raw)
  To: Andrew Morton, Chen, Gong, linux-kernel@vger.kernel.org,
	Luck, Tony, Matthew Garrett, Vivek Goyal, len.brown@intel.com,
	ying.huang@intel.com, ak@linux.intel.com, hughd@chromium.org,
	mingo@elte.hu, jmorris@namei.org, a.p.zijlstra@chello.nl,
	namhyung@gmail.com, Don Zickus
  Cc: dle-develop@lists.sourceforge.net, Satoru Moriya

pstore_dump()/kmsg_dump() may be called everywhere in kernel.
So we have to care about following cases.

  - Panic path
    In this case, Logging message process is serialized via smp_send_stop().
    So, we can bust spin_locks.

    Currently, kmsg_dump() may be called twice (KMSG_DUMP_PANIC and KMSG_DUMP_EMERGY)
    So, for avoiding deadlock, I suggest to bust locks rather than skipping them.

  - NMI context 
     While a cpu is in NMI handler, other cpus may be running.
     So, trylock should be called so that lockdep cheking works.
 
  - Process context
    In this case, we can simply take locks.

Patch Descriptions:
  Adding a lock operation below in pstore_dump()
  - busting psinfo->buf_lock of pstore_dump() in panic path for avoiding deadlock.

  Adding lock operations below in kmsg_dump()
  - busting logbuf_lock of kmsg_dump() in panic path for avoiding deadlock.
  - Adding trylock for taking logbuf_lock of kmsg_dump() in NMI context. 
  

 Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com>

---
 fs/pstore/platform.c |   11 +++++++++++
 kernel/printk.c      |   28 +++++++++++++++++++++++++---
 2 files changed, 36 insertions(+), 3 deletions(-)

diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index 2bd620f..91ef164 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -97,6 +97,17 @@ static void pstore_dump(struct kmsg_dumper *dumper,
 	else
 		why = "Unknown";
 
+	/*
+	 * pstore_dump() is called after smp_send_stop() in panic path.
+	 * So, spin_lock should be bust for avoiding deadlock.
+	 */
+	if (reason == KMSG_DUMP_PANIC)
+		spin_lock_init(&psinfo->buf_lock);
+
+	/*
+	 * While a cpu is in NMI handler, other cpus may be running.
+	 * So, trylock should be called so that lockdep checking works.
+	 */
 	if (in_nmi()) {
 		is_locked = spin_trylock(&psinfo->buf_lock);
 		if (!is_locked)
diff --git a/kernel/printk.c b/kernel/printk.c
index 1455a0d..f51f547 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -1730,15 +1730,37 @@ void kmsg_dump(enum kmsg_dump_reason reason)
 	struct kmsg_dumper *dumper;
 	const char *s1, *s2;
 	unsigned long l1, l2;
-	unsigned long flags;
+	unsigned long flags = 0;
+	int is_locked = 0;
 
 	/* Theoretically, the log could move on after we do this, but
 	   there's not a lot we can do about that. The new messages
 	   will overwrite the start of what we dump. */
-	raw_spin_lock_irqsave(&logbuf_lock, flags);
+
+	/*
+	 * kmsg_dump() is called after smp_send_stop() in panic path.
+	 * So, spin_lock should be bust for avoiding deadlock.
+	 */
+	if (reason == KMSG_DUMP_PANIC)
+		raw_spin_lock_init(&logbuf_lock);
+	/*
+	 * While a cpu is in NMI handler, other cpus may be running.
+	 * So, trylock should be called so that lockdep checking works.
+	 */
+	if (in_nmi())
+		is_locked = raw_spin_trylock(&logbuf_lock);
+	else
+		raw_spin_lock_irqsave(&logbuf_lock, flags);
+
 	end = log_end & LOG_BUF_MASK;
 	chars = logged_chars;
-	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
+
+	if (in_nmi()) {
+		if (is_locked)
+			raw_spin_unlock(&logbuf_lock);
+	} else
+		raw_spin_unlock_irqrestore(&logbuf_lock, flags);
+
 
 	if (chars > end) {
 		s1 = log_buf + log_buf_len - chars + end;
-- 1.7.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [RFC][PATCH v2 -next 2/2] Adding lock operations to kmsg_dump()/pstore_dump()
  2011-10-21 21:21 [RFC][PATCH v2 -next 2/2] Adding lock operations to kmsg_dump()/pstore_dump() Seiji Aguchi
@ 2011-10-28 18:33 ` Don Zickus
  2011-10-28 19:02   ` Luck, Tony
  2011-11-10 12:50 ` Peter Zijlstra
  1 sibling, 1 reply; 6+ messages in thread
From: Don Zickus @ 2011-10-28 18:33 UTC (permalink / raw)
  To: Seiji Aguchi
  Cc: Andrew Morton, Chen, Gong, linux-kernel@vger.kernel.org,
	Luck, Tony, Matthew Garrett, Vivek Goyal, len.brown@intel.com,
	ying.huang@intel.com, ak@linux.intel.com, hughd@chromium.org,
	mingo@elte.hu, jmorris@namei.org, a.p.zijlstra@chello.nl,
	namhyung@gmail.com, dle-develop@lists.sourceforge.net,
	Satoru Moriya

On Fri, Oct 21, 2011 at 05:21:35PM -0400, Seiji Aguchi wrote:
> pstore_dump()/kmsg_dump() may be called everywhere in kernel.
> So we have to care about following cases.
> 
>   - Panic path
>     In this case, Logging message process is serialized via smp_send_stop().
>     So, we can bust spin_locks.
> 
>     Currently, kmsg_dump() may be called twice (KMSG_DUMP_PANIC and KMSG_DUMP_EMERGY)
>     So, for avoiding deadlock, I suggest to bust locks rather than skipping them.
> 
>   - NMI context 
>      While a cpu is in NMI handler, other cpus may be running.
>      So, trylock should be called so that lockdep cheking works.
>  
>   - Process context
>     In this case, we can simply take locks.

It ain't pretty but it moves things towards a more reliable message dump.
The odds of us needing to bust the spinlocks are really small.  Most of
the time no one reads the pstore filesystem.

I would love to figure out a prettier solution for this locking mess, but
I can't think of anything.  We have customers who want to utilize this
technology, so I am trying to make sure it is stable and robust for now.
A little selfish I suppose.  But we are open to ideas?

Acked-by: Don Zickus <dzickus@redhat.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [RFC][PATCH v2 -next 2/2] Adding lock operations to kmsg_dump()/pstore_dump()
  2011-10-28 18:33 ` Don Zickus
@ 2011-10-28 19:02   ` Luck, Tony
  2011-10-28 20:00     ` Don Zickus
  0 siblings, 1 reply; 6+ messages in thread
From: Luck, Tony @ 2011-10-28 19:02 UTC (permalink / raw)
  To: Don Zickus, Seiji Aguchi
  Cc: Andrew Morton, Chen, Gong, linux-kernel@vger.kernel.org,
	Matthew Garrett, Vivek Goyal, Brown, Len, Huang, Ying,
	ak@linux.intel.com, hughd@chromium.org, mingo@elte.hu,
	jmorris@namei.org, a.p.zijlstra@chello.nl, namhyung@gmail.com,
	dle-develop@lists.sourceforge.net, Satoru Moriya

> It ain't pretty but it moves things towards a more reliable message dump.
> The odds of us needing to bust the spinlocks are really small.  Most of
> the time no one reads the pstore filesystem.

Does it really change the odds much?  As you say, the common case is that
pstore front-end doesn't have the lock held - so that case is unchanged.
We can get the lock anyway, we don't need to bust it.

Looking at the uncommon case where the lock is held - that means that
pstore was in the middle of some back-end operation. Busting the lock
means that the back-end will be surprised by being called again when the
first operation had not yet completed. In the case of a state machine
driven back end like ERST, I don't think this has a high probability of
working out well.

So you might be moving the needle from 99.999% chance of saving to pstore
with 0.001% chance of hanging on the spin lock. to 99.9991% chance of
saving, and 0.0009% chance of something highly weird happening in the
back-end driver because you busted the lock and called it anyway.

> I would love to figure out a prettier solution for this locking mess, but
> I can't think of anything.  We have customers who want to utilize this
> technology, so I am trying to make sure it is stable and robust for now.
> A little selfish I suppose.  But we are open to ideas?

If a prettier solution is needed - it will have to involve the back-end.
Perhaps a whole separate write/panic path (with separate buffer). Then
a sufficiently smart back end could do the right thing. I have little
confidence that ERST could be made smart in this way, because almost all
of the heavy lifting is done by the BIOS - so Linux has no way to influence
the flow of execution.

-Tony

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC][PATCH v2 -next 2/2] Adding lock operations to kmsg_dump()/pstore_dump()
  2011-10-28 19:02   ` Luck, Tony
@ 2011-10-28 20:00     ` Don Zickus
  0 siblings, 0 replies; 6+ messages in thread
From: Don Zickus @ 2011-10-28 20:00 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Seiji Aguchi, Andrew Morton, Chen, Gong,
	linux-kernel@vger.kernel.org, Matthew Garrett, Vivek Goyal,
	Brown, Len, Huang, Ying, ak@linux.intel.com, hughd@chromium.org,
	mingo@elte.hu, jmorris@namei.org, a.p.zijlstra@chello.nl,
	namhyung@gmail.com, dle-develop@lists.sourceforge.net,
	Satoru Moriya

On Fri, Oct 28, 2011 at 12:02:15PM -0700, Luck, Tony wrote:
> > It ain't pretty but it moves things towards a more reliable message dump.
> > The odds of us needing to bust the spinlocks are really small.  Most of
> > the time no one reads the pstore filesystem.
> 
> Does it really change the odds much?  As you say, the common case is that
> pstore front-end doesn't have the lock held - so that case is unchanged.
> We can get the lock anyway, we don't need to bust it.

Agreed.

> 
> Looking at the uncommon case where the lock is held - that means that
> pstore was in the middle of some back-end operation. Busting the lock
> means that the back-end will be surprised by being called again when the
> first operation had not yet completed. In the case of a state machine
> driven back end like ERST, I don't think this has a high probability of
> working out well.

Remember ERST has two modes, state machine and NVRAM.  The state machine
will have issues, but the NVRAM part (which isn't implemented yet) might
not.  Not sure about EFI. But shouldn't the backend determine that, not pstore?

> 
> So you might be moving the needle from 99.999% chance of saving to pstore
> with 0.001% chance of hanging on the spin lock. to 99.9991% chance of
> saving, and 0.0009% chance of something highly weird happening in the
> back-end driver because you busted the lock and called it anyway.

Sure.  But at the same time, APEI is one of those 'value add' by OEMs.  If
you are paying $20K for this feature, wouldn't you expect this feature to
work 100% of the time?  At least with kdump/netdump, you can say, hey it
was free so you get what you pay for.

I guess it would help if we had more machines with working firmware to
test this.

> 
> > I would love to figure out a prettier solution for this locking mess, but
> > I can't think of anything.  We have customers who want to utilize this
> > technology, so I am trying to make sure it is stable and robust for now.
> > A little selfish I suppose.  But we are open to ideas?
> 
> If a prettier solution is needed - it will have to involve the back-end.
> Perhaps a whole separate write/panic path (with separate buffer). Then
> a sufficiently smart back end could do the right thing. I have little
> confidence that ERST could be made smart in this way, because almost all
> of the heavy lifting is done by the BIOS - so Linux has no way to influence
> the flow of execution.

Sadly I agree.

Perhaps I have been hanging around mjg too much, but I little confidence
in anything ACPI related being smart.

I don't have that much motivation to push this patch very hard.  I just
saw some a theoretical issue and thought I could help Seiji solve it.  I
am more interested in getting the first patch of this series in than this one.

If you find this patch adds more complexity for very little gain, so be
it.  We tried. :-)

Cheers,
Don

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC][PATCH v2 -next 2/2] Adding lock operations to kmsg_dump()/pstore_dump()
  2011-10-21 21:21 [RFC][PATCH v2 -next 2/2] Adding lock operations to kmsg_dump()/pstore_dump() Seiji Aguchi
  2011-10-28 18:33 ` Don Zickus
@ 2011-11-10 12:50 ` Peter Zijlstra
  2011-11-10 13:33   ` Don Zickus
  1 sibling, 1 reply; 6+ messages in thread
From: Peter Zijlstra @ 2011-11-10 12:50 UTC (permalink / raw)
  To: Seiji Aguchi
  Cc: Andrew Morton, Chen, Gong, linux-kernel@vger.kernel.org,
	Luck, Tony, Matthew Garrett, Vivek Goyal, len.brown@intel.com,
	ying.huang@intel.com, ak@linux.intel.com, hughd@chromium.org,
	mingo@elte.hu, jmorris@namei.org, namhyung@gmail.com, Don Zickus,
	dle-develop@lists.sourceforge.net, Satoru Moriya

On Fri, 2011-10-21 at 17:21 -0400, Seiji Aguchi wrote:
> +++ b/fs/pstore/platform.c
> @@ -97,6 +97,17 @@ static void pstore_dump(struct kmsg_dumper *dumper,
>         else
>                 why = "Unknown";
>  
> +       /*
> +        * pstore_dump() is called after smp_send_stop() in panic path.
> +        * So, spin_lock should be bust for avoiding deadlock.
> +        */
> +       if (reason == KMSG_DUMP_PANIC)
> +               spin_lock_init(&psinfo->buf_lock);
> +
> +       /*
> +        * While a cpu is in NMI handler, other cpus may be running.
> +        * So, trylock should be called so that lockdep checking works.
> +        */

Don't be silly, lockdep doesn't cover NMI, in fact you shouldn't use
locks from NMI context ever.

>         if (in_nmi()) {
>                 is_locked = spin_trylock(&psinfo->buf_lock);
>                 if (!is_locked)
> diff --git a/kernel/printk.c b/kernel/printk.c
> index 1455a0d..f51f547 100644
> --- a/kernel/printk.c
> +++ b/kernel/printk.c
> @@ -1730,15 +1730,37 @@ void kmsg_dump(enum kmsg_dump_reason reason)
>         struct kmsg_dumper *dumper;
>         const char *s1, *s2;
>         unsigned long l1, l2;
> -       unsigned long flags;
> +       unsigned long flags = 0;
> +       int is_locked = 0;
>  
>         /* Theoretically, the log could move on after we do this, but
>            there's not a lot we can do about that. The new messages
>            will overwrite the start of what we dump. */
> -       raw_spin_lock_irqsave(&logbuf_lock, flags);
> +
> +       /*
> +        * kmsg_dump() is called after smp_send_stop() in panic path.
> +        * So, spin_lock should be bust for avoiding deadlock.
> +        */
> +       if (reason == KMSG_DUMP_PANIC)
> +               raw_spin_lock_init(&logbuf_lock); 

In both cases where you bust the spinlock at least yell loudly and
disable lock debugging.

And I guess this is where Don wants to use NMIs for smp_send_stop() so
what you get around the fact that this lock you're busting disabled
IRQs?

All in all this patch is way ugly and doesn't make me feel all warm and
fuzzy.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC][PATCH v2 -next 2/2] Adding lock operations to kmsg_dump()/pstore_dump()
  2011-11-10 12:50 ` Peter Zijlstra
@ 2011-11-10 13:33   ` Don Zickus
  0 siblings, 0 replies; 6+ messages in thread
From: Don Zickus @ 2011-11-10 13:33 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Seiji Aguchi, Andrew Morton, Chen, Gong,
	linux-kernel@vger.kernel.org, Luck, Tony, Matthew Garrett,
	Vivek Goyal, len.brown@intel.com, ying.huang@intel.com,
	ak@linux.intel.com, hughd@chromium.org, mingo@elte.hu,
	jmorris@namei.org, namhyung@gmail.com,
	dle-develop@lists.sourceforge.net, Satoru Moriya

On Thu, Nov 10, 2011 at 01:50:25PM +0100, Peter Zijlstra wrote:
> On Fri, 2011-10-21 at 17:21 -0400, Seiji Aguchi wrote:
> > +++ b/fs/pstore/platform.c
> > @@ -97,6 +97,17 @@ static void pstore_dump(struct kmsg_dumper *dumper,
> >         else
> >                 why = "Unknown";
> >  
> > +       /*
> > +        * pstore_dump() is called after smp_send_stop() in panic path.
> > +        * So, spin_lock should be bust for avoiding deadlock.
> > +        */
> > +       if (reason == KMSG_DUMP_PANIC)
> > +               spin_lock_init(&psinfo->buf_lock);
> > +
> > +       /*
> > +        * While a cpu is in NMI handler, other cpus may be running.
> > +        * So, trylock should be called so that lockdep checking works.
> > +        */
> 
> Don't be silly, lockdep doesn't cover NMI, in fact you shouldn't use
> locks from NMI context ever.

Heh.  I would normally agree, but in this case we have a piece of hardware
that can be accessed from normal, irq and NMI context.  I still scratch my
head for the best way to handle this.  This approach was sorta of a
bandaid effort to prevent a deadlock in the NMI panic case.

> 
> >         if (in_nmi()) {
> >                 is_locked = spin_trylock(&psinfo->buf_lock);
> >                 if (!is_locked)
> > diff --git a/kernel/printk.c b/kernel/printk.c
> > index 1455a0d..f51f547 100644
> > --- a/kernel/printk.c
> > +++ b/kernel/printk.c
> > @@ -1730,15 +1730,37 @@ void kmsg_dump(enum kmsg_dump_reason reason)
> >         struct kmsg_dumper *dumper;
> >         const char *s1, *s2;
> >         unsigned long l1, l2;
> > -       unsigned long flags;
> > +       unsigned long flags = 0;
> > +       int is_locked = 0;
> >  
> >         /* Theoretically, the log could move on after we do this, but
> >            there's not a lot we can do about that. The new messages
> >            will overwrite the start of what we dump. */
> > -       raw_spin_lock_irqsave(&logbuf_lock, flags);
> > +
> > +       /*
> > +        * kmsg_dump() is called after smp_send_stop() in panic path.
> > +        * So, spin_lock should be bust for avoiding deadlock.
> > +        */
> > +       if (reason == KMSG_DUMP_PANIC)
> > +               raw_spin_lock_init(&logbuf_lock); 
> 
> In both cases where you bust the spinlock at least yell loudly and
> disable lock debugging.
> 
> And I guess this is where Don wants to use NMIs for smp_send_stop() so
> what you get around the fact that this lock you're busting disabled
> IRQs?

:-)  I thought it would be safer to bust spinlocks if we could have a
better gaurantee the other cpus were not accessing the hardware at the
same time.

> 
> All in all this patch is way ugly and doesn't make me feel all warm and
> fuzzy.

I understand.

Cheers,
Don

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-11-10 13:33 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-21 21:21 [RFC][PATCH v2 -next 2/2] Adding lock operations to kmsg_dump()/pstore_dump() Seiji Aguchi
2011-10-28 18:33 ` Don Zickus
2011-10-28 19:02   ` Luck, Tony
2011-10-28 20:00     ` Don Zickus
2011-11-10 12:50 ` Peter Zijlstra
2011-11-10 13:33   ` Don Zickus

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox