linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] powerpc/kexec: Wait 1s for secondaries to enter OPAL
@ 2015-07-22  5:54 Samuel Mendoza-Jonas
  2015-07-27  5:56 ` Stewart Smith
  2015-10-12 11:21 ` [RFC] " Michael Ellerman
  0 siblings, 2 replies; 6+ messages in thread
From: Samuel Mendoza-Jonas @ 2015-07-22  5:54 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Samuel Mendoza-Jonas, benh

Always include a timeout when waiting for secondary cpus to enter OPAL
in the kexec path, rather than only when crashing.

Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>
---
 arch/powerpc/platforms/powernv/setup.c | 21 +++++++++++++--------
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c
index 59076db..f916601 100644
--- a/arch/powerpc/platforms/powernv/setup.c
+++ b/arch/powerpc/platforms/powernv/setup.c
@@ -195,7 +195,7 @@ static void pnv_kexec_wait_secondaries_down(void)
 
 	for_each_online_cpu(i) {
 		uint8_t status;
-		int64_t rc;
+		int64_t rc, timeout = 1000;
 
 		if (i == my_cpu)
 			continue;
@@ -212,6 +212,18 @@ static void pnv_kexec_wait_secondaries_down(void)
 				       i, paca[i].hw_cpu_id);
 				notified = i;
 			}
+
+			/*
+			 * On crash secondaries might be unreachable or hung,
+			 * so timeout if we've waited too long
+			 * */
+			mdelay(1);
+			if (timeout-- == 0) {
+				printk(KERN_ERR "kexec: timed out waiting for "
+				       "cpu %d (physical %d) to enter OPAL\n",
+				       i, paca[i].hw_cpu_id);
+				break;
+			}
 		}
 	}
 }
@@ -233,13 +245,6 @@ static void pnv_kexec_cpu_down(int crash_shutdown, int secondary)
 
 		/* Return the CPU to OPAL */
 		opal_return_cpu();
-	} else if (crash_shutdown) {
-		/*
-		 * On crash, we don't wait for secondaries to go
-		 * down as they might be unreachable or hung, so
-		 * instead we just wait a bit and move on.
-		 */
-		mdelay(1);
 	} else {
 		/* Primary waits for the secondaries to have reached OPAL */
 		pnv_kexec_wait_secondaries_down();
-- 
2.4.6

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] powerpc/kexec: Wait 1s for secondaries to enter OPAL
  2015-07-22  5:54 [RFC PATCH] powerpc/kexec: Wait 1s for secondaries to enter OPAL Samuel Mendoza-Jonas
@ 2015-07-27  5:56 ` Stewart Smith
  2015-07-28  6:13   ` Samuel Mendoza-Jonas
  2015-10-12 11:21 ` [RFC] " Michael Ellerman
  1 sibling, 1 reply; 6+ messages in thread
From: Stewart Smith @ 2015-07-27  5:56 UTC (permalink / raw)
  To: Samuel Mendoza-Jonas, linuxppc-dev; +Cc: Samuel Mendoza-Jonas

Samuel Mendoza-Jonas <sam.mj@au1.ibm.com> writes:
> Always include a timeout when waiting for secondary cpus to enter OPAL
> in the kexec path, rather than only when crashing.

This *sounds* reasonable... but I wonder what actual worse case could
be and why we'd get stuck too long waiting for things?

What was the original bug/problem that inspired this patch?

and is 1s enough?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] powerpc/kexec: Wait 1s for secondaries to enter OPAL
  2015-07-27  5:56 ` Stewart Smith
@ 2015-07-28  6:13   ` Samuel Mendoza-Jonas
  2015-07-28  9:58     ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 6+ messages in thread
From: Samuel Mendoza-Jonas @ 2015-07-28  6:13 UTC (permalink / raw)
  To: Stewart Smith, linuxppc-dev

On 27/07/15 15:56, Stewart Smith wrote:
> Samuel Mendoza-Jonas <sam.mj@au1.ibm.com> writes:
>> Always include a timeout when waiting for secondary cpus to enter OPAL
>> in the kexec path, rather than only when crashing.
> 
> This *sounds* reasonable... but I wonder what actual worse case could
> be and why we'd get stuck too long waiting for things?
> 
> What was the original bug/problem that inspired this patch?
> 
> and is 1s enough?

"It sounds reasonable" was more or less the inspiration :)
While I was going over some of the code relating to the previous kexec
fix with Ben he pointed this out and suggested there wasn't
much of a reason to differentiate between a crashing/non-crashing
cpu as far as the timeout goes - if we're not 'crashing' we still
don't want to spin forever.

I'll let Ben comment on whether 1s per cpu is enough.

> 
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
> 


-- 
-----------
LTC Ozlabs
IBM

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] powerpc/kexec: Wait 1s for secondaries to enter OPAL
  2015-07-28  6:13   ` Samuel Mendoza-Jonas
@ 2015-07-28  9:58     ` Benjamin Herrenschmidt
  2015-07-29  7:24       ` Stewart Smith
  0 siblings, 1 reply; 6+ messages in thread
From: Benjamin Herrenschmidt @ 2015-07-28  9:58 UTC (permalink / raw)
  To: sam.mj; +Cc: Stewart Smith, linuxppc-dev

On Tue, 2015-07-28 at 16:13 +1000, Samuel Mendoza-Jonas wrote:

> "It sounds reasonable" was more or less the inspiration :)
> While I was going over some of the code relating to the previous kexec
> fix with Ben he pointed this out and suggested there wasn't
> much of a reason to differentiate between a crashing/non-crashing
> cpu as far as the timeout goes - if we're not 'crashing' we still
> don't want to spin forever.
> 
> I'll let Ben comment on whether 1s per cpu is enough.

Well, if the scheduler doesn't give us the CPU at the point of kexec
within a second, I think we are in pretty bad shape already, don't you
think ?

I don't mind bumping the timeout of you have worries...

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] powerpc/kexec: Wait 1s for secondaries to enter OPAL
  2015-07-28  9:58     ` Benjamin Herrenschmidt
@ 2015-07-29  7:24       ` Stewart Smith
  0 siblings, 0 replies; 6+ messages in thread
From: Stewart Smith @ 2015-07-29  7:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, sam.mj; +Cc: linuxppc-dev

Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:
> On Tue, 2015-07-28 at 16:13 +1000, Samuel Mendoza-Jonas wrote:
>
>> "It sounds reasonable" was more or less the inspiration :)
>> While I was going over some of the code relating to the previous kexec
>> fix with Ben he pointed this out and suggested there wasn't
>> much of a reason to differentiate between a crashing/non-crashing
>> cpu as far as the timeout goes - if we're not 'crashing' we still
>> don't want to spin forever.
>> 
>> I'll let Ben comment on whether 1s per cpu is enough.
>
> Well, if the scheduler doesn't give us the CPU at the point of kexec
> within a second, I think we are in pretty bad shape already, don't you
> think ?

Quite likely, I think my dislike of magic timeouts just kicked in :)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC] powerpc/kexec: Wait 1s for secondaries to enter OPAL
  2015-07-22  5:54 [RFC PATCH] powerpc/kexec: Wait 1s for secondaries to enter OPAL Samuel Mendoza-Jonas
  2015-07-27  5:56 ` Stewart Smith
@ 2015-10-12 11:21 ` Michael Ellerman
  1 sibling, 0 replies; 6+ messages in thread
From: Michael Ellerman @ 2015-10-12 11:21 UTC (permalink / raw)
  To: Samuel Mendoza-Jonas, linuxppc-dev; +Cc: Samuel Mendoza-Jonas

On Wed, 2015-22-07 at 05:54:29 UTC, Samuel Mendoza-Jonas wrote:
> Always include a timeout when waiting for secondary cpus to enter OPAL
> in the kexec path, rather than only when crashing.
> 
> Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/1b70386c99e997b359735c75

cheers

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-10-12 11:21 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-07-22  5:54 [RFC PATCH] powerpc/kexec: Wait 1s for secondaries to enter OPAL Samuel Mendoza-Jonas
2015-07-27  5:56 ` Stewart Smith
2015-07-28  6:13   ` Samuel Mendoza-Jonas
2015-07-28  9:58     ` Benjamin Herrenschmidt
2015-07-29  7:24       ` Stewart Smith
2015-10-12 11:21 ` [RFC] " Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).