linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Nathan Lynch via B4 Submission Endpoint <devnull+nathanl.linux.ibm.com@kernel.org>
To: Michael Ellerman <mpe@ellerman.id.au>,
	 Nicholas Piggin <npiggin@gmail.com>,
	 Christophe Leroy <christophe.leroy@csgroup.eu>,
	 Kajol Jain <kjain@linux.ibm.com>,
	Laurent Dufour <ldufour@linux.ibm.com>,
	 Mahesh J Salgaonkar <mahesh@linux.ibm.com>,
	 Andrew Donnellan <ajd@linux.ibm.com>,
	Nick Child <nnac123@linux.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>, linuxppc-dev@lists.ozlabs.org
Subject: [PATCH v2 01/19] powerpc/rtas: handle extended delays safely in early boot
Date: Mon, 06 Feb 2023 12:54:17 -0600	[thread overview]
Message-ID: <20230125-b4-powerpc-rtas-queue-v2-1-9aa6bd058063@linux.ibm.com> (raw)
In-Reply-To: <20230125-b4-powerpc-rtas-queue-v2-0-9aa6bd058063@linux.ibm.com>

From: Nathan Lynch <nathanl@linux.ibm.com>

Some code that runs early in boot calls RTAS functions that can return
-2 or 990x statuses, which mean the caller should retry. An example is
pSeries_cmo_feature_init(), which invokes ibm,get-system-parameter but
treats these benign statuses as errors instead of retrying.

pSeries_cmo_feature_init() and similar code should be made to retry
until they succeed or receive a real error, using the usual pattern:

	do {
		rc = rtas_call(token, etc...);
	} while (rtas_busy_delay(rc));

But rtas_busy_delay() will perform a timed sleep on any 990x
status. This isn't safe so early in boot, before the CPU scheduler and
timer subsystem have initialized.

The -2 RTAS status is much more likely to occur during single-threaded
boot than 990x in practice, at least on PowerVM. This is because -2
usually means that RTAS made progress but exhausted its self-imposed
timeslice, while 990x is associated with concurrent requests from the
OS causing internal contention. Regardless, according to the language
in PAPR, the OS should be prepared to handle either type of status at
any time.

Add a fallback path to rtas_busy_delay() to handle this as safely as
possible, performing a small delay on 990x. Include a counter to
detect retry loops that aren't making progress and bail out.

This was found by inspection and I'm not aware of any real
failures. However, the implementation of rtas_busy_delay() before
commit 38f7b7067dae ("powerpc/rtas: rtas_busy_delay() improvements")
was not susceptible to this problem, so let's treat this as a
regression.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Fixes: 38f7b7067dae ("powerpc/rtas: rtas_busy_delay() improvements")
---
 arch/powerpc/kernel/rtas.c | 48 +++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 47 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 795225d7f138..ec2df09a70cf 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -606,6 +606,46 @@ unsigned int rtas_busy_delay_time(int status)
 	return ms;
 }
 
+/*
+ * Early boot fallback for rtas_busy_delay().
+ */
+static bool __init rtas_busy_delay_early(int status)
+{
+	static size_t successive_ext_delays __initdata;
+	bool ret;
+
+	switch (status) {
+	case RTAS_EXTENDED_DELAY_MIN...RTAS_EXTENDED_DELAY_MAX:
+		/*
+		 * In the unlikely case that we receive an extended
+		 * delay status in early boot, the OS is probably not
+		 * the cause, and there's nothing we can do to clear
+		 * the condition. Best we can do is delay for a bit
+		 * and hope it's transient. Lie to the caller if it
+		 * seems like we're stuck in a retry loop.
+		 */
+		mdelay(1);
+		ret = true;
+		successive_ext_delays += 1;
+		if (successive_ext_delays > 1000) {
+			pr_err("too many extended delays, giving up\n");
+			dump_stack();
+			ret = false;
+		}
+		break;
+	case RTAS_BUSY:
+		ret = true;
+		successive_ext_delays = 0;
+		break;
+	default:
+		ret = false;
+		successive_ext_delays = 0;
+		break;
+	}
+
+	return ret;
+}
+
 /**
  * rtas_busy_delay() - helper for RTAS busy and extended delay statuses
  *
@@ -624,11 +664,17 @@ unsigned int rtas_busy_delay_time(int status)
  * * false - @status is not @RTAS_BUSY nor an extended delay hint. The
  *           caller is responsible for handling @status.
  */
-bool rtas_busy_delay(int status)
+bool __ref rtas_busy_delay(int status)
 {
 	unsigned int ms;
 	bool ret;
 
+	/*
+	 * Can't do timed sleeps before timekeeping is up.
+	 */
+	if (system_state < SYSTEM_SCHEDULING)
+		return rtas_busy_delay_early(status);
+
 	switch (status) {
 	case RTAS_EXTENDED_DELAY_MIN...RTAS_EXTENDED_DELAY_MAX:
 		ret = true;

-- 
2.39.1


  reply	other threads:[~2023-02-06 19:02 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-06 18:54 [PATCH v2 00/19] RTAS maintenance Nathan Lynch via B4 Submission Endpoint
2023-02-06 18:54 ` Nathan Lynch via B4 Submission Endpoint [this message]
2023-02-08 11:20   ` [PATCH v2 01/19] powerpc/rtas: handle extended delays safely in early boot Michael Ellerman
2023-02-08 13:14     ` Nathan Lynch
2023-02-10  5:54       ` Michael Ellerman
2023-02-06 18:54 ` [PATCH v2 02/19] powerpc/perf/hv-24x7: add missing RTAS retry status handling Nathan Lynch via B4 Submission Endpoint
2023-02-06 18:54 ` [PATCH v2 03/19] powerpc/pseries/lpar: " Nathan Lynch via B4 Submission Endpoint
2023-02-06 18:54 ` [PATCH v2 04/19] powerpc/pseries/lparcfg: " Nathan Lynch via B4 Submission Endpoint
2023-02-06 18:54 ` [PATCH v2 05/19] powerpc/pseries/setup: " Nathan Lynch via B4 Submission Endpoint
2023-02-06 18:54 ` [PATCH v2 06/19] powerpc/pseries: drop RTAS-based timebase synchronization Nathan Lynch via B4 Submission Endpoint
2023-02-06 18:54 ` [PATCH v2 07/19] powerpc/rtas: improve function information lookups Nathan Lynch via B4 Submission Endpoint
2023-02-08 11:57   ` Michael Ellerman
2023-02-08 13:16     ` Nathan Lynch
2023-02-06 18:54 ` [PATCH v2 08/19] powerpc/rtas: strengthen do_enter_rtas() type safety, drop inline Nathan Lynch via B4 Submission Endpoint
2023-02-06 18:54 ` [PATCH v2 09/19] powerpc/tracing: tracepoints for RTAS entry and exit Nathan Lynch via B4 Submission Endpoint
2023-02-06 18:54 ` [PATCH v2 10/19] powerpc/rtas: add tracepoints around RTAS entry Nathan Lynch via B4 Submission Endpoint
2023-02-06 18:54 ` [PATCH v2 11/19] powerpc/rtas: add work area allocator Nathan Lynch via B4 Submission Endpoint
2023-02-08 11:58   ` Michael Ellerman
2023-02-08 14:48     ` Nathan Lynch
2023-02-10  6:07       ` Michael Ellerman
2023-02-06 18:54 ` [PATCH v2 12/19] powerpc/pseries/dlpar: use RTAS work area API Nathan Lynch via B4 Submission Endpoint
2023-02-06 18:54 ` [PATCH v2 13/19] powerpc/pseries: PAPR system parameter API Nathan Lynch via B4 Submission Endpoint
2023-02-06 18:54 ` [PATCH v2 14/19] powerpc/pseries: convert CMO probe to papr_sysparm API Nathan Lynch via B4 Submission Endpoint
2023-02-06 18:54 ` [PATCH v2 15/19] powerpc/pseries/lparcfg: convert " Nathan Lynch via B4 Submission Endpoint
2023-02-06 18:54 ` [PATCH v2 16/19] powerpc/pseries/hv-24x7: " Nathan Lynch via B4 Submission Endpoint
2023-02-06 18:54 ` [PATCH v2 17/19] powerpc/pseries/lpar: " Nathan Lynch via B4 Submission Endpoint
2023-02-06 18:54 ` [PATCH v2 18/19] powerpc/rtas: introduce rtas_function_token() API Nathan Lynch via B4 Submission Endpoint
2023-02-08 12:09   ` Michael Ellerman
2023-02-08 15:44     ` Nathan Lynch
2023-02-06 18:54 ` [PATCH v2 19/19] powerpc/rtas: arch-wide function token lookup conversions Nathan Lynch via B4 Submission Endpoint

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230125-b4-powerpc-rtas-queue-v2-1-9aa6bd058063@linux.ibm.com \
    --to=devnull+nathanl.linux.ibm.com@kernel.org \
    --cc=ajd@linux.ibm.com \
    --cc=christophe.leroy@csgroup.eu \
    --cc=kjain@linux.ibm.com \
    --cc=ldufour@linux.ibm.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mahesh@linux.ibm.com \
    --cc=mpe@ellerman.id.au \
    --cc=nathanl@linux.ibm.com \
    --cc=nnac123@linux.ibm.com \
    --cc=npiggin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).