linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Aneesh Kumar K.V (IBM) <aneesh.kumar@kernel.org>
To: Nathan Lynch via B4 Relay
	<devnull+nathanl.linux.ibm.com@kernel.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Nicholas Piggin <npiggin@gmail.com>
Cc: "Nathan Lynch" <nathanl@linux.ibm.com>,
	tyreld@linux.ibm.com, "Michal Suchánek" <msuchanek@suse.de>,
	linuxppc-dev@lists.ozlabs.org, gcwilson@linux.ibm.com
Subject: Re: [PATCH v4 05/13] powerpc/rtas: Facilitate high-level call sequences
Date: Mon, 20 Nov 2023 13:40:22 +0530	[thread overview]
Message-ID: <878r6slua9.fsf@kernel.org> (raw)
In-Reply-To: <20231117-papr-sys_rtas-vs-lockdown-v4-5-b794d8cb8502@linux.ibm.com>

Nathan Lynch via B4 Relay <devnull+nathanl.linux.ibm.com@kernel.org>
writes:

> From: Nathan Lynch <nathanl@linux.ibm.com>
>
> On RTAS platforms there is a general restriction that the OS must not
> enter RTAS on more than one CPU at a time. This low-level
> serialization requirement is satisfied by holding a spin
> lock (rtas_lock) across most RTAS function invocations.
>
> However, some pseries RTAS functions require multiple successive calls
> to complete a logical operation. Beginning a new call sequence for such a
> function may disrupt any other sequences of that function already in
> progress. Safe and reliable use of these functions effectively
> requires higher-level serialization beyond what is already done at the
> level of RTAS entry and exit.
>
> Where a sequence-based RTAS function is invoked only through
> sys_rtas(), with no in-kernel users, there is no issue as far as the
> kernel is concerned. User space is responsible for appropriately
> serializing its call sequences. (Whether user space code actually
> takes measures to prevent sequence interleaving is another matter.)
> Examples of such functions currently include ibm,platform-dump and
> ibm,get-vpd.
>
> But where a sequence-based RTAS function has both user space and
> in-kernel uesrs, there is a hazard. Even if the in-kernel call sites
> of such a function serialize their sequences correctly, a user of
> sys_rtas() can invoke the same function at any time, potentially
> disrupting a sequence in progress.
>
> So in order to prevent disruption of kernel-based RTAS call sequences,
> they must serialize not only with themselves but also with sys_rtas()
> users, somehow. Preferably without adding global locks or adding more
> function-specific hacks to sys_rtas(). This is a prerequisite for
> adding an in-kernel call sequence of ibm,get-vpd, which is in a change
> to follow.
>
> Note that it has never been feasible for the kernel to prevent
> sys_rtas()-based sequences from being disrupted because control
> returns to user space on every call. sys_rtas()-based users of these
> functions have always been, and continue to be, responsible for
> coordinating their call sequences with other users, even those which
> may invoke the RTAS functions through less direct means than
> sys_rtas(). This is an unavoidable consequence of exposing
> sequence-based RTAS functions through sys_rtas().
>
> * Add new rtas_function_lock() and rtas_function_unlock() APIs for use
>   with sequence-based RTAS functions.
>
> * Add an optional per-function mutex to struct rtas_function. When this
>   member is set, kernel-internal callers of the RTAS function are
>   required to guard their call sequences with rtas_function_lock() and
>   rtas_function_unlock(). This requirement will be enforced in a later
>   change, after all affected call sites are updated.
>
> * Populate the lock members of function table entries where
>   serialization of call sequences is known to be necessary, along with
>   justifying commentary.
>
> * In sys_rtas(), acquire the per-function mutex when it is present.
>
> There should be no perceivable change introduced here except that
> concurrent callers of the same RTAS function via sys_rtas() may block
> on a mutex instead of spinning on rtas_lock. Changes to follow will
> add rtas_function_lock()/unlock() pairs to kernel-based call
> sequences.
>

Can you add an example of the last part. I did look at to find 06 to
find the details 

	rtas_function_lock(RTAS_FN_IBM_ACTIVATE_FIRMWARE);

	do {
		fwrc = rtas_call(token, 0, 1, NULL);
	} while (rtas_busy_delay(fwrc));

	rtas_function_unlock(RTAS_FN_IBM_ACTIVATE_FIRMWARE);

Reviewed-by: Aneesh Kumar K.V (IBM) <aneesh.kumar@kernel.org>

>
> Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
> ---
>  arch/powerpc/include/asm/rtas.h |   2 +
>  arch/powerpc/kernel/rtas.c      | 101 +++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 101 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
> index b73010583a8d..9a20caba6858 100644
> --- a/arch/powerpc/include/asm/rtas.h
> +++ b/arch/powerpc/include/asm/rtas.h
> @@ -421,6 +421,8 @@ static inline bool rtas_function_implemented(const rtas_fn_handle_t handle)
>  {
>  	return rtas_function_token(handle) != RTAS_UNKNOWN_SERVICE;
>  }
> +void rtas_function_lock(rtas_fn_handle_t handle);
> +void rtas_function_unlock(rtas_fn_handle_t handle);
>  extern int rtas_token(const char *service);
>  extern int rtas_service_present(const char *service);
>  extern int rtas_call(int token, int, int, int *, ...);
> diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
> index 1fc0b3fffdd1..52f2242d0c28 100644
> --- a/arch/powerpc/kernel/rtas.c
> +++ b/arch/powerpc/kernel/rtas.c
> @@ -18,6 +18,7 @@
>  #include <linux/kernel.h>
>  #include <linux/lockdep.h>
>  #include <linux/memblock.h>
> +#include <linux/mutex.h>
>  #include <linux/of.h>
>  #include <linux/of_fdt.h>
>  #include <linux/reboot.h>
> @@ -70,14 +71,34 @@ struct rtas_filter {
>   *                            ppc64le, and we want to keep it that way. It does
>   *                            not make sense for this to be set when @filter
>   *                            is NULL.
> + * @lock: Pointer to an optional dedicated per-function mutex. This
> + *        should be set for functions that require multiple calls in
> + *        sequence to complete a single operation, and such sequences
> + *        will disrupt each other if allowed to interleave. Users of
> + *        this function are required to hold the associated lock for
> + *        the duration of the call sequence. Add an explanatory
> + *        comment to the function table entry if setting this member.
>   */
>  struct rtas_function {
>  	s32 token;
>  	const bool banned_for_syscall_on_le:1;
>  	const char * const name;
>  	const struct rtas_filter *filter;
> +	struct mutex *lock;
>  };
>  
> +/*
> + * Per-function locks. Access these through the
> + * rtas_function_lock/unlock APIs, not directly.
> + */
> +static DEFINE_MUTEX(rtas_ibm_activate_firmware_lock);
> +static DEFINE_MUTEX(rtas_ibm_get_dynamic_sensor_state_lock);
> +static DEFINE_MUTEX(rtas_ibm_get_indices_lock);
> +static DEFINE_MUTEX(rtas_ibm_get_vpd_lock);
> +static DEFINE_MUTEX(rtas_ibm_lpar_perftools_lock);
> +static DEFINE_MUTEX(rtas_ibm_physical_attestation_lock);
> +static DEFINE_MUTEX(rtas_ibm_set_dynamic_indicator_lock);
> +
>  static struct rtas_function rtas_function_table[] __ro_after_init = {
>  	[RTAS_FNIDX__CHECK_EXCEPTION] = {
>  		.name = "check-exception",
> @@ -125,6 +146,13 @@ static struct rtas_function rtas_function_table[] __ro_after_init = {
>  			.buf_idx1 = -1, .size_idx1 = -1,
>  			.buf_idx2 = -1, .size_idx2 = -1,
>  		},
> +		/*
> +		 * PAPR doesn't explicitly impose any restriction, but
> +		 * this typically requires multiple calls before
> +		 * success, and there's no reason to allow sequences
> +		 * to interleave.
> +		 */
> +		.lock = &rtas_ibm_activate_firmware_lock,
>  	},
>  	[RTAS_FNIDX__IBM_CBE_START_PTCAL] = {
>  		.name = "ibm,cbe-start-ptcal",
> @@ -196,6 +224,12 @@ static struct rtas_function rtas_function_table[] __ro_after_init = {
>  			.buf_idx1 = 1, .size_idx1 = -1,
>  			.buf_idx2 = -1, .size_idx2 = -1,
>  		},
> +		/*
> +		 * PAPR+ R1–7.3.19–3 is explicit that the OS must not
> +		 * call ibm,get-dynamic-sensor-state with different
> +		 * inputs until a non-retry status has been returned.
> +		 */
> +		.lock = &rtas_ibm_get_dynamic_sensor_state_lock,
>  	},
>  	[RTAS_FNIDX__IBM_GET_INDICES] = {
>  		.name = "ibm,get-indices",
> @@ -203,6 +237,12 @@ static struct rtas_function rtas_function_table[] __ro_after_init = {
>  			.buf_idx1 = 2, .size_idx1 = 3,
>  			.buf_idx2 = -1, .size_idx2 = -1,
>  		},
> +		/*
> +		 * PAPR+ R1–7.3.17–2 says that the OS must not
> +		 * interleave ibm,get-indices call sequences with
> +		 * different inputs.
> +		 */
> +		.lock = &rtas_ibm_get_indices_lock,
>  	},
>  	[RTAS_FNIDX__IBM_GET_RIO_TOPOLOGY] = {
>  		.name = "ibm,get-rio-topology",
> @@ -220,6 +260,11 @@ static struct rtas_function rtas_function_table[] __ro_after_init = {
>  			.buf_idx1 = 0, .size_idx1 = -1,
>  			.buf_idx2 = 1, .size_idx2 = 2,
>  		},
> +		/*
> +		 * PAPR+ R1–7.3.20–4 indicates that sequences
> +		 * should not be allowed to interleave.
> +		 */
> +		.lock = &rtas_ibm_get_vpd_lock,
>  	},
>  	[RTAS_FNIDX__IBM_GET_XIVE] = {
>  		.name = "ibm,get-xive",
> @@ -239,6 +284,11 @@ static struct rtas_function rtas_function_table[] __ro_after_init = {
>  			.buf_idx1 = 2, .size_idx1 = 3,
>  			.buf_idx2 = -1, .size_idx2 = -1,
>  		},
> +		/*
> +		 * PAPR+ R1–7.3.26–6 says the OS should allow only one
> +		 * call sequence in progress at a time.
> +		 */
> +		.lock = &rtas_ibm_lpar_perftools_lock,
>  	},
>  	[RTAS_FNIDX__IBM_MANAGE_FLASH_IMAGE] = {
>  		.name = "ibm,manage-flash-image",
> @@ -277,6 +327,14 @@ static struct rtas_function rtas_function_table[] __ro_after_init = {
>  			.buf_idx1 = 0, .size_idx1 = 1,
>  			.buf_idx2 = -1, .size_idx2 = -1,
>  		},
> +		/*
> +		 * This follows a sequence-based pattern similar to
> +		 * ibm,get-vpd et al. Since PAPR+ restricts
> +		 * interleaving call sequences for other functions of
> +		 * this style, assume the restriction applies here,
> +		 * even though it's not explicit in the spec.
> +		 */
> +		.lock = &rtas_ibm_physical_attestation_lock,
>  	},
>  	[RTAS_FNIDX__IBM_PLATFORM_DUMP] = {
>  		.name = "ibm,platform-dump",
> @@ -284,6 +342,13 @@ static struct rtas_function rtas_function_table[] __ro_after_init = {
>  			.buf_idx1 = 4, .size_idx1 = 5,
>  			.buf_idx2 = -1, .size_idx2 = -1,
>  		},
> +		/*
> +		 * PAPR+ 7.3.3.4.1 indicates that concurrent sequences
> +		 * of ibm,platform-dump are allowed if they are
> +		 * operating on different dump tags. So leave
> +		 * the lock pointer unset for now. This may need
> +		 * reconsideration if kernel-internal users appear.
> +		 */
>  	},
>  	[RTAS_FNIDX__IBM_POWER_OFF_UPS] = {
>  		.name = "ibm,power-off-ups",
> @@ -326,6 +391,12 @@ static struct rtas_function rtas_function_table[] __ro_after_init = {
>  			.buf_idx1 = 2, .size_idx1 = -1,
>  			.buf_idx2 = -1, .size_idx2 = -1,
>  		},
> +		/*
> +		 * PAPR+ R1–7.3.18–3 says the OS must not call this
> +		 * function with different inputs until a non-retry
> +		 * status has been returned.
> +		 */
> +		.lock = &rtas_ibm_set_dynamic_indicator_lock,
>  	},
>  	[RTAS_FNIDX__IBM_SET_EEH_OPTION] = {
>  		.name = "ibm,set-eeh-option",
> @@ -556,9 +627,9 @@ static int __init rtas_token_to_function_xarray_init(void)
>  }
>  arch_initcall(rtas_token_to_function_xarray_init);
>  
> -static const struct rtas_function *rtas_token_to_function(s32 token)
> +static struct rtas_function *rtas_token_to_function(s32 token)
>  {
> -	const struct rtas_function *func;
> +	struct rtas_function *func;
>  
>  	if (WARN_ONCE(token < 0, "invalid token %d", token))
>  		return NULL;
> @@ -581,6 +652,28 @@ static const struct rtas_function *rtas_token_to_function(s32 token)
>  	return NULL;
>  }
>  
> +static void __rtas_function_lock(struct rtas_function *func)
> +{
> +	if (func && func->lock)
> +		mutex_lock(func->lock);
> +}
> +
> +static void __rtas_function_unlock(struct rtas_function *func)
> +{
> +	if (func && func->lock)
> +		mutex_unlock(func->lock);
> +}
> +
> +void rtas_function_lock(const rtas_fn_handle_t handle)
> +{
> +	__rtas_function_lock(rtas_function_lookup(handle));
> +}
> +
> +void rtas_function_unlock(const rtas_fn_handle_t handle)
> +{
> +	__rtas_function_unlock(rtas_function_lookup(handle));
> +}
> +
>  /* This is here deliberately so it's only used in this file */
>  void enter_rtas(unsigned long);
>  
> @@ -1885,6 +1978,8 @@ SYSCALL_DEFINE1(rtas, struct rtas_args __user *, uargs)
>  
>  	buff_copy = get_errorlog_buffer();
>  
> +	__rtas_function_lock(rtas_token_to_function(token));
> +
>  	raw_spin_lock_irqsave(&rtas_lock, flags);
>  	cookie = lockdep_pin_lock(&rtas_lock);
>  
> @@ -1900,6 +1995,8 @@ SYSCALL_DEFINE1(rtas, struct rtas_args __user *, uargs)
>  	lockdep_unpin_lock(&rtas_lock, cookie);
>  	raw_spin_unlock_irqrestore(&rtas_lock, flags);
>  
> +	__rtas_function_unlock(rtas_token_to_function(token));
> +
>  	if (buff_copy) {
>  		if (errbuf)
>  			log_error(errbuf, ERR_TYPE_RTAS_LOG, 0);
>
> -- 
> 2.41.0

  reply	other threads:[~2023-11-20 20:21 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-18  5:14 [PATCH v4 00/13] powerpc/pseries: New character devices for system parameters and VPD Nathan Lynch via B4 Relay
2023-11-18  5:14 ` [PATCH v4 01/13] powerpc/rtas: Add for_each_rtas_function() iterator Nathan Lynch via B4 Relay
2023-11-20  8:07   ` Aneesh Kumar K.V
2023-11-18  5:14 ` [PATCH v4 02/13] powerpc/rtas: Fall back to linear search on failed token->function lookup Nathan Lynch via B4 Relay
2023-11-20  8:07   ` Aneesh Kumar K.V
2023-11-18  5:14 ` [PATCH v4 03/13] powerpc/rtas: Add function return status constants Nathan Lynch via B4 Relay
2023-11-20  8:08   ` Aneesh Kumar K.V
2023-11-18  5:14 ` [PATCH v4 04/13] powerpc/rtas: Factor out function descriptor lookup Nathan Lynch via B4 Relay
2023-11-20  8:08   ` Aneesh Kumar K.V
2023-11-18  5:14 ` [PATCH v4 05/13] powerpc/rtas: Facilitate high-level call sequences Nathan Lynch via B4 Relay
2023-11-20  8:10   ` Aneesh Kumar K.V [this message]
2023-11-28 15:35     ` Nathan Lynch
2023-11-28 22:30   ` Michael Ellerman
2023-11-28 23:05     ` Nathan Lynch
2023-11-29 13:20       ` Michael Ellerman
2023-11-30 18:26         ` Nathan Lynch
2023-11-30 21:41           ` Nathan Lynch
2023-11-30 22:46           ` Michael Ellerman
2023-11-30 23:53             ` Nathan Lynch
2023-12-05 16:51               ` Nathan Lynch
2023-12-07  1:02                 ` Michael Ellerman
2023-11-29  2:11   ` Michael Ellerman
2023-11-29  2:37     ` Nathan Lynch
2023-11-29  3:16       ` Michael Ellerman
2023-11-18  5:14 ` [PATCH v4 06/13] powerpc/rtas: Serialize firmware activation sequences Nathan Lynch via B4 Relay
2023-11-20  8:12   ` Aneesh Kumar K.V
2023-11-28 15:32     ` Nathan Lynch
2023-11-28 15:46       ` Aneesh Kumar K.V
2023-11-28 16:16         ` Nathan Lynch
2023-11-28 16:41           ` Nathan Lynch
2023-11-18  5:14 ` [PATCH v4 07/13] powerpc/rtas: Warn if per-function lock isn't held Nathan Lynch via B4 Relay
2023-11-20  8:13   ` Aneesh Kumar K.V
2023-11-18  5:14 ` [PATCH v4 08/13] powerpc/uapi: Export papr-miscdev.h header Nathan Lynch via B4 Relay
2023-11-18  5:14 ` [PATCH v4 09/13] powerpc/pseries: Add papr-vpd character driver for VPD retrieval Nathan Lynch via B4 Relay
2023-11-21  8:31   ` Michal Suchánek
2023-11-28 15:38     ` Nathan Lynch
2023-11-29  2:07   ` Michael Ellerman
2023-11-29  2:41     ` Nathan Lynch
2023-11-29  3:13       ` Michael Ellerman
2023-11-18  5:14 ` [PATCH v4 10/13] powerpc/pseries/papr-sysparm: Validate buffer object lengths Nathan Lynch via B4 Relay
2023-11-18  5:14 ` [PATCH v4 11/13] powerpc/pseries/papr-sysparm: Expose character device to user space Nathan Lynch via B4 Relay
2023-11-18  5:14 ` [PATCH v4 12/13] powerpc/selftests: Add test for papr-vpd Nathan Lynch via B4 Relay
2023-11-18  5:14 ` [PATCH v4 13/13] powerpc/selftests: Add test for papr-sysparm Nathan Lynch via B4 Relay
2023-11-29  1:08   ` Michael Ellerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878r6slua9.fsf@kernel.org \
    --to=aneesh.kumar@kernel.org \
    --cc=devnull+nathanl.linux.ibm.com@kernel.org \
    --cc=gcwilson@linux.ibm.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=msuchanek@suse.de \
    --cc=nathanl@linux.ibm.com \
    --cc=npiggin@gmail.com \
    --cc=tyreld@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).