From: Aneesh Kumar K.V (IBM) <aneesh.kumar@kernel.org>
To: Nathan Lynch via B4 Relay
<devnull+nathanl.linux.ibm.com@kernel.org>,
Michael Ellerman <mpe@ellerman.id.au>,
Nicholas Piggin <npiggin@gmail.com>
Cc: "Nathan Lynch" <nathanl@linux.ibm.com>,
tyreld@linux.ibm.com, "Michal Suchánek" <msuchanek@suse.de>,
linuxppc-dev@lists.ozlabs.org, gcwilson@linux.ibm.com
Subject: Re: [PATCH v4 05/13] powerpc/rtas: Facilitate high-level call sequences
Date: Mon, 20 Nov 2023 13:40:22 +0530 [thread overview]
Message-ID: <878r6slua9.fsf@kernel.org> (raw)
In-Reply-To: <20231117-papr-sys_rtas-vs-lockdown-v4-5-b794d8cb8502@linux.ibm.com>
Nathan Lynch via B4 Relay <devnull+nathanl.linux.ibm.com@kernel.org>
writes:
> From: Nathan Lynch <nathanl@linux.ibm.com>
>
> On RTAS platforms there is a general restriction that the OS must not
> enter RTAS on more than one CPU at a time. This low-level
> serialization requirement is satisfied by holding a spin
> lock (rtas_lock) across most RTAS function invocations.
>
> However, some pseries RTAS functions require multiple successive calls
> to complete a logical operation. Beginning a new call sequence for such a
> function may disrupt any other sequences of that function already in
> progress. Safe and reliable use of these functions effectively
> requires higher-level serialization beyond what is already done at the
> level of RTAS entry and exit.
>
> Where a sequence-based RTAS function is invoked only through
> sys_rtas(), with no in-kernel users, there is no issue as far as the
> kernel is concerned. User space is responsible for appropriately
> serializing its call sequences. (Whether user space code actually
> takes measures to prevent sequence interleaving is another matter.)
> Examples of such functions currently include ibm,platform-dump and
> ibm,get-vpd.
>
> But where a sequence-based RTAS function has both user space and
> in-kernel uesrs, there is a hazard. Even if the in-kernel call sites
> of such a function serialize their sequences correctly, a user of
> sys_rtas() can invoke the same function at any time, potentially
> disrupting a sequence in progress.
>
> So in order to prevent disruption of kernel-based RTAS call sequences,
> they must serialize not only with themselves but also with sys_rtas()
> users, somehow. Preferably without adding global locks or adding more
> function-specific hacks to sys_rtas(). This is a prerequisite for
> adding an in-kernel call sequence of ibm,get-vpd, which is in a change
> to follow.
>
> Note that it has never been feasible for the kernel to prevent
> sys_rtas()-based sequences from being disrupted because control
> returns to user space on every call. sys_rtas()-based users of these
> functions have always been, and continue to be, responsible for
> coordinating their call sequences with other users, even those which
> may invoke the RTAS functions through less direct means than
> sys_rtas(). This is an unavoidable consequence of exposing
> sequence-based RTAS functions through sys_rtas().
>
> * Add new rtas_function_lock() and rtas_function_unlock() APIs for use
> with sequence-based RTAS functions.
>
> * Add an optional per-function mutex to struct rtas_function. When this
> member is set, kernel-internal callers of the RTAS function are
> required to guard their call sequences with rtas_function_lock() and
> rtas_function_unlock(). This requirement will be enforced in a later
> change, after all affected call sites are updated.
>
> * Populate the lock members of function table entries where
> serialization of call sequences is known to be necessary, along with
> justifying commentary.
>
> * In sys_rtas(), acquire the per-function mutex when it is present.
>
> There should be no perceivable change introduced here except that
> concurrent callers of the same RTAS function via sys_rtas() may block
> on a mutex instead of spinning on rtas_lock. Changes to follow will
> add rtas_function_lock()/unlock() pairs to kernel-based call
> sequences.
>
Can you add an example of the last part. I did look at to find 06 to
find the details
rtas_function_lock(RTAS_FN_IBM_ACTIVATE_FIRMWARE);
do {
fwrc = rtas_call(token, 0, 1, NULL);
} while (rtas_busy_delay(fwrc));
rtas_function_unlock(RTAS_FN_IBM_ACTIVATE_FIRMWARE);
Reviewed-by: Aneesh Kumar K.V (IBM) <aneesh.kumar@kernel.org>
>
> Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
> ---
> arch/powerpc/include/asm/rtas.h | 2 +
> arch/powerpc/kernel/rtas.c | 101 +++++++++++++++++++++++++++++++++++++++-
> 2 files changed, 101 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
> index b73010583a8d..9a20caba6858 100644
> --- a/arch/powerpc/include/asm/rtas.h
> +++ b/arch/powerpc/include/asm/rtas.h
> @@ -421,6 +421,8 @@ static inline bool rtas_function_implemented(const rtas_fn_handle_t handle)
> {
> return rtas_function_token(handle) != RTAS_UNKNOWN_SERVICE;
> }
> +void rtas_function_lock(rtas_fn_handle_t handle);
> +void rtas_function_unlock(rtas_fn_handle_t handle);
> extern int rtas_token(const char *service);
> extern int rtas_service_present(const char *service);
> extern int rtas_call(int token, int, int, int *, ...);
> diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
> index 1fc0b3fffdd1..52f2242d0c28 100644
> --- a/arch/powerpc/kernel/rtas.c
> +++ b/arch/powerpc/kernel/rtas.c
> @@ -18,6 +18,7 @@
> #include <linux/kernel.h>
> #include <linux/lockdep.h>
> #include <linux/memblock.h>
> +#include <linux/mutex.h>
> #include <linux/of.h>
> #include <linux/of_fdt.h>
> #include <linux/reboot.h>
> @@ -70,14 +71,34 @@ struct rtas_filter {
> * ppc64le, and we want to keep it that way. It does
> * not make sense for this to be set when @filter
> * is NULL.
> + * @lock: Pointer to an optional dedicated per-function mutex. This
> + * should be set for functions that require multiple calls in
> + * sequence to complete a single operation, and such sequences
> + * will disrupt each other if allowed to interleave. Users of
> + * this function are required to hold the associated lock for
> + * the duration of the call sequence. Add an explanatory
> + * comment to the function table entry if setting this member.
> */
> struct rtas_function {
> s32 token;
> const bool banned_for_syscall_on_le:1;
> const char * const name;
> const struct rtas_filter *filter;
> + struct mutex *lock;
> };
>
> +/*
> + * Per-function locks. Access these through the
> + * rtas_function_lock/unlock APIs, not directly.
> + */
> +static DEFINE_MUTEX(rtas_ibm_activate_firmware_lock);
> +static DEFINE_MUTEX(rtas_ibm_get_dynamic_sensor_state_lock);
> +static DEFINE_MUTEX(rtas_ibm_get_indices_lock);
> +static DEFINE_MUTEX(rtas_ibm_get_vpd_lock);
> +static DEFINE_MUTEX(rtas_ibm_lpar_perftools_lock);
> +static DEFINE_MUTEX(rtas_ibm_physical_attestation_lock);
> +static DEFINE_MUTEX(rtas_ibm_set_dynamic_indicator_lock);
> +
> static struct rtas_function rtas_function_table[] __ro_after_init = {
> [RTAS_FNIDX__CHECK_EXCEPTION] = {
> .name = "check-exception",
> @@ -125,6 +146,13 @@ static struct rtas_function rtas_function_table[] __ro_after_init = {
> .buf_idx1 = -1, .size_idx1 = -1,
> .buf_idx2 = -1, .size_idx2 = -1,
> },
> + /*
> + * PAPR doesn't explicitly impose any restriction, but
> + * this typically requires multiple calls before
> + * success, and there's no reason to allow sequences
> + * to interleave.
> + */
> + .lock = &rtas_ibm_activate_firmware_lock,
> },
> [RTAS_FNIDX__IBM_CBE_START_PTCAL] = {
> .name = "ibm,cbe-start-ptcal",
> @@ -196,6 +224,12 @@ static struct rtas_function rtas_function_table[] __ro_after_init = {
> .buf_idx1 = 1, .size_idx1 = -1,
> .buf_idx2 = -1, .size_idx2 = -1,
> },
> + /*
> + * PAPR+ R1–7.3.19–3 is explicit that the OS must not
> + * call ibm,get-dynamic-sensor-state with different
> + * inputs until a non-retry status has been returned.
> + */
> + .lock = &rtas_ibm_get_dynamic_sensor_state_lock,
> },
> [RTAS_FNIDX__IBM_GET_INDICES] = {
> .name = "ibm,get-indices",
> @@ -203,6 +237,12 @@ static struct rtas_function rtas_function_table[] __ro_after_init = {
> .buf_idx1 = 2, .size_idx1 = 3,
> .buf_idx2 = -1, .size_idx2 = -1,
> },
> + /*
> + * PAPR+ R1–7.3.17–2 says that the OS must not
> + * interleave ibm,get-indices call sequences with
> + * different inputs.
> + */
> + .lock = &rtas_ibm_get_indices_lock,
> },
> [RTAS_FNIDX__IBM_GET_RIO_TOPOLOGY] = {
> .name = "ibm,get-rio-topology",
> @@ -220,6 +260,11 @@ static struct rtas_function rtas_function_table[] __ro_after_init = {
> .buf_idx1 = 0, .size_idx1 = -1,
> .buf_idx2 = 1, .size_idx2 = 2,
> },
> + /*
> + * PAPR+ R1–7.3.20–4 indicates that sequences
> + * should not be allowed to interleave.
> + */
> + .lock = &rtas_ibm_get_vpd_lock,
> },
> [RTAS_FNIDX__IBM_GET_XIVE] = {
> .name = "ibm,get-xive",
> @@ -239,6 +284,11 @@ static struct rtas_function rtas_function_table[] __ro_after_init = {
> .buf_idx1 = 2, .size_idx1 = 3,
> .buf_idx2 = -1, .size_idx2 = -1,
> },
> + /*
> + * PAPR+ R1–7.3.26–6 says the OS should allow only one
> + * call sequence in progress at a time.
> + */
> + .lock = &rtas_ibm_lpar_perftools_lock,
> },
> [RTAS_FNIDX__IBM_MANAGE_FLASH_IMAGE] = {
> .name = "ibm,manage-flash-image",
> @@ -277,6 +327,14 @@ static struct rtas_function rtas_function_table[] __ro_after_init = {
> .buf_idx1 = 0, .size_idx1 = 1,
> .buf_idx2 = -1, .size_idx2 = -1,
> },
> + /*
> + * This follows a sequence-based pattern similar to
> + * ibm,get-vpd et al. Since PAPR+ restricts
> + * interleaving call sequences for other functions of
> + * this style, assume the restriction applies here,
> + * even though it's not explicit in the spec.
> + */
> + .lock = &rtas_ibm_physical_attestation_lock,
> },
> [RTAS_FNIDX__IBM_PLATFORM_DUMP] = {
> .name = "ibm,platform-dump",
> @@ -284,6 +342,13 @@ static struct rtas_function rtas_function_table[] __ro_after_init = {
> .buf_idx1 = 4, .size_idx1 = 5,
> .buf_idx2 = -1, .size_idx2 = -1,
> },
> + /*
> + * PAPR+ 7.3.3.4.1 indicates that concurrent sequences
> + * of ibm,platform-dump are allowed if they are
> + * operating on different dump tags. So leave
> + * the lock pointer unset for now. This may need
> + * reconsideration if kernel-internal users appear.
> + */
> },
> [RTAS_FNIDX__IBM_POWER_OFF_UPS] = {
> .name = "ibm,power-off-ups",
> @@ -326,6 +391,12 @@ static struct rtas_function rtas_function_table[] __ro_after_init = {
> .buf_idx1 = 2, .size_idx1 = -1,
> .buf_idx2 = -1, .size_idx2 = -1,
> },
> + /*
> + * PAPR+ R1–7.3.18–3 says the OS must not call this
> + * function with different inputs until a non-retry
> + * status has been returned.
> + */
> + .lock = &rtas_ibm_set_dynamic_indicator_lock,
> },
> [RTAS_FNIDX__IBM_SET_EEH_OPTION] = {
> .name = "ibm,set-eeh-option",
> @@ -556,9 +627,9 @@ static int __init rtas_token_to_function_xarray_init(void)
> }
> arch_initcall(rtas_token_to_function_xarray_init);
>
> -static const struct rtas_function *rtas_token_to_function(s32 token)
> +static struct rtas_function *rtas_token_to_function(s32 token)
> {
> - const struct rtas_function *func;
> + struct rtas_function *func;
>
> if (WARN_ONCE(token < 0, "invalid token %d", token))
> return NULL;
> @@ -581,6 +652,28 @@ static const struct rtas_function *rtas_token_to_function(s32 token)
> return NULL;
> }
>
> +static void __rtas_function_lock(struct rtas_function *func)
> +{
> + if (func && func->lock)
> + mutex_lock(func->lock);
> +}
> +
> +static void __rtas_function_unlock(struct rtas_function *func)
> +{
> + if (func && func->lock)
> + mutex_unlock(func->lock);
> +}
> +
> +void rtas_function_lock(const rtas_fn_handle_t handle)
> +{
> + __rtas_function_lock(rtas_function_lookup(handle));
> +}
> +
> +void rtas_function_unlock(const rtas_fn_handle_t handle)
> +{
> + __rtas_function_unlock(rtas_function_lookup(handle));
> +}
> +
> /* This is here deliberately so it's only used in this file */
> void enter_rtas(unsigned long);
>
> @@ -1885,6 +1978,8 @@ SYSCALL_DEFINE1(rtas, struct rtas_args __user *, uargs)
>
> buff_copy = get_errorlog_buffer();
>
> + __rtas_function_lock(rtas_token_to_function(token));
> +
> raw_spin_lock_irqsave(&rtas_lock, flags);
> cookie = lockdep_pin_lock(&rtas_lock);
>
> @@ -1900,6 +1995,8 @@ SYSCALL_DEFINE1(rtas, struct rtas_args __user *, uargs)
> lockdep_unpin_lock(&rtas_lock, cookie);
> raw_spin_unlock_irqrestore(&rtas_lock, flags);
>
> + __rtas_function_unlock(rtas_token_to_function(token));
> +
> if (buff_copy) {
> if (errbuf)
> log_error(errbuf, ERR_TYPE_RTAS_LOG, 0);
>
> --
> 2.41.0
next prev parent reply other threads:[~2023-11-20 20:21 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-18 5:14 [PATCH v4 00/13] powerpc/pseries: New character devices for system parameters and VPD Nathan Lynch
2023-11-18 5:14 ` Nathan Lynch via B4 Relay
2023-11-18 5:14 ` [PATCH v4 01/13] powerpc/rtas: Add for_each_rtas_function() iterator Nathan Lynch
2023-11-18 5:14 ` Nathan Lynch via B4 Relay
2023-11-20 8:07 ` Aneesh Kumar K.V
2023-11-18 5:14 ` [PATCH v4 02/13] powerpc/rtas: Fall back to linear search on failed token->function lookup Nathan Lynch
2023-11-18 5:14 ` Nathan Lynch via B4 Relay
2023-11-20 8:07 ` Aneesh Kumar K.V
2023-11-18 5:14 ` [PATCH v4 03/13] powerpc/rtas: Add function return status constants Nathan Lynch
2023-11-18 5:14 ` Nathan Lynch via B4 Relay
2023-11-20 8:08 ` Aneesh Kumar K.V
2023-11-18 5:14 ` [PATCH v4 04/13] powerpc/rtas: Factor out function descriptor lookup Nathan Lynch
2023-11-18 5:14 ` Nathan Lynch via B4 Relay
2023-11-20 8:08 ` Aneesh Kumar K.V
2023-11-18 5:14 ` [PATCH v4 05/13] powerpc/rtas: Facilitate high-level call sequences Nathan Lynch
2023-11-18 5:14 ` Nathan Lynch via B4 Relay
2023-11-20 8:10 ` Aneesh Kumar K.V [this message]
2023-11-28 15:35 ` Nathan Lynch
2023-11-28 22:30 ` Michael Ellerman
2023-11-28 23:05 ` Nathan Lynch
2023-11-29 13:20 ` Michael Ellerman
2023-11-30 18:26 ` Nathan Lynch
2023-11-30 21:41 ` Nathan Lynch
2023-11-30 22:46 ` Michael Ellerman
2023-11-30 23:53 ` Nathan Lynch
2023-12-05 16:51 ` Nathan Lynch
2023-12-07 1:02 ` Michael Ellerman
2023-11-29 2:11 ` Michael Ellerman
2023-11-29 2:37 ` Nathan Lynch
2023-11-29 3:16 ` Michael Ellerman
2023-11-18 5:14 ` [PATCH v4 06/13] powerpc/rtas: Serialize firmware activation sequences Nathan Lynch
2023-11-18 5:14 ` Nathan Lynch via B4 Relay
2023-11-20 8:12 ` Aneesh Kumar K.V
2023-11-28 15:32 ` Nathan Lynch
2023-11-28 15:46 ` Aneesh Kumar K.V
2023-11-28 16:16 ` Nathan Lynch
2023-11-28 16:41 ` Nathan Lynch
2023-11-18 5:14 ` [PATCH v4 07/13] powerpc/rtas: Warn if per-function lock isn't held Nathan Lynch
2023-11-18 5:14 ` Nathan Lynch via B4 Relay
2023-11-20 8:13 ` Aneesh Kumar K.V
2023-11-18 5:14 ` [PATCH v4 08/13] powerpc/uapi: Export papr-miscdev.h header Nathan Lynch
2023-11-18 5:14 ` Nathan Lynch via B4 Relay
2023-11-18 5:14 ` [PATCH v4 09/13] powerpc/pseries: Add papr-vpd character driver for VPD retrieval Nathan Lynch
2023-11-18 5:14 ` Nathan Lynch via B4 Relay
2023-11-21 8:31 ` Michal Suchánek
2023-11-28 15:38 ` Nathan Lynch
2023-11-29 2:07 ` Michael Ellerman
2023-11-29 2:41 ` Nathan Lynch
2023-11-29 3:13 ` Michael Ellerman
2023-11-18 5:14 ` [PATCH v4 10/13] powerpc/pseries/papr-sysparm: Validate buffer object lengths Nathan Lynch
2023-11-18 5:14 ` Nathan Lynch via B4 Relay
2023-11-18 5:14 ` [PATCH v4 11/13] powerpc/pseries/papr-sysparm: Expose character device to user space Nathan Lynch
2023-11-18 5:14 ` Nathan Lynch via B4 Relay
2023-11-18 5:14 ` [PATCH v4 12/13] powerpc/selftests: Add test for papr-vpd Nathan Lynch
2023-11-18 5:14 ` Nathan Lynch via B4 Relay
2023-11-18 5:14 ` [PATCH v4 13/13] powerpc/selftests: Add test for papr-sysparm Nathan Lynch
2023-11-18 5:14 ` Nathan Lynch via B4 Relay
2023-11-29 1:08 ` Michael Ellerman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=878r6slua9.fsf@kernel.org \
--to=aneesh.kumar@kernel.org \
--cc=devnull+nathanl.linux.ibm.com@kernel.org \
--cc=gcwilson@linux.ibm.com \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mpe@ellerman.id.au \
--cc=msuchanek@suse.de \
--cc=nathanl@linux.ibm.com \
--cc=npiggin@gmail.com \
--cc=tyreld@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.