* [PATCH 1/2] workqueue: Allow to expose ordered workqueues via sysfs
2026-02-05 11:55 [PATCH 0/2] efi: Expose the runtime-services workqueue via sysfs Sebastian Andrzej Siewior
@ 2026-02-05 11:55 ` Sebastian Andrzej Siewior
2026-02-05 13:39 ` Sebastian Andrzej Siewior
2026-02-05 11:55 ` [PATCH 2/2] efi: Allow to expose the workqueue " Sebastian Andrzej Siewior
` (2 subsequent siblings)
3 siblings, 1 reply; 12+ messages in thread
From: Sebastian Andrzej Siewior @ 2026-02-05 11:55 UTC (permalink / raw)
To: linux-efi, linux-kernel, linux-rt-devel
Cc: Luis Claudio R. Goncalves, Ard Biesheuvel, John Ogness,
Lai Jiangshan, Tejun Heo, Sebastian Andrzej Siewior
Ordered workqueues are not exposed via sysfs because the 'max_active'
attribute changes the number actives worker. More than one active worker
can break ordering guarantees.
This can be avoided by forbidding writes the file for ordered
workqueues. Exposing it via sysfs allows to alter other attributes such
as the cpumask on which CPU the worker can run.
The 'max_active' value shouldn't be changed for BH worker because the
core never spawns additional worker and the worker itself can not be
preempted. So this make no sense.
Allow to expose ordered workqueues via sysfs if requested and forbid
changing 'max_active' value for ordered and BH worker.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
kernel/workqueue.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 253311af47c6d..625ee8cc47f40 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -7097,6 +7097,13 @@ static ssize_t max_active_store(struct device *dev,
struct workqueue_struct *wq = dev_to_wq(dev);
int val;
+ /*
+ * Adjusting max_active breaks ordering guarantee. Changing it has no
+ * effect on BH worker.
+ */
+ if (wq->flags & (WQ_BH | __WQ_ORDERED))
+ return -EACCES;
+
if (sscanf(buf, "%d", &val) != 1 || val <= 0)
return -EINVAL;
@@ -7413,13 +7420,6 @@ int workqueue_sysfs_register(struct workqueue_struct *wq)
struct wq_device *wq_dev;
int ret;
- /*
- * Adjusting max_active breaks ordering guarantee. Disallow exposing
- * ordered workqueues.
- */
- if (WARN_ON(wq->flags & __WQ_ORDERED))
- return -EINVAL;
-
wq->wq_dev = wq_dev = kzalloc(sizeof(*wq_dev), GFP_KERNEL);
if (!wq_dev)
return -ENOMEM;
--
2.51.0
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [PATCH 1/2] workqueue: Allow to expose ordered workqueues via sysfs
2026-02-05 11:55 ` [PATCH 1/2] workqueue: Allow to expose ordered workqueues " Sebastian Andrzej Siewior
@ 2026-02-05 13:39 ` Sebastian Andrzej Siewior
2026-02-05 21:59 ` Tejun Heo
0 siblings, 1 reply; 12+ messages in thread
From: Sebastian Andrzej Siewior @ 2026-02-05 13:39 UTC (permalink / raw)
To: linux-efi, linux-kernel, linux-rt-devel
Cc: Luis Claudio R. Goncalves, Ard Biesheuvel, John Ogness,
Lai Jiangshan, Tejun Heo, Thomas Weißschuh
On 2026-02-05 12:55:58 [+0100], To linux-efi@vger.kernel.org wrote:
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -7097,6 +7097,13 @@ static ssize_t max_active_store(struct device *dev,
> struct workqueue_struct *wq = dev_to_wq(dev);
> int val;
>
> + /*
> + * Adjusting max_active breaks ordering guarantee. Changing it has no
> + * effect on BH worker.
> + */
> + if (wq->flags & (WQ_BH | __WQ_ORDERED))
> + return -EACCES;
> +
> if (sscanf(buf, "%d", &val) != 1 || val <= 0)
> return -EINVAL;
I have been informed that instead of this -EACCES I could do the
following and exposing only the max_active (and per_cpu) attribute as
RO instead.
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 625ee8cc47f40..793b59ce99edb 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -7097,13 +7097,6 @@ static ssize_t max_active_store(struct device *dev,
struct workqueue_struct *wq = dev_to_wq(dev);
int val;
- /*
- * Adjusting max_active breaks ordering guarantee. Changing it has no
- * effect on BH worker.
- */
- if (wq->flags & (WQ_BH | __WQ_ORDERED))
- return -EACCES;
-
if (sscanf(buf, "%d", &val) != 1 || val <= 0)
return -EINVAL;
@@ -7117,7 +7110,26 @@ static struct attribute *wq_sysfs_attrs[] = {
&dev_attr_max_active.attr,
NULL,
};
-ATTRIBUTE_GROUPS(wq_sysfs);
+
+static umode_t wq_sysfs_is_visible(struct kobject *kobj, struct attribute *a, int n)
+{
+ struct device *dev = kobj_to_dev(kobj);
+ struct workqueue_struct *wq = dev_to_wq(dev);
+
+ /*
+ * Adjusting max_active breaks ordering guarantee. Changing it has no
+ * effect on BH worker. Limit max_active to RO in such case.
+ */
+ if (wq->flags & (WQ_BH | __WQ_ORDERED))
+ return 0444;
+ return a->mode;
+}
+
+static const struct attribute_group wq_sysfs_group = {
+ .is_visible = wq_sysfs_is_visible,
+ .attrs = wq_sysfs_attrs,
+};
+__ATTRIBUTE_GROUPS(wq_sysfs);
static ssize_t wq_nice_show(struct device *dev, struct device_attribute *attr,
char *buf)
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [PATCH 1/2] workqueue: Allow to expose ordered workqueues via sysfs
2026-02-05 13:39 ` Sebastian Andrzej Siewior
@ 2026-02-05 21:59 ` Tejun Heo
0 siblings, 0 replies; 12+ messages in thread
From: Tejun Heo @ 2026-02-05 21:59 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: linux-efi, linux-kernel, linux-rt-devel,
Luis Claudio R. Goncalves, Ard Biesheuvel, John Ogness,
Lai Jiangshan, Thomas Weißschuh
On Thu, Feb 05, 2026 at 02:39:13PM +0100, Sebastian Andrzej Siewior wrote:
> +static umode_t wq_sysfs_is_visible(struct kobject *kobj, struct attribute *a, int n)
> +{
> + struct device *dev = kobj_to_dev(kobj);
> + struct workqueue_struct *wq = dev_to_wq(dev);
> +
> + /*
> + * Adjusting max_active breaks ordering guarantee. Changing it has no
> + * effect on BH worker. Limit max_active to RO in such case.
> + */
> + if (wq->flags & (WQ_BH | __WQ_ORDERED))
> + return 0444;
> + return a->mode;
> +}
> +
> +static const struct attribute_group wq_sysfs_group = {
> + .is_visible = wq_sysfs_is_visible,
> + .attrs = wq_sysfs_attrs,
> +};
> +__ATTRIBUTE_GROUPS(wq_sysfs);
Yeah, this looks fine to me.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 2/2] efi: Allow to expose the workqueue via sysfs
2026-02-05 11:55 [PATCH 0/2] efi: Expose the runtime-services workqueue via sysfs Sebastian Andrzej Siewior
2026-02-05 11:55 ` [PATCH 1/2] workqueue: Allow to expose ordered workqueues " Sebastian Andrzej Siewior
@ 2026-02-05 11:55 ` Sebastian Andrzej Siewior
2026-02-09 15:17 ` [PATCH 0/2] efi: Expose the runtime-services " Luis Claudio R. Goncalves
2026-02-09 17:10 ` Ard Biesheuvel
3 siblings, 0 replies; 12+ messages in thread
From: Sebastian Andrzej Siewior @ 2026-02-05 11:55 UTC (permalink / raw)
To: linux-efi, linux-kernel, linux-rt-devel
Cc: Luis Claudio R. Goncalves, Ard Biesheuvel, John Ogness,
Lai Jiangshan, Tejun Heo, Sebastian Andrzej Siewior
Exposing the efi_rts_wq workqueue via sysfs provides an easy mechanism
to restrict EFI firmware invocation to certain CPU(s).
This can be used to restrict EFI invocations to specific CPUs while
allowing other workqueue to use the remaning CPUs.
Expose the workqueue via sysfs. Change the name to efi_runtime which is
what will be visible under sysfs.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
drivers/firmware/efi/efi.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index 17b5f3415465e..96c47dbb743a7 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -423,7 +423,7 @@ static int __init efisubsys_init(void)
* ordered workqueue (which creates only one execution context)
* should suffice for all our needs.
*/
- efi_rts_wq = alloc_ordered_workqueue("efi_rts_wq", 0);
+ efi_rts_wq = alloc_ordered_workqueue("efi_runtime", WQ_SYSFS);
if (!efi_rts_wq) {
pr_err("Creating efi_rts_wq failed, EFI runtime services disabled.\n");
clear_bit(EFI_RUNTIME_SERVICES, &efi.flags);
--
2.51.0
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [PATCH 0/2] efi: Expose the runtime-services workqueue via sysfs
2026-02-05 11:55 [PATCH 0/2] efi: Expose the runtime-services workqueue via sysfs Sebastian Andrzej Siewior
2026-02-05 11:55 ` [PATCH 1/2] workqueue: Allow to expose ordered workqueues " Sebastian Andrzej Siewior
2026-02-05 11:55 ` [PATCH 2/2] efi: Allow to expose the workqueue " Sebastian Andrzej Siewior
@ 2026-02-09 15:17 ` Luis Claudio R. Goncalves
2026-02-09 15:55 ` Sebastian Andrzej Siewior
2026-02-09 17:10 ` Ard Biesheuvel
3 siblings, 1 reply; 12+ messages in thread
From: Luis Claudio R. Goncalves @ 2026-02-09 15:17 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: linux-efi, linux-kernel, linux-rt-devel, Ard Biesheuvel,
John Ogness, Lai Jiangshan, Tejun Heo
On Thu, Feb 05, 2026 at 12:55:57PM +0100, Sebastian Andrzej Siewior wrote:
> EFI runtime services are disabled on PREEMPT_RT by default which can be
> overwritten on the boot command line. For native EFI, an invocation
> requires to disable preemption while a call is made into EFI. The time
> spent in EFI is not deterministic and depends on SW and HW of the
> system.
> While accessing the efi-rtc device can be avoided by using a native
> driver, accessing the "variables" is important and there is no second
> path.
>
> The "runtime-wrappers" is wrapping access to the EFI callback via a
> workqueue. On a SMP system one CPU could be declared as housekeeping/
> not-realtime-capable and force all EFI invocation to be performed on
> this CPU. This could be achieved by setting workqueue.unbound_cpus or
> /sys/devices/virtual/workqueue/cpumask
>
> at runtime. This however will affect all workqueues and might not be
> desired. With an explicit setting such as
> /sys/devices/virtual/workqueue/efi_runtime/cpumask
>
> it looks like an official way to limit the CPUs involved here.
>
> With this in place I was wondering if EFI_DISABLE_RUNTIME could be
> lifted at runtime on SMP systems. But given the unbound_cpus option
> and the auto-config based on NOHZ-full it might not be wise to add yet
> another smart option here. Also it needs to be a subset of root cpumask
> or it won't be effective.
I ran tests on two aarch64 boxes and two x86_64 boxes. I ran timerlat on a
set of isolated CPUs (10-30) and serviced the EFI Runtime requests on CPU4.
During the tests I ran operations such as df, efibootmgr, read individual
EFIvars and performed read/write ops to the boxes using efi-rtc. All that,
at regular intervals during the test duration.
I had previously checked the interference/latency generated by these
operations on each box, so I knew what to look for in terms of expected
latency spikes.
On the aarch64 boxes the impact of the EFI-related requests was confined to
CPU4, as expected, and no apparent noise was recorded on the isolated CPUs.
In one of the x86_64 boxes the noise also seemed to be contained to CPU4.
The other box gave me the impression that SMIs were being triggered by the
EFI runtime requests and that was affecting all the CPUs. I will explore
a bit more both x86_64 cases, but I am more than satisfied with the results
of the proposed patches.
Sebastian, as for the TEE feature you mentioned, is there specific test I
should run? Or is there any test you would like me to run in the context of
this change?
In any case,
Tested-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
> There are two EFI invocations which are not covered by this
> - mixed EFI
> Used on x86 with 64bit kernel but 32bit EFI. Would it work to use here
> the same workqueue mechanism?
>
> - TEE / ARM secure monitor
> If I understand this right then TEE invokes the secure monitor which
> is preemptible. So an interrupt will interrupt and enter "normal"
> world immediately and could wake a user task. The following context
> switch will not happen because the return from interrupt path goes
> back to the secure monitor/ TEE.
> If so, or if TEE may disable interrupts from normal world, would it
> make sense to use a wrapper here, too?
>
> Any comments or things I have missed?
>
> Sebastian
>
> Sebastian Andrzej Siewior (2):
> workqueue: Allow to expose ordered workqueues via sysfs
> efi: Allow to expose the workqueue via sysfs
>
> drivers/firmware/efi/efi.c | 2 +-
> kernel/workqueue.c | 14 +++++++-------
> 2 files changed, 8 insertions(+), 8 deletions(-)
>
> --
> 2.51.0
>
---end quoted text---
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] efi: Expose the runtime-services workqueue via sysfs
2026-02-09 15:17 ` [PATCH 0/2] efi: Expose the runtime-services " Luis Claudio R. Goncalves
@ 2026-02-09 15:55 ` Sebastian Andrzej Siewior
2026-02-12 7:09 ` Ilias Apalodimas
0 siblings, 1 reply; 12+ messages in thread
From: Sebastian Andrzej Siewior @ 2026-02-09 15:55 UTC (permalink / raw)
To: Luis Claudio R. Goncalves
Cc: linux-efi, linux-kernel, linux-rt-devel, Ard Biesheuvel,
John Ogness, Lai Jiangshan, Tejun Heo
On 2026-02-09 12:17:35 [-0300], Luis Claudio R. Goncalves wrote:
> Sebastian, as for the TEE feature you mentioned, is there specific test I
> should run? Or is there any test you would like me to run in the context of
> this change?
Puh.
If you have a TEE environment, then the EFI interface should be
"supplied" the TEE instead the runtime-wrappers. My guess is that
tee_get_variable() would be used instead and here the workqueue won't be
used (I think). So that is the easy part.
What I don't know is if this is a problem, i.e. is it possible to
interrupt the secure monitor and continue in Linux before heading back
to the secure environment or not.
If you could check how long you end up in the next variable and RTC call
and if the time is noticeable, do you see it in cyclictest or not.
So if the EFI-TEE-RTC-callback takes always >1ms and you don't see this
in cyclictest as a spike then it should be good.
Sebastian
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] efi: Expose the runtime-services workqueue via sysfs
2026-02-09 15:55 ` Sebastian Andrzej Siewior
@ 2026-02-12 7:09 ` Ilias Apalodimas
2026-02-12 16:20 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 12+ messages in thread
From: Ilias Apalodimas @ 2026-02-12 7:09 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Luis Claudio R. Goncalves, linux-efi, linux-kernel,
linux-rt-devel, Ard Biesheuvel, John Ogness, Lai Jiangshan,
Tejun Heo
Hi Sebastian,
Late to the party but ...
On Mon, 9 Feb 2026 at 17:55, Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
>
> On 2026-02-09 12:17:35 [-0300], Luis Claudio R. Goncalves wrote:
> > Sebastian, as for the TEE feature you mentioned, is there specific test I
> > should run? Or is there any test you would like me to run in the context of
> > this change?
>
> Puh.
> If you have a TEE environment, then the EFI interface should be
> "supplied" the TEE instead the runtime-wrappers. My guess is that
> tee_get_variable() would be used instead and here the workqueue won't be
> used (I think). So that is the easy part.
>
> What I don't know is if this is a problem, i.e. is it possible to
> interrupt the secure monitor and continue in Linux before heading back
> to the secure environment or not.
In theory yes. In practice, at least for arm & OP-TEE, the
communication between the TEE and the secure-world app doing the
variable chekcs & authentication is via the MM protocol [0].
IIRC that requires to run to completion. So what happens is that you
enter OP-TEE and right before the StMM is invoked (the app that
handles EFI variables) all exceptions are masked and it must run to
completion.
The period of masking does not include writing the variables to
storage. That's handled differently and is interruptible.
> If you could check how long you end up in the next variable and RTC call
> and if the time is noticeable, do you see it in cyclictest or not.
> So if the EFI-TEE-RTC-callback takes always >1ms and you don't see this
> in cyclictest as a spike then it should be good.
>
> Sebastian
>
[0] https://documentation-service.arm.com/static/5ed11e40ca06a95ce53f905c
Cheers
/Ilias
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] efi: Expose the runtime-services workqueue via sysfs
2026-02-12 7:09 ` Ilias Apalodimas
@ 2026-02-12 16:20 ` Sebastian Andrzej Siewior
2026-02-13 6:28 ` Ilias Apalodimas
0 siblings, 1 reply; 12+ messages in thread
From: Sebastian Andrzej Siewior @ 2026-02-12 16:20 UTC (permalink / raw)
To: Ilias Apalodimas
Cc: Luis Claudio R. Goncalves, linux-efi, linux-kernel,
linux-rt-devel, Ard Biesheuvel, John Ogness, Lai Jiangshan,
Tejun Heo
On 2026-02-12 09:09:51 [+0200], Ilias Apalodimas wrote:
> Hi Sebastian,
Hi Ilias,
> Late to the party but ...
glad to have you.
> On Mon, 9 Feb 2026 at 17:55, Sebastian Andrzej Siewior
> > What I don't know is if this is a problem, i.e. is it possible to
> > interrupt the secure monitor and continue in Linux before heading back
> > to the secure environment or not.
>
> In theory yes. In practice, at least for arm & OP-TEE, the
> communication between the TEE and the secure-world app doing the
> variable chekcs & authentication is via the MM protocol [0].
> IIRC that requires to run to completion. So what happens is that you
> enter OP-TEE and right before the StMM is invoked (the app that
> handles EFI variables) all exceptions are masked and it must run to
> completion.
> The period of masking does not include writing the variables to
> storage. That's handled differently and is interruptible.
There it RTC and variables which is the most common thing. If you can
somehow outsource variable read/ write then fine but I guess you need to
wait somehow to ensure the data is written. Anyway.
That referenced document describes the protocol but not the
implementation of how communication works. What I found is that most
interfaces in the TEE world end up either in "SMCCC_1_2 hvc" or
"SMCCC_1_2 smc". The smc command in terms of arguments is described in
https://documentation-service.arm.com/static/5f8ea482f86e16515cdbe3c6
but it does not say if the interrupts are masked. I would assume that it
transfers the execution control to the secure monitor which is then
entered with disabled interrupts similar to an exception on the linux
side. In that case it would mandate a workqueue kind of solution so it
can be pinned to a CPU.
The only exception here seems to be the amdtee driver
(psp_tee_process_cmd()) which sends a command and waits for an answer.
Sebastian
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] efi: Expose the runtime-services workqueue via sysfs
2026-02-12 16:20 ` Sebastian Andrzej Siewior
@ 2026-02-13 6:28 ` Ilias Apalodimas
0 siblings, 0 replies; 12+ messages in thread
From: Ilias Apalodimas @ 2026-02-13 6:28 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Luis Claudio R. Goncalves, linux-efi, linux-kernel,
linux-rt-devel, Ard Biesheuvel, John Ogness, Lai Jiangshan,
Tejun Heo
Hi Sebastian,
On Thu, 12 Feb 2026 at 18:20, Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
>
> On 2026-02-12 09:09:51 [+0200], Ilias Apalodimas wrote:
> > Hi Sebastian,
> Hi Ilias,
>
> > Late to the party but ...
>
> glad to have you.
>
> > On Mon, 9 Feb 2026 at 17:55, Sebastian Andrzej Siewior
> > > What I don't know is if this is a problem, i.e. is it possible to
> > > interrupt the secure monitor and continue in Linux before heading back
> > > to the secure environment or not.
> >
> > In theory yes. In practice, at least for arm & OP-TEE, the
> > communication between the TEE and the secure-world app doing the
> > variable chekcs & authentication is via the MM protocol [0].
> > IIRC that requires to run to completion. So what happens is that you
> > enter OP-TEE and right before the StMM is invoked (the app that
> > handles EFI variables) all exceptions are masked and it must run to
> > completion.
> > The period of masking does not include writing the variables to
> > storage. That's handled differently and is interruptible.
>
> There it RTC and variables which is the most common thing. If you can
> somehow outsource variable read/ write then fine but I guess you need to
> wait somehow to ensure the data is written. Anyway.
>
Yes the variables are processed with exceptions disabled, but the
actual writing to the RPMB runs as OP-TEE RPCs(remote procedure calls)
which can be interrupted.
> That referenced document describes the protocol but not the
> implementation of how communication works. What I found is that most
> interfaces in the TEE world end up either in "SMCCC_1_2 hvc" or
> "SMCCC_1_2 smc". The smc command in terms of arguments is described in
> https://documentation-service.arm.com/static/5f8ea482f86e16515cdbe3c6
>
> but it does not say if the interrupts are masked.
It's a bit cryptic indeed. It doesn't specifically mandate it, but the
chapter 2 introduction says
"A description of how MM services can be invoked asynchronously is
beyond the scope of this specification". So we tend to keep them
disabled. But as i said I am pretty sure keeping them enabled, if
needed, won't break anything.
> I would assume that it
> transfers the execution control to the secure monitor which is then
> entered with disabled interrupts similar to an exception on the linux
> side. In that case it would mandate a workqueue kind of solution so it
> can be pinned to a CPU.
We enter OP-TEE with exceptions is enabled. It's only when we enter
the S-EL0 application that processes the variables we mask exceptions
[0].
>
> The only exception here seems to be the amdtee driver
> (psp_tee_process_cmd()) which sends a command and waits for an answer.
>
> Sebastian
[0] https://github.com/OP-TEE/optee_os/blob/master/core/arch/arm/kernel/stmm_sp.c#L124
Regards
/Ilias
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] efi: Expose the runtime-services workqueue via sysfs
2026-02-05 11:55 [PATCH 0/2] efi: Expose the runtime-services workqueue via sysfs Sebastian Andrzej Siewior
` (2 preceding siblings ...)
2026-02-09 15:17 ` [PATCH 0/2] efi: Expose the runtime-services " Luis Claudio R. Goncalves
@ 2026-02-09 17:10 ` Ard Biesheuvel
2026-02-12 15:18 ` Sebastian Andrzej Siewior
3 siblings, 1 reply; 12+ messages in thread
From: Ard Biesheuvel @ 2026-02-09 17:10 UTC (permalink / raw)
To: Sebastian Andrzej Siewior, linux-efi, linux-kernel,
linux-rt-devel
Cc: Luis Claudio R. Goncalves, John Ogness, Lai Jiangshan, Tejun Heo
Hello Sebastian,
On Thu, 5 Feb 2026, at 12:55, Sebastian Andrzej Siewior wrote:
> EFI runtime services are disabled on PREEMPT_RT by default which can be
> overwritten on the boot command line. For native EFI, an invocation
> requires to disable preemption while a call is made into EFI.
This is no longer true on arm64 since
commit a5baf582f4c026c25a206ac121bceade926aec74
Author: Ard Biesheuvel <ardb@kernel.org>
Date: Wed Oct 15 22:56:42 2025 +0200
arm64/efi: Call EFI runtime services without disabling preemption
except for some corner cases (reboot, pstore crash dump).
> The time
> spent in EFI is not deterministic and depends on SW and HW of the
> system.
> While accessing the efi-rtc device can be avoided by using a native
> driver, accessing the "variables" is important and there is no second
> path.
>
> The "runtime-wrappers" is wrapping access to the EFI callback via a
> workqueue. On a SMP system one CPU could be declared as housekeeping/
> not-realtime-capable and force all EFI invocation to be performed on
> this CPU. This could be achieved by setting workqueue.unbound_cpus or
> /sys/devices/virtual/workqueue/cpumask
>
> at runtime. This however will affect all workqueues and might not be
> desired. With an explicit setting such as
> /sys/devices/virtual/workqueue/efi_runtime/cpumask
>
> it looks like an official way to limit the CPUs involved here.
>
> With this in place I was wondering if EFI_DISABLE_RUNTIME could be
> lifted at runtime on SMP systems. But given the unbound_cpus option
> and the auto-config based on NOHZ-full it might not be wise to add yet
> another smart option here. Also it needs to be a subset of root cpumask
> or it won't be effective.
>
> There are two EFI invocations which are not covered by this
> - mixed EFI
> Used on x86 with 64bit kernel but 32bit EFI. Would it work to use here
> the same workqueue mechanism?
>
That stuff is beyond obsolete, so I don't think it is relevant for RT.
> - TEE / ARM secure monitor
> If I understand this right then TEE invokes the secure monitor which
> is preemptible. So an interrupt will interrupt and enter "normal"
> world immediately and could wake a user task. The following context
> switch will not happen because the return from interrupt path goes
> back to the secure monitor/ TEE.
> If so, or if TEE may disable interrupts from normal world, would it
> make sense to use a wrapper here, too?
>
> Any comments or things I have missed?
>
> Sebastian
>
> Sebastian Andrzej Siewior (2):
> workqueue: Allow to expose ordered workqueues via sysfs
> efi: Allow to expose the workqueue via sysfs
>
> drivers/firmware/efi/efi.c | 2 +-
> kernel/workqueue.c | 14 +++++++-------
> 2 files changed, 8 insertions(+), 8 deletions(-)
>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH 0/2] efi: Expose the runtime-services workqueue via sysfs
2026-02-09 17:10 ` Ard Biesheuvel
@ 2026-02-12 15:18 ` Sebastian Andrzej Siewior
0 siblings, 0 replies; 12+ messages in thread
From: Sebastian Andrzej Siewior @ 2026-02-12 15:18 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-efi, linux-kernel, linux-rt-devel,
Luis Claudio R. Goncalves, John Ogness, Lai Jiangshan, Tejun Heo
On 2026-02-09 18:10:52 [+0100], Ard Biesheuvel wrote:
> Hello Sebastian,
>
> On Thu, 5 Feb 2026, at 12:55, Sebastian Andrzej Siewior wrote:
> > EFI runtime services are disabled on PREEMPT_RT by default which can be
> > overwritten on the boot command line. For native EFI, an invocation
> > requires to disable preemption while a call is made into EFI.
>
> This is no longer true on arm64 since
>
> commit a5baf582f4c026c25a206ac121bceade926aec74
> Author: Ard Biesheuvel <ardb@kernel.org>
> Date: Wed Oct 15 22:56:42 2025 +0200
>
> arm64/efi: Call EFI runtime services without disabling preemption
>
> except for some corner cases (reboot, pstore crash dump).
Hmm. While this sounds familiar (and I think you told me that FPU usage
no longer disables preemption here, too) there is x86 for instance. Here
arch_efi_call_virt_setup() disables preemption twice (efi_fpu_begin() +
(firmware_restrict_branch_speculation_start()) followed by
efi_call_virt_save_flags() where interrupts are off.
Also I don't know if the EFI implementation itself is allowed to disable
interrupts.
> > There are two EFI invocations which are not covered by this
> > - mixed EFI
> > Used on x86 with 64bit kernel but 32bit EFI. Would it work to use here
> > the same workqueue mechanism?
> >
>
> That stuff is beyond obsolete, so I don't think it is relevant for RT.
agreed.
> Acked-by: Ard Biesheuvel <ardb@kernel.org>
Oh. Thank you.
Sebastian
^ permalink raw reply [flat|nested] 12+ messages in thread