Linux ACPI
 help / color / mirror / Atom feed
* Re: [PATCH v4 09/10] riscv: Select HAVE_ACPI_APEI required for RAS
From: Sunil V L @ 2026-06-09 14:30 UTC (permalink / raw)
  To: Himanshu Chauhan
  Cc: linux-riscv, linux-kernel, linux-acpi, linux-efi, acpica-devel,
	paul.walmsley, palmer, lenb, james.morse, tony.luck, ardb, conor,
	cleger, robert.moore, anup.patel
In-Reply-To: <20260513084325.2176952-10-himanshu.chauhan@oss.qualcomm.com>

On Wed, May 13, 2026 at 2:14 PM Himanshu Chauhan
<himanshu.chauhan@oss.qualcomm.com> wrote:
>
> Select the HAVE_ACPI_APEI option so that APEI GHES config options
> are visible.
>
> Signed-off-by: Himanshu Chauhan <himanshu.chauhan@oss.qualcomm.com>
> ---
>  arch/riscv/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index d235396c4514..b94b19fb4249 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -187,6 +187,7 @@ config RISCV
>         select HAVE_MOVE_PUD
>         select HAVE_PAGE_SIZE_4KB
>         select HAVE_PCI
> +       select HAVE_ACPI_APEI if ACPI
>         select HAVE_PERF_EVENTS
>         select HAVE_PERF_REGS
>         select HAVE_PERF_USER_STACK_DUMP
>
Reviewed-by: Sunil V L <sunilvl@oss.qualcomm.com>

^ permalink raw reply

* Re: [PATCH v3 03/17] clocksource/drivers/arm_arch_timer: Default to EL2 virtual timer when running VHE
From: Marc Zyngier @ 2026-06-09 12:15 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-arm-kernel, linux-acpi, linux-kernel, devicetree,
	Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla, Catalin Marinas,
	Will Deacon, Rafael J. Wysocki, Mark Rutland, Daniel Lezcano,
	Thomas Gleixner, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Chen-Yu Tsai, Jernej Skrabec, Samuel Holland, Neil Armstrong,
	Kevin Hilman, Jerome Brunet, Martin Blumenstingl, Ge Gordon,
	BST Linux Kernel Upstream Group, Jesper Nilsson, Lars Persson,
	Alim Akhtar, Ivaylo Ivanov, Frank Li, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, Dinh Nguyen,
	Matthias Brugger, AngeloGioacchino Del Regno, Thierry Reding,
	Jonathan Hunter, Bjorn Andersson, Konrad Dybcio,
	Andreas Färber,
	"Yu-Chun Lin [林祐君]", Heiko Stuebner,
	Shawn Lin, Orson Zhai, Baolin Wang, Michal Simek,
	Florian Fainelli
In-Reply-To: <658bffa9-bd70-4b62-ac03-505822ba0be9@samsung.com>

On Tue, 09 Jun 2026 12:32:24 +0100,
Marek Szyprowski <m.szyprowski@samsung.com> wrote:
> 
> On 09.06.2026 12:46, Marc Zyngier wrote:
> > On Tue, 09 Jun 2026 11:35:24 +0100,
> > Marek Szyprowski <m.szyprowski@samsung.com> wrote:
> >> On 09.06.2026 12:21, Marc Zyngier wrote:
> >>> On Tue, 09 Jun 2026 11:03:21 +0100,
> >>> Marek Szyprowski <m.szyprowski@samsung.com> wrote:
> >>>> On 23.05.2026 16:02, Marc Zyngier wrote:
> >>>>> When running with at EL2 with VHE enabled, the architecture provides
> >>>>> two EL2 timer/counters, dubbed physical and virtual. Apart from their
> >>>>> names, they are strictly identical.
> >>>>>
> >>>>> However, they don't get virtualised the same way, specially when
> >>>>> it comes to adding arbitrary offsets to the timers. When running as
> >>>>> a guest, the host CNTVOFF_EL2 does apply to the guest's view of
> >>>>> CNTHV*_El2. This is not true for CNTPOFF_EL2 and CNTHP*_EL2, as
> >>>>> the architecture is broken past the first level of virtualisation
> >>>>> (it lacks some essential mechanisms to be usable, despite what
> >>>>> the ARM ARM pretends).
> >>>>>
> >>>>> This means that when running as a L2 guest hypervisor, using the
> >>>>> physical timer results in traps to L0, which are then forwarded to
> >>>>> L1 in order to emulate the offset, leading to even worse performance
> >>>>> due to massive trap amplification (the combination of register and
> >>>>> ERET trapping is absolutely lethal).
> >>>>>
> >>>>> Switch the arch timer code to using the virtual timer when running
> >>>>> in VHE by default, only using the physical timer if the interrupt
> >>>>> is not correctly described in the firmware tables (which seems
> >>>>> to be an unfortunately common case). This comes as no impact on
> >>>>> bare-metal, and slightly improves the situation in the virtualised
> >>>>> case.
> >>>>>
> >>>>> Signed-off-by: Marc Zyngier <maz@kernel.org>
> >>>> This patch landed recently in linux-next as commit d87773de9efe
> >>>> ("clocksource/drivers/arm_arch_timer: Default to EL2 virtual timer when
> >>>> running VHE"). In my tests I found that it breaks booting of RaspberryPi5
> >>>> board. Reverting it on top of linux-next fixes the issue. Here is a boot
> >>>> log:
> >>> Huh.
> >>>
> >>> [...]
> >>>
> >>>> arch_timer: cp15 timer running at 54.00MHz (hyp-virt).
> >>>> clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0xc743ce346, max_idle_ns: 440795203123 ns
> >>>> sched_clock: 56 bits at 54MHz, resolution 18ns, wraps every 4398046511102ns
> >>> The interrupt appears to be advertised in the DT, but doesn't seem to
> >>> fire. That's obviously not going to end well. My suspicion is that
> >>> either the interrupt isn't wired (that'd be hilariously abd), or is
> >>> left as Group-0 by the firmware (copy-paste from RPi4).
> >>>
> >>> Can you try the following hack and let me know if the kernel shouts at
> >>> you?
> >>>
> >>> Thanks,
> >>>
> >>> 	M.
> >>>
> >>> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
> >>> index ec70c84e9f91d..d05791e6cc0db 100644
> >>> --- a/drivers/irqchip/irq-gic.c
> >>> +++ b/drivers/irqchip/irq-gic.c
> >>> @@ -213,6 +213,7 @@ static void gic_eoimode1_mask_irq(struct irq_data *d)
> >>>  static void gic_unmask_irq(struct irq_data *d)
> >>>  {
> >>>  	gic_poke_irq(d, GIC_DIST_ENABLE_SET);
> >>> +	WARN_ON(!gic_peek_irq(d, GIC_DIST_ENABLE_SET));
> >>>  }
> >>>  
> >>>  static void gic_eoi_irq(struct irq_data *d)
> >> I've applied this change, but it doesn't trigger any warning in the boot log.
> > [+ Florian]
> >
> > Huh. So that really points at the timer not being wired into the GIC,
> > Samsung style... Can you confirm that removing the EL2 virtual timer
> > from the DT results in a booting machine?
> 
> With the following diff the board boots again:

OK, so it really is the EL2 virtual timer PPI that is misbehaving.
Let's see what Florian comes up with on the status of this interrupt.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply

* Re: [PATCH v3 03/17] clocksource/drivers/arm_arch_timer: Default to EL2 virtual timer when running VHE
From: Marek Szyprowski @ 2026-06-09 11:32 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, linux-acpi, linux-kernel, devicetree,
	Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla, Catalin Marinas,
	Will Deacon, Rafael J. Wysocki, Mark Rutland, Daniel Lezcano,
	Thomas Gleixner, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Chen-Yu Tsai, Jernej Skrabec, Samuel Holland, Neil Armstrong,
	Kevin Hilman, Jerome Brunet, Martin Blumenstingl, Ge Gordon,
	BST Linux Kernel Upstream Group, Jesper Nilsson, Lars Persson,
	Alim Akhtar, Ivaylo Ivanov, Frank Li, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, Dinh Nguyen,
	Matthias Brugger, AngeloGioacchino Del Regno, Thierry Reding,
	Jonathan Hunter, Bjorn Andersson, Konrad Dybcio,
	Andreas Färber, Yu-Chun Lin [林祐君],
	Heiko Stuebner, Shawn Lin, Orson Zhai, Baolin Wang, Michal Simek,
	Florian Fainelli
In-Reply-To: <86ecigt2hq.wl-maz@kernel.org>

On 09.06.2026 12:46, Marc Zyngier wrote:
> On Tue, 09 Jun 2026 11:35:24 +0100,
> Marek Szyprowski <m.szyprowski@samsung.com> wrote:
>> On 09.06.2026 12:21, Marc Zyngier wrote:
>>> On Tue, 09 Jun 2026 11:03:21 +0100,
>>> Marek Szyprowski <m.szyprowski@samsung.com> wrote:
>>>> On 23.05.2026 16:02, Marc Zyngier wrote:
>>>>> When running with at EL2 with VHE enabled, the architecture provides
>>>>> two EL2 timer/counters, dubbed physical and virtual. Apart from their
>>>>> names, they are strictly identical.
>>>>>
>>>>> However, they don't get virtualised the same way, specially when
>>>>> it comes to adding arbitrary offsets to the timers. When running as
>>>>> a guest, the host CNTVOFF_EL2 does apply to the guest's view of
>>>>> CNTHV*_El2. This is not true for CNTPOFF_EL2 and CNTHP*_EL2, as
>>>>> the architecture is broken past the first level of virtualisation
>>>>> (it lacks some essential mechanisms to be usable, despite what
>>>>> the ARM ARM pretends).
>>>>>
>>>>> This means that when running as a L2 guest hypervisor, using the
>>>>> physical timer results in traps to L0, which are then forwarded to
>>>>> L1 in order to emulate the offset, leading to even worse performance
>>>>> due to massive trap amplification (the combination of register and
>>>>> ERET trapping is absolutely lethal).
>>>>>
>>>>> Switch the arch timer code to using the virtual timer when running
>>>>> in VHE by default, only using the physical timer if the interrupt
>>>>> is not correctly described in the firmware tables (which seems
>>>>> to be an unfortunately common case). This comes as no impact on
>>>>> bare-metal, and slightly improves the situation in the virtualised
>>>>> case.
>>>>>
>>>>> Signed-off-by: Marc Zyngier <maz@kernel.org>
>>>> This patch landed recently in linux-next as commit d87773de9efe
>>>> ("clocksource/drivers/arm_arch_timer: Default to EL2 virtual timer when
>>>> running VHE"). In my tests I found that it breaks booting of RaspberryPi5
>>>> board. Reverting it on top of linux-next fixes the issue. Here is a boot
>>>> log:
>>> Huh.
>>>
>>> [...]
>>>
>>>> arch_timer: cp15 timer running at 54.00MHz (hyp-virt).
>>>> clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0xc743ce346, max_idle_ns: 440795203123 ns
>>>> sched_clock: 56 bits at 54MHz, resolution 18ns, wraps every 4398046511102ns
>>> The interrupt appears to be advertised in the DT, but doesn't seem to
>>> fire. That's obviously not going to end well. My suspicion is that
>>> either the interrupt isn't wired (that'd be hilariously abd), or is
>>> left as Group-0 by the firmware (copy-paste from RPi4).
>>>
>>> Can you try the following hack and let me know if the kernel shouts at
>>> you?
>>>
>>> Thanks,
>>>
>>> 	M.
>>>
>>> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
>>> index ec70c84e9f91d..d05791e6cc0db 100644
>>> --- a/drivers/irqchip/irq-gic.c
>>> +++ b/drivers/irqchip/irq-gic.c
>>> @@ -213,6 +213,7 @@ static void gic_eoimode1_mask_irq(struct irq_data *d)
>>>  static void gic_unmask_irq(struct irq_data *d)
>>>  {
>>>  	gic_poke_irq(d, GIC_DIST_ENABLE_SET);
>>> +	WARN_ON(!gic_peek_irq(d, GIC_DIST_ENABLE_SET));
>>>  }
>>>  
>>>  static void gic_eoi_irq(struct irq_data *d)
>> I've applied this change, but it doesn't trigger any warning in the boot log.
> [+ Florian]
>
> Huh. So that really points at the timer not being wired into the GIC,
> Samsung style... Can you confirm that removing the EL2 virtual timer
> from the DT results in a booting machine?

With the following diff the board boots again:

diff --git a/arch/arm64/boot/dts/broadcom/bcm2712.dtsi b/arch/arm64/boot/dts/broadcom/bcm2712.dtsi
index 761c59d90ffc..09ff5e9959d3 100644
--- a/arch/arm64/boot/dts/broadcom/bcm2712.dtsi
+++ b/arch/arm64/boot/dts/broadcom/bcm2712.dtsi
@@ -678,8 +678,6 @@ IRQ_TYPE_LEVEL_LOW)>,
                             <GIC_PPI 11 (GIC_CPU_MASK_SIMPLE(4) |
                                          IRQ_TYPE_LEVEL_LOW)>,
                             <GIC_PPI 10 (GIC_CPU_MASK_SIMPLE(4) |
-                                         IRQ_TYPE_LEVEL_LOW)>,
-                            <GIC_PPI 12 (GIC_CPU_MASK_SIMPLE(4) |
                                          IRQ_TYPE_LEVEL_LOW)>;
        };

> Florian, can you please check whether PPI12 is actually the EL2
> virtual timer on the RPI5 SoC?

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply related

* [PATCH] ACPI: RISC-V: Fix false warning in cppc_read_ffh() for same-CPU reads
From: wang.yechao255 @ 2026-06-09 11:28 UTC (permalink / raw)
  To: linux-acpi, linux-riscv, linux-kernel
  Cc: sunilvl, rafael, lenb, pjw, palmer, aou, alex, zhenglifeng1,
	pierre.gondois, viresh.kumar, zhanjie9

From: Wang Yechao <wang.yechao255@zte.com.cn>

Commit 997c021abc6e ("cpufreq: CPPC: Update FIE arch_freq_scale in
ticks for non-PCC regs") changed the CPPC Frequency Invariance Engine
to read AMU counters directly from the scheduler tick for non-PCC
register spaces (like FFH), instead of deferring to a kthread. This
means cppc_read_ffh() is now called with IRQs disabled from the tick
handler, triggering the warning.

This is the same fix as commit df6e4ab654dc ("arm64: topology: Fix
false warning in counters_read_on_cpu() for same-CPU reads").

Fixes: 997c021abc6e ("cpufreq: CPPC: Update FIE arch_freq_scale in ticks for non-PCC regs")
Signed-off-by: Wang Yechao <wang.yechao255@zte.com.cn>
---
 drivers/acpi/riscv/cppc.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/acpi/riscv/cppc.c b/drivers/acpi/riscv/cppc.c
index 42c1a9052470..5dce0377a8df 100644
--- a/drivers/acpi/riscv/cppc.c
+++ b/drivers/acpi/riscv/cppc.c
@@ -98,16 +98,19 @@ int cpc_read_ffh(int cpu, struct cpc_reg *reg, u64 *val)
 {
 	struct sbi_cppc_data data;

-	if (WARN_ON_ONCE(irqs_disabled()))
-		return -EPERM;
-
 	if (FFH_CPPC_TYPE(reg->address) == FFH_CPPC_SBI) {
 		if (!cppc_ext_present)
 			return -EINVAL;

 		data.reg = FFH_CPPC_SBI_REG(reg->address);

-		smp_call_function_single(cpu, sbi_cppc_read, &data, 1);
+		if (irqs_disabled()) {
+			if (WARN_ON_ONCE(cpu != smp_processor_id()))
+				return -EPERM;
+			sbi_cppc_read(&data);
+		} else {
+			smp_call_function_single(cpu, sbi_cppc_read, &data, 1);
+		}

 		*val = data.ret.value;

@@ -115,7 +118,13 @@ int cpc_read_ffh(int cpu, struct cpc_reg *reg, u64 *val)
 	} else if (FFH_CPPC_TYPE(reg->address) == FFH_CPPC_CSR) {
 		data.reg = FFH_CPPC_CSR_NUM(reg->address);

-		smp_call_function_single(cpu, cppc_ffh_csr_read, &data, 1);
+		if (irqs_disabled()) {
+			if (WARN_ON_ONCE(cpu != smp_processor_id()))
+				return -EPERM;
+			cppc_ffh_csr_read(&data);
+		} else {
+			smp_call_function_single(cpu, cppc_ffh_csr_read, &data, 1);
+		}

 		*val = data.ret.value;

-- 
2.43.5

^ permalink raw reply related

* Re: [PATCH v3 03/17] clocksource/drivers/arm_arch_timer: Default to EL2 virtual timer when running VHE
From: Marc Zyngier @ 2026-06-09 10:46 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-arm-kernel, linux-acpi, linux-kernel, devicetree,
	Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla, Catalin Marinas,
	Will Deacon, Rafael J. Wysocki, Mark Rutland, Daniel Lezcano,
	Thomas Gleixner, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Chen-Yu Tsai, Jernej Skrabec, Samuel Holland, Neil Armstrong,
	Kevin Hilman, Jerome Brunet, Martin Blumenstingl, Ge Gordon,
	BST Linux Kernel Upstream Group, Jesper Nilsson, Lars Persson,
	Alim Akhtar, Ivaylo Ivanov, Frank Li, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, Dinh Nguyen,
	Matthias Brugger, AngeloGioacchino Del Regno, Thierry Reding,
	Jonathan Hunter, Bjorn Andersson, Konrad Dybcio,
	Andreas Färber,
	"Yu-Chun Lin [林祐君]", Heiko Stuebner,
	Shawn Lin, Orson Zhai, Baolin Wang, Michal Simek,
	Florian Fainelli
In-Reply-To: <193cc406-0834-4dee-9b4a-02cdfd85e05c@samsung.com>

On Tue, 09 Jun 2026 11:35:24 +0100,
Marek Szyprowski <m.szyprowski@samsung.com> wrote:
> 
> On 09.06.2026 12:21, Marc Zyngier wrote:
> > On Tue, 09 Jun 2026 11:03:21 +0100,
> > Marek Szyprowski <m.szyprowski@samsung.com> wrote:
> >> On 23.05.2026 16:02, Marc Zyngier wrote:
> >>> When running with at EL2 with VHE enabled, the architecture provides
> >>> two EL2 timer/counters, dubbed physical and virtual. Apart from their
> >>> names, they are strictly identical.
> >>>
> >>> However, they don't get virtualised the same way, specially when
> >>> it comes to adding arbitrary offsets to the timers. When running as
> >>> a guest, the host CNTVOFF_EL2 does apply to the guest's view of
> >>> CNTHV*_El2. This is not true for CNTPOFF_EL2 and CNTHP*_EL2, as
> >>> the architecture is broken past the first level of virtualisation
> >>> (it lacks some essential mechanisms to be usable, despite what
> >>> the ARM ARM pretends).
> >>>
> >>> This means that when running as a L2 guest hypervisor, using the
> >>> physical timer results in traps to L0, which are then forwarded to
> >>> L1 in order to emulate the offset, leading to even worse performance
> >>> due to massive trap amplification (the combination of register and
> >>> ERET trapping is absolutely lethal).
> >>>
> >>> Switch the arch timer code to using the virtual timer when running
> >>> in VHE by default, only using the physical timer if the interrupt
> >>> is not correctly described in the firmware tables (which seems
> >>> to be an unfortunately common case). This comes as no impact on
> >>> bare-metal, and slightly improves the situation in the virtualised
> >>> case.
> >>>
> >>> Signed-off-by: Marc Zyngier <maz@kernel.org>
> >> This patch landed recently in linux-next as commit d87773de9efe
> >> ("clocksource/drivers/arm_arch_timer: Default to EL2 virtual timer when
> >> running VHE"). In my tests I found that it breaks booting of RaspberryPi5
> >> board. Reverting it on top of linux-next fixes the issue. Here is a boot
> >> log:
> > Huh.
> >
> > [...]
> >
> >> arch_timer: cp15 timer running at 54.00MHz (hyp-virt).
> >> clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0xc743ce346, max_idle_ns: 440795203123 ns
> >> sched_clock: 56 bits at 54MHz, resolution 18ns, wraps every 4398046511102ns
> > The interrupt appears to be advertised in the DT, but doesn't seem to
> > fire. That's obviously not going to end well. My suspicion is that
> > either the interrupt isn't wired (that'd be hilariously abd), or is
> > left as Group-0 by the firmware (copy-paste from RPi4).
> >
> > Can you try the following hack and let me know if the kernel shouts at
> > you?
> >
> > Thanks,
> >
> > 	M.
> >
> > diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
> > index ec70c84e9f91d..d05791e6cc0db 100644
> > --- a/drivers/irqchip/irq-gic.c
> > +++ b/drivers/irqchip/irq-gic.c
> > @@ -213,6 +213,7 @@ static void gic_eoimode1_mask_irq(struct irq_data *d)
> >  static void gic_unmask_irq(struct irq_data *d)
> >  {
> >  	gic_poke_irq(d, GIC_DIST_ENABLE_SET);
> > +	WARN_ON(!gic_peek_irq(d, GIC_DIST_ENABLE_SET));
> >  }
> >  
> >  static void gic_eoi_irq(struct irq_data *d)
> 
> I've applied this change, but it doesn't trigger any warning in the boot log.

[+ Florian]

Huh. So that really points at the timer not being wired into the GIC,
Samsung style... Can you confirm that removing the EL2 virtual timer
from the DT results in a booting machine?

Florian, can you please check whether PPI12 is actually the EL2
virtual timer on the RPI5 SoC?

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply

* Re: [PATCH v3 03/17] clocksource/drivers/arm_arch_timer: Default to EL2 virtual timer when running VHE
From: Marek Szyprowski @ 2026-06-09 10:35 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, linux-acpi, linux-kernel, devicetree,
	Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla, Catalin Marinas,
	Will Deacon, Rafael J. Wysocki, Mark Rutland, Daniel Lezcano,
	Thomas Gleixner, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Chen-Yu Tsai, Jernej Skrabec, Samuel Holland, Neil Armstrong,
	Kevin Hilman, Jerome Brunet, Martin Blumenstingl, Ge Gordon,
	BST Linux Kernel Upstream Group, Jesper Nilsson, Lars Persson,
	Alim Akhtar, Ivaylo Ivanov, Frank Li, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, Dinh Nguyen,
	Matthias Brugger, AngeloGioacchino Del Regno, Thierry Reding,
	Jonathan Hunter, Bjorn Andersson, Konrad Dybcio,
	Andreas Färber, Yu-Chun Lin [林祐君],
	Heiko Stuebner, Shawn Lin, Orson Zhai, Baolin Wang, Michal Simek
In-Reply-To: <86ik7st3nh.wl-maz@kernel.org>

On 09.06.2026 12:21, Marc Zyngier wrote:
> On Tue, 09 Jun 2026 11:03:21 +0100,
> Marek Szyprowski <m.szyprowski@samsung.com> wrote:
>> On 23.05.2026 16:02, Marc Zyngier wrote:
>>> When running with at EL2 with VHE enabled, the architecture provides
>>> two EL2 timer/counters, dubbed physical and virtual. Apart from their
>>> names, they are strictly identical.
>>>
>>> However, they don't get virtualised the same way, specially when
>>> it comes to adding arbitrary offsets to the timers. When running as
>>> a guest, the host CNTVOFF_EL2 does apply to the guest's view of
>>> CNTHV*_El2. This is not true for CNTPOFF_EL2 and CNTHP*_EL2, as
>>> the architecture is broken past the first level of virtualisation
>>> (it lacks some essential mechanisms to be usable, despite what
>>> the ARM ARM pretends).
>>>
>>> This means that when running as a L2 guest hypervisor, using the
>>> physical timer results in traps to L0, which are then forwarded to
>>> L1 in order to emulate the offset, leading to even worse performance
>>> due to massive trap amplification (the combination of register and
>>> ERET trapping is absolutely lethal).
>>>
>>> Switch the arch timer code to using the virtual timer when running
>>> in VHE by default, only using the physical timer if the interrupt
>>> is not correctly described in the firmware tables (which seems
>>> to be an unfortunately common case). This comes as no impact on
>>> bare-metal, and slightly improves the situation in the virtualised
>>> case.
>>>
>>> Signed-off-by: Marc Zyngier <maz@kernel.org>
>> This patch landed recently in linux-next as commit d87773de9efe
>> ("clocksource/drivers/arm_arch_timer: Default to EL2 virtual timer when
>> running VHE"). In my tests I found that it breaks booting of RaspberryPi5
>> board. Reverting it on top of linux-next fixes the issue. Here is a boot
>> log:
> Huh.
>
> [...]
>
>> arch_timer: cp15 timer running at 54.00MHz (hyp-virt).
>> clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0xc743ce346, max_idle_ns: 440795203123 ns
>> sched_clock: 56 bits at 54MHz, resolution 18ns, wraps every 4398046511102ns
> The interrupt appears to be advertised in the DT, but doesn't seem to
> fire. That's obviously not going to end well. My suspicion is that
> either the interrupt isn't wired (that'd be hilariously abd), or is
> left as Group-0 by the firmware (copy-paste from RPi4).
>
> Can you try the following hack and let me know if the kernel shouts at
> you?
>
> Thanks,
>
> 	M.
>
> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
> index ec70c84e9f91d..d05791e6cc0db 100644
> --- a/drivers/irqchip/irq-gic.c
> +++ b/drivers/irqchip/irq-gic.c
> @@ -213,6 +213,7 @@ static void gic_eoimode1_mask_irq(struct irq_data *d)
>  static void gic_unmask_irq(struct irq_data *d)
>  {
>  	gic_poke_irq(d, GIC_DIST_ENABLE_SET);
> +	WARN_ON(!gic_peek_irq(d, GIC_DIST_ENABLE_SET));
>  }
>  
>  static void gic_eoi_irq(struct irq_data *d)

I've applied this change, but it doesn't trigger any warning in the boot log.

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply

* Re: [PATCH v3 03/17] clocksource/drivers/arm_arch_timer: Default to EL2 virtual timer when running VHE
From: Marc Zyngier @ 2026-06-09 10:21 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-arm-kernel, linux-acpi, linux-kernel, devicetree,
	Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla, Catalin Marinas,
	Will Deacon, Rafael J. Wysocki, Mark Rutland, Daniel Lezcano,
	Thomas Gleixner, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Chen-Yu Tsai, Jernej Skrabec, Samuel Holland, Neil Armstrong,
	Kevin Hilman, Jerome Brunet, Martin Blumenstingl, Ge Gordon,
	BST Linux Kernel Upstream Group, Jesper Nilsson, Lars Persson,
	Alim Akhtar, Ivaylo Ivanov, Frank Li, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, Dinh Nguyen,
	Matthias Brugger, AngeloGioacchino Del Regno, Thierry Reding,
	Jonathan Hunter, Bjorn Andersson, Konrad Dybcio,
	Andreas Färber,
	"Yu-Chun Lin [林祐君]", Heiko Stuebner,
	Shawn Lin, Orson Zhai, Baolin Wang, Michal Simek
In-Reply-To: <ea15cce1-b393-43f6-8d58-3d6f90f0c0cd@samsung.com>

On Tue, 09 Jun 2026 11:03:21 +0100,
Marek Szyprowski <m.szyprowski@samsung.com> wrote:
> 
> Dear All,
> 
> On 23.05.2026 16:02, Marc Zyngier wrote:
> > When running with at EL2 with VHE enabled, the architecture provides
> > two EL2 timer/counters, dubbed physical and virtual. Apart from their
> > names, they are strictly identical.
> >
> > However, they don't get virtualised the same way, specially when
> > it comes to adding arbitrary offsets to the timers. When running as
> > a guest, the host CNTVOFF_EL2 does apply to the guest's view of
> > CNTHV*_El2. This is not true for CNTPOFF_EL2 and CNTHP*_EL2, as
> > the architecture is broken past the first level of virtualisation
> > (it lacks some essential mechanisms to be usable, despite what
> > the ARM ARM pretends).
> >
> > This means that when running as a L2 guest hypervisor, using the
> > physical timer results in traps to L0, which are then forwarded to
> > L1 in order to emulate the offset, leading to even worse performance
> > due to massive trap amplification (the combination of register and
> > ERET trapping is absolutely lethal).
> >
> > Switch the arch timer code to using the virtual timer when running
> > in VHE by default, only using the physical timer if the interrupt
> > is not correctly described in the firmware tables (which seems
> > to be an unfortunately common case). This comes as no impact on
> > bare-metal, and slightly improves the situation in the virtualised
> > case.
> >
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> This patch landed recently in linux-next as commit d87773de9efe
> ("clocksource/drivers/arm_arch_timer: Default to EL2 virtual timer when
> running VHE"). In my tests I found that it breaks booting of RaspberryPi5
> board. Reverting it on top of linux-next fixes the issue. Here is a boot
> log:

Huh.

[...]

> arch_timer: cp15 timer running at 54.00MHz (hyp-virt).
> clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0xc743ce346, max_idle_ns: 440795203123 ns
> sched_clock: 56 bits at 54MHz, resolution 18ns, wraps every 4398046511102ns

The interrupt appears to be advertised in the DT, but doesn't seem to
fire. That's obviously not going to end well. My suspicion is that
either the interrupt isn't wired (that'd be hilariously abd), or is
left as Group-0 by the firmware (copy-paste from RPi4).

Can you try the following hack and let me know if the kernel shouts at
you?

Thanks,

	M.

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index ec70c84e9f91d..d05791e6cc0db 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -213,6 +213,7 @@ static void gic_eoimode1_mask_irq(struct irq_data *d)
 static void gic_unmask_irq(struct irq_data *d)
 {
 	gic_poke_irq(d, GIC_DIST_ENABLE_SET);
+	WARN_ON(!gic_peek_irq(d, GIC_DIST_ENABLE_SET));
 }
 
 static void gic_eoi_irq(struct irq_data *d)

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply related

* Re: [PATCH v3 03/17] clocksource/drivers/arm_arch_timer: Default to EL2 virtual timer when running VHE
From: Marek Szyprowski @ 2026-06-09 10:03 UTC (permalink / raw)
  To: Marc Zyngier, linux-arm-kernel, linux-acpi, linux-kernel,
	devicetree
  Cc: Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla, Catalin Marinas,
	Will Deacon, Rafael J. Wysocki, Mark Rutland, Daniel Lezcano,
	Thomas Gleixner, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Chen-Yu Tsai, Jernej Skrabec, Samuel Holland, Neil Armstrong,
	Kevin Hilman, Jerome Brunet, Martin Blumenstingl, Ge Gordon,
	BST Linux Kernel Upstream Group, Jesper Nilsson, Lars Persson,
	Alim Akhtar, Ivaylo Ivanov, Frank Li, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, Dinh Nguyen,
	Matthias Brugger, AngeloGioacchino Del Regno, Thierry Reding,
	Jonathan Hunter, Bjorn Andersson, Konrad Dybcio,
	Andreas Färber, Yu-Chun Lin [林祐君],
	Heiko Stuebner, Shawn Lin, Orson Zhai, Baolin Wang, Michal Simek
In-Reply-To: <20260523140242.586031-4-maz@kernel.org>

Dear All,

On 23.05.2026 16:02, Marc Zyngier wrote:
> When running with at EL2 with VHE enabled, the architecture provides
> two EL2 timer/counters, dubbed physical and virtual. Apart from their
> names, they are strictly identical.
>
> However, they don't get virtualised the same way, specially when
> it comes to adding arbitrary offsets to the timers. When running as
> a guest, the host CNTVOFF_EL2 does apply to the guest's view of
> CNTHV*_El2. This is not true for CNTPOFF_EL2 and CNTHP*_EL2, as
> the architecture is broken past the first level of virtualisation
> (it lacks some essential mechanisms to be usable, despite what
> the ARM ARM pretends).
>
> This means that when running as a L2 guest hypervisor, using the
> physical timer results in traps to L0, which are then forwarded to
> L1 in order to emulate the offset, leading to even worse performance
> due to massive trap amplification (the combination of register and
> ERET trapping is absolutely lethal).
>
> Switch the arch timer code to using the virtual timer when running
> in VHE by default, only using the physical timer if the interrupt
> is not correctly described in the firmware tables (which seems
> to be an unfortunately common case). This comes as no impact on
> bare-metal, and slightly improves the situation in the virtualised
> case.
>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
This patch landed recently in linux-next as commit d87773de9efe
("clocksource/drivers/arm_arch_timer: Default to EL2 virtual timer when
running VHE"). In my tests I found that it breaks booting of RaspberryPi5
board. Reverting it on top of linux-next fixes the issue. Here is a boot
log:

Booting Linux on physical CPU 0x0000000000 [0x414fd0b1]
Linux version 7.0.0+ (m.szyprowski@AMDC4653) (aarch64-linux-gnu-gcc (Ubuntu 11.4.0-1ubuntu1~22.04.3) 11.4.0, GNU ld (GNU Binutils for Ubuntu) 2.38) #16769 SMP PREEMPT Tue Jun  9 11:57:24 CEST 2026
KASLR enabled
Machine model: Raspberry Pi 5 Model B Rev 1.0
earlycon: pl11 at MMIO 0x000000107d001000 (options '115200n8')
printk: legacy bootconsole [pl11] enabled
Reserved memory: created CMA memory pool at 0x000000003bc00000, size 64 MiB
OF: reserved mem: initialized node linux,cma, compatible id shared-dma-pool
OF: reserved mem: 0x000000003bc00000..0x000000003fbfffff (65536 KiB) map reusable linux,cma
OF: reserved mem: 0x0000000000000000..0x000000000007ffff (512 KiB) nomap non-reusable atf@0
NUMA: Faking a node at [mem 0x0000000000000000-0x00000001ffffffff]
NODE_DATA(0) allocated [mem 0x1fefe0480-0x1fefe313f]
psci: probing for conduit method from DT.
psci: PSCIv1.1 detected in firmware.
psci: Using standard PSCI v0.2 function IDs
psci: MIGRATE_INFO_TYPE not supported.
psci: SMC Calling Convention v1.2
Zone ranges:
  DMA      [mem 0x0000000000000000-0x00000000ffffffff]
  DMA32    empty
  Normal   [mem 0x0000000100000000-0x00000001ffffffff]
Movable zone start for each node
Early memory node ranges
  node   0: [mem 0x0000000000000000-0x000000000007ffff]
  node   0: [mem 0x0000000000080000-0x000000003fbfffff]
  node   0: [mem 0x0000000040000000-0x00000001ffffffff]
Initmem setup node 0 [mem 0x0000000000000000-0x00000001ffffffff]
On node 0, zone DMA: 1024 pages in unavailable ranges
percpu: Embedded 36 pages/cpu s109456 r8192 d29808 u147456
Detected PIPT I-cache on CPU0
CPU features: detected: Virtualization Host Extensions
CPU features: detected: Spectre-v4
CPU features: detected: Spectre-BHB
CPU features: kernel page table isolation forced ON by KASLR
CPU features: detected: Kernel page table isolation (KPTI)
CPU features: detected: SSBS not fully self-synchronizing
alternatives: applying boot alternatives
Kernel command line: console=ttyAMA10,115200n8 earlycon root=PARTUUID=11111111-03 rw clk_ignore_unused rootdelay=2 retain_initrd
printk: log buffer data + meta data: 131072 + 458752 = 589824 bytes
Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes, linear)
Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes, linear)
software IO TLB: area num 4.
software IO TLB: mapped [mem 0x00000000fbfff000-0x00000000fffff000] (64MB)
Fallback order for Node 0: 0
Built 1 zonelists, mobility grouping on.  Total pages: 2096128
Policy zone: Normal
mem auto-init: stack:off, heap alloc:off, heap free:off
SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
Running RCU self tests
Running RCU synchronous self tests
rcu: Preemptible hierarchical RCU implementation.
rcu:     RCU event tracing is enabled.
rcu:     RCU lockdep checking is enabled.
rcu:     RCU restricting CPUs from NR_CPUS=512 to nr_cpu_ids=4.
 Trampoline variant of Tasks RCU enabled.
 Tracing variant of Tasks RCU enabled.
rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=4
Running RCU synchronous self tests
RCU Tasks: Setting shift to 2 and lim to 1 rcu_task_cb_adjust=1 rcu_task_cpu_ids=4.
NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
Root IRQ handler: gic_handle_irq
GIC: Using split EOI/Deactivate mode
rcu: srcu_init: Setting srcu_struct sizes based on contention.
arch_timer: cp15 timer running at 54.00MHz (hyp-virt).
clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0xc743ce346, max_idle_ns: 440795203123 ns
sched_clock: 56 bits at 54MHz, resolution 18ns, wraps every 4398046511102ns
Console: colour dummy device 80x25
Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
... MAX_LOCKDEP_SUBCLASSES:  8
... MAX_LOCK_DEPTH:          48
... MAX_LOCKDEP_KEYS:        8192
... CLASSHASH_SIZE:          4096
... MAX_LOCKDEP_ENTRIES:     32768
... MAX_LOCKDEP_CHAINS:      65536
... CHAINHASH_SIZE:          32768
 memory used by lock dependency info: 6429 kB
 memory used for stack traces: 4224 kB
 per task-struct memory footprint: 1920 bytes
Calibrating delay loop (skipped), value calculated using timer frequency.. 108.00 BogoMIPS (lpj=216000)
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 16384 (order: 5, 131072 bytes, linear)
Mountpoint-cache hash table entries: 16384 (order: 5, 131072 bytes, linear)
VFS: Finished mounting rootfs on nullfs
Running RCU synchronous self tests
Running RCU synchronous self tests

(booting freezes)


> ---
>  drivers/clocksource/arm_arch_timer.c | 55 +++++++++++++++++-----------
>  1 file changed, 33 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
> index 90aeff44a2764..4adf756423de9 100644
> --- a/drivers/clocksource/arm_arch_timer.c
> +++ b/drivers/clocksource/arm_arch_timer.c
> @@ -688,6 +688,7 @@ static void __arch_timer_setup(struct clock_event_device *clk)
>  	clk->irq = arch_timer_ppi[arch_timer_uses_ppi];
>  	switch (arch_timer_uses_ppi) {
>  	case ARCH_TIMER_VIRT_PPI:
> +	case ARCH_TIMER_HYP_VIRT_PPI:
>  		clk->set_state_shutdown = arch_timer_shutdown_virt;
>  		clk->set_state_oneshot_stopped = arch_timer_shutdown_virt;
>  		sne = erratum_handler(set_next_event_virt);
> @@ -879,7 +880,7 @@ static void __init arch_timer_banner(void)
>  	pr_info("cp15 timer running at %lu.%02luMHz (%s).\n",
>  		(unsigned long)arch_timer_rate / 1000000,
>  		(unsigned long)(arch_timer_rate / 10000) % 100,
> -		(arch_timer_uses_ppi == ARCH_TIMER_VIRT_PPI) ? "virt" : "phys");
> +		arch_timer_ppi_names[arch_timer_uses_ppi]);
>  }
>  
>  u32 arch_timer_get_rate(void)
> @@ -912,7 +913,8 @@ static void __init arch_counter_register(void)
>  	int width;
>  
>  	if ((IS_ENABLED(CONFIG_ARM64) && !is_hyp_mode_available()) ||
> -	    arch_timer_uses_ppi == ARCH_TIMER_VIRT_PPI) {
> +	    arch_timer_uses_ppi == ARCH_TIMER_VIRT_PPI ||
> +	    arch_timer_uses_ppi == ARCH_TIMER_HYP_VIRT_PPI) {
>  		if (arch_timer_counter_has_wa()) {
>  			rd = arch_counter_get_cntvct_stable;
>  			scr = raw_counter_get_cntvct_stable;
> @@ -1023,6 +1025,7 @@ static int __init arch_timer_register(void)
>  	ppi = arch_timer_ppi[arch_timer_uses_ppi];
>  	switch (arch_timer_uses_ppi) {
>  	case ARCH_TIMER_VIRT_PPI:
> +	case ARCH_TIMER_HYP_VIRT_PPI:
>  		err = request_percpu_irq(ppi, arch_timer_handler_virt,
>  					 "arch_timer", arch_timer_evt);
>  		break;
> @@ -1090,25 +1093,34 @@ static int __init arch_timer_common_init(void)
>  /**
>   * arch_timer_select_ppi() - Select suitable PPI for the current system.
>   *
> - * If HYP mode is available, we know that the physical timer
> - * has been configured to be accessible from PL1. Use it, so
> - * that a guest can use the virtual timer instead.
> + * On AArch32, if HYP mode is available, we know that the physical
> + * timer has been configured to be accessible from PL1. Use it, so
> + * that a guest can use the virtual timer instead (though KVM host
> + * support has long been removed).
>   *
> - * On ARMv8.1 with VH extensions, the kernel runs in HYP. VHE
> - * accesses to CNTP_*_EL1 registers are silently redirected to
> - * their CNTHP_*_EL2 counterparts, and use a different PPI
> - * number.
> + * On ARMv8.1 with FEAT_VHE, the kernel runs in EL2. Accesses to
> + * CNTV_*_EL1 registers are silently redirected to their CNTHV_*_EL2
> + * counterparts, and the timer uses a different PPI number. Similar
> + * thing happen when using the EL2 physical timer. Note that a bunch
> + * of DTs out there omit the virtual EL2 timer, so fallback gracefully
> + * on the physical timer.
> + *
> + * Without VHE, if no interrupt provided for virtual timer, we'll have
> + * to stick to the physical timer. It'd better be accessible...
>   *
> - * If no interrupt provided for virtual timer, we'll have to
> - * stick to the physical timer. It'd better be accessible...
>   * For arm64 we never use the secure interrupt.
>   *
>   * Return: a suitable PPI type for the current system.
>   */
>  static enum arch_timer_ppi_nr __init arch_timer_select_ppi(void)
>  {
> -	if (is_kernel_in_hyp_mode())
> +	if (is_kernel_in_hyp_mode()) {
> +		if (arch_timer_ppi[ARCH_TIMER_HYP_VIRT_PPI])
> +			return ARCH_TIMER_HYP_VIRT_PPI;
> +
> +		pr_warn_once(FW_BUG "VHE-capable CPU without EL2 virtual timer interrupt\n");
>  		return ARCH_TIMER_HYP_PPI;
> +	}
>  
>  	if (!is_hyp_mode_available() && arch_timer_ppi[ARCH_TIMER_VIRT_PPI])
>  		return ARCH_TIMER_VIRT_PPI;
> @@ -1200,14 +1212,9 @@ static int __init arch_timer_acpi_init(struct acpi_table_header *table)
>  	if (ret)
>  		return ret;
>  
> -	arch_timer_ppi[ARCH_TIMER_PHYS_NONSECURE_PPI] =
> -		acpi_gtdt_map_ppi(ARCH_TIMER_PHYS_NONSECURE_PPI);
> -
> -	arch_timer_ppi[ARCH_TIMER_VIRT_PPI] =
> -		acpi_gtdt_map_ppi(ARCH_TIMER_VIRT_PPI);
> -
> -	arch_timer_ppi[ARCH_TIMER_HYP_PPI] =
> -		acpi_gtdt_map_ppi(ARCH_TIMER_HYP_PPI);
> +	/* The GTDT parser can't be bothered with the secure timer */
> +	for (int i = ARCH_TIMER_PHYS_NONSECURE_PPI; i < ARCH_TIMER_MAX_TIMER_PPI; i++)
> +		arch_timer_ppi[i] = acpi_gtdt_map_ppi(i);
>  
>  	arch_timer_populate_kvm_info();
>  
> @@ -1253,10 +1260,14 @@ int kvm_arch_ptp_get_crosststamp(u64 *cycle, struct timespec64 *ts,
>  	if (!IS_ENABLED(CONFIG_HAVE_ARM_SMCCC_DISCOVERY))
>  		return -EOPNOTSUPP;
>  
> -	if (arch_timer_uses_ppi == ARCH_TIMER_VIRT_PPI)
> +	switch (arch_timer_uses_ppi) {
> +	case ARCH_TIMER_VIRT_PPI:
> +	case ARCH_TIMER_HYP_VIRT_PPI:
>  		ptp_counter = KVM_PTP_VIRT_COUNTER;
> -	else
> +		break;
> +	default:
>  		ptp_counter = KVM_PTP_PHYS_COUNTER;
> +	}
>  
>  	arm_smccc_1_1_invoke(ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID,
>  			     ptp_counter, &hvc_res);

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply

* Re: [PATCH v4 2/2] ACPI: CPPC: Add ospm_nominal_perf support
From: Sumit Gupta @ 2026-06-09  9:59 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: viresh.kumar, lenb, pierre.gondois, zhenglifeng1, zhanjie9,
	mario.limonciello, saket.dumbre, linux-acpi, linux-kernel,
	linux-pm, acpica-devel, treding, jonathanh, vsethi, ksitaraman,
	sanjayc, mochs, bbasu, sumitg
In-Reply-To: <CAJZ5v0icPAmaiThjgUTxWRymNGBQFarodurJhODFOkLKpVwtOQ@mail.gmail.com>


On 01/06/26 23:07, Rafael J. Wysocki wrote:
> External email: Use caution opening links or attachments
>
>
> On Wed, May 27, 2026 at 9:47 PM Sumit Gupta <sumitg@nvidia.com> wrote:
>> Expose the OSPM Nominal Performance register (ACPI 6.6, Section
>> 8.4.6.1.2.6), which conveys the desired nominal performance level
>> at which the platform may run. Unlike the existing read-only
>> Nominal Performance register, it is writable and lets OSPM
>> request a lower nominal level than the platform-reported nominal.
>> The platform classifies performance above this level as boosted
>> and below as throttled for its power/thermal decisions.
>>
>> It is exposed as a per-policy cpufreq sysfs attribute in kHz, to
>> match the cpufreq sysfs unit convention:
>>
>>    /sys/devices/system/cpu/cpufreq/policyN/ospm_nominal_freq
>>
>> The attribute is documented in
>> Documentation/ABI/testing/sysfs-devices-system-cpu.
>>
>> Writes are converted to perf via cppc_khz_to_perf(), validated
>> against [Lowest Performance, Nominal Performance], and applied to
>> every CPU in policy->cpus.
>>
>> The register is write-only; the kernel caches the last written
>> value in struct cppc_cpudata for sysfs readback (returns 0 until
>> userspace writes a value).
>>
>> Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
> There is some sashiko.dev feedback on this one that is valid AFAICS:
>
> https://sashiko.dev/#/patchset/20260527194626.185286-1-sumitg%40nvidia.com


Thanks for sharing. Will address all four in v5.


 > Could this validation reject valid frequencies on systems with CPPC
 > favored cores?

1.
In v5, moved the range check into store_ospm_nominal_freq() in
cppc_cpufreq.c, where it uses the same cpu_data->perf_caps as the
kHz->perf conversion. OSPM Nominal is treated as a single per-policy
value. It's validated once against the policy caps and applied to all
CPUs in policy->cpus. This avoids validating against a sibling's
separately-read caps.


 > When the platform's ACPI _CPC table does not support this register,
 > will this unconditionally return '0' instead of an error like
 > -EOPNOTSUPP?

2.
In v5, added cppc_get_ospm_nominal_perf() and dropped the cached
value. show() now reads the register and returns "<unsupported>"
on -EOPNOTSUPP.


 > Are newly onlined CPUs missing this cached ospm_nominal_perf value?

3.
show() now reads the register directly and not cached value, so it
always reflects true per-CPU state.


 > Does skipping this rollback loop when prev_set is false leave the CPUs
 > in a desynchronized state?

In v5, the cache and ospm_nominal_perf_set bool are removed and
show() reads the register, so a partial write is no longer hidden.
sysfs shows the real per-CPU value, not '0').
store() reads the current value before writing and restores it on
already updated CPUs if a sibling write fails, so there is no
first write skip.

Thank you,
Sumit Gupta
....



^ permalink raw reply

* Re: [PATCH v2 6/6] irqchip/gic-v5: Enable GICv5 IWB ACPI probe ordering detection
From: Lorenzo Pieralisi @ 2026-06-09  9:31 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Len Brown, Sunil V L, Marc Zyngier, Thomas Gleixner, Huacai Chen,
	Anup Patel, Hanjun Guo, Sudeep Holla, Catalin Marinas,
	Will Deacon, linux-riscv, linux-kernel, linux-acpi,
	linux-arm-kernel, loongarch
In-Reply-To: <CAJZ5v0ibZKfzJwGyUb92-K1N9C_ab0QujpAKCrvMdyygquS1Vw@mail.gmail.com>

On Mon, Jun 08, 2026 at 07:18:15PM +0200, Rafael J. Wysocki wrote:
> On Wed, Jun 3, 2026 at 10:21 AM Lorenzo Pieralisi <lpieralisi@kernel.org> wrote:
> >
> > Register an ACPI hook in the ACPI interrupt management code for GICv5 to
> > retrieve the ACPI interrupt controller handle (if any) of the controller
> > handling a specific GSI, by updating the acpi_set_irq_model() call with
> > the gic_v5_get_gsi_handle() function pointer parameter.
> >
> > gicv5_get_gsi_handle() allows ACPI core to detect the ACPI handle
> > of the controller that manages a specific GSI interrupt.
> >
> > Update the IWB driver to clear device dependencies in ACPI core once the
> > IWB driver has probed.
> >
> > Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
> > Cc: Thomas Gleixner <tglx@kernel.org>
> > Cc: Marc Zyngier <maz@kernel.org>
> > ---
> >  drivers/irqchip/irq-gic-v5-iwb.c |  5 +++++
> >  drivers/irqchip/irq-gic-v5.c     | 13 +++++++++++--
> >  2 files changed, 16 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/irqchip/irq-gic-v5-iwb.c b/drivers/irqchip/irq-gic-v5-iwb.c
> > index 9103feb70ce8..a02cb9537b15 100644
> > --- a/drivers/irqchip/irq-gic-v5-iwb.c
> > +++ b/drivers/irqchip/irq-gic-v5-iwb.c
> > @@ -269,6 +269,11 @@ static int gicv5_iwb_device_probe(struct platform_device *pdev)
> >         if (IS_ERR(iwb_node))
> >                 return PTR_ERR(iwb_node);
> >
> > +#ifdef CONFIG_ACPI
> > +       if (has_acpi_companion(&pdev->dev))
> > +               acpi_dev_clear_dependencies(ACPI_COMPANION(&pdev->dev));
> > +#endif
> 
> I would rather add a wrapper for this, along with an empty stub for
> the !CONFIG_ACPI case.

Ok, I will.

> > +
> >         return 0;
> >  }
> >
> > diff --git a/drivers/irqchip/irq-gic-v5.c b/drivers/irqchip/irq-gic-v5.c
> > index 03cc2830b260..26cfaea1af41 100644
> > --- a/drivers/irqchip/irq-gic-v5.c
> > +++ b/drivers/irqchip/irq-gic-v5.c
> > @@ -1217,11 +1217,19 @@ static struct fwnode_handle *gsi_domain_handle;
> >  static struct fwnode_handle *gic_v5_get_gsi_domain_id(u32 gsi)
> >  {
> >         if (FIELD_GET(GICV5_GSI_IC_TYPE, gsi) == GICV5_GSI_IWB_TYPE)
> > -               return iort_iwb_handle(FIELD_GET(GICV5_GSI_IWB_FRAME_ID, gsi));
> > +               return iort_iwb_handle_fwnode(FIELD_GET(GICV5_GSI_IWB_FRAME_ID, gsi));
> 
> Why is this change needed?

This is a mistake, thanks for spotting it, it belongs in the previous patch
(IMO patch 5 and 6 are a single logical entity, I split it to try to keep
irqchip specific changes in one patch).

iort_iwb_handle() was refactored to return an acpi_handle not a fwnode_handle
(in preparation for adding gic_v5_get_gsi_handle() below - the dependency
chain requires acpi_handle retrieval not fwnode_handle), so the change above
belongs in patch 5, I will move this line change to patch 5 in v3.

Thanks,
Lorenzo

> >
> >         return gsi_domain_handle;
> >  }
> >
> > +static acpi_handle gic_v5_get_gsi_handle(u32 gsi)
> > +{
> > +       if (FIELD_GET(GICV5_GSI_IC_TYPE, gsi) == GICV5_GSI_IWB_TYPE)
> > +               return iort_iwb_handle(FIELD_GET(GICV5_GSI_IWB_FRAME_ID, gsi));
> > +
> > +       return NULL;
> > +}
> > +
> >  static int __init gic_acpi_init(union acpi_subtable_headers *header, const unsigned long end)
> >  {
> >         struct acpi_madt_gicv5_irs *irs = (struct acpi_madt_gicv5_irs *)header;
> > @@ -1242,7 +1250,8 @@ static int __init gic_acpi_init(union acpi_subtable_headers *header, const unsig
> >         if (ret)
> >                 goto out_irs;
> >
> > -       acpi_set_irq_model(ACPI_IRQ_MODEL_GIC_V5, gic_v5_get_gsi_domain_id, NULL);
> > +       acpi_set_irq_model(ACPI_IRQ_MODEL_GIC_V5, gic_v5_get_gsi_domain_id,
> > +                          gic_v5_get_gsi_handle);
> >
> >         return 0;
> >
> >
> > --

^ permalink raw reply

* Re: [PATCH v2 1/6] ACPI: RISC-V: Fix riscv_acpi_irq_get_dep() loop termination
From: Lorenzo Pieralisi @ 2026-06-09  9:19 UTC (permalink / raw)
  To: Sunil V L
  Cc: Rafael J. Wysocki, Len Brown, Sunil V L, Marc Zyngier,
	Thomas Gleixner, Huacai Chen, Anup Patel, Hanjun Guo,
	Sudeep Holla, Catalin Marinas, Will Deacon, linux-riscv,
	linux-kernel, linux-acpi, linux-arm-kernel, loongarch
In-Reply-To: <CAB19ukFFwm3ehzkBFr+oXRjA7VK_4_=XHFSuqdEpbVqUz8Do4Q@mail.gmail.com>

On Mon, Jun 08, 2026 at 09:54:28PM +0530, Sunil V L wrote:
> Hi Lorenzo,
> 
> On Wed, Jun 3, 2026 at 1:51 PM Lorenzo Pieralisi <lpieralisi@kernel.org> wrote:
> >
> > In riscv_acpi_add_irq_dep() the main loop condition would currently stop
> > the loop if an interrupt descriptor contains an interrupt for which the
> > respective GSI handle is NULL, which is not correct because subsequent
> > interrupts in the interrupt descriptor might still have a GSI dependency
> > that must not be skipped.
> >
> > Rework riscv_acpi_add_irq_dep() and the riscv_acpi_irq_get_dep() call chain
> > to fix it - by not forcing the loop to stop in order to guarantee
> > dependency detection for all the interrupt entries in the CRS descriptor.
> >
> > Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
> > Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> > Cc: Sunil V L <sunilvl@ventanamicro.com>
> > ---
> >  drivers/acpi/riscv/irq.c | 10 ++++++----
> >  1 file changed, 6 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/acpi/riscv/irq.c b/drivers/acpi/riscv/irq.c
> > index 9b88d0993e88..cd83c3035cf6 100644
> > --- a/drivers/acpi/riscv/irq.c
> > +++ b/drivers/acpi/riscv/irq.c
> > @@ -299,6 +299,7 @@ static acpi_status riscv_acpi_irq_get_parent(struct acpi_resource *ares, void *c
> >                         return AE_OK;
> >
> >                 ctx->handle = riscv_acpi_get_gsi_handle(eirq->interrupts[ctx->index]);
> > +               ctx->rc = 0;
> >                 return AE_CTRL_TERMINATE;
> >         }
> >
> > @@ -314,10 +315,8 @@ static int riscv_acpi_irq_get_dep(acpi_handle handle, unsigned int index, acpi_h
> >
> >         acpi_walk_resources(handle, METHOD_NAME__CRS, riscv_acpi_irq_get_parent, &ctx);
> >         *gsi_handle = ctx.handle;
> > -       if (*gsi_handle)
> > -               return 1;
> >
> > -       return 0;
> > +       return ctx.rc;
> >  }
> >
> >  static u32 riscv_acpi_add_prt_dep(acpi_handle handle)
> > @@ -381,8 +380,11 @@ static u32 riscv_acpi_add_irq_dep(acpi_handle handle)
> >         int i;
> >
> >         for (i = 0;
> > -            riscv_acpi_irq_get_dep(handle, i, &gsi_handle);
> > +            !riscv_acpi_irq_get_dep(handle, i, &gsi_handle);
> >              i++) {
> > +               if (!gsi_handle)
> > +                       continue;
> > +
> >                 dep_devices.count = 1;
> >                 dep_devices.handles = kzalloc_objs(*dep_devices.handles, 1);
> >                 if (!dep_devices.handles) {
> >
> Do these fixes need the Fixes tag?

I can add a Fixes: tag but I wanted first some help testing them, it
is code perusal that got me there.

> Otherwise, LGTM.
> Reviewed-by: Sunil V L <sunilvl@oss.qualcomm.com>

Thanks,
Lorenzo

^ permalink raw reply

* Re: [PATCH v3 2/2] ACPI: CPPC: Add ospm_nominal_perf support
From: Sumit Gupta @ 2026-06-09  8:53 UTC (permalink / raw)
  To: Pierre Gondois, rafael, viresh.kumar, lenb, zhenglifeng1,
	zhanjie9, mario.limonciello, saket.dumbre, linux-acpi,
	linux-kernel, linux-pm, acpica-devel
  Cc: treding, jonathanh, vsethi, ksitaraman, sanjayc, bbasu, sumitg
In-Reply-To: <86780f97-29ee-4a72-b311-38c89434b707@arm.com>


On 28/05/26 17:37, Pierre Gondois wrote:
> External email: Use caution opening links or attachments
>
>
> Hello Sumit,
>>
>> Hi Pierre,
>>
>> Thanks for the review and the complementary patch.
>> Going point by point:
>>
>> 1. Rollback for a partially applied multiple CPU write in
>>     store_ospm_nominal_freq(): Agreed, will add into v4.
>>
>> 2. cppc_get_ospm_nominal_perf() and the show/init/exit coherence
>>     checks that rely on it: I'd skip these as the register is write-only
>>     as per spec.
>>
> NIT:
> IIUC having a write-only register doesn't mean we cannot read it.
> Cf. cppc_get_desired_perf()


Good point.
v5 reads the register via a new cppc_get_ospm_nominal_perf().
So, show() returns the register value or "<unsupported>",
and dropped the cache/bool.


>
>> 3. Initializing the register at startup and restoring at exit: In v3, we
>>     dropped the unconditional cpu_init write so user values would
>>     survive CPU hotplug. The spec also makes the explicit init
>>     unnecessary: "If this register is not provided, then OSPM must
>>     assume that the OSPM Nominal Performance value is equal to
>>     the Nominal Performance value.". The unwritten default already
>>     looks well defined.
>
> The concern I had was for the scenario where:
>
> - the driver is loaded
>
> - the user sets an ospm_nominal_freq value
>
> - the driver is unloaded
>
> In such case, the ospm_nominal_freq value will still be set to a
> non-default value. The modifications suggested previously would
> allow to handle that case to come back to the default value.
>
> FWIU, we have:
>
> +------+     +---------+     +-----------+     +------+
> | User | <-> | CPPC    | <-> | CPPC      | <-> | CPPC |
> +------+     | driver  |     | reg       |     | HW   |
>              +---------+     | interface |     | reg  |
>                              +-----------+     +------+
>
> So if we want to handle:
>
> - the case described above
>
> - the case you mentioned, i.e. hot-plugging CPUs
>
> maybe the scratch values should be stored along the CPPC register
> interface. This would allow to handle complex cases where CPUs
> are hotplugged and the driver is loaded/unloaded ?
>
> Note: the same kind of scenario should apply to the auto_sel register
>

Right.
After unload, the register keeps the user set value instead of the
firmware value. In a follow-up, I will restore the firmware value on
unload and reapply the user value across hotplug, grouping the
OSPM-set registers together (ospm_nominal_perf, auto_sel and EPP).
On my test platforms the registers survive hotplug, but that isn't
guaranteed in general.

I think it's better to keep the saved state in the cppc_cpufreq driver
rather than the CPPC register interface. intel_pstate and amd-pstate
do the same.
For reapply, will use a CPU hotplug callback rather than ->online/
->offline hooks. Those are only called when a policy gains its first
online CPU or loses its last one. cppc_cpufreq also has shared
(SHARED_TYPE_ANY) policy, offlining and onlining a single CPU
keeps the policy active, so neither hook is called for it. A per-CPU
hotplug callback is needed to cover that case.

Let me know if you have other thoughts.

Thank you,
Sumit Gupta


^ permalink raw reply

* Re: [PATCH v4 07/10] riscv: Add RISC-V entries in processor type and ISA strings
From: Sunil V L @ 2026-06-09  8:27 UTC (permalink / raw)
  To: Himanshu Chauhan
  Cc: linux-riscv, linux-kernel, linux-acpi, linux-efi, acpica-devel,
	paul.walmsley, palmer, lenb, james.morse, tony.luck, ardb, conor,
	cleger, robert.moore, anup.patel
In-Reply-To: <20260513084325.2176952-8-himanshu.chauhan@oss.qualcomm.com>

On Wed, May 13, 2026 at 2:14 PM Himanshu Chauhan
<himanshu.chauhan@oss.qualcomm.com> wrote:
>
> Add RISCV and RISCV32/64 strings in the in processor type and ISA strings
> respectively. These are defined for cper records.
>
> Signed-off-by: Himanshu Chauhan <himanshu.chauhan@oss.qualcomm.com>
> ---
>  drivers/firmware/efi/cper.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
> index 06b4fdb59917..1b1ab2f1355b 100644
> --- a/drivers/firmware/efi/cper.c
> +++ b/drivers/firmware/efi/cper.c
> @@ -170,6 +170,7 @@ static const char * const proc_type_strs[] = {
>         "IA32/X64",
>         "IA64",
>         "ARM",
> +       "RISC-V",
>  };
>
>  static const char * const proc_isa_strs[] = {
> @@ -178,6 +179,8 @@ static const char * const proc_isa_strs[] = {
>         "X64",
>         "ARM A32/T32",
>         "ARM A64",
> +       "RV32/RV32E",
> +       "RV64",
>  };
>
Reviewed-by: Sunil V L <sunilvl@oss.qualcomm.com>

^ permalink raw reply

* Re: [PATCH v4 05/10] riscv: conditionally compile GHES NMI spool function
From: Sunil V L @ 2026-06-09  8:17 UTC (permalink / raw)
  To: Himanshu Chauhan
  Cc: linux-riscv, linux-kernel, linux-acpi, linux-efi, acpica-devel,
	paul.walmsley, palmer, lenb, james.morse, tony.luck, ardb, conor,
	cleger, robert.moore, anup.patel
In-Reply-To: <20260513084325.2176952-6-himanshu.chauhan@oss.qualcomm.com>

On Wed, May 13, 2026 at 2:14 PM Himanshu Chauhan
<himanshu.chauhan@oss.qualcomm.com> wrote:
>
> Compile ghes_in_nmi_spool_from_list only when NMI and SEA
> is enabled. Otherwise compilation fails with "defined but
> not used" error.
>
> Signed-off-by: Himanshu Chauhan <himanshu.chauhan@oss.qualcomm.com>
> ---
>  drivers/acpi/apei/ghes.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index 3236a3ce79d6..8edc2c8db1bb 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -1397,6 +1397,7 @@ static int ghes_in_nmi_queue_one_entry(struct ghes *ghes,
>         return rc;
>  }
>
> +#if defined(CONFIG_HAVE_ACPI_APEI_NMI) || defined(CONFIG_ACPI_APEI_SEA)
>
>  static int ghes_in_nmi_spool_from_list(struct list_head *rcu_list,
>                                        enum fixed_addresses fixmap_idx)
>  {
> @@ -1415,6 +1416,7 @@ static int ghes_in_nmi_spool_from_list(struct list_head *rcu_list,
>
>         return ret;
>  }
> +#endif
>
Reviewed-by: Sunil V L <sunilvl@oss.qualcomm.com>

^ permalink raw reply

* Re: [PATCH v4 02/10] riscv: Define arch_apei_get_mem_attribute for RISC-V
From: Sunil V L @ 2026-06-09  8:01 UTC (permalink / raw)
  To: Himanshu Chauhan
  Cc: linux-riscv, linux-kernel, linux-acpi, linux-efi, acpica-devel,
	paul.walmsley, palmer, lenb, james.morse, tony.luck, ardb, conor,
	cleger, robert.moore, anup.patel
In-Reply-To: <20260513084325.2176952-3-himanshu.chauhan@oss.qualcomm.com>

On Wed, May 13, 2026 at 2:13 PM Himanshu Chauhan
<himanshu.chauhan@oss.qualcomm.com> wrote:
>
> ghes_map function uses arch_apei_get_mem_attribute to get the
> protection bits for a given physical address. These protection
> bits are then used to map the physical address.
>
> Signed-off-by: Himanshu Chauhan <himanshu.chauhan@oss.qualcomm.com>
> ---
>  arch/riscv/include/asm/acpi.h | 16 ++++++++++++++++
>  arch/riscv/kernel/acpi.c      | 12 ++++++++++++
>  2 files changed, 28 insertions(+)
>
> diff --git a/arch/riscv/include/asm/acpi.h b/arch/riscv/include/asm/acpi.h
> index 26ab37c171bc..c142db9f81a7 100644
> --- a/arch/riscv/include/asm/acpi.h
> +++ b/arch/riscv/include/asm/acpi.h
> @@ -14,6 +14,7 @@
>
>  /* Basic configuration for ACPI */
>  #ifdef CONFIG_ACPI
> +pgprot_t __acpi_get_mem_attribute(phys_addr_t addr);
>
>  typedef u64 phys_cpuid_t;
>  #define PHYS_CPUID_INVALID INVALID_HARTID
> @@ -27,6 +28,21 @@ extern int acpi_disabled;
>  extern int acpi_noirq;
>  extern int acpi_pci_disabled;
>
> +#ifdef CONFIG_ACPI_APEI
> +/*
> + * acpi_disable_cmcff to disable IA-32 Corrected Machine Check (CMC)
> + * Firmware-First mode. It is not required in RISC-V architecture
> + * and is present for compatibility
> + */
> +#define acpi_disable_cmcff 1
> +static inline pgprot_t arch_apei_get_mem_attribute(phys_addr_t addr)
> +{
> +       return  __acpi_get_mem_attribute(addr);
> +}
> +#else /* CONFIG_ACPI_APEI */
> +#define acpi_disable_cmcff 0
> +#endif /* !CONFIG_ACPI_APEI */
> +
>  static inline void disable_acpi(void)
>  {
>         acpi_disabled = 1;
> diff --git a/arch/riscv/kernel/acpi.c b/arch/riscv/kernel/acpi.c
> index 068e0b404b6f..7a6770697999 100644
> --- a/arch/riscv/kernel/acpi.c
> +++ b/arch/riscv/kernel/acpi.c
> @@ -204,6 +204,18 @@ struct acpi_madt_rintc *acpi_cpu_get_madt_rintc(int cpu)
>         return &cpu_madt_rintc[cpu];
>  }
>
> +pgprot_t __acpi_get_mem_attribute(phys_addr_t addr)
> +{
> +       u64 attr;
> +
> +       attr = efi_mem_attributes(addr);
> +       if (attr & EFI_MEMORY_WB)
> +               return PAGE_KERNEL;
> +       if ((attr & EFI_MEMORY_WC) || (attr & EFI_MEMORY_WT))
> +               return pgprot_writecombine(PAGE_KERNEL);
> +       return PAGE_KERNEL;
> +}
> +
>  /*
>   * __acpi_map_table() will be called before paging_init(), so early_ioremap()
>   * or early_memremap() should be called here to for ACPI table mapping.
>
Reviewed-by: Sunil V L <sunilvl@oss.qualcomm.com>

^ permalink raw reply

* Re: [PATCH v4 2/2] ACPI: CPPC: Add ospm_nominal_perf support
From: Sumit Gupta @ 2026-06-09  7:47 UTC (permalink / raw)
  To: Pierre Gondois, rafael, viresh.kumar, lenb, zhenglifeng1,
	zhanjie9, mario.limonciello, saket.dumbre, linux-acpi,
	linux-kernel, linux-pm, acpica-devel
  Cc: treding, jonathanh, vsethi, ksitaraman, sanjayc, mochs, bbasu,
	sumitg
In-Reply-To: <e3528e78-1397-439a-b8c2-22648d66efb5@arm.com>


On 28/05/26 17:42, Pierre Gondois wrote:
> External email: Use caution opening links or attachments
>
>
> Hello Sumit,
>
> On 5/27/26 21:46, Sumit Gupta wrote:
>> Expose the OSPM Nominal Performance register (ACPI 6.6, Section
>> 8.4.6.1.2.6), which conveys the desired nominal performance level
>> at which the platform may run. Unlike the existing read-only
>> Nominal Performance register, it is writable and lets OSPM
>> request a lower nominal level than the platform-reported nominal.
>> The platform classifies performance above this level as boosted
>> and below as throttled for its power/thermal decisions.
>>
>> It is exposed as a per-policy cpufreq sysfs attribute in kHz, to
>> match the cpufreq sysfs unit convention:
>>
>>    /sys/devices/system/cpu/cpufreq/policyN/ospm_nominal_freq
>>
>> The attribute is documented in
>> Documentation/ABI/testing/sysfs-devices-system-cpu.
>>
>> Writes are converted to perf via cppc_khz_to_perf(), validated
>> against [Lowest Performance, Nominal Performance], and applied to
>> every CPU in policy->cpus.
>>
>> The register is write-only; the kernel caches the last written
>> value in struct cppc_cpudata for sysfs readback (returns 0 until
>> userspace writes a value).
>>
>> Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
>> ---
>>   .../ABI/testing/sysfs-devices-system-cpu      | 17 ++++++
>>   drivers/acpi/cppc_acpi.c                      | 35 +++++++++++
>>   drivers/cpufreq/cppc_cpufreq.c                | 60 +++++++++++++++++++
>>   include/acpi/cppc_acpi.h                      | 12 ++++
>>   4 files changed, 124 insertions(+)
>>
>> diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu 
>> b/Documentation/ABI/testing/sysfs-devices-system-cpu
>> index 82d10d556cc8..ac1bf1b89ac4 100644
>> --- a/Documentation/ABI/testing/sysfs-devices-system-cpu
>> +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
>> @@ -346,6 +346,23 @@ Description:     Performance Limited
>>
>>               This file is only present if the cppc-cpufreq driver is 
>> in use.
>>
>> +What: /sys/devices/system/cpu/cpuX/cpufreq/ospm_nominal_freq
>> +Date:                May 2026
>> +Contact:     linux-pm@vger.kernel.org
>> +Description: OSPM Nominal Performance (kHz)
>> +
>> +             OSPM uses this attribute to request a nominal performance
>> +             level lower than the platform-reported nominal. The
>> +             platform treats performance above this level as boost
>> +             and below as throttle for power and thermal decisions.
>> +
>> +             Read returns the last written value in kHz, or 0 if no
>> +             value has been written. Write a kHz value in the range
>> +             [lowest_freq, nominal_freq].
>> +
>> +             This file is only present if the cppc-cpufreq driver is
>> +             in use.
>> +
>>   What: /sys/devices/system/cpu/cpu*/cache/index3/cache_disable_{0,1}
>>   Date:               August 2008
>>   KernelVersion:      2.6.27
>> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
>> index c76cfafa3589..ad6ece16c30d 100644
>> --- a/drivers/acpi/cppc_acpi.c
>> +++ b/drivers/acpi/cppc_acpi.c
>> @@ -1682,6 +1682,41 @@ int cppc_set_epp(int cpu, u64 epp_val)
>>   }
>>   EXPORT_SYMBOL_GPL(cppc_set_epp);
>>
>> +/**
>> + * cppc_set_ospm_nominal_perf() - Write OSPM Nominal Performance 
>> register.
>> + * @cpu: CPU on which to write register.
>> + * @ospm_nominal_perf: Value to write to the OSPM Nominal 
>> Performance register.
>> + *
>> + * OSPM Nominal Performance conveys the desired nominal performance 
>> level
>> + * at which the platform may run. Per ACPI 6.6, s8.4.6.1.2.6, the value
>> + * must lie within [Lowest Performance, Nominal Performance] and may be
>> + * set independently of Minimum, Maximum and Desired performance.
>> + *
>> + * Return: 0 on success or negative error code.
>> + */
>> +int cppc_set_ospm_nominal_perf(int cpu, u64 ospm_nominal_perf)
>> +{
>> +     struct cpc_desc *cpc_desc = per_cpu(cpc_desc_ptr, cpu);
>> +     struct cppc_perf_caps caps;
>> +     int ret;
>> +
>> +     if (!cpc_desc) {
>> +             pr_debug("No CPC descriptor for CPU:%d\n", cpu);
>> +             return -ENODEV;
>> +     }
>> +
>> +     ret = cppc_get_perf_caps(cpu, &caps);
>> +     if (ret)
>> +             return ret;
>> +
>> +     if (ospm_nominal_perf < caps.lowest_perf ||
>> +         ospm_nominal_perf > caps.nominal_perf)
>> +             return -EINVAL;
>> +
>> +     return cppc_set_reg_val(cpu, OSPM_NOMINAL_PERF, 
>> ospm_nominal_perf);
>> +}
>> +EXPORT_SYMBOL_GPL(cppc_set_ospm_nominal_perf);
>> +
>>   /**
>>    * cppc_get_auto_act_window() - Read autonomous activity window 
>> register.
>>    * @cpu: CPU from which to read register.
>> diff --git a/drivers/cpufreq/cppc_cpufreq.c 
>> b/drivers/cpufreq/cppc_cpufreq.c
>> index 15a728dea911..5c54af1655b5 100644
>> --- a/drivers/cpufreq/cppc_cpufreq.c
>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>> @@ -1139,11 +1139,70 @@ static int cppc_get_perf_limited_filtered(int 
>> cpu, u64 *perf_limited)
>>   CPPC_CPUFREQ_ATTR_RW_U64(perf_limited, cppc_get_perf_limited_filtered,
>>                        cppc_set_perf_limited)
>>
>> +static ssize_t show_ospm_nominal_freq(struct cpufreq_policy *policy, 
>> char *buf)
>> +{
>> +     struct cppc_cpudata *cpu_data = policy->driver_data;
>> +     unsigned int freq_khz;
>> +
>> +     if (!cpu_data->ospm_nominal_perf_set)
>> +             return sysfs_emit(buf, "0\n");
>
> The questions on v3 might be more relevant,
> but for instance here, the ospm_nominal_perf value is not 0,
> the hardware register might contain any value.
>
> So reading the register might be more meaningful than
> returning 0.
>
>

Agreed.
In v5, reading the register instead of caching:

- Added cppc_get_ospm_nominal_perf().
- show_ospm_nominal_freq() now returns the current register value,
   or "<unsupported>" on -EOPNOTSUPP.
- Dropped the cpu_data->ospm_nominal_perf cache and the
   ospm_nominal_perf_set bool entirely.

This also follows your earlier point that being write-only doesn't mean
we can't read it (cppc_get_desired_perf() being the precedent).

Thank you,
Sumit Gupta



^ permalink raw reply

* Re: [PATCH v4 2/2] ACPI: CPPC: Add ospm_nominal_perf support
From: Sumit Gupta @ 2026-06-09  7:37 UTC (permalink / raw)
  To: Christian Loehle, rafael, viresh.kumar, lenb, pierre.gondois,
	zhenglifeng1, zhanjie9, mario.limonciello, saket.dumbre,
	linux-acpi, linux-kernel, linux-pm, acpica-devel
  Cc: treding, jonathanh, vsethi, ksitaraman, sanjayc, mochs, bbasu,
	sumitg
In-Reply-To: <19b92583-ab78-4b45-a260-03a13874b0fe@arm.com>


On 29/05/26 18:42, Christian Loehle wrote:
> External email: Use caution opening links or attachments
>
>
> On 5/27/26 20:46, Sumit Gupta wrote:
>> Expose the OSPM Nominal Performance register (ACPI 6.6, Section
>> 8.4.6.1.2.6), which conveys the desired nominal performance level
>> at which the platform may run. Unlike the existing read-only
>> Nominal Performance register, it is writable and lets OSPM
>> request a lower nominal level than the platform-reported nominal.
>> The platform classifies performance above this level as boosted
>> and below as throttled for its power/thermal decisions.
>>
>> It is exposed as a per-policy cpufreq sysfs attribute in kHz, to
>> match the cpufreq sysfs unit convention:
>>
>>    /sys/devices/system/cpu/cpufreq/policyN/ospm_nominal_freq
>>
>> The attribute is documented in
>> Documentation/ABI/testing/sysfs-devices-system-cpu.
>>
>> Writes are converted to perf via cppc_khz_to_perf(), validated
>> against [Lowest Performance, Nominal Performance], and applied to
>> every CPU in policy->cpus.
>>
>> The register is write-only; the kernel caches the last written
>> value in struct cppc_cpudata for sysfs readback (returns 0 until
>> userspace writes a value).
>>
>> Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
>> ---
>>   .../ABI/testing/sysfs-devices-system-cpu      | 17 ++++++
>>   drivers/acpi/cppc_acpi.c                      | 35 +++++++++++
>>   drivers/cpufreq/cppc_cpufreq.c                | 60 +++++++++++++++++++
>>   include/acpi/cppc_acpi.h                      | 12 ++++
>>   4 files changed, 124 insertions(+)
>>
>> diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
>> index 82d10d556cc8..ac1bf1b89ac4 100644
>> --- a/Documentation/ABI/testing/sysfs-devices-system-cpu
>> +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
>> @@ -346,6 +346,23 @@ Description:     Performance Limited
>>
>>                This file is only present if the cppc-cpufreq driver is in use.
>>
>> +What:                /sys/devices/system/cpu/cpuX/cpufreq/ospm_nominal_freq
>> +Date:                May 2026
>> +Contact:     linux-pm@vger.kernel.org
>> +Description: OSPM Nominal Performance (kHz)
>> +
>> +             OSPM uses this attribute to request a nominal performance
>> +             level lower than the platform-reported nominal. The
>> +             platform treats performance above this level as boost
>> +             and below as throttle for power and thermal decisions.
>> +
>> +             Read returns the last written value in kHz, or 0 if no
>> +             value has been written. Write a kHz value in the range
>> +             [lowest_freq, nominal_freq].
>> +
>> +             This file is only present if the cppc-cpufreq driver is
>> +             in use.
>> +
> Given that this value, based also on firmware behavior, can create vast asymmetries
> between CPUs, which the scheduler would be unaware of I wonder if this warrants
> a similar disclaimer like intel_pstate carries for their per-CPU EPP:
> https://www.kernel.org/doc/html/v7.0/admin-guide/pm/intel_pstate.html#energy-vs-performance-hints
>
> "[Note that tasks may by migrated from one CPU to another by the scheduler’s load-balancing algorithm and if different energy vs performance hints are set for those CPUs, that may lead to undesirable outcomes. To avoid such issues it is better to set the same energy vs performance hint for all CPUs or to pin every task potentially sensitive to them to a specific CPU.]"

Good point.
In v5 added below to the sysfs ABI doc, modeled on similar note:

   Note that tasks may be migrated from one CPU to another
   by the scheduler's load-balancing algorithm, and if
   different OSPM Nominal Performance values are set for
   those CPUs, that may lead to undesirable outcomes. To
   avoid such issues it is better to set the same value
   across all policies, or to pin every potentially
   sensitive task to a specific CPU.

Thank you,
Sumit Gupta



^ permalink raw reply

* Re: [PATCH RESEND v2 1/2] serial: earlycon: add uart_clk_freq parameter
From: Geert Uytterhoeven @ 2026-06-09  6:53 UTC (permalink / raw)
  To: Markus Probst
  Cc: Rafael J. Wysocki, Len Brown, Thomas Bogendoerfer, Ard Biesheuvel,
	Ilias Apalodimas, Greg Kroah-Hartman, Jiri Slaby, linux-acpi,
	linux-kernel, linux-m68k, linux-mips, linux-efi, linux-serial
In-Reply-To: <20260609-acpi_spcr-v2-1-3cd9a3bda727@posteo.de>

Hi Markus,

On Tue, 9 Jun 2026 at 00:40, Markus Probst <markus.probst@posteo.de> wrote:
> Add `uart_clk_freq` parameter to `setup_earlycon`. This allows the
> options string to be reused with `add_preferred_console`, while still
> allowing to set the uart clock frequency. This will be used in the
> following commit ("ACPI: SPCR: Support UART clock frequency field").
>
> No logical change intended.
>
> Signed-off-by: Markus Probst <markus.probst@posteo.de>

> --- a/drivers/tty/serial/earlycon.c
> +++ b/drivers/tty/serial/earlycon.c
> @@ -135,11 +135,14 @@ static int __init parse_options(struct earlycon_device *device, char *options)
>         return 0;
>  }
>
> -static int __init register_earlycon(char *buf, const struct earlycon_id *match)
> +static int __init register_earlycon(char *buf, unsigned int uart_clk_freq,
> +                                   const struct earlycon_id *match)
>  {
>         int err;
>         struct uart_port *port = &early_console_dev.port;
>
> +       port->uartclk = uart_clk_freq;

Who is actually consuming this value?
Earlycon typically works with the serial console, as configured before
Linux boot by the firmware.
The Microsoft doc referenced in patch 2 seem to agree with that:

   "On a system where the BIOS or system firmware uses the serial
    port for console input/output, this table should be used to convey
    information about the settings, to ensure a seamless transition
    between the firmware console output and Windows EMS output."

> +
>         /* On parsing error, pass the options buf to the setup function */
>         if (buf && !parse_options(&early_console_dev, buf))
>                 buf = NULL;

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* Re: [PATCH] software node: allow passing reference args to PROPERTY_ENTRY_REF()
From: Andy Shevchenko @ 2026-06-09  6:04 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Danilo Krummrich, Greg Kroah-Hartman, Daniel Scally,
	Heikki Krogerus, Sakari Ailus, Rafael J. Wysocki, linux-acpi,
	driver-core, linux-kernel
In-Reply-To: <aiTo4dvKu8pyimHA@google.com>

On Sat, Jun 06, 2026 at 08:51:29PM -0700, Dmitry Torokhov wrote:
> When dynamically creating software nodes and properties for subsequent
> use with software_node_register() current implementation of
> PROPERTY_ENTRY_REF() is not suitable because it creates a temporary
> instance of struct software_node_ref_args on stack which will later
> disappear, and software_node_register() only does shallow copy of
> properties.
> 
> Fix this by allowing to pass address of reference arguments structure
> directly into PROPERTY_ENTRY_REF(), so that caller can manage lifetime
> of the object properly.

Now it looks nice and good, thanks!
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply

* Re: [PATCH RESEND v2 1/2] serial: earlycon: add uart_clk_freq parameter
From: Greg Kroah-Hartman @ 2026-06-09  5:30 UTC (permalink / raw)
  To: Markus Probst
  Cc: Rafael J. Wysocki, Len Brown, Geert Uytterhoeven,
	Thomas Bogendoerfer, Ard Biesheuvel, Ilias Apalodimas, Jiri Slaby,
	linux-acpi, linux-kernel, linux-m68k, linux-mips, linux-efi,
	linux-serial
In-Reply-To: <20260609-acpi_spcr-v2-1-3cd9a3bda727@posteo.de>

On Mon, Jun 08, 2026 at 10:40:21PM +0000, Markus Probst wrote:
> Add `uart_clk_freq` parameter to `setup_earlycon`. This allows the
> options string to be reused with `add_preferred_console`, while still
> allowing to set the uart clock frequency. This will be used in the
> following commit ("ACPI: SPCR: Support UART clock frequency field").

Ick, this is bad, now you need to look up what this 0 is as a parameter
every time you see this call.  Please just add a new function that takes
the new paramter, don't abuse the old one for this.

thanks,

greg k-h

^ permalink raw reply

* Re: [PATCH] iommu/riscv: Add dependency between iommu and devices
From: wang.yechao255 @ 2026-06-09  1:04 UTC (permalink / raw)
  To: sunilvl
  Cc: linux-acpi, linux-riscv, linux-kernel, iommu, sunilvl, rafael,
	lenb, pjw, palmer, aou, alex, tomasz.jeznach, anup.patel,
	andrew.jones
In-Reply-To: <CAB19ukFMu1tW5QkCOY2e=DWbKrehEDv_iO5T7vEfKEGkbTFEEw@mail.gmail.com>

> On Mon, Jun 8, 2026 at 11:52 AM <wang.yechao255@zte.com.cn> wrote:
> >
> > > >
> > > > From: Wang Yechao <wang.yechao255@zte.com.cn>
> > > >
> > > > Commit 9156585280f1 ("ACPI: RIMT: Add dependency between iommu and
> > > > devices") adds the dependency between iommu and devices on ACPI
> > > > systems. On devicetree systems, the incorrect removal order also
> > > > occurs.
> > > >
> > > Interesting. Why is it not handled by the fw_devlink infrastructure in DT?
> > >
> > > Thanks,
> > > Sunil
> >
> > Thank you. I have noticed that commit e149573b2f84 ("of: property: Add device
> > link support for "iommu-map"") adds support for "iommu-map" to the device link
> > supplier bindings, so that probing of PCI devices can be deferred until after
> > the IOMMU is available.
> >
> > The QEMU RISC-V virt machine currently lacks the "iommu-map" property for the
> > PCIe bus when the iommu-pci device is used, which leads to this issue. Therefore,
> > I believe the correct fix is to add the "iommu-map" property for PCIe in QEMU.
> >
> For PCI IOMMU, the dependency should be automatically taken care since iommu
> should always be the first device in the hierarchy and scanned first.

Thank you for pointing it out. The device is not removed before the PCI IOMMU
device because the BDF of the device I configured is smaller than the BDF of
the PCI IOMMU device.

Regards,
Yechao

^ permalink raw reply

* [PATCH RESEND v2 2/2] ACPI: SPCR: Support UART clock frequency field
From: Markus Probst @ 2026-06-08 22:40 UTC (permalink / raw)
  To: Rafael J. Wysocki, Len Brown, Geert Uytterhoeven,
	Thomas Bogendoerfer, Ard Biesheuvel, Ilias Apalodimas,
	Greg Kroah-Hartman, Jiri Slaby
  Cc: linux-acpi, linux-kernel, linux-m68k, linux-mips, linux-efi,
	linux-serial, Markus Probst
In-Reply-To: <20260609-acpi_spcr-v2-0-3cd9a3bda727@posteo.de>

The Microsoft Serial Port Console Redirection (SPCR) specification
revision 1.08 comprises additional field: UART Clock Frequency [1].

It contains a non-zero value indicating the UART clock frequency in Hz.

Link: https://learn.microsoft.com/en-us/windows-hardware/drivers/serports/serial-port-console-redirection-table [1]
Signed-off-by: Markus Probst <markus.probst@posteo.de>
---
 drivers/acpi/spcr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/acpi/spcr.c b/drivers/acpi/spcr.c
index cfacbe53f279..16f94073fde6 100644
--- a/drivers/acpi/spcr.c
+++ b/drivers/acpi/spcr.c
@@ -228,7 +228,7 @@ int __init acpi_parse_spcr(bool enable_earlycon, bool enable_console)
 	pr_info("console: %s\n", opts);
 
 	if (enable_earlycon)
-		setup_earlycon(opts, 0);
+		setup_earlycon(opts, table->header.revision >= 3 ? table->uart_clk_freq : 0);
 
 	if (enable_console)
 		err = add_preferred_console(uart, 0, opts + strlen(uart) + 1);

-- 
2.53.0


^ permalink raw reply related

* [PATCH RESEND v2 1/2] serial: earlycon: add uart_clk_freq parameter
From: Markus Probst @ 2026-06-08 22:40 UTC (permalink / raw)
  To: Rafael J. Wysocki, Len Brown, Geert Uytterhoeven,
	Thomas Bogendoerfer, Ard Biesheuvel, Ilias Apalodimas,
	Greg Kroah-Hartman, Jiri Slaby
  Cc: linux-acpi, linux-kernel, linux-m68k, linux-mips, linux-efi,
	linux-serial, Markus Probst
In-Reply-To: <20260609-acpi_spcr-v2-0-3cd9a3bda727@posteo.de>

Add `uart_clk_freq` parameter to `setup_earlycon`. This allows the
options string to be reused with `add_preferred_console`, while still
allowing to set the uart clock frequency. This will be used in the
following commit ("ACPI: SPCR: Support UART clock frequency field").

No logical change intended.

Signed-off-by: Markus Probst <markus.probst@posteo.de>
---
 arch/m68k/virt/config.c          |  2 +-
 arch/mips/mti-malta/malta-init.c |  2 +-
 drivers/acpi/spcr.c              |  2 +-
 drivers/firmware/efi/earlycon.c  |  2 +-
 drivers/tty/serial/earlycon.c    | 17 ++++++++++++-----
 include/linux/serial_core.h      |  7 +++++--
 6 files changed, 21 insertions(+), 11 deletions(-)

diff --git a/arch/m68k/virt/config.c b/arch/m68k/virt/config.c
index b338e2a8da6a..2c35ec15a51b 100644
--- a/arch/m68k/virt/config.c
+++ b/arch/m68k/virt/config.c
@@ -83,7 +83,7 @@ void __init config_virt(void)
 
 	snprintf(earlycon, sizeof(earlycon), "early_gf_tty,0x%08x",
 		 virt_bi_data.tty.mmio);
-	setup_earlycon(earlycon);
+	setup_earlycon(earlycon, 0);
 
 	mach_init_IRQ = virt_init_IRQ;
 	mach_sched_init = virt_sched_init;
diff --git a/arch/mips/mti-malta/malta-init.c b/arch/mips/mti-malta/malta-init.c
index 82b0fd8576a2..88ef17967ced 100644
--- a/arch/mips/mti-malta/malta-init.c
+++ b/arch/mips/mti-malta/malta-init.c
@@ -75,7 +75,7 @@ static void __init console_config(void)
 	if ((strstr(fw_getcmdline(), "earlycon=")) == NULL) {
 		sprintf(console_string, "uart8250,io,0x3f8,%d%c%c", baud,
 			parity, bits);
-		setup_earlycon(console_string);
+		setup_earlycon(console_string, 0);
 	}
 
 	if ((strstr(fw_getcmdline(), "console=")) == NULL) {
diff --git a/drivers/acpi/spcr.c b/drivers/acpi/spcr.c
index 73cb933fdc89..cfacbe53f279 100644
--- a/drivers/acpi/spcr.c
+++ b/drivers/acpi/spcr.c
@@ -228,7 +228,7 @@ int __init acpi_parse_spcr(bool enable_earlycon, bool enable_console)
 	pr_info("console: %s\n", opts);
 
 	if (enable_earlycon)
-		setup_earlycon(opts);
+		setup_earlycon(opts, 0);
 
 	if (enable_console)
 		err = add_preferred_console(uart, 0, opts + strlen(uart) + 1);
diff --git a/drivers/firmware/efi/earlycon.c b/drivers/firmware/efi/earlycon.c
index 3d060d59968c..0e3c2cb08966 100644
--- a/drivers/firmware/efi/earlycon.c
+++ b/drivers/firmware/efi/earlycon.c
@@ -221,7 +221,7 @@ static bool __initdata fb_probed;
 void __init efi_earlycon_reprobe(void)
 {
 	if (fb_probed)
-		setup_earlycon("efifb");
+		setup_earlycon("efifb", 0);
 }
 
 static int __init efi_earlycon_setup(struct earlycon_device *device,
diff --git a/drivers/tty/serial/earlycon.c b/drivers/tty/serial/earlycon.c
index ab9af37f6cda..a419943e083b 100644
--- a/drivers/tty/serial/earlycon.c
+++ b/drivers/tty/serial/earlycon.c
@@ -135,11 +135,14 @@ static int __init parse_options(struct earlycon_device *device, char *options)
 	return 0;
 }
 
-static int __init register_earlycon(char *buf, const struct earlycon_id *match)
+static int __init register_earlycon(char *buf, unsigned int uart_clk_freq,
+				    const struct earlycon_id *match)
 {
 	int err;
 	struct uart_port *port = &early_console_dev.port;
 
+	port->uartclk = uart_clk_freq;
+
 	/* On parsing error, pass the options buf to the setup function */
 	if (buf && !parse_options(&early_console_dev, buf))
 		buf = NULL;
@@ -164,7 +167,8 @@ static int __init register_earlycon(char *buf, const struct earlycon_id *match)
 
 /**
  *	setup_earlycon - match and register earlycon console
- *	@buf:	earlycon param string
+ *	@buf:		earlycon param string
+ *	@uart_clk_freq:	uart clock frequency in Hz or 0 for BASE_BAUD*16
  *
  *	Registers the earlycon console matching the earlycon specified
  *	in the param string @buf. Acceptable param strings are of the form
@@ -177,10 +181,13 @@ static int __init register_earlycon(char *buf, const struct earlycon_id *match)
  *	<options> string in the 'options' parameter; all other forms set
  *	the parameter to NULL.
  *
+ *	If the uart clock frequency is specified in the 'options' parameter,
+ *	the value of the param @uart_clk_freq will be ignored.
+ *
  *	Returns 0 if an attempt to register the earlycon was made,
  *	otherwise negative error code
  */
-int __init setup_earlycon(char *buf)
+int __init setup_earlycon(char *buf, unsigned int uart_clk_freq)
 {
 	const struct earlycon_id *match;
 	bool empty_compatible = true;
@@ -209,7 +216,7 @@ int __init setup_earlycon(char *buf)
 		} else
 			buf = NULL;
 
-		return register_earlycon(buf, match);
+		return register_earlycon(buf, uart_clk_freq, match);
 	}
 
 	if (empty_compatible) {
@@ -241,7 +248,7 @@ static int __init param_setup_earlycon(char *buf)
 		}
 	}
 
-	err = setup_earlycon(buf);
+	err = setup_earlycon(buf, 0);
 	if (err == -ENOENT || err == -EALREADY)
 		return 0;
 	return err;
diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h
index 666430b47899..5c60fda9dd3a 100644
--- a/include/linux/serial_core.h
+++ b/include/linux/serial_core.h
@@ -1097,10 +1097,13 @@ int of_setup_earlycon(const struct earlycon_id *match, unsigned long node,
 
 #ifdef CONFIG_SERIAL_EARLYCON
 extern bool earlycon_acpi_spcr_enable __initdata;
-int setup_earlycon(char *buf);
+int setup_earlycon(char *buf, unsigned int uart_clk_freq);
 #else
 static const bool earlycon_acpi_spcr_enable EARLYCON_USED_OR_UNUSED;
-static inline int setup_earlycon(char *buf) { return 0; }
+static inline int setup_earlycon(char *buf, unsigned int uart_clk_freq)
+{
+	return 0;
+}
 #endif
 
 /* Variant of uart_console_registered() when the console_list_lock is held. */

-- 
2.53.0


^ permalink raw reply related

* [PATCH RESEND v2 0/2] ACPI: SPCR: Support UART clock frequency field
From: Markus Probst @ 2026-06-08 22:40 UTC (permalink / raw)
  To: Rafael J. Wysocki, Len Brown, Geert Uytterhoeven,
	Thomas Bogendoerfer, Ard Biesheuvel, Ilias Apalodimas,
	Greg Kroah-Hartman, Jiri Slaby
  Cc: linux-acpi, linux-kernel, linux-m68k, linux-mips, linux-efi,
	linux-serial, Markus Probst

Support the uart clock frequency in the SPCR table.
See the commit messages for details.

Signed-off-by: Markus Probst <markus.probst@posteo.de>
---
Changes in v2:
- fix uart_clk_freq possibly being interpreted as parity/bits/flow
- Link to v1: https://patch.msgid.link/20260505-acpi_spcr-v1-1-fd4bc6f4eb53@posteo.de

---
Markus Probst (2):
      serial: earlycon: add uart_clk_freq parameter
      ACPI: SPCR: Support UART clock frequency field

 arch/m68k/virt/config.c          |  2 +-
 arch/mips/mti-malta/malta-init.c |  2 +-
 drivers/acpi/spcr.c              |  2 +-
 drivers/firmware/efi/earlycon.c  |  2 +-
 drivers/tty/serial/earlycon.c    | 17 ++++++++++++-----
 include/linux/serial_core.h      |  7 +++++--
 6 files changed, 21 insertions(+), 11 deletions(-)
---
base-commit: aa61612ab641d7d62b0b6889f2c7c9251489f6e3
change-id: 20260430-acpi_spcr-61902fd923f2
-----BEGIN PGP SIGNATURE-----

iQJPBAABCAA5FiEEgnQYxPSsWOdyMMRzNHYf+OetQ9IFAmoTfSQbFIAAAAAABAAO
bWFudTIsMi41KzEuMTIsMiwyAAoJEDR2H/jnrUPS9e4P/3ObHEJDC0UywwA9xj1z
kzoUrs/iZrus7ROb6ri7MzHVY8riTrzHZrvMOkdWBAxuMXrnzdLwDx96qnQuiaHm
GJBDNBAxoRzBxVkkQJi9Osa+zr8DOEC3+gsv3dCqNqI4DT1wXsBEMi4zgg5dJ5Ye
oFFjEXN/EAiFVa6DHeNMaoJ69sLbYjvWUxAvU74Zpa6zjQMc1n9oCcJFc5D6cvkx
9/WozDV7rTNjmqDy9kcmyb3geeEMd14/y3j8adIe6qB0kkIJQwQr671eIWzGA7pg
353gxRmbLaT9YCKJvHsP32N7Z2EUhrp/o3U6Od/Q0I+qDz13RBSuDLUTogi/mhAw
U7i9a2WHaD2LvwQdt5azLjuo7etx5si84E/cT5G2xJBwUeC2ftYjZulJZs8BMKZp
Oac3Ln/qWCEVw52DWOcxPPIkxlGfjEZOqWBajkRI4NdY3+d0o0/nBK+RYfOt30sf
L3+yLnMmqBjF1RkF1Lm3kTZ589K2KSxGOKMGoYKZqyvV4Ato4w4EoIwL+MQJw7SD
De/BNNpFpTDqJxqgnl4HuELZRzmiKAQCGMwDq285I0Ng1r7xlxFCDYBJnyjnm2Qd
LbD/ZH5yl0beq/S1qZla+JIRjYdbRNQlLUSh3MqBxgll0Xg5GYLk2qeeF+xJJDCq
LkXKR59axau1efToWKWn7CBZ
=EGzU
-----END PGP SIGNATURE-----
-- 
Markus Probst <markus.probst@posteo.de>


^ permalink raw reply

* Re: [PATCH v1] tpm_crb: Check ACPI_COMPANION() against NULL during probe
From: Rafael J. Wysocki @ 2026-06-08 17:35 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Peter Huewe, Jason Gunthorpe, linux-integrity, Andy Shevchenko,
	LKML, Linux ACPI
In-Reply-To: <CAJZ5v0jQUQ85MpyPZNbLmxqaGGvsTBKsdf8gdPNmFFSpZkj4eQ@mail.gmail.com>

On Tue, May 19, 2026 at 11:01 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Sat, May 16, 2026 at 3:15 AM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> >
> > On Tue, May 12, 2026 at 06:16:23PM +0200, Rafael J. Wysocki wrote:
> > > From: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
> > >
> > > Every platform driver can be forced to match a device that doesn't match
> > > its list of device IDs because of device_match_driver_override(), so
> > > platform drivers that rely on the existence of a device's ACPI companion
> > > object need to verify its presence.
> > >
> > > Accordingly, add a requisite ACPI_COMPANION() check against NULL to the
> > > tpm_crb driver.
> > >
> > > Fixes: 48fe2cddc85c ("tpm_crb: Convert ACPI driver to a platform one")
> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > ---
> > >  drivers/char/tpm/tpm_crb.c |    6 +++++-
> > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > >
> > > --- a/drivers/char/tpm/tpm_crb.c
> > > +++ b/drivers/char/tpm/tpm_crb.c
> > > @@ -786,8 +786,8 @@ static int crb_map_pluton(struct device
> > >  static int crb_acpi_probe(struct platform_device *pdev)
> > >  {
> > >       struct device *dev = &pdev->dev;
> > > -     struct acpi_device *device = ACPI_COMPANION(dev);
> > >       struct acpi_table_tpm2 *buf;
> > > +     struct acpi_device *device;
> > >       struct crb_priv *priv;
> > >       struct tpm_chip *chip;
> > >       struct tpm2_crb_smc *crb_smc;
> > > @@ -797,6 +797,10 @@ static int crb_acpi_probe(struct platfor
> > >       u32 sm;
> > >       int rc;
> > >
> > > +     device = ACPI_COMPANION(dev);
> > > +     if (!device)
> > > +             return -ENODEV;
> > > +
> > >       status = acpi_get_table(ACPI_SIG_TPM2, 1,
> > >                               (struct acpi_table_header **) &buf);
> > >       if (ACPI_FAILURE(status) || buf->header.length < sizeof(*buf)) {
> > >
> > >
> > >
> >
> > Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
>
> Thanks!
>
> So do you want me to pick up this one?

I took the silence as consent and picked it up.  If you'd rather route
it differently, please let me know.

Thanks!

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox